We trained a machine learning model to predict the transition temperature of superconducting (Tc) materials using the CatBoost algorithm and the DataG dataset, which originated from the largest dataset of superconducting materials, SuperCon (https://doi.org/10.48505/nims.3739).
The SuperCon dataset, which contains 33407 superconducting compounds, currently has the largest and most comprehensive database of superconducting materials. CatBoost algorithm is a machine learning ensemble technique based on Gradient Boosted Decision Trees (GBDT). Since its launch in late 2018, researchers have successfully utilized CatBoost for machine learning investigations involving Big Data.
Here, the Tc of superconducting materials is calculated and predicted based on the dataset of DataG and DataH. We prepared the DataG dataset after various steps of data preprocessing, so the DataG dataset is strongly recommended for predicting the Tc of superconducting materials. (The DataH, an adaptation of the SuperCon dataset, was presented by Kam Hamidieh [1] ). The model presented here for predicting superconducting materials is only a regression model and it predicts the Tc of superconducting materials. In near future, we will design a classification model based on machine learning that will predict whether or not a compound will become superconducting.

1- Hamidieh, K. A data-driven statistical model for predicting the critical temperature of a superconductor. Computational Materials Science, 154, 346 (2018).

ارتقاء امنیت وب با وف بومی