Abstract
This work involves the use of combined forces of data-driven machine learning models and high fidelity density functional theory for the identification of new potential thermoelectric materials. The traditional method of thermoelectric material discovery from an almost limitless search space of chemical compounds involves expensive and time consuming experiments. In the current work, the density functional theory (DFT) simulations are used to compute the descriptors (features) and thermoelectric characteristics (labels) of a set of compounds. The DFT simulations are computationally very expensive and hence the database is not very exhaustive. With an anticipation that the important features can be learned by machine learning (ML) from the limited database and the knowledge could be used to predict the behavior of any new compound, the current work adds knowledge related to (a) understanding the impact of selection of influence of training/test data, (b) influence of complexity of ML algorithms, and (c) computational efficiency of combined DFT-ML methodology.