%0 Journal Article %A Bharat Medasani %A Anthony C Gamst %A Hong Ding %A Wei Chen %A Kristin A Persson %A Mark D Asta %A Andrew Canning %A Maciej Haranczyk %B npj Computational Materials %D 2016 %G eng %N 1 %R 10.1038/s41524-016-0001-z %T Predicting defect behavior in B2 intermetallics by merging ab initio modeling and machine learning %V 2 %8 12/2016 %! npj Comput Mater %X
We present a combination of machine learning and high throughput calculations to predict the points defects behavior in binary intermetallic (A–B) compounds, using as an example systems with the cubic B2 crystal structure (with equiatomic AB stoichiometry). To the best of our knowledge, this work is the first application of machine learning-models for point defect properties. High throughput first principles density functional calculations have been employed to compute intrinsic point defect energies in 100 B2 intermetallic compounds. The systems are classified into two groups: (i) those for which the intrinsic defects are antisites for both A and B rich compositions, and (ii) those for which vacancies are the dominant defect for either or both composition ranges. The data was analyzed by machine learning-techniques using decision tree, and full and reduced multiple additive regression tree (MART) models. Among these three schemes, a reduced MART (r-MART) model using six descriptors (formation energy, minimum and difference of electron densities at the Wigner–Seitz cell boundary, atomic radius difference, maximal atomic number and maximal electronegativity) presents the highest fit (98 %) and predictive (75 %) accuracy. This model is used to predict the defect behavior of other B2 compounds, and it is found that 45 % of the compounds considered feature vacancies as dominant defects for either A or B rich compositions (or both). The ability to predict dominant defect types is important for the modeling of thermodynamic and kinetic properties of intermetallic compounds, and the present results illustrate how this information can be derived using modern tools combining high throughput calculations and data analytics.