Publicidad
Publicidad

Más contenido relacionado

Presentaciones para ti(20)

Similar a Available methods for predicting materials synthesizability using computational and machine learning approaches(20)

Publicidad

Más de Anubhav Jain(20)

Publicidad

Available methods for predicting materials synthesizability using computational and machine learning approaches

  1. Available methods for predicting materials synthesizability using computational and machine learning approaches Anubhav Jain Lawrence Berkeley National Laboratory TMS Spring Meeting, Mar 2023 Slides (already) posted to hackingmaterials.lbl.gov
  2. Joining Gerd’s group …. 2
  3. Joining Gerd’s group … 3 Gerd teaching thermodynamics, ~2006
  4. Congratulations Gerd! 4 2008 ECS 2011 The OG Materials Genome server at MIT 2008 – BURP! (“Bosch-Umicore Research Project”)
  5. Congratulations to G. Ceder! 5 2008 2011 The OG Materials Project server
  6. Outline of talk • Congratulations to Gerd Ceder • Old pictures • The dreaded puppet • Ceder group Jeopardy 6
  7. GETTING WITH THE PROGRAM FONT SIZES! MEDITATING IN QUIET SPACES IS THIS A GOVERNMENT LAB? POSTDOC OR COMPUTER? IS IT A HOLIDAY TODAY?
  8. Outline • The explosion of new materials predictions, and dilemma of what to test • Can we trust ML algorithms that predict hull stability? • Beyond 0K “e above hull”: Efficient phonons • Integrating literature knowledge for efficient experimentation 8
  9. The pace of experimental materials discovery is about 10K-20K entries per year 9 Entries in the Powder Diffraction File (PDF) Collaboration with ICSD Collaboration with MPDS ~20,000 entries per year over last decade Gates-Rector, S. & Blanton, T. The Powder Diffraction File: a quality materials characterization database. Powder Diffr. 34, 352–360 (2019). Inorganic Crystal Structure Database ~10,000 entries per year over last decade Zagorac, D., Müller, H., Ruehl, S., Zagorac, J. & Rehme, S. Recent developments in the Inorganic Crystal Structure Database: theoretical crystal structure data and related features. J Appl Crystallogr 52, 918–925 (2019).
  10. However, machine learning is predicting very large numbers of new stable compounds 10 With multiple experiments likely needed per compound, even automated labs won’t be able to keep up “A-lab” – Ceder & colleagues 0 500000 1000000 1500000 2000000 MP stable ICSD PDF M3GNet stable In a short period of time, ML algorithms can generate potentially millions of potentially stable compounds M3GNet data: Chen, C., Ong, S.P. A universal graph deep learning interatomic potential for the periodic table. Nat Comput Sci 2, 718–728 (2022).
  11. How do we prioritize? 11 We need to be able to accurately assess likelihood of synthesis success to avoid wasting resources Likelihood of success Potential Novelty, Functionality candidate predictions * millions Large error bars in the process make this difficult – how can we start working towards more confidence in likelihood of success?
  12. Outline • The explosion of new materials predictions, and dilemma of what to test • Can we trust ML algorithms that predict hull stability? • Beyond 0K “e above hull”: Efficient phonons • Integrating literature knowledge for efficient experimentation 12
  13. Do ML algorithms work for new materials? 13 Bartel et al., npj Comp. Mats., 6, 97 (2020) Structure relaxing algorithms: partially fix this by solving an optimization problem Zuo et al., MaterialsToday, 51 (2021)
  14. ML algorithms should be tested on a discovery- oriented task similar to how they’d be deployed 14 Matbench-discover asks algorithms to rank a new chemical space’s candidates by predicted hull stability MP stable structures substituted, unrelaxed candidate structures (257k) model: Wrenformer, BOSWR, M3GNet, etc. Wang, H.-C., Botti, S. & Marques, M. A. L. Predicting stable crystalline compounds using chemical similarity. npj Comput Mater 7, 12 (2021).
  15. How well do current algorithms do? Good, but still room for improvement 15 Precision (fraction of correctly guessed stable materials) peaks at about 0.5 or so Discovery acceleration factor (how much better than random guess) peaks at almost 3 (maximum=6) Surprisingly, MEGNET does better on these metrics than other models that relax the structure (although worse on MAE) but this likely an artifact of the test (next slide). M3Gnet is likely the best model of those tested (next slide)
  16. Further analyzing the results and performance curves, M3GNet is clearly the best model so far 16 Likely true negatives Likely true positives Likely false negatives Likely false positives
  17. Matbench-discovery will be posted to Materials Project, and we will track evolution of algorithms over time! 17 Data and granular metrics 2022 2021 2020 <2019 2022 2021 2020 <2019
  18. Outline • The explosion of new materials predictions, and dilemma of what to test • Can we trust ML algorithms that predict hull stability? • Beyond 0K “e above hull”: Efficient phonons • Integrating literature knowledge for efficient experimentation 18
  19. “e above hull” is not particularly selective for observed vs unobserved materials 19 Aykol, M., Dwaraknath, S. S., Sun, W. & Persson, K. A. Thermodynamic limit for synthesis of metastable inorganic materials. Sci. Adv. 4, eaaq0148 (2018). Sun, W. et al. The thermodynamic scale of inorganic crystalline metastability. Sci. Adv. 2, e1600225 (2016). Observed (blue) and unobserved (red) phases have significant overlap in hull energies The “best” hull energy cutoff to use is system-dependent, and may not work well at all for some chemistries (e.g., nitrides)
  20. We know what’s needed to go beyond the hull, but it’s usually a pain and/or expensive 20 Dynamical stability + finite T Gibbs energy Oxidation & moisture resistance / passivation Aqueous / environment stability Amorphous hull limit / ensure hull is complete Defect & distortion tolerance 0K hull stability starting structure “synthesizability workflow”
  21. Dynamical stability is not too hard to get, but routinely ignored 21 Hull energies of some hypothetical MAX phases Dynamically unstable phases are marked with a * Khaledialidusti, R., Khazaei, M., Khazaei, S. & Ohno, K. Nanoscale 13, 7294–7307 (2021). In more tricky cases, dynamic stability can be T-dependent 0K dynamic stability may not be enough
  22. The vibrational contribution to free energy is another thing we usually ignore 22 Bartel, C. J. et al. Nat Commun 9, 4168 (2018)
  23. Calculating thermal properties of materials • The vibrational thermal properties of materials are determined by phonon behavior • In lattice dynamics, we typically tailor expand the phonon interactions by atomic displacements: And differentiate to solve for the interatomic force constants 23
  24. The problem – obtaining force constants can require many DFT calculations 24 To obtain 2nd order IFCs To obtain 3rd order IFCs 2 displacements in a supercell (# of supercells needed: 1000s-10000s) … 1 displacement in a supercell (Usually <5 supercells needed) Finite-displacement method IFCs extracted from HiPhive To obtain any order of IFCs (2nd, 3rd,…) in one shot … displace each atom in a supercell (Only need 5~10 supercells in total!) • Traditionally, one performs systematic displacements, each of which only has a few atom movements and solves only a small portion of the IFC matrix • Primitive cells with reduced symmetry and many atoms can easily require 1000 or more calculations • The scaling goes something like: O(Nn) where N is the number of sites and n is the order of IFC you want. Not scalable!
  25. The solution – perform non-systematic displacements • Instead of performing systematic displacements, perform non-systematic displacements in which many IFC terms are “mixed up” • Then, perform a best fit procedure to fit the IFC matrix elements to the observed data • Typically undetermined, so regularization is important • This method has been suggested by several groups, for now we focus on the implementation in the HiPhive code (Erhart group, Chalmers University of Technology) • Disadvantage: this method requires careful selection of fit parameters to get correct results 25 IFCs extracted from HiPhive To obtain any order of IFCs (2nd, 3rd,…) in one shot … displace each atom in a supercell (Only need 5~10 supercells in total!) Monte Carlo rattle penalizes displacements that lead to very small interatomic distances Fransson, E.; Eriksson, F.; Erhart, P. Efficient Construction of Linear Models in Materials Modeling and Applications to Force Constant Expansions. npj Comput Mater 2020, 6 (1), 135.
  26. We’ve been working to get the parameter selection problem sorted out … 26 Effect of supercell size Effect of cutoff Effect of fitting method Other parameters like rattling amount, etc. also tested
  27. We’ve wrapped these and other considerations into a fully automatic workflow 27 VASP DFT relaxation of primitive cell VASP SCF on supercells (u = 0.01-0.05 Å) VASP SCF on supercells (u = 0.1-0.5 Å) HiPhive Fit harmonic Φ2 HiPhive Fit anharmonic Φ3 ,Φ4 etc Complete Φ Imaginary modes? Stable Phonon INPUT Bulk modulus ShengBTE/ FourPhonon Boltzmann Transport • Free Energy • Entropy • Heat Capacity • Gruneisen • Thermal Expansion • Lattice Thermal Conductivity No Yes Inner Loop Outer Loop No • Quantum Covariance • Renormalize Φ2 Imaginary modes? Converged free energy? Free Energy Converged free energy? • Expand Lattice at T Yes Yes No • Phase transition • Thermoelectric zT Renormalization at T ≥ 0 K Renormalization at T ≥ 0 K Renormalized Φ • Corrected Free Energy No Yes Non-analytical corrections for ionic compounds Phonon renormalization for imaginary modes at finite T via Xia & Chan Li (BCC, Im-3m) ZrO2 (cubic, Fm-3m) GeTe (cubic, Fm-3m) BaTiO3 (cubic, Pm-3m) Xia, Y. & Chan, M. K. Y. Anharmonic stabilization and lattice heat transport in rocksalt β -GeTe. Appl. Phys. Lett. 113, 193902 (2018).
  28. We see 100 – 1000X speedup compared to finite displacement method 28 100x speedup 1000x speedup harmonic terms (Φ2) 2nd order: non-analytic correction (NAC) phonon dispersion/DOS quasi-harmonic thermal properties (free energy, heat capacity, entropy) anharmonic terms (Φ3, Φ4) Φ 2 ( h a r m o n i c ) 4th order: finite-temperature phonon(renormalization) corrected free energy 3rd order: lattice thermal conductivity, Gruneisen parameter, coefficient of thermal expansion More thermal properties Higher physical accuracy Computational feasibility 4 th order of IFCs 3 rd order of IFCs 2 nd order of IFCs … φ 4 Φ3 (anharm onic) … Dynamic stability, finite temperature Gibbs free energy, and other parameters are now accessible!
  29. Hopefully such calculations can become more routine in the future 29 Dynamical stability + finite T Gibbs energy Oxidation & moisture resistance / passivation Aqueous / environment stability Amorphous hull limit / ensure hull is complete Defect & distortion tolerance 0K hull stability starting structure
  30. Outline • The explosion of new materials predictions, and dilemma of what to test • Can we trust ML algorithms that predict hull stability? • Beyond 0K “e above hull”: Efficient phonons • Integrating literature knowledge for efficient experimentation 30
  31. Data from the literature is also used not only to assess synthesizability, but make more efficient use of experiments This means not only identifying synthesizable compounds, but reducing the number of experiments it takes to make them 31 Huo, H. et al. Machine-Learning Rationalization and Prediction of Solid-State Synthesis Conditions. Chem. Mater. 34, 7323–7336 (2022).
  32. A main issue is getting clean data sets 32 There is “loss” at each step of the process Ideally, we have fewer steps that can do more, and retain overall accuracy Wang et al., https://arxiv.org/abs/2111.10874
  33. We have found that a fine-tuned GPT-3 model can be used to extract synthesis recipes or other data 1. Initial training set of templates filled mostly manually, as zero- shot GPT is often poor for complex technical tasks 2. Fine-tune model to fill templates, use the model to assist in annotation 3. Repeat as necessary until desired inference accuracy is achieved
  34. Templated extraction of synthesis recipes • Annotate paragraphs to output structured recipe templates • JSON-format • Designed using domain knowledge from experimentalists • Template is relation graph to be filled in by model
  35. Example Extraction for Au nanorod synthesis
  36. Training a decision tree to predict AuNR shape shows similar conclusions as literature 36 Rod Cube Rod Cube Bipyramid Star Bipyramid None None None None None None None • Decision tree shows seed capping agent type as first decision boundary for shape determination • “Citrate-capped gold seeds form penta-twinned structure, while CTAB-capped seeds are single crystalline, hence former leads to bipyramids and latter leads to rods”1,2 1 Liu and Guyot-Sionnest, J. Phys. Chem. B, 2005 109 (47), 22192-22200 2 Grzelczak et al., Chem. Soc. Rev., 2008,37, 1783-1791
  37. We see similar results in a parallel project about BiFeO3 synthesis via sol–gel 37
  38. We are also extending to applications like doping 38 Currently: ~357,000 processed abstracts ~373,000 dopants in ~312,000 host materials
  39. This allows us to get doping statistics for common materials 39
  40. We can see which materials might have similar patterns of dopants … 40 Hosts Dopant s Occurrences (48k abstracts)
  41. And use ML (collaborative filtering) to find unknown dopants 41
  42. Conclusions • Even automated labs won’t be able to keep up with the deluge of ML predictions of interesting / novel / functional compounds • ML does seem to do a relatively good job at finding 0K hull stable compounds • We need more informed criteria on how to rank the candidates coming out of the ML pipeline, and these criteria need to be easy to deploy • More efficient calculation strategies and NLP-based ML can play a role to help prioritize some compounds over others. 42
  43. Acknowledgements NLP • Alex Dunn • John Dagdelen • Nick Walker • Sanghoon Lee • Kevin Cruse • Viktoriia Baibakova • Amalie Trewartha 43 Funding provided by: • U.S. Department of Energy, Basic Energy Science, “Materials Project” program • U.S. Department of Energy, Basic Energy Science, “D2S2” program • Toyota Research Institutes, Accelerated Materials Design program Slides (already) posted to hackingmaterials.lbl.gov Matbench-discovery • Janosh Riebesell • Alex Dunn • Rhys Goodall Phonon workflow • Zhuoying Zhu • Hrushikesh Sahasrabuddhe … and of course Gerd for setting me on this trajectory and continuing me on it …
Publicidad