Deep Learning for the Atomic Scale: Graph Neural Networks and Deep Generative Models with Some Applications to Materials and Molecules
Published in Linköping University Electronic Press, 2025
Abstract:
The development of artificial intelligence, and in particular machine learning, has seen tremendous success in recent years. The use of machine learning, however, extends to vast application areas outside of those that we encounter in our day-to-day life. One such area is within the natural sciences, where research has shown promising results in using machine learning for modeling of systems of atoms. This is also the type of application for which the methods developed in this thesis are motivated. The thesis investigates and develops both predictive models that can predict properties and simulate these types of systems, and generative models that can propose new potential materials or molecules.
Particular emphasis is put on methods that model data as graphs, and the thesis starts with investigations of graph neural networks (GNNs) designed for predicting material and molecular properties. The performance of these models in the context of high-throughput screenings are put under scrutiny. The performance of GNNs when predicting properties of materials which are only hypothetical and structures which are not completely relaxed is investigated, and the insights are used to suggest a workflow that combines machine learning and conventional high-throughput methods. Additionally, an investigation of so-called knowledge distillation in the context of GNNs for systems of atoms has been performed. This study proposes some simple techniques for improving the performance of this type of GNNs, without sacrificing speed.
The generative modeling techniques developed in the thesis are both more generally applicable and specifically targeting the materials science domain. Among the general methods, the thesis investigates a type of generative autoregressive models where the generation order is a random variable, and develops discriminator guidance for such models. Additionally, a new sequential Monte Carlo algorithm, DDSMC, is developed for general Bayesian inverse problems. A dedicated materials science model, WyckoffDiff, is developed, utilizing a description of materials that explicitly encode information of their symmetries, with the aim of facilitating generation of materials with strict symmetrical properties.
While predictive and generative models can be useful on their own, the study on WyckoffDiff also highlights how they can be used together as parts of a materials discovery pipeline, with predictive models predicting the properties of the materials generated by a generative model like WyckoffDiff.
Recommended citation: Ekström Kelvinius, F. (2025). Deep Learning for the Atomic Scale : Graph Neural Networks and Deep Generative Models with Some Applications to Materials and Molecules (PhD dissertation, Linköping University Electronic Press). https://doi.org/10.3384/9789181181852
Download Paper