A genetic algorithm (GA) for structure prediction of molecular crystals
A genetic algorithm performs global optimization by mimicking an evolutionary process. The property being optimized is mapped onto a fitness function and structures with a higher fitness are assigned a higher probability for mating. Crossover and mutation operators generate offspring by combining or altering the structural genes of parent structures, such that structural features associated with a high fitness are propagated in the population. GAtor performs structure prediction for crystals of (semi-)rigid molecules with no internal rotational degrees of freedom. It is written in Python and distributed under a BSD-3 license. GAtor interfaces with the FHI-aims electronic structure package for energy evaluations and geometry relaxations with dispersion-inclusive density functional theory (DFT). It is recommended to start GAtor from a diverse initial pool of structures generated by Genarris.
- Random structures are generated in all (or user-defined) space groups to exhaustively sample the potential energy surface.
- Physically motivated constraints are imposed on intermolecular distances to ensure the quality of the structures.
- Fast energy evaluations are performed with the Harris approximation.
- Machine learning is used for clustering based on a relative coordinate descriptor (RCD) developed specifically for molecular crystals.
- Standard and user-defined workflows comprising different sequences of energy evaluation, clustering, and selection steps produce curated sets of structures to serve as initial populations for global optimization algorithms and/or as training sets for machine learning.
A random structure generation package for molecular crystals
Genarris performs configuration space screening for crystals of (semi-)rigid molecules with no internal rotational degrees of freedom by random sampling with physical constraints. Genarris uses the Harris approximation to perform fast energy evaluations. The Harris density of a molecular crystal is constructed by replicating a single molecule density, which is calculated only once. The DFT energy is then evaluated for the Harris density without performing a self-consistency cycle. Genarris is written in Python and distributed under a BSD-3 license. Genarris interfaces with the FHI-aims electronic structure package for energy evaluations and geometry relaxations with dispersion-inclusive DFT.
- A variety of breeding operators (crossover and mutation) tailored for molecular crystals provide a balance between exploration and exploitation.
- Evolutionary niching, performed by using machine learning for clustering on the fly and then using a cluster-based fitness function, helps overcome initial pool biases and selection biases by steering the GA to under-explored regions of the configuration space.
- A massive parallelization scheme, tested on up to 105 CPU cores, enables effective utilization of high performance computing resources.