Publication

Article Metrics

Citations


Online attention

The predictive power of data-processing statistics

DOI: 10.1107/S2052252520000895 DOI Help

Authors: Melanie Vollmar (Diamond Light Source) , James M. Parkhurst (Diamond Light Source; MRC Laboratory of Molecular Biology) , Dominic Jaques (Diamond Light Source) , Arnaud Basle (Newcastle University) , Garib N. Murshudov (MRC Laboratory of Molecular Biology) , David G. Waterman (Science Technology and Facilities Council, Rutherford Appleton Laboratory) , Gwyndaf Evans (Diamond Light Source)
Co-authored by industrial partner: No

Type: Journal Paper
Journal: Iucrj , VOL 7 , PAGES 342 - 354

State: Published (Approved)
Published: March 2020

Open Access Open Access

Abstract: This study describes a method to estimate the likelihood of success in determining a macromolecular structure by X-ray crystallography and experimental single-wavelength anomalous dispersion (SAD) or multiple-wavelength anomalous dispersion (MAD) phasing based on initial data-processing statistics and sample crystal properties. Such a predictive tool can rapidly assess the usefulness of data and guide the collection of an optimal data set. The increase in data rates from modern macromolecular crystallography beamlines, together with a demand from users for real-time feedback, has led to pressure on computational resources and a need for smarter data handling. Statistical and machine-learning methods have been applied to construct a classifier that displays 95% accuracy for training and testing data sets compiled from 440 solved structures. Applying this classifier to new data achieved 79% accuracy. These scores already provide clear guidance as to the effective use of computing resources and offer a starting point for a personalized data-collection assistant.

Journal Keywords: macromolecular crystallography; experimental phasing; machine learning; structure determination; phasing; X-ray crystallography

Subject Areas: Information and Communication Technology


Technical Areas:

Documents:
jt5042.pdf