|
|
Beatriz
Costa-Gomes
,
Joel
Greer
,
Nikolai
Juraschko
,
James
Parkhurst
,
Jola
Mirecka
,
Marjan
Famili
,
Camila
Rangel-Smith
,
Oliver
Strickson
,
Alan
Lowe
,
Mark
Basham
,
Tom
Burnley
Open Access
Abstract: Ease of access to data, tools and models expedites scientific research. In structural biology there are now numerous open repositories of experimental and simulated data sets. Being able to easily access and utilize these is crucial to allow researchers to make optimal use of their research effort. The tools presented here are useful for collating existing public cryoEM data sets and/or creating new synthetic cryoEM data sets to aid the development of novel data processing and interpretation algorithms. In recent years, structural biology has seen the development of a multitude of machine-learning-based algorithms to aid numerous steps in the processing and reconstruction of experimental data sets and the use of these approaches has become widespread. Developing such techniques in structural biology requires access to large data sets, which can be cumbersome to curate and unwieldy to make use of. In this paper, we present a suite of Python software packages, which we collectively refer to as PERC (profet, EMPIARreader and CAKED). These are designed to reduce the burden which data curation places upon structural biology research. The protein structure fetcher (profet) package allows users to conveniently download and cleave sequences or structures from the Protein Data Bank or AlphaFold databases. EMPIARreader allows lazy loading of Electron Microscopy Public Image Archive data sets in a machine-learning-compatible structure. The Class Aggregator for Key Electron-microscopy Data (CAKED) package is designed to seamlessly facilitate the training of machine-learning models on electron microscopy data, including electron-cryo-microscopy-specific data augmentation and labeling. These packages may be utilized independently or as building blocks in workflows. All are available in open-source repositories and designed to be easily extensible to facilitate more advanced workflows if required.
|
Oct 2025
|
|
|
|
Jingjing
Zhao
,
Chen
Huang
,
Ali
Mostaed
,
Amirafshar
Moshtaghpour
,
James M.
Parkhurst
,
Ivan
Lobato
,
Marcus
Gallagher-Jones
,
Judy S.
Kim
,
Mark
Boyce
,
David
Stuart
,
Elena A.
Andreeva
,
Jacques-Philippe
Colletier
,
Angus I.
Kirkland
Open Access
Abstract: Exit wavefunction reconstruction is important in transmission electron microscopy for structural studies. We describe electron Fourier ptychography and its application to phase reconstruction of both radiation-resistant and beam-sensitive materials. We demonstrate that the phase of the exit wave can be reconstructed to high resolution using a modified iterative phase retrieval algorithm from data collected in an alternative optical geometry. This method achieves a spatial resolution of 0.63 nm at a fluence of 4.5 × 102 e−/nm2, as validated on Cry11Aa protein crystals under cryogenic conditions. Notably, this method requires no instrumental modifications, is straightforward to implement, and can be seamlessly integrated with existing data collection software, providing a broadly accessible alternative approach for structural studies.
|
Oct 2025
|
|
|
|
Open Access
Abstract: Conformational heterogeneity of biological macromolecules is a challenge in single-particle averaging (SPA). Current standard practice is to employ classification and filtering methods that may allow a discrete number of conformational states to be reconstructed. However, the conformation space accessible to these molecules is continuous and, therefore, explored incompletely by a small number of discrete classes. Recently developed heterogeneous reconstruction algorithms (HRAs) to analyse continuous heterogeneity rely on machine-learning methods that employ low-dimensional latent space representations. The non-linear nature of many of these methods poses a challenge to their validation and interpretation and to identifying functionally relevant conformational trajectories. These methods would benefit from in-depth benchmarking using high-quality synthetic data and concomitant ground truth information. We present a framework for the simulation and subsequent analysis with respect to the ground truth of cryo-EM micrographs containing particles whose conformational heterogeneity is sourced from molecular dynamics simulations. These synthetic data can be processed as if they were experimental data, allowing aspects of standard SPA workflows as well as heterogeneous reconstruction methods to be compared with known ground truth using available utilities. The simulation and analysis of several such datasets are demonstrated and an initial investigation into HRAs is presented.
|
Nov 2024
|
|
I24-Microfocus Macromolecular Crystallography
|
Abstract: This chapter describes additions to the DIALS software package for processing serial still-shot crystallographic data, and the implementation of a pipeline, xia2.ssx, for processing and merging serial crystallography data using DIALS programs. To integrate partial still-shot diffraction data, a 3D gaussian profile model was developed that can describe anisotropic spot shapes. This model is optimised by maximum likelihood methods using the pixel-intensity distributions of strong diffraction spots, enabling simultaneous refinement of the profile model and Ewald-sphere offsets. We demonstrate the processing of an example SSX dataset where the improved partiality estimates lead to better model statistics compared with post-refined isotropic models. We also demonstrate some of the workflows available for merging SSX data, including processing time/dose resolved data series, where data can be separated at the point of merging after scaling and discuss the program outputs used to investigate the data throughout the pipeline.
|
Nov 2024
|
|
|
|
Open Access
Abstract: For cryo-electron tomography (cryo-ET) of beam-sensitive biological specimens, a planar sample geometry is typically used. As the sample is tilted, the effective thickness of the sample along the direction of the electron beam increases and the signal-to-noise ratio concomitantly decreases, limiting the transfer of information at high tilt angles. In addition, the tilt range where data can be collected is limited by a combination of various sample-environment constraints, including the limited space in the objective lens pole piece and the possible use of fixed conductive braids to cool the specimen. Consequently, most tilt series are limited to a maximum of ±70°, leading to the presence of a missing wedge in Fourier space. The acquisition of cryo-ET data without a missing wedge, for example using a cylindrical sample geometry, is hence attractive for volumetric analysis of low-symmetry structures such as organelles or vesicles, lysis events, pore formation or filaments for which the missing information cannot be compensated by averaging techniques. Irrespective of the geometry, electron-beam damage to the specimen is an issue and the first images acquired will transfer more high-resolution information than those acquired last. There is also an inherent trade-off between higher sampling in Fourier space and avoiding beam damage to the sample. Finally, the necessity of using a sufficient electron fluence to align the tilt images means that this fluence needs to be fractionated across a small number of images; therefore, the order of data acquisition is also a factor to consider. Here, an n-helix tilt scheme is described and simulated which uses overlapping and interleaved tilt series to maximize the use of a pillar geometry, allowing the entire pillar volume to be reconstructed as a single unit. Three related tilt schemes are also evaluated that extend the continuous and classic dose-symmetric tilt schemes for cryo-ET to pillar samples to enable the collection of isotropic information across all spatial frequencies. A fourfold dose-symmetric scheme is proposed which provides a practical compromise between uniform information transfer and complexity of data acquisition.
|
Jun 2024
|
|
|
|
Open Access
Abstract: Simulations of cryo-electron microscopy (cryo-EM) images of biological samples can be used to produce test datasets to support the development of instrumentation, methods, and software, as well as to assess data acquisition and analysis strategies. To be useful, these simulations need to be based on physically realistic models which include large volumes of amorphous ice. The gold standard model for EM image simulation is a physical atom-based ice model produced using molecular dynamics simulations. Although practical for small sample volumes; for simulation of cryo-EM data from large sample volumes, this can be too computationally expensive. We have evaluated a Gaussian Random Field (GRF) ice model which is shown to be more computationally efficient for large sample volumes. The simulated EM images are compared with the gold standard atom-based ice model approach and shown to be directly comparable. Comparison with experimentally acquired data shows the Gaussian random field ice model produces realistic simulations. The software required has been implemented in the Parakeet software package and the underlying atomic models are available online for use by the wider community.
|
Nov 2023
|
|
|
|
Jon
Agirre
,
Mihaela
Atanasova
,
Haroldas
Bagdonas
,
Charles B.
Ballard
,
Arnaud
Basle
,
James
Beilsten-Edmands
,
Rafael J.
Borges
,
David G.
Brown
,
J. Javier
Burgos-Marmol
,
John M.
Berrisford
,
Paul S.
Bond
,
Iracema
Caballero
,
Lucrezia
Catapano
,
Grzegorz
Chojnowski
,
Atlanta G.
Cook
,
Kevin D.
Cowtan
,
Tristan I.
Croll
,
Judit É.
Debreczeni
,
Nicholas E.
Devenish
,
Eleanor J.
Dodson
,
Tarik R.
Drevon
,
Paul
Emsley
,
Gwyndaf
Evans
,
Phil R.
Evans
,
Maria
Fando
,
James
Foadi
,
Luis
Fuentes-Montero
,
Elspeth F.
Garman
,
Markus
Gerstel
,
Richard J.
Gildea
,
Kaushik
Hatti
,
Maarten L.
Hekkelman
,
Philipp
Heuser
,
Soon Wen
Hoh
,
Michael A.
Hough
,
Huw T.
Jenkins
,
Elisabet
Jiménez
,
Robbie P.
Joosten
,
Ronan M.
Keegan
,
Nicholas
Keep
,
Eugene B.
Krissinel
,
Petr
Kolenko
,
Oleg
Kovalevskiy
,
Victor S.
Lamzin
,
David M.
Lawson
,
Andrey
Lebedev
,
Andrew G. W.
Leslie
,
Bernhard
Lohkamp
,
Fei
Long
,
Martin
Maly
,
Airlie
Mccoy
,
Stuart J.
Mcnicholas
,
Ana
Medina
,
Claudia
Millán
,
James W.
Murray
,
Garib N.
Murshudov
,
Robert A.
Nicholls
,
Martin E. M.
Noble
,
Robert
Oeffner
,
Navraj S.
Pannu
,
James M.
Parkhurst
,
Nicholas
Pearce
,
Joana
Pereira
,
Anastassis
Perrakis
,
Harold R.
Powell
,
Randy J.
Read
,
Daniel J.
Rigden
,
William
Rochira
,
Massimo
Sammito
,
Filomeno
Sanchez Rodriguez
,
George M.
Sheldrick
,
Kathryn L.
Shelley
,
Felix
Simkovic
,
Adam J.
Simpkin
,
Pavol
Skubak
,
Egor
Sobolev
,
Roberto A.
Steiner
,
Kyle
Stevenson
,
Ivo
Tews
,
Jens M. H.
Thomas
,
Andrea
Thorn
,
Josep Triviño
Valls
,
Ville
Uski
,
Isabel
Uson
,
Alexei
Vagin
,
Sameer
Velankar
,
Melanie
Vollmar
,
Helen
Walden
,
David
Waterman
,
Keith S.
Wilson
,
Martyn
Winn
,
Graeme
Winter
,
Marcin
Wojdyr
,
Keitaro
Yamashita
Open Access
Abstract: The Collaborative Computational Project No. 4 (CCP4) is a UK-led international collective with a mission to develop, test, distribute and promote software for macromolecular crystallography. The CCP4 suite is a multiplatform collection of programs brought together by familiar execution routines, a set of common libraries and graphical interfaces. The CCP4 suite has experienced several considerable changes since its last reference article, involving new infrastructure, original programs and graphical interfaces. This article, which is intended as a general literature citation for the use of the CCP4 software suite in structure determination, will guide the reader through such transformations, offering a general overview of the new features and outlining future developments. As such, it aims to highlight the individual programs that comprise the suite and to provide the latest references to them for perusal by crystallographers around the world.
|
Jun 2023
|
|
Krios I-Titan Krios I at Diamond
|
James M.
Parkhurst
,
Adam D.
Crawshaw
,
C. Alistair
Siebert
,
Maud
Dumoux
,
C. David
Owen
,
Pedro
Nunes
,
David
Waterman
,
Thomas
Glen
,
David I.
Stuart
,
James H.
Naismith
,
Gwyndaf
Evans
Open Access
Abstract: Three-dimensional electron diffraction (3DED) from nanocrystals of biological macromolecules requires the use of very small crystals. These are typically less than 300 nm-thick in the direction of the electron beam due to the strong interaction between electrons and matter. In recent years, focused-ion-beam (FIB) milling has been used in the preparation of thin samples for 3DED. These instruments typically use a gallium liquid metal ion source. Inductively coupled plasma (ICP) sources in principle offer faster milling rates. Little work has been done to quantify the damage these sources cause to delicate biological samples at cryogenic temperatures. Here, an analysis of the effect that milling with plasma FIB (pFIB) instrumentation has on lysozyme crystals is presented. This work evaluates both argon and xenon plasmas and compares them with crystals milled with a gallium source. A milling protocol was employed that utilizes an overtilt to produce wedge-shaped lamellae with a shallow thickness gradient which yielded very thin crystalline samples. 3DED data were then acquired and standard data-processing statistics were employed to assess the quality of the diffraction data. An upper bound to the depth of the pFIB-milling damage layer of between 42.5 and 50 nm is reported, corresponding to half the thickness of the thinnest lamellae that resulted in usable diffraction data. A lower bound of between 32.5 and 40 nm is also reported, based on a literature survey of the minimum amount of diffracting material required for 3DED.
|
May 2023
|
|
|
|
Open Access
Abstract: In cryo-electron tomography (cryo-ET) of biological samples, the quality of tomographic reconstructions can vary depending on the transmission electron microscope (TEM) instrument and data acquisition parameters. In this paper, we present Parakeet, a ‘digital twin’ software pipeline for the assessment of the impact of various TEM experiment parameters on the quality of three-dimensional tomographic reconstructions. The Parakeet digital twin is a digital model that can be used to optimize the performance and utilization of a physical instrument to enable in silico optimization of sample geometries, data acquisition schemes and instrument parameters. The digital twin performs virtual sample generation, TEM image simulation, and tilt series reconstruction and analysis within a convenient software framework. As well as being able to produce physically realistic simulated cryo-ET datasets to aid the development of tomographic reconstruction and subtomogram averaging programs, Parakeet aims to enable convenient assessment of the effects of different microscope parameters and data acquisition parameters on reconstruction quality. To illustrate the use of the software, we present the example of a quantitative analysis of missing wedge artefacts on simulated planar and cylindrical biological samples and discuss how data collection parameters can be modified for cylindrical samples where a full 180° tilt range might be measured.
|
Oct 2021
|
|
B24-Cryo Soft X-ray Tomography
|
Open Access
Abstract: Chlamydiae are strict intracellular pathogens residing within a specialised membrane-bound compartment called the inclusion. Therefore, each infected cell can, be considered as a single entity where bacteria form a community within the inclusion. It remains unclear as to how the population of bacteria within the inclusion influences individual bacterium. The life cycle of Chlamydia involves transitioning between the invasive elementary bodies (EBs) and replicative reticulate bodies (RBs). We have used cryo-soft X-ray tomography to observe individual inclusions, an approach that combines 40 nm spatial resolution and large volume imaging (up to 16 µm). Using semi-automated segmentation pipeline, we considered each inclusion as an individual bacterial niche. Within each inclusion, we identifyed and classified different forms of the bacteria and confirmed the recent finding that RBs have a variety of volumes (small, large and abnormal). We demonstrate that the proportions of these different RB forms depend on the bacterial concentration in the inclusion. We conclude that each inclusion operates as an autonomous community that influences the characteristics of individual bacteria within the inclusion.
|
Aug 2021
|
|