|
Rhiju
Das
,
Rachael C.
Kretsch
,
Adam J.
Simpkin
,
Thomas
Mulvaney
,
Phillip
Pham
,
Ramya
Rangan
,
Fan
Bu
,
Ronan M.
Keegan
,
Maya
Topf
,
Daniel J.
Rigden
,
Zhichao
Miao
,
Eric
Westhof
Open Access
Abstract: The prediction of RNA three-dimensional structures remains an unsolved problem. Here, we report assessments of RNA structure predictions in CASP15, the first CASP exercise that involved RNA structure modeling. Forty-two predictor groups submitted models for at least one of twelve RNA-containing targets. These models were evaluated by the RNA-Puzzles organizers and, separately, by a CASP-recruited team using metrics (GDT, lDDT) and approaches (Z-score rankings) initially developed for assessment of proteins and generalized here for RNA assessment. The two assessments independently ranked the same predictor groups as first (AIchemy_RNA2), second (Chen), and third (RNAPolis and GeneSilico, tied); predictions from deep learning approaches were significantly worse than these top ranked groups, which did not use deep learning. Further analyses based on direct comparison of predicted models to cryogenic electron microscopy (cryo-EM) maps and x-ray diffraction data support these rankings. With the exception of two RNA-protein complexes, models submitted by CASP15 groups correctly predicted the global fold of the RNA targets. Comparisons of CASP15 submissions to designed RNA nanostructures as well as molecular replacement trials highlight the potential utility of current RNA modeling approaches for RNA nanotechnology and structural biology, respectively. Nevertheless, challenges remain in modeling fine details such as noncanonical pairs, in ranking among submitted models, and in prediction of multiple structures resolved by cryo-EM or crystallography.
|
Oct 2023
|
|
|
Open Access
Abstract: The results of tertiary structure assessment at CASP15 are reported. For the first time, recognizing the outstanding performance of AlphaFold 2 (AF2) at CASP14, all single-chain predictions were assessed together, irrespective of whether a template was available. At CASP15, there was no single stand-out group, with most of the best-scoring groups—led by PEZYFoldings, UM-TBM, and Yang Server—employing AF2 in one way or another. Many top groups paid special attention to generating deep Multiple Sequence Alignments (MSAs) and testing variant MSAs, thereby allowing them to successfully address some of the hardest targets. Such difficult targets, as well as lacking templates, were typically proteins with few homologues. Local divergence between prediction and target correlated with localization at crystal lattice or chain interfaces, and with regions exhibiting high B-factor factors in crystal structure targets, and should not necessarily be considered as representing error in the prediction. However, analysis of exposed and buried side chain accuracy showed room for improvement even in the latter. Nevertheless, a majority of groups produced high-quality predictions for most targets, which are valuable for experimental structure determination, functional analysis, and many other tasks across biology. These include those applying methods similar to those used to generate major resources such as the AlphaFold Protein Structure Database and the ESM Metagenomic atlas: the confidence estimates of the former were also notably accurate.
|
Sep 2023
|
|
|
Jon
Agirre
,
Mihaela
Atanasova
,
Haroldas
Bagdonas
,
Charles B.
Ballard
,
Arnaud
Basle
,
James
Beilsten-Edmands
,
Rafael J.
Borges
,
David G.
Brown
,
J. Javier
Burgos-Marmol
,
John M.
Berrisford
,
Paul S.
Bond
,
Iracema
Caballero
,
Lucrezia
Catapano
,
Grzegorz
Chojnowski
,
Atlanta G.
Cook
,
Kevin D.
Cowtan
,
Tristan I.
Croll
,
Judit É.
Debreczeni
,
Nicholas E.
Devenish
,
Eleanor J.
Dodson
,
Tarik R.
Drevon
,
Paul
Emsley
,
Gwyndaf
Evans
,
Phil R.
Evans
,
Maria
Fando
,
James
Foadi
,
Luis
Fuentes-Montero
,
Elspeth F.
Garman
,
Markus
Gerstel
,
Richard J.
Gildea
,
Kaushik
Hatti
,
Maarten L.
Hekkelman
,
Philipp
Heuser
,
Soon Wen
Hoh
,
Michael A.
Hough
,
Huw T.
Jenkins
,
Elisabet
Jiménez
,
Robbie P.
Joosten
,
Ronan M.
Keegan
,
Nicholas
Keep
,
Eugene B.
Krissinel
,
Petr
Kolenko
,
Oleg
Kovalevskiy
,
Victor S.
Lamzin
,
David M.
Lawson
,
Andrey
Lebedev
,
Andrew G. W.
Leslie
,
Bernhard
Lohkamp
,
Fei
Long
,
Martin
Maly
,
Airlie
Mccoy
,
Stuart J.
Mcnicholas
,
Ana
Medina
,
Claudia
Millán
,
James W.
Murray
,
Garib N.
Murshudov
,
Robert A.
Nicholls
,
Martin E. M.
Noble
,
Robert
Oeffner
,
Navraj S.
Pannu
,
James M.
Parkhurst
,
Nicholas
Pearce
,
Joana
Pereira
,
Anastassis
Perrakis
,
Harold R.
Powell
,
Randy J.
Read
,
Daniel J.
Rigden
,
William
Rochira
,
Massimo
Sammito
,
Filomeno
Sanchez Rodriguez
,
George M.
Sheldrick
,
Kathryn L.
Shelley
,
Felix
Simkovic
,
Adam J.
Simpkin
,
Pavol
Skubak
,
Egor
Sobolev
,
Roberto A.
Steiner
,
Kyle
Stevenson
,
Ivo
Tews
,
Jens M. H.
Thomas
,
Andrea
Thorn
,
Josep Triviño
Valls
,
Ville
Uski
,
Isabel
Uson
,
Alexei
Vagin
,
Sameer
Velankar
,
Melanie
Vollmar
,
Helen
Walden
,
David
Waterman
,
Keith S.
Wilson
,
Martyn
Winn
,
Graeme
Winter
,
Marcin
Wojdyr
,
Keitaro
Yamashita
Open Access
Abstract: The Collaborative Computational Project No. 4 (CCP4) is a UK-led international collective with a mission to develop, test, distribute and promote software for macromolecular crystallography. The CCP4 suite is a multiplatform collection of programs brought together by familiar execution routines, a set of common libraries and graphical interfaces. The CCP4 suite has experienced several considerable changes since its last reference article, involving new infrastructure, original programs and graphical interfaces. This article, which is intended as a general literature citation for the use of the CCP4 software suite in structure determination, will guide the reader through such transformations, offering a general overview of the new features and outlining future developments. As such, it aims to highlight the individual programs that comprise the suite and to provide the latest references to them for perusal by crystallographers around the world.
|
Jun 2023
|
|
|
Open Access
Abstract: Determination of protein structures typically entails building a model that satisfies the collected experimental observations and its deposition in the Protein Data Bank. Experimental limitations can lead to unavoidable uncertainties during the process of model building, which result in the introduction of errors into the deposited model. Many metrics are available for model validation, but most are limited to consideration of the physico-chemical aspects of the model or its match to the experimental data. The latest advances in the field of deep learning have enabled the increasingly accurate prediction of inter-residue distances, an advance which has played a pivotal role in the recent improvements observed in the field of protein ab initio modelling. Here, new validation methods are presented based on the use of these precise inter-residue distance predictions, which are compared with the distances observed in the protein model. Sequence-register errors are particularly clearly detected and the register shifts required for their correction can be reliably determined. The method is available in the ConKit package (https://www.conkit.org).
|
Dec 2022
|
|
I04-Macromolecular Crystallography
|
Olga V.
Moroz
,
Elena
Blagova
,
Andrey A.
Lebedev
,
Filomeno
Sanchez Rodriguez
,
Daniel J.
Rigden
,
Jeppe
Wegener Tams
,
Reinhard
Wilting
,
Jan Kjølhede
Vester
,
Emily
Longhi
,
Gustav
Hammerich Hansen
,
Kristian
Bertel Rømer Mørkeberg Krogh
,
Roland A.
Pache
,
Gideon
Davies
,
Keith S.
Wilson
Diamond Proposal Number(s):
[18598]
Abstract: β-Galactosidases catalyse the hydrolysis of lactose into galactose and glucose; as an alternative reaction, some β-galactosidases also catalyse the formation of galactooligosaccharides by transglycosylation. Both reactions have industrial importance: lactose hydrolysis is used to produce lactose-free milk, while galactooligosaccharides have been shown to act as prebiotics. For some multi-domain β-galactosidases, the hydrolysis/transglycosylation ratio can be modified by the truncation of carbohydrate-binding modules. Here, an analysis of BbgIII, a multidomain β-galactosidase from Bifidobacterium bifidum, is presented. The X-ray structure has been determined of an intact protein corresponding to a gene construct of eight domains. The use of evolutionary covariance-based predictions made sequence docking in low-resolution areas of the model spectacularly easy, confirming the relevance of this rapidly developing deep-learning-based technique for model building. The structure revealed two alternative orientations of the CBM32 carbohydrate-binding module relative to the GH2 catalytic domain in the six crystallographically independent chains. In one orientation the CBM32 domain covers the entrance to the active site of the enzyme, while in the other orientation the active site is open, suggesting a possible mechanism for switching between the two activities of the enzyme, namely lactose hydrolysis and transgalactosylation. The location of the carbohydrate-binding site of the CBM32 domain on the opposite site of the module to where it comes into contact with the catalytic GH2 domain is consistent with its involvement in adherence to host cells. The role of the CBM32 domain in switching between hydrolysis and transglycosylation modes offers protein-engineering opportunities for selective β-galactosidase modification for industrial purposes in the future.
|
Dec 2021
|
|
I03-Macromolecular Crystallography
|
Diamond Proposal Number(s):
[12342]
Abstract: Insect juvenile hormones (JHs) are a family of sesquiterpenoid molecules that are secreted into the haemolymph. JHs have multiple roles in insect development, metamorphosis and sexual maturation. A number of pesticides work by chemically mimicking JHs, thus preventing insects from developing and reproducing normally. The haemolymph levels of JH are governed by the rates of its biosynthesis and degradation. One enzyme involved in JH catabolism is JH diol kinase (JHDK), which uses ATP (or GTP) to phosphorylate JH diol to JH diol phosphate, which can be excreted. The X-ray structure of JHDK from the silkworm Bombyx mori has been determined at a resolution of 2.0 Å with an R factor of 19.0% and an Rfree of 24.8%. The structure possesses three EF-hand motifs which are occupied by calcium ions. This is in contrast to the recently reported structure of the JHDK-like-2 protein from B. mori (PDB entry 6kth), which possessed only one calcium ion. Since JHDK is known to be inhibited by calcium ions, it is likely that our structure represents the calcium-inhibited form of the enzyme. The electrostatic surface of the protein suggests a binding site for the triphosphate of ATP close to the N-terminal end of the molecule in a cavity between the N- and C-terminal domains. Superposition with a number of calcium-activated photoproteins suggests that there may be parallels between the binding of JH diol to JHDK and the binding of luciferin to aequorin.
|
Dec 2021
|
|
|
Open Access
Abstract: We report here an assessment of the model refinement category of the 14th round of Critical Assessment of Structure Prediction (CASP14). As before, predictors submitted up to five ranked refinements, along with associated residue-level error estimates, for targets that had a wide range of starting quality. The ability of groups to accurately rank their submissions and to predict coordinate error varied widely. Overall only four groups out-performed a “naïve predictor” corresponding to resubmission of the starting model. Among the top groups there are interesting differences of approach and in the spread of improvements seen: some methods are more conservative, others more adventurous. Some targets were “double-barrelled” for which predictors were offered a high-quality AlphaFold 2 (AF2)-derived prediction alongside another of lower quality. The AF2-derived models were largely unimprovable, many of their apparent errors being found to reside at domain and, especially, crystal lattice contacts. Refinement is shown to have a mixed impact overall on structure-based function annotation methods to predict nucleic acid binding, spot catalytic sites and dock protein structures.
|
Jul 2021
|
|
|
Open Access
Abstract: Covariance-based predictions of residue contacts and inter-residue distances are an increasingly popular data type in protein bioinformatics. Here we present ConPlot, a web-based application for convenient display and analysis of contact maps and distograms. Integration of predicted contact data with other predictions is often required to facilitate inference of structural features. ConPlot can therefore use the empty space near the contact map diagonal to display multiple coloured tracks representing other sequence-based predictions. Popular file formats are natively read and bespoke data can also be flexibly displayed. This novel visualisation will enable easier interpretation of predicted contact maps.
|
Jan 2021
|
|
|
Open Access
Abstract: The conventional approach in molecular replacement is the use of a related structure as a search model. However, this is not always possible as the availability of such structures can be scarce for poorly characterized families of proteins. In these cases, alternative approaches can be explored, such as the use of small ideal fragments that share high, albeit local, structural similarity with the unknown protein. Earlier versions of AMPLE enabled the trialling of a library of ideal helices, which worked well for largely helical proteins at suitable resolutions. Here, the performance of libraries of helical ensembles created by clustering helical segments is explored. The impacts of different B-factor treatments and different degrees of structural heterogeneity are explored. A 30% increase in the number of solutions obtained by AMPLE was observed when using this new set of ensembles compared with the performance with ideal helices. The boost in performance was notable across three different fold classes: transmembrane, globular and coiled-coil structures. Furthermore, the increased effectiveness of these ensembles was coupled to a reduction in the time required by AMPLE to reach a solution. AMPLE users can now take full advantage of this new library of search models by activating the `helical ensembles' mode.
|
Oct 2020
|
|
I02-Macromolecular Crystallography
I04-1-Macromolecular Crystallography (fixed wavelength)
I04-Macromolecular Crystallography
|
Diamond Proposal Number(s):
[8997, 7146]
Abstract: Shiga toxin-encoding bacteriophages transfer Shiga toxin genes to Escherichia coli and are responsible for the emergence of pathogenic bacterial strains that cause severe foodborne human diseases. Gene vb_24B_21 is the most highly conserved gene across sequenced Shiga bacteriophages. Protein vb_24B_21 (also termed 933Wp42 and NanS-p) is a carbohydrate esterase with homology to the E. coli chromosomally encoded NanS that deacetylates sialic acid in the intestinal mucus. To assist the functional characterization of vb_24B_21, we have studied its molecular structure by homology modelling its esterase domain and by elucidating the crystal structure of its uncharacterized C-terminal domain at the atomic resolution of 0.97 Å. Our modelling confirms that NanS from the E. coli host is the closest structurally characterized homolog to the esterase domain of vb_24B_21. Like NanS, vb_24B_21 has an atypical active site, comprising a simple catalytic dyad Ser-His and a divergent oxyanion hole. The crystal structure of the C-terminal domain reveals a lectin-like, jelly-roll β-sandwich fold. The domain displays a prominent cleft that bioinformatics analysis predicts to be a carbohydrate binding site without catalytic properties. In summary, our study indicates that vb_24B_21 is a NanS-like atypical esterase that is assisted by a carbohydrate-binding module of yet undetermined binding specificity.
|
Aug 2020
|
|