da

RefXAS: an open access database of X-ray absorption spectra

Under DAPHNE4NFDI, the X-ray absorption spectroscopy (XAS) reference database, RefXAS, has been set up. For this purpose, we developed a method to enable users to submit a raw dataset, with its associated metadata, via a dedicated website for inclusion in the database. Implementation of the database includes an upload of metadata to the scientific catalogue and an upload of files via object storage, with automated query capabilities through a web server and visualization of the data and files. Based on the mode of measurements, quality criteria have been formulated for the automated check of any uploaded data. In the present work, the significant metadata fields for reusability, as well as reproducibility of results (FAIR data principles), are discussed. Quality criteria for the data uploaded to the database have been formulated and assessed. Moreover, the usability and interoperability of available XAS data/file formats have been explored. The first version of the RefXAS database prototype is presented, which features a human verification procedure, currently being tested with a new user interface designed specifically for curators; a user-friendly landing page; a full list of datasets; advanced search capabilities; a streamlined upload process; and, finally, a server-side automatic authentication and (meta-) data storage via MongoDB, PostgreSQL and (data-) files via relevant APIs.




da

A distributed software system for integrating data-intensive imaging methods in a hard X-ray nanoprobe beamline at the SSRF

The development of hard X-ray nanoprobe techniques has given rise to a number of experimental methods, like nano-XAS, nano-XRD, nano-XRF, ptychography and tomography. Each method has its own unique data processing algorithms. With the increase in data acquisition rate, the large amount of generated data is now a big challenge to these algorithms. In this work, an intuitive, user-friendly software system is introduced to integrate and manage these algorithms; by taking advantage of the loosely coupled, component-based design approach of the system, the data processing speed of the imaging algorithm is enhanced through optimization of the parallelism efficiency. This study provides meaningful solutions to tackle complexity challenges faced in synchrotron data processing.




da

Comparing single-shot damage thresholds of boron carbide and silicon at the European XFEL

Xray free-electron lasers (XFELs) enable experiments that would have been impractical or impossible at conventional X-ray laser facilities. Indeed, more XFEL facilities are being built and planned, with their aim to deliver larger pulse energies and higher peak brilliance. While seeking to increase the pulse power, it is quintessential to consider the maximum pulse fluence that a grazing-incidence FEL mirror can withstand. To address this issue, several studies were conducted on grazing-incidence damage by soft X-ray FEL pulses at the European XFEL facility. Boron carbide (B4C) coatings on polished silicon substrate were investigated using 1 keV photon energy, similar to the X-ray mirrors currently installed at the soft X-ray beamlines (SASE3). The purpose of this study is to compare the damage threshold of B4C and Si to determine the advantages, tolerance and limits of using B4C coatings.




da

Methyl 1-(4-fluoro­benz­yl)-1H-indazole-3-carboxyl­ate

The title compound, C16H13FN2O2, was synthesized by nucleophilic substitution of the indazole N—H hydrogen atom of methyl 1H-indazole-3-carboxyl­ate with 1-(bromo­meth­yl)-4-fluoro­benzene. In the crystal, some hydrogen-bond-like inter­actions are observed.




da

Bis[1,3-bis­(2,4,6-tri­methyl­phen­yl)imidazolium] bis(μ-cis-1,2-di­phenyl­ethene-1,2-di­thiol­ato-κ2S,S':κS)bis­[(cis-1,2-di­phenyl­ethene-1,2-di­thiol­ato-κ2S,S')iron(III)] di­methyl&

The mol­ecular structure of the solvated title salt, (C21H25N2)2[Fe2(C14H10S2)4]·2C3H7NO reveals that the anion is situated on a crystallographic inversion center in the triclinic space group Poverline{1}. The title compound crystallizes utilizing a network of weak π-stacking inter­actions of phenyl rings pertaining to the di­thiol­ene unit. Moreover, the acidic imidazolium H atoms [N—C(H)—N] display non-classical hydrogen-bonding inter­actions of the C—H⋯O type to the oxygen atoms of the N,N-dimethyl formamide solvent, and hydrogen atoms on the backbone of imidazolium rings display weak C—H⋯S inter­actions with the di­thiol­ene sulfur atoms.




da

Bis(2-hy­droxy-2,3-di­hydro-1H-inden-1-aminium) tetra­chlorido­palladate(II) hemihydrate

A new square-planar palladium complex salt hydrate, (C9H12NO)2[PdCl4]·0.5H2O, has been characterized. The asymmetric unit of the complex salt comprises two [PdCl4]2− dianions, four 2-hy­droxy-2,3-di­hydro-1H-inden-1-aminium cations, each derived from (1R,2S)-(+)-1-amino­indan-2-ol, and one water mol­ecule of crystallization. In the crystal, a two-dimensional layer parallel to (001) features a number of O—H⋯O, N—H⋯O, O—H⋯Cl and N—H⋯Cl hydrogen bonds.




da

Bis[2,6-bis­(benzimidazol-2-yl)pyridine-κ3N,N',N'']nickel(II) bis­(tri­fluoro­methane­sulfonate) diethyl ether monosolvate

In the title complex, [Ni(C19H13N5)2](CF3SO3)2·(CH3CH2)2O, the central NiII atom is sixfold coordinated by three nitro­gen atoms of each 2,6-bis­(2-benzimidazol­yl)pyridine ligand in a distorted octa­hedral geometry with two tri­fluoro­methane­sulfonate ions and a mol­ecule of diethyl ether completing the outer coordination sphere of the complex. Hydrogen bonding contributes to the organization of the asymmetric units in columns along the a axis generating a porous supra­molecular structure. The structure was refined as a two-component twin with a refined BASF value of 0.4104 (13).




da

Bis[2,6-bis­(1H-benzimidazol-2-yl)pyridine]ruthenium(II) bis(hexa­fluorido­phosphate) diethyl ether tris­olvate

The title compound, [Ru(C19H13N5)2](PF6)2·3C4H10O, was obtained from the reaction of Ru(bimpy)Cl3 [bimpy is 2,6-bis­(1H-benzimidazol-2-yl)pyridine] and bimpy in refluxing ethanol followed by recrystallization from diethyl ether/aceto­nitrile. At 125 K the complex has ortho­rhom­bic (Pca21) symmetry. It is remarkable that the structure is almost centrosymmetric. However, refinement in space group Pbcn leads to disorder and definitely worse results. It is of inter­est with respect to potential catalytic reduction of CO2. The structure displays N—H⋯O, N—H⋯F hydrogen bonding and significant π–π stacking and C—H⋯π stacking inter­actions.




da

Redetermination of germacrone type II based on single-crystal X-ray data

The extraction and purification procedures, crystallization and crystal structure refinement (single-crystal X-ray data) of germacrone type II, C15H22O, are presented. The structural results are compared with a previous powder X-ray synchrotron study [Kaduk et al. (2022). Powder Diffr. 37, 98–104], revealing significant improvements in terms of accuracy and precision. Hirshfeld atom refinement (HAR), as well as Hirshfeld surface analysis, give insight into the inter­molecular inter­actions of germacrone type II.




da

(SC,RS)-Bromido­(N-{4-methyl-1-[(4-methyl­phenyl)sul­fan­yl]­pentan-2-yl}-N'-(pyridin-2-yl)imidazol-2-yl­idene)palladium(II) bromide

The mol­ecule of the title NCNHCS pincer N-heterocyclic carbene palladium(II) complex, [PdBr(C21H25N3S)]Br, exhibits a slightly distorted square-planar coordination at the palladium(II) atom, with the five-membered chelate ring nearly planar. The six-membered chelate ring adopts an envelope conformation. Upon chelation, the sulfur atom becomes a stereogenic centre with an RS configuration induced by the chiral carbon of the precursor imidazolium salt. There are intra­molecular C—H⋯Br—Pd hydrogen bonds in the structure. The two inter­stitial Br atoms, as the counter-anion of the structure, are both located on crystallographic twofold axes and are connected to the complex cations via C—H⋯·Br hydrogen bonds.




da

trans-Di­chlorido­bis­(secnidazole-κN3)copper(II)

The use of acetic acid (HOAc) in a reaction between CuCl2·2H2O and secnid­azole, an active pharmaceutical ingredient useful in the treatment against a variety of anaerobic Gram-positive and Gram-negative bacteria, affords the title complex, [CuCl2(C7H11N3O3)2]. This compound was previously synthesized using ethanol as solvent, although its crystal structure was not reported [Betanzos-Lara et al. (2013). Inorg. Chim. Acta, 397, 94–100]. In the mol­ecular complex, the Cu2+ cation is situated at an inversion centre and displays a square-planar coordination environment. There is a hydrogen-bonded framework based on inter­molecular O—H⋯Cl inter­actions, characterized by H⋯Cl separations of 2.28 (4) Å and O—H⋯Cl angles of 175 (3)°. The resulting supra­molecular network is based on R22(18) ring motifs, forming chains in the [010] direction.




da

Octa­kis(di­butyl­ammonium) deca­molybdate(VI)

In the title salt, (C8H20N)8[Mo10O34], the [Mo10O34]8− polyanion is located about an inversion centre and can be considered as a β-type octa­molybdate anion to which two additional MoO4 tetra­hedra are linked via common corners. The [Mo10O34]8− polyanions are packed in rows extending parallel to [001] and are connected to the di­butyl­ammonium counter-cations through N—H⋯O hydrogen-bonding inter­actions.




da

(2,5-Di­methyl­imidazole){N,N',N'',N'''-[porphyrin-5,10,15,20-tetra­yltetra­(2,1-phenyl­ene)]tetra­kis(pyridine-3-carboxamide)}manganese(II) chloro­benzene disolvate

In the title compound, [Mn(C68H44N12O4)(C5H8N2)]·2C6H5Cl, the central MnII ion is coordinated by four pyrrole N atoms of the porphyrin core in the basal sites and one N atom of the 2,5-di­methyl­imidazole ligand in the apical site. Two chloro­benzene solvent mol­ecules are also present in the asymmetric unit. Due to the apical imidazole ligand, the Mn atom is displaced out of the 24-atom porphyrin mean plane by 0.66 Å. The average Mn—Np (p = porphyrin) bond length is 2.143 (8) Å, and the axial Mn—NIm (Im = 2,5-di­methyl­imidazole) bond length is 2.171 (8) Å. The structure displays inter­molecular and intra­molecular N—H⋯O, N—H⋯N, C—H⋯O and C—H⋯N hydrogen bonding. The crystal studied was refined as a two-component inversion twin.




da

Bis(ethyl­enedi­ammonium) μ-ethyl­enedi­aminetetra­acetato-1κ3O,N,O':2κ3O'',N',O'''-bis­[tri­oxidomolybdate(VI)] tetra­hydrate

The title compound, (C2H10N2)2[(C10H12N2O8)(MoO3)2]·4H2O, which crystallizes in the monoclinic C2/c space group, was obtained by mixing molybdenum oxide, ethyl­enedi­amine and ethyl­enedi­amine­tetra­acetic acid (H4edta) in a 2:4:1 ratio. The complex anion contains two MoO3 units bridged by an edta4− anion. The midpoint of the central C—C bond of the edta4− anion is located on a crystallographic inversion centre. The independent Mo atom is tridentately coordin­ated by a nitro­gen atom and two carboxyl­ate groups of the edta4− ligand, together with the three oxo ligands, producing a distorted octa­hedral coordination environment. In the three-dimensional supra­molecular crystal structure, the dinuclear anions, the organo­ammonium counter-ions and the solvent water mol­ecules are linked by N—H⋯Ow, N—H⋯Oedta and O—H⋯O hydrogen bonds.




da

μ-Chlorido-bis­{[1-benzyl-3-(2,4,6-tri­methyl­phen­yl)imidazol-2-yl­idene-κC]silver(I)} chloride 1,2-di­chloro­ethane hemisolvate

The title compound, [Ag2(C19H20N2)4]Cl·0.5C2H4Cl2, can be readily generated by treatment of (1-benzyl-3-(2,4,6-tri­methyl­phen­yl)imidazolium chloride with sodium bis­(tri­methyl­sil­yl)amide followed by silver chloride. The mol­ecular structure of the compound was confirmed using NMR spectroscopy and single-crystal X-ray diffraction analysis. The crystal structure of the title compound at 110 K has monoclinic (P21/c) symmetry. The represented silver compound is of inter­est with respect to anti­bacterial properties and the structure displays a series of weak inter­molecular hydrogen-bonding inter­actions with the chloride counter-anion.




da

Bis[2-(isoquinolin-1-yl)phenyl-κ2N,C1](2-phenyl-1H-imidazo[4,5-f][1,10]phenanthroline-κ2N,N')iridium(III) hexa­fluorido­phosphate methanol monosolvate

The title compound, [Ir(C15H10N)2(C19H12N4)]PF6·CH3OH, crystallizes in the C2/c space group with one monocationic iridium complex, one hexa­fluorido­phosphate anion, and one methanol solvent mol­ecule of crystallization in the asymmetric unit, all in general positions. The anion and solvent are linked to the iridium complex cation via hydrogen bonding. All bond lengths and angles fall into expected ranges compared to similar compounds.




da

Redetermined structure of methyl 3-{4,4-di­fluoro-2-[2-(methoxy­car­bon­yl)­ethyl]-1,3,5,7-tetra­methyl-4-bora-3a,4a-di­aza-s-in­da­cen-6-yl}pro­pion­ate

In the title compound, C21H27BF2N2O4, a highly fluorescent boron–dipyrromethene dye, the methyl­propionate moieties have different conformations. In the crystal, weak C—H⋯F and C—H⋯O inter­actions link the mol­ecules. Some optical properties are presented.




da

4-Bromo-N,N'-di­phenyl­benzimidamide N'-oxide

The title compound, C19H15BrN2O, crystallizes with two similar mol­ecules in the asymmetric unit. The extended structure features dimers linked by pairs of N—H⋯O and C—H⋯O hydrogen bonds. The HNCNO moiety of the title compound shows delocalization over the N—C—N part, as evidenced by the similar C—N bond distances.




da

Crystal structure elucidation of a geminal and vicinal bis­(tri­fluoro­methane­sulfonate) ester

Geminal and vicinal bis­(tri­fluoro­methane­sulfonate) esters are highly reactive alkyl­ene synthons used as potent electrophiles in the macrocyclization of imid­azoles and the transformation of bypyridines to diquat derivatives via nucleophilic substitution reactions. Herein we report the crystal structures of methyl­ene (C3H2F6O6S2) and ethyl­ene bis­(tri­fluoro­methane­sulfonate) (C4H4F6O6S2), the first examples of a geminal and vicinal bis­(tri­fluoro­methane­sulfonate) ester characterized by single-crystal X-ray diffraction (SC-XRD). With melting points slightly below ambient temperature, both reported bis­(tri­fluoro­methane­sulfonate)s are air- and moisture-sensitive oils and were crys­tallized at 277 K to afford two-com­ponent non-merohedrally twinned crystals. The dominant inter­actions present in both com­pounds are non-classical C—H⋯O hydrogen bonds and inter­molecular C—F⋯F—C inter­actions between tri­fluoro­methyl groups. Mol­ecular electrostatic potential (MEP) cal­culations by DFT-D3 helped to qu­antify the polarity between O⋯H and F⋯F contacts to rationalize the self-sorting of both bis­(tri­fluoro­methane­sulfonate) esters in polar (non-fluorous) and non-polar (fluorous) domains within the crystal structure.




da

Data collection is your last experiment




da

TAAM refinement on high-resolution experimental and simulated 3D ED/MicroED data for organic mol­ecules

3D electron diffraction (3D ED), or microcrystal electron diffraction (MicroED), has become an alternative technique for determining the high-resolution crystal structures of compounds from sub-micron-sized crystals. Here, we considered l-alanine, α-glycine and urea, which are known to form good-quality crystals, and collected high-resolution 3D ED data on our in-house TEM instrument. In this study, we present a comparison of independent atom model (IAM) and transferable aspherical atom model (TAAM) kinematical refinement against experimental and simulated data. TAAM refinement on both experimental and simulated data clearly improves the model fitting statistics (R factors and residual electrostatic potential) compared to IAM refinement. This shows that TAAM better represents the experimental electrostatic potential of organic crystals than IAM. Furthermore, we compared the geometrical parameters and atomic displacement parameters (ADPs) resulting from the experimental refinements with the simulated refinements, with the periodic density functional theory (DFT) calculations and with published X-ray and neutron crystal structures. The TAAM refinements on the 3D ED data did not improve the accuracy of the bond lengths between the non-H atoms. The experimental 3D ED data provided more accurate H-atom positions than the IAM refinements on the X-ray diffraction data. The IAM refinements against 3D ED data had a tendency to lead to slightly longer X—H bond lengths than TAAM, but the difference was statistically insignificant. Atomic displacement parameters were too large by tens of percent for l-alanine and α-glycine. Most probably, other unmodelled effects were causing this behaviour, such as radiation damage or dynamical scattering.




da

Deep residual networks for crystallography trained on synthetic data

The use of artificial intelligence to process diffraction images is challenged by the need to assemble large and precisely designed training data sets. To address this, a codebase called Resonet was developed for synthesizing diffraction data and training residual neural networks on these data. Here, two per-pattern capabilities of Resonet are demonstrated: (i) interpretation of crystal resolution and (ii) identification of overlapping lattices. Resonet was tested across a compilation of diffraction images from synchrotron experiments and X-ray free-electron laser experiments. Crucially, these models readily execute on graphics processing units and can thus significantly outperform conventional algorithms. While Resonet is currently utilized to provide real-time feedback for macromolecular crystallography users at the Stanford Synchrotron Radiation Lightsource, its simple Python-based interface makes it easy to embed in other processing frameworks. This work highlights the utility of physics-based simulation for training deep neural networks and lays the groundwork for the development of additional models to enhance diffraction collection and analysis.




da

A web-based dashboard for RELION metadata visualization

Cryo-electron microscopy (cryo-EM) has witnessed radical progress in the past decade, driven by developments in hardware and software. While current software packages include processing pipelines that simplify the image-processing workflow, they do not prioritize the in-depth analysis of crucial metadata, limiting troubleshooting for challenging data sets. The widely used RELION software package lacks a graphical native representation of the underlying metadata. Here, two web-based tools are introduced: relion_live.py, which offers real-time feedback on data collection, aiding swift decision-making during data acquisition, and relion_analyse.py, a graphical interface to represent RELION projects by plotting essential metadata including interactive data filtration and analysis. A useful script for estimating ice thickness and data quality during movie pre-processing is also presented. These tools empower researchers to analyse data efficiently and allow informed decisions during data collection and processing.




da

Advanced exploitation of unmerged reflection data during processing and refinement with autoPROC and BUSTER

The validation of structural models obtained by macromolecular X-ray crystallography against experimental diffraction data, whether before deposition into the PDB or after, is typically carried out exclusively against the merged data that are eventually archived along with the atomic coordinates. It is shown here that the availability of unmerged reflection data enables valuable additional analyses to be performed that yield improvements in the final models, and tools are presented to implement them, together with examples of the results to which they give access. The first example is the automatic identification and removal of image ranges affected by loss of crystal centering or by excessive decay of the diffraction pattern as a result of radiation damage. The second example is the `reflection-auditing' process, whereby individual merged data items showing especially poor agreement with model predictions during refinement are investigated thanks to the specific metadata (such as image number and detector position) that are available for the corresponding unmerged data, potentially revealing previously undiagnosed instrumental, experimental or processing problems. The third example is the calculation of so-called F(early) − F(late) maps from carefully selected subsets of unmerged amplitude data, which can not only highlight the location and extent of radiation damage but can also provide guidance towards suitable fine-grained parametrizations to model the localized effects of such damage.




da

Tomo Live: an on-the-fly reconstruction pipeline to judge data quality for cryo-electron tomography workflows

Data acquisition and processing for cryo-electron tomography can be a significant bottleneck for users. To simplify and streamline the cryo-ET workflow, Tomo Live, an on-the-fly solution that automates the alignment and reconstruction of tilt-series data, enabling real-time data-quality assessment, has been developed. Through the integration of Tomo Live into the data-acquisition workflow for cryo-ET, motion correction is performed directly after each of the acquired tilt angles. Immediately after the tilt-series acquisition has completed, an unattended tilt-series alignment and reconstruction into a 3D volume is performed. The results are displayed in real time in a dedicated remote web platform that runs on the microscope hardware. Through this web platform, users can review the acquired data (aligned stack and 3D volume) and several quality metrics that are obtained during the alignment and reconstruction process. These quality metrics can be used for fast feedback for subsequent acquisitions to save time. Parameters such as Alignment Accuracy, Deleted Tilts and Tilt Axis Correction Angle are visualized as graphs and can be used as filters to export only the best tomograms (raw data, reconstruction and intermediate data) for further processing. Here, the Tomo Live algorithms and workflow are described and representative results on several biological samples are presented. The Tomo Live workflow is accessible to both expert and non-expert users, making it a valuable tool for the continued advancement of structural biology, cell biology and histology.




da

Efficient in situ screening of and data collection from microcrystals in crystallization plates

A considerable bottleneck in serial crystallography at XFEL and synchrotron sources is the efficient production of large quantities of homogenous, well diffracting microcrystals. Efficient high-throughput screening of batch-grown microcrystals and the determination of ground-state structures from different conditions is thus of considerable value in the early stages of a project. Here, a highly sample-efficient methodology to measure serial crystallography data from microcrystals by raster scanning within standard in situ 96-well crystallization plates is described. Structures were determined from very small quantities of microcrystal suspension and the results were compared with those from other sample-delivery methods. The analysis of a two-dimensional batch crystallization screen using this method is also described as a useful guide for further optimization and the selection of appropriate conditions for scaling up microcrystallization.




da

A database overview of metal-coordination distances in metalloproteins

Metalloproteins are ubiquitous in all living organisms and take part in a very wide range of biological processes. For this reason, their experimental characterization is crucial to obtain improved knowledge of their structure and biological functions. The three-dimensional structure represents highly relevant information since it provides insight into the interaction between the metal ion(s) and the protein fold. Such interactions determine the chemical reactivity of the bound metal. The available PDB structures can contain errors due to experimental factors such as poor resolution and radiation damage. A lack of use of distance restraints during the refinement and validation process also impacts the structure quality. Here, the aim was to obtain a thorough overview of the distribution of the distances between metal ions and their donor atoms through the statistical analysis of a data set based on more than 115 000 metal-binding sites in proteins. This analysis not only produced reference data that can be used by experimentalists to support the structure-determination process, for example as refinement restraints, but also resulted in an improved insight into how protein coordination occurs for different metals and the nature of their binding interactions. In particular, the features of carboxylate coordination were inspected, which is the only type of interaction that is commonly present for nearly all metals.




da

Identifying and avoiding radiation damage in macromolecular crystallography

Radiation damage remains one of the major impediments to accurate structure solution in macromolecular crystallography. The artefacts of radiation damage can manifest as structural changes that result in incorrect biological interpretations being drawn from a model, they can reduce the resolution to which data can be collected and they can even prevent structure solution entirely. In this article, we discuss how to identify and mitigate against the effects of radiation damage at each stage in the macromolecular crystal structure-solution pipeline.




da

A small step towards an important goal: fragment screen of the c-di-AMP-synthesizing enzyme CdaA

CdaA is the most widespread diadenylate cyclase in many bacterial species, including several multidrug-resistant human pathogens. The enzymatic product of CdaA, cyclic di-AMP, is a secondary messenger that is essential for the viability of many bacteria. Its absence in humans makes CdaA a very promising and attractive target for the development of new antibiotics. Here, the structural results are presented of a crystallographic fragment screen against CdaA from Listeria monocytogenes, a saprophytic Gram-positive bacterium and an opportunistic food-borne pathogen that can cause listeriosis in humans and animals. Two of the eight fragment molecules reported here were localized in the highly conserved ATP-binding site. These fragments could serve as potential starting points for the development of antibiotics against several CdaA-dependent bacterial species.




da

Pillar data-acquisition strategies for cryo-electron tomography of beam-sensitive biological samples

For cryo-electron tomography (cryo-ET) of beam-sensitive biological specimens, a planar sample geometry is typically used. As the sample is tilted, the effective thickness of the sample along the direction of the electron beam increases and the signal-to-noise ratio concomitantly decreases, limiting the transfer of information at high tilt angles. In addition, the tilt range where data can be collected is limited by a combination of various sample-environment constraints, including the limited space in the objective lens pole piece and the possible use of fixed conductive braids to cool the specimen. Consequently, most tilt series are limited to a maximum of ±70°, leading to the presence of a missing wedge in Fourier space. The acquisition of cryo-ET data without a missing wedge, for example using a cylindrical sample geometry, is hence attractive for volumetric analysis of low-symmetry structures such as organelles or vesicles, lysis events, pore formation or filaments for which the missing information cannot be compensated by averaging techniques. Irrespective of the geometry, electron-beam damage to the specimen is an issue and the first images acquired will transfer more high-resolution information than those acquired last. There is also an inherent trade-off between higher sampling in Fourier space and avoiding beam damage to the sample. Finally, the necessity of using a sufficient electron fluence to align the tilt images means that this fluence needs to be fractionated across a small number of images; therefore, the order of data acquisition is also a factor to consider. Here, an n-helix tilt scheme is described and simulated which uses overlapping and interleaved tilt series to maximize the use of a pillar geometry, allowing the entire pillar volume to be reconstructed as a single unit. Three related tilt schemes are also evaluated that extend the continuous and classic dose-symmetric tilt schemes for cryo-ET to pillar samples to enable the collection of isotropic information across all spatial frequencies. A fourfold dose-symmetric scheme is proposed which provides a practical compromise between uniform information transfer and complexity of data acquisition.




da

Validation of electron-microscopy maps using solution small-angle X-ray scattering

The determination of the atomic resolution structure of biomacromolecules is essential for understanding details of their function. Traditionally, such a structure determination has been performed with crystallographic or nuclear resonance methods, but during the last decade, cryogenic transmission electron microscopy (cryo-TEM) has become an equally important tool. As the blotting and flash-freezing of the samples can induce conformational changes, external validation tools are required to ensure that the vitrified samples are representative of the solution. Although many validation tools have already been developed, most of them rely on fully resolved atomic models, which prevents early screening of the cryo-TEM maps. Here, a novel and automated method for performing such a validation utilizing small-angle X-ray scattering measurements, publicly available through the new software package AUSAXS, is introduced and implemented. The method has been tested on both simulated and experimental data, where it was shown to work remarkably well as a validation tool. The method provides a dummy atomic model derived from the EM map which best represents the solution structure.




da

Managing macromolecular crystallographic data with a laboratory information management system

Protein crystallography is an established method to study the atomic structures of macromolecules and their complexes. A prerequisite for successful structure determination is diffraction-quality crystals, which may require extensive optimization of both the protein and the conditions, and hence projects can stretch over an extended period, with multiple users being involved. The workflow from crystallization and crystal treatment to deposition and publication is well defined, and therefore an electronic laboratory information management system (LIMS) is well suited to management of the data. Completion of the project requires key information on all the steps being available and this information should also be made available according to the FAIR principles. As crystallized samples are typically shipped between facilities, a key feature to be captured in the LIMS is the exchange of metadata between the crystallization facility of the home laboratory and, for example, synchrotron facilities. On completion, structures are deposited in the Protein Data Bank (PDB) and the LIMS can include the PDB code in its database, completing the chain of custody from crystallization to structure deposition and publication. A LIMS designed for macromolecular crystallography, IceBear, is available as a standalone installation and as a hosted service, and the implementation of key features for the capture of metadata in IceBear is discussed as an example.




da

Post-translational modifications in the Protein Data Bank

Proteins frequently undergo covalent modification at the post-translational level, which involves the covalent attachment of chemical groups onto amino acids. This can entail the singular or multiple addition of small groups, such as phosphorylation; long-chain modifications, such as glycosylation; small proteins, such as ubiquitination; as well as the interconversion of chemical groups, such as the formation of pyroglutamic acid. These post-translational modifications (PTMs) are essential for the normal functioning of cells, as they can alter the physicochemical properties of amino acids and therefore influence enzymatic activity, protein localization, protein–protein interactions and protein stability. Despite their inherent importance, accurately depicting PTMs in experimental studies of protein structures often poses a challenge. This review highlights the role of PTMs in protein structures, as well as the prevalence of PTMs in the Protein Data Bank, directing the reader to accurately built examples suitable for use as a modelling reference.




da

Structural studies of β-glucosidase from the thermophilic bacterium Caldicellulosiruptor saccharolyticus

β-Glucosidase from the thermophilic bacterium Caldicellulosiruptor saccharo­lyticus (Bgl1) has been denoted as having an attractive catalytic profile for various industrial applications. Bgl1 catalyses the final step of in the decomposition of cellulose, an unbranched glucose polymer that has attracted the attention of researchers in recent years as it is the most abundant renewable source of reduced carbon in the biosphere. With the aim of enhancing the thermostability of Bgl1 for a broad spectrum of biotechnological processes, it has been subjected to structural studies. Crystal structures of Bgl1 and its complex with glucose were determined at 1.47 and 1.95 Å resolution, respectively. Bgl1 is a member of glycosyl hydrolase family 1 (GH1 superfamily, EC 3.2.1.21) and the results showed that the 3D structure of Bgl1 follows the overall architecture of the GH1 family, with a classical (β/α)8 TIM-barrel fold. Comparisons of Bgl1 with sequence or structural homologues of β-glucosidase reveal quite similar structures but also unique structural features in Bgl1 with plausible functional roles.




da

EMhub: a web platform for data management and on-the-fly processing in scientific facilities

Most scientific facilities produce large amounts of heterogeneous data at a rapid pace. Managing users, instruments, reports and invoices presents additional challenges. To address these challenges, EMhub, a web platform designed to support the daily operations and record-keeping of a scientific facility, has been introduced. EMhub enables the easy management of user information, instruments, bookings and projects. The application was initially developed to meet the needs of a cryoEM facility, but its functionality and adaptability have proven to be broad enough to be extended to other data-generating centers. The expansion of EMHub is enabled by the modular nature of its core functionalities. The application allows external processes to be connected via a REST API, automating tasks such as folder creation, user and password generation, and the execution of real-time data-processing pipelines. EMhub has been used for several years at the Swedish National CryoEM Facility and has been installed in the CryoEM center at the Structural Biology Department at St. Jude Children's Research Hospital. A fully automated single-particle pipeline has been implemented for on-the-fly data processing and analysis. At St. Jude, the X-Ray Crystallography Center and the Single-Molecule Imaging Center have already expanded the platform to support their operational and data-management workflows.




da

STEM SerialED: achieving high-resolution data for ab initio structure determination of beam-sensitive nanocrystalline materials

Serial electron diffraction (SerialED), which applies a snapshot data acquisition strategy for each crystal, was introduced to tackle the problem of radiation damage in the structure determination of beam-sensitive materials by three-dimensional electron diffraction (3DED). The snapshot data acquisition in SerialED can be realized using both transmission and scanning transmission electron microscopes (TEM/STEM). However, the current SerialED workflow based on STEM setups requires special external devices and software, which limits broader adoption. Here, we present a simplified experimental implementation of STEM-based SerialED on Thermo Fisher Scientific STEMs using common proprietary software interfaced through Python scripts to automate data collection. Specifically, we utilize TEM Imaging and Analysis (TIA) scripting and TEM scripting to access the STEM functionalities of the microscope, and DigitalMicrograph scripting to control the camera for snapshot data acquisition. Data analysis adapts the existing workflow using the software CrystFEL, which was developed for serial X-ray crystallography. Our workflow for STEM SerialED can be used on any Gatan or Thermo Fisher Scientific camera. We apply this workflow to collect high-resolution STEM SerialED data from two aluminosilicate zeolites, zeolite Y and ZSM-25. We demonstrate, for the first time, ab initio structure determination through direct methods using STEM SerialED data. Zeolite Y is relatively stable under the electron beam, and STEM SerialED data extend to 0.60 Å. We show that the structural model obtained using STEM SerialED data merged from 358 crystals is nearly identical to that using continuous rotation electron diffraction data from one crystal. This demonstrates that accurate structures can be obtained from STEM SerialED. Zeolite ZSM-25 is very beam-sensitive and has a complex structure. We show that STEM SerialED greatly improves the data resolution of ZSM-25, compared with serial rotation electron diffraction (SerialRED), from 1.50 to 0.90 Å. This allows, for the first time, the use of standard phasing methods, such as direct methods, for the ab initio structure determination of ZSM-25.




da

Refining short-range order parameters from the three-dimensional diffuse scattering in single-crystal electron diffraction data

Our study compares short-range order parameters refined from the diffuse scattering in single-crystal X-ray and single-crystal electron diffraction data. Nb0.84CoSb was chosen as a reference material. The correlations between neighbouring vacancies and the displacements of Sb and Co atoms were refined from the diffuse scattering using a Monte Carlo refinement in DISCUS. The difference between the Sb and Co displacements refined from the diffuse scattering and the Sb and Co displacements refined from the Bragg reflections in single-crystal X-ray diffraction data is 0.012 (7) Å for the refinement on diffuse scattering in single-crystal X-ray diffraction data and 0.03 (2) Å for the refinement on the diffuse scattering in single-crystal electron diffraction data. As electron diffraction requires much smaller crystals than X-ray diffraction, this opens up the possibility of refining short-range order parameters in many technologically relevant materials for which no crystals large enough for single-crystal X-ray diffraction are available.




da

The interoperability of crystallographic data and databases

Interoperability of crystallographic data with other disciplines is essential for the smooth and rapid progress of structure-based science in the computer age. Within crystallography and closely related subject areas, there is already a high level of conformance to the generally accepted FAIR principles (that data be findable, accessible, interoperable and reusable) through the adoption of common information exchange protocols by databases, publishers, instrument vendors, experimental facilities and software authors. Driven by the success within these domains, the IUCr has worked closely with CODATA (the Committee on Data of the International Science Council) to help develop the latter's commitment to cross-domain integration of discipline-specific data. The IUCr has, in particular, emphasized the need for standards relating to data quality and completeness as an adjunct to the FAIR data landscape. This can ensure definitive reusable data, which in turn can aid interoperability across domains. A microsymposium at the IUCr 2023 Congress provided an up-to-date survey of data interoperability within and outside of crystallography, expounded using a broad range of examples.




da

Data reduction in protein serial crystallography

Serial crystallography (SX) has become an established technique for protein structure determination, especially when dealing with small or radiation-sensitive crystals and investigating fast or irreversible protein dynamics. The advent of newly developed multi-megapixel X-ray area detectors, capable of capturing over 1000 images per second, has brought about substantial benefits. However, this advancement also entails a notable increase in the volume of collected data. Today, up to 2 PB of data per experiment could be easily obtained under efficient operating conditions. The combined costs associated with storing data from multiple experiments provide a compelling incentive to develop strategies that effectively reduce the amount of data stored on disk while maintaining the quality of scientific outcomes. Lossless data-compression methods are designed to preserve the information content of the data but often struggle to achieve a high compression ratio when applied to experimental data that contain noise. Conversely, lossy compression methods offer the potential to greatly reduce the data volume. Nonetheless, it is vital to thoroughly assess the impact of data quality and scientific outcomes when employing lossy compression, as it inherently involves discarding information. The evaluation of lossy compression effects on data requires proper data quality metrics. In our research, we assess various approaches for both lossless and lossy compression techniques applied to SX data, and equally importantly, we describe metrics suitable for evaluating SX data quality.




da

Community recommendations on cryoEM data archiving and validation

In January 2020, a workshop was held at EMBL-EBI (Hinxton, UK) to discuss data requirements for the deposition and validation of cryoEM structures, with a focus on single-particle analysis. The meeting was attended by 47 experts in data processing, model building and refinement, validation, and archiving of such structures. This report describes the workshop's motivation and history, the topics discussed, and the resulting consensus recommendations. Some challenges for future methods-development efforts in this area are also highlighted, as is the implementation to date of some of the recommendations.




da

RCSB Protein Data Bank: supporting research and education worldwide through explorations of experimentally determined and computationally predicted atomic level 3D biostructures

The Protein Data Bank (PDB) was established as the first open-access digital data resource in biology and medicine in 1971 with seven X-ray crystal structures of proteins. Today, the PDB houses >210 000 experimentally determined, atomic level, 3D structures of proteins and nucleic acids as well as their complexes with one another and small molecules (e.g. approved drugs, enzyme cofactors). These data provide insights into fundamental biology, biomedicine, bioenergy and biotechnology. They proved particularly important for understanding the SARS-CoV-2 global pandemic. The US-funded Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) and other members of the Worldwide Protein Data Bank (wwPDB) partnership jointly manage the PDB archive and support >60 000 `data depositors' (structural biologists) around the world. wwPDB ensures the quality and integrity of the data in the ever-expanding PDB archive and supports global open access without limitations on data usage. The RCSB PDB research-focused web portal at https://www.rcsb.org/ (RCSB.org) supports millions of users worldwide, representing a broad range of expertise and interests. In addition to retrieving 3D structure data, PDB `data consumers' access comparative data and external annotations, such as information about disease-causing point mutations and genetic variations. RCSB.org also provides access to >1 000 000 computed structure models (CSMs) generated using artificial intelligence/machine-learning methods. To avoid doubt, the provenance and reliability of experimentally determined PDB structures and CSMs are identified. Related training materials are available to support users in their RCSB.org explorations.




da

Analysis of COF-300 synthesis: probing degradation processes and 3D electron diffraction structure

Although COF-300 is often used as an example to study the synthesis and structure of (3D) covalent organic frameworks (COFs), knowledge of the underlying synthetic processes is still fragmented. Here, an optimized synthetic procedure based on a combination of linker protection and modulation was applied. Using this approach, the influence of time and temperature on the synthesis of COF-300 was studied. Synthesis times that were too short produced materials with limited crystallinity and porosity, lacking the typical pore flexibility associated with COF-300. On the other hand, synthesis times that were too long could be characterized by loss of crystallinity and pore order by degradation of the tetrakis(4-aminophenyl)methane (TAM) linker used. The presence of the degradation product was confirmed by visual inspection, Raman spectroscopy and X-ray photoelectron spectroscopy (XPS). As TAM is by far the most popular linker for the synthesis of 3D COFs, this degradation process might be one of the reasons why the development of 3D COFs is still lagging compared with 2D COFs. However, COF crystals obtained via an optimized procedure could be structurally probed using 3D electron diffraction (3DED). The 3DED analysis resulted in a full structure determination of COF-300 at atomic resolution with satisfying data parameters. Comparison of our 3DED-derived structural model with previously reported single-crystal X-ray diffraction data for this material, as well as parameters derived from the Cambridge Structural Database, demonstrates the high accuracy of the 3DED method for structure determination. This validation might accelerate the exploitation of 3DED as a structure determination technique for COFs and other porous materials.




da

The evolution of raw data archiving and the growth of its importance in crystallography

The hardware for data archiving has expanded capacities for digital storage enormously in the past decade or more. The IUCr evaluated the costs and benefits of this within an official working group which advised that raw data archiving would allow ground truth reproducibility in published studies. Consultations of the IUCr's Commissions ensued via a newly constituted standing advisory committee, the Committee on Data. At all stages, the IUCr financed workshops to facilitate community discussions and possible methods of raw data archiving implementation. The recent launch of the IUCrData journal's Raw Data Letters is a milestone in the implementation of raw data archiving beyond the currently published studies: it includes diffraction patterns that have not been fully interpreted, if at all. The IUCr 75th Congress in Melbourne included a workshop on raw data reuse, discussing the successes and ongoing challenges of raw data reuse. This article charts the efforts of the IUCr to facilitate discussions and plans relating to raw data archiving and reuse within the various communities of crystallography, diffraction and scattering.




da

Benchmarking predictive methods for small-angle X-ray scattering from atomic coordinates of proteins using maximum likelihood consensus data

Stimulated by informal conversations at the XVII International Small Angle Scattering (SAS) conference (Traverse City, 2017), an international team of experts undertook a round-robin exercise to produce a large dataset from proteins under standard solution conditions. These data were used to generate consensus SAS profiles for xylose isomerase, urate oxidase, xylanase, lysozyme and ribonuclease A. Here, we apply a new protocol using maximum likelihood with a larger number of the contributed datasets to generate improved consensus profiles. We investigate the fits of these profiles to predicted profiles from atomic coordinates that incorporate different models to account for the contribution to the scattering of water molecules of hydration surrounding proteins in solution. Programs using an implicit, shell-type hydration layer generally optimize fits to experimental data with the aid of two parameters that adjust the volume of the bulk solvent excluded by the protein and the contrast of the hydration layer. For these models, we found the error-weighted residual differences between the model and the experiment generally reflected the subsidiary maxima and minima in the consensus profiles that are determined by the size of the protein plus the hydration layer. By comparison, all-atom solute and solvent molecular dynamics (MD) simulations are without the benefit of adjustable parameters and, nonetheless, they yielded at least equally good fits with residual differences that are less reflective of the structure in the consensus profile. Further, where MD simulations accounted for the precise solvent composition of the experiment, specifically the inclusion of ions, the modelled radius of gyration values were significantly closer to the experiment. The power of adjustable parameters to mask real differences between a model and the structure present in solution is demonstrated by the results for the conformationally dynamic ribonuclease A and calculations with pseudo-experimental data. This study shows that, while methods invoking an implicit hydration layer have the unequivocal advantage of speed, care is needed to understand the influence of the adjustable parameters. All-atom solute and solvent MD simulations are slower but are less susceptible to false positives, and can account for thermal fluctuations in atomic positions, and more accurately represent the water molecules of hydration that contribute to the scattering profile.




da

Comprehensive encoding of conformational and compositional protein structural ensembles through the mmCIF data structure

In the folded state, biomolecules exchange between multiple conformational states crucial for their function. However, most structural models derived from experiments and computational predictions only encode a single state. To represent biomolecules accurately, we must move towards modeling and predicting structural ensembles. Information about structural ensembles exists within experimental data from X-ray crystallography and cryo-electron microscopy. Although new tools are available to detect conformational and compositional heterogeneity within these ensembles, the legacy PDB data structure does not robustly encapsulate this complexity. We propose modifications to the macromolecular crystallographic information file (mmCIF) to improve the representation and interrelation of conformational and compositional heterogeneity. These modifications will enable the capture of macromolecular ensembles in a human and machine-interpretable way, potentially catalyzing breakthroughs for ensemble–function predictions, analogous to the achievements of AlphaFold with single-structure prediction.




da

On the structure refinement of metal complexes against 3D electron diffraction data using multipolar scattering factors

This study examines various methods for modelling the electron density and, thus, the electrostatic potential of an organometallic complex for use in crystal structure refinement against 3D electron diffraction (ED) data. It focuses on modelling the scattering factors of iron(III), considering the electron density distribution specific for coordination with organic linkers. We refined the structural model of the metal–organic complex, iron(III) acetyl­acetonate (FeAcAc), using both the independent atom model (IAM) and the transferable aspherical atom model (TAAM). TAAM refinement initially employed multipolar parameters from the MATTS databank for acetyl­acetonate, while iron was modelled with a spherical and neutral approach (TAAM ligand). Later, custom-made TAAM scattering factors for Fe—O coordination were derived from DFT calculations [TAAM-ligand-Fe(III)]. Our findings show that, in this compound, the TAAM scattering factor corresponding to Fe3+ has a lower scattering amplitude than the Fe3+ charged scattering factor described by IAM. When using scattering factors corresponding to the oxidation state of iron, IAM inaccurately represents electrostatic potential maps and overestimates the scattering potential of the iron. In addition, TAAM significantly improved the fitting of the model to the data, shown by improved R1 values, goodness-of-fit (GooF) and reduced noise in the Fourier difference map (based on the residual distribution analysis). For 3D ED, R1 values improved from 19.36% (IAM) to 17.44% (TAAM-ligand) and 17.49% (TAAM-ligand-Fe3+), and for single-crystal X-ray diffraction (SCXRD) from 3.82 to 2.03% and 1.98%, respectively. For 3D ED, the most significant R1 reductions occurred in the low-resolution region (8.65–2.00 Å), dropping from 20.19% (IAM) to 14.67% and 14.89% for TAAM-ligand and TAAM-ligand-Fe(III), respectively, with less improvement in high-resolution ranges (2.00–0.85 Å). This indicates that the major enhancements are due to better scattering modelling in low-resolution zones. Furthermore, when using TAAM instead of IAM, there was a noticeable improvement in the shape of the thermal ellipsoids, which more closely resembled those of an SCXRD-refined model. This study demonstrates the applicability of more sophisticated scattering factors to improve the refinement of metal–organic complexes against 3D ED data, suggesting the need for more accurate modelling methods and highlighting the potential of TAAM in examining the charge distribution of large molecular structures using 3D ED.




da

Hirshfeld atom refinement and dynamical refinement of hexagonal ice structure from electron diffraction data

Reaching beyond the commonly used spherical atomic electron density model allows one to greatly improve the accuracy of hydrogen atom structural param­eters derived from X-ray data. However, the effects of atomic asphericity are less explored for electron diffraction data. In this work, Hirshfeld atom refinement (HAR), a method that uses an accurate description of electron density by quantum mechanical calculation for a system of interest, was applied for the first time to the kinematical refinement of electron diffraction data. This approach was applied here to derive the structure of ordinary hexagonal ice (Ih). The effect of introducing HAR is much less noticeable than in the case of X-ray refinement and it is largely overshadowed by dynamical scattering effects. It led to only a slight change in the O—H bond lengths (shortening by 0.01 Å) compared with the independent atom model (IAM). The average absolute differences in O—H bond lengths between the kinematical refinements and the reference neutron structure were much larger: 0.044 for IAM and 0.046 Å for HAR. The refinement results changed considerably when dynamical scattering effects were modelled – with extinction correction or with dynamical refinement. The latter led to an improvement of the O—H bond length accuracy to 0.021 Å on average (with IAM refinement). Though there is a potential for deriving more accurate structures using HAR for electron diffraction, modelling of dynamical scattering effects seems to be a necessary step to achieve this. However, at present there is no software to support both HAR and dynamical refinement.




da

CheckMyMetal (CMM): validating metal-binding sites in X-ray and cryo-EM data

Identifying and characterizing metal-binding sites (MBS) within macromolecular structures is imperative for elucidating their biological functions. CheckMyMetal (CMM) is a web based tool that facilitates the interactive valid­ation of MBS in structures determined through X-ray crystallography and cryo-electron microscopy (cryo-EM). Recent updates to CMM have significantly enhanced its capability to efficiently handle large datasets generated from cryo-EM structural analyses. In this study, we address various challenges inherent in validating MBS within both X-ray and cryo-EM structures. Specifically, we examine the difficulties associated with accurately identifying metals and modeling their coordination environments by considering the ongoing reproducibility challenges in structural biology and the critical importance of well annotated, high-quality experimental data. CMM employs a sophisticated framework of rules rooted in the valence bond theory for MBS validation. We explore how CMM validation parameters correlate with the resolution of experimentally derived structures of macromolecules and their complexes. Additionally, we showcase the practical utility of CMM by analyzing a representative cryo-EM structure. Through a comprehensive examination of experimental data, we demonstrate the capability of CMM to advance MBS characterization and identify potential instances of metal misassignment.




da

Waterless structures in the Protein Data Bank

The absence of solvent molecules in high-resolution protein crystal structure models deposited in the Protein Data Bank (PDB) contradicts the fact that, for proteins crystallized from aqueous media, water molecules are always expected to bind to the protein surface, as well as to some sites in the protein interior. An analysis of the contents of the PDB indicated that the expected ratio of the number of water molecules to the number of amino-acid residues exceeds 1.5 in atomic resolution structures, decreasing to 0.25 at around 2.5 Å resolution. Nevertheless, almost 800 protein crystal structures determined at a resolution of 2.5 Å or higher are found in the current release of the PDB without any water molecules, whereas some other depositions have unusually low or high occupancies of modeled solvent. Detailed analysis of these depositions revealed that the lack of solvent molecules might be an indication of problems with either the diffraction data, the refinement protocol, the deposition process or a combination of these factors. It is postulated that problems with solvent structure should be flagged by the PDB and addressed by the depositors.




da

Lattice response to the radiation damage of molecular crystals: radiation-induced versus thermal expansivity

The interaction of intense synchrotron radiation with molecular crystals frequently modifies the crystal structure by breaking bonds, producing fragments and, hence, inducing disorder. Here, a second-rank tensor of radiation-induced lattice strain is proposed to characterize the structural susceptibility to radiation. Quantitative estimates are derived using a linear response approximation from experimental data collected on three materials Hg(NO3)2(PPh3)2, Hg(CN)2(PPh3)2 and BiPh3 [PPh3 = triphenylphosphine, P(C6H5)3; Ph = phenyl, C6H5], and are compared with the corresponding thermal expansivities. The associated eigenvalues and eigenvectors show that the two tensors are not the same and therefore probe truly different structural responses. The tensor of radiative expansion serves as a measure of the susceptibility of crystal structures to radiation damage.