dr. Eugene Krissinel

Science and Technology Facilities Council,
CCP4 core group
Oxford, UK

HT3: Processing & Structure determination

L2: Model Validation and Deposition

Macromolecular structure determination is a complex process that involves a series of steps, including data processing, model building, and refinement. The accuracy and reliability of the final structure model depend on the quality of data and methods used in each of these steps and is often difficult to assess, partially because there is no single score which would unambiguously indicate structure quality. A number of quality indicators need to be checked before concluding that the structure represents the best or next-to-the-best fit to given experimental observations, i.e., represents a valid result and accurately reflects the true nature of the macromolecule.

The assessment procedure, commonly called “structure validation”, should always be performed before depositing results to the Protein Data Bank (the PDB). CCP4 offers a number of validation utilities, most of which are executed at the end of refinement jobs. Model validation also helps to identify errors and inconsistencies in the model, which should be corrected before deposition. Validation of macromolecular structures is critical, as inaccuracies or errors in the models can lead to incorrect interpretation of the biological function of the protein.

CCP4 Cloud interface provides an easy-to-use platform for protein structure determination and validation. Most task reports in CCP4 Cloud come with the Verdict section, which represents a validation summary for the task’s results. Verdicts come with few quality assessment sections that provide more detailed information on various quality indicators, for example, per-residue B-factor analysis and electron density fit scores, Ramachandran outliers, and full Molprobity analysis. The Verdict includes information on potential errors or issues in the model and suggestions on parameters that may be used to improve the results. The Verdict also provides a quick overview of the quality of the model and can be used to identify potential issues that need to be addressed before deposition.

PDB deposition requires mmCIF-formatted files with experimental observations or structure factors and model coordinates. They are prepared with CCP4 Cloud’s deposition task, which also acquires PDB validation reports containing detailed quality analysis and scores. These reports represent a copy of what will be examined by PDB curators when the structure is considered for deposition, and they will also be available to referees when results are submitted for publication in a journal.