DAQ-Score Database

Precomputed Residue-Wise Local Quality Scores of Protein Models from Cryo-EM Maps

Developed by Kihara Lab
Examples:

What is DAQ

DAQ is a deep-learning-based score that quantifies residue-wise local quality for protein models from cryo-electron microscopy (cryo-EM) maps.

Key Features

example record

The model is colored by DAQ-score scaled from red (low) to blue (high). PDB 7JSN chain B Version 1, EMD-22458.

This database provides pre-computed DAQ score for structure models in PDB that were derived from cryo-EM maps. Entries can be searched by PDB ID, EMDB ID, or key words. DAQ-score along a protein structure model is visualized in an interactive structure viewer as well as a graph and a score table, which are connected with the model structure in the viewer. Three scores are provided, which evaluate correctness of amino acid assignments, Cα positions, and secondary structure assignments.

Overview of DAQ

Overview of DAQ

DAQ uses deep-learning and computes the likelihood that each local position in a cryo-EM map corresponds to different secondary structures, amino acids, and Cα atoms from its local density features. Then, a plausibility of each residue in a structure model from the cryo-EM map is quantified with the following equations.

The amino acid type of residue \(i\) in a model is evaluated as: \[ DAQ(AA)(i)=log\left(\frac{P_{aa(i)}(i)}{\sum_{j}P_{aa(i)}(j)/N}\right), \] where \(aa(i)\) is the amino acid type of residue \(i\), \(P_{aa(i)}(i)\) is the computed probability for the amino acid type of residue \(i\) by deep learning, which is normalized by the average probability of the amino acid type across over all atom positions in the protein model.

The Cα position of residue \(i\) in a model is evaluated as: \[ DAQ(C\alpha)(i)=log\left(\frac{P_{C\alpha}(i)}{\sum_{j}P_{C\alpha}(j)/N}\right), \] where \(C\alpha(i)\) is the Cα atom of residue \(i\), \(P_{C\alpha}(i)\) is the computed probability that the position correspond to a Cα atom by deep learning, which is normalized by the average probability of Cα over all atom positions in the protein model.

Lastly, the secondary structure of residue \(i\) in a model is evaluated as: \[ DAQ(SS)\left(i\right)=\sum_{ss\in H,E,C}{{Pseq}_{ss}\left(i\right)log\left(\frac{P_{ss}\left(i\right)}{\sum_{j}P_{ss}(j)/N}\right)}, \] where \(SS(i)\) is the secondary structure type of residue \(i\) to be evaluated, \({Pseq}_{ss}(i)\) is the probability of the secondary structure \(ss\) for the amino acid residue \(i\) predicted from the protein sequence using a secondary structure prediction method, SPOT1D. \(P_{ss}(i)\) is the computed probability of the secondary structure of residue \(i\) by the deep learning, which is normalized by the average probability of the secondary structure type across over all atom positions in the protein model.

Computed scores are averaged by a window of 19 residues along the sequence.

Key Insights

  1. A residue in a model has a positive score if the secondary structure/amino acid/Cα position assignment is correct. A negative score indicates that the assignment may be incorrect and worth close check.
  2. A negative \(DAQ(AA)\) score indicates the possibility of misalignment of the amino acid sequence to the local structure. A negative \(DAQ(C\alpha)\) score indicates the possibility that the local region has incorrect conformation.
  3. If a position in the map does not have distinct density pattern for the assigned amino acid (or secondary structure, Cα atom), \(DAQ\) will be close to 0.