In my last post, I briefly mentioned our work in collaboration with Nicola de Maio from Nick Goldman’s lab, at EMBL’s European Bioinformatics Institute (EBI). What Nicola is doing is filtering out unreliable sequences of SARS-CoV-2 samples before analyzing them for genetic variability at the drug-binding sites of SARS-CoV-2 proteins that we want to investigate.
First, I sent the list of 21 residues forming the SARS-CoV-2 main protease’s (MPro) catalytic site to Nicola. Nicola used his last batch of over 15000 SARS-CoV-2 genome samples (as of May 17th, 2020) to identify all mutations at these 21 positions. Nicola reported his findings in his recent post on OpenLabNotebook here and detailed methods are also available on Zenodo here. Mutations were extremely rare, but he did identify the following four mutations in SARS-CoV-2 samples from COVID-19 patients: M49I, P52S, N142S, and P168S.
To predict the effect of these mutations on the binding of a known MPro inhibitor (PDB code 7bqy), we estimated the difference in the binding energy of the ligand between the wild type and the mutated forms of the protein with ICM (Molsoft, San Diego). ICM calculates the Gibbs binding energy (dGbind in kcal/mol) of the ligand change as the energy of the protein-ligand complex minus the energy of the isolated protein and ligand. The difference in binding energy (ddGbind), which is what we are interested in, is dGbind [mutant] minus dGbind [wild-type]. Since the precision of the method is about 2 kcal/mol1, ddGbind values > 2 kcal/mol indicate mutations that significantly penalize ligand binding. Figure 1 below highlights the mutated residues in orange, and the table shows the ddGbind values.
Figure 1. Top: The amino acid mutations found at the catalytic site of SARS-CoV-2 MPro (PDB code: 7bqy) across over 15000 SARS-CoV-2 genome samples are highlighted in orange. The effect of the mutations on the binding energy of the MPro inhibitor N3 (bound to MPro in the crystal structure 7bqy) is shown. The effect of mutating residues at positions known to be important for ligand binding (but not mutated in COVID-19 patients) are shown as positive controls. Bottom: The 3D representation of the SARS-CoV-2 MPro catalytic site. The residues highlighted in light grey are positive controls (M165I, Q189S), and those in orange are the four mutations identified from SARS-CoV-2 samples (M49I, P52S, N142S, and P168S). The dashed line shows the hydrogen bond between the inhibitor and Q189.
We also added two positive controls to our table: methionine 165 (M165) and glutamine 189 (Q189). These residues are both at positions that are critical for ligand binding, and we speculated that the ddGbind values should reflect that. To mimic the mutation M49I, we also mutated M165 to Ile, similarly we mutated Q189 to Ser to mimic N142S.
As you can see, for example, mutating M165 to an isoleucine residue will increase the Gibbs binding free energy by 14.65 kCal/mol, which is very significant, while the change in Gibbs binding free energy for mutations that Nicola found in the SARS-CoV-2 sequences is in the range of -0.1 kcal/mol to 2 kcal/mol, which is not significant (as explained above).
Consequently, our data suggest that the main protease mutations observed so far in SARS-CoV-2 samples should have a limited effect on the binding of the protease inhibitor N3. On the other end, the two positive controls are predicted to significantly penalize ligand binding. Importantly, these are predictions only and would need to be validated experimentally.
The limited effect of these mutations on ligand binding predicted by ICM is further supported by the fact that all these positions are at the rim of the binding site rather than deep into the pocket as shown in Figure 1. This is in contrast with the positive control M165, which lies at the bottom of the binding site: mutations are more likely to induce steric clashes. Q189, the other positive control, is also at the rim but makes a hydrogen bond with the inhibitor, which is lost when mutated to Ser.
In conclusion, mutations identified over 15,000 SARS-CoV-2 genome samples and positioned at the catalytic site of the viral main protease are rare and expected to have minimal impact on binding of N3, an inhibitor that was crystallized with the protein.
To see the detailed steps of my analysis, please refer to my Zenodo report here.
Please contact me via the “Leave a comment” link at the top of this post. Stay Tuned for more updates on this project.
- Schapira, M., Totrov, M., Abagyan, R. Prediction of the Binding Energy for Small Molecules, Peptides and Proteins. Journal of Molecular Recognition 12(3), 177-90 (1999). https://doi.org/10.1002/(SICI)1099-1352(199905/06)12:3<177::AID-JMR451>3.0.CO;2-Z