Rebuttal letter of the “comment on “contrastive pre-training and 3D convolution neural network for RNA and small molecule binding affinity prediction” by Sun and Gao”
Saisai Sun

Abstract
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
| Methods | PCC | SPCC | RMSE | MAE |
|---|---|---|---|---|
| Vina | −0.386 | −0.389 | 0.277 | 0.257 |
| RF-score | 0.445 | 0.364 | 0.152 | 0.129 |
| RSAPred | 0.399 | 0.221 | 2.886 | 2.205 |
| 3DCNN | 0.466 | 0.408 | 0.165 | 0.129 |
| RLaffinity | 0.559 | 0.540 | 0.152 | 0.119 |
- —Natural Science Basic Research Program of Shaanxi Province10.13039/501100017596
- —Young Scientists Fund of the National Natural Science Foundation of China10.13039/501100020771
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · RNA Research and Splicing · RNA modifications and cancer
Dear Editor,
Thank you for providing us with the opportunity to clarify and address the questions raised in the comment letter. We appreciate the thoughtful comments and the opportunity for a detailed discussion about our research. In the following sections, we have provided detailed responses to each of the questions posed. Our aim is to offer comprehensive explanations and to elucidate any aspects of our work that may have prompted further inquiry. We hope that our responses will address the concerns effectively and enhance the understanding of our research.
1 Presence of DNA-ligand complex structures from PDBbind among the training, validation, and test datasets
There are several reasons for including DNA-ligand complex structures from PDBbind (Wang et al. 2004) in our training, validation, and test datasets. Firstly, the PDBbind dataset is relatively small, making it impractical to exclude the DNA complexes entirely. Additionally, we have separated the double-stranded DNA into single chains, treating these as the binding targets. This approach aligns with the structure of some RNA single chains, such as PDB structure of 2f4s. Moreover, research has shown that RNA can adopt a wide conformational range between the canonical A- and B-forms at the localized single-strand domain (SSD) level, including many B-form-like conformations. This occurs through C2'-endo ribose conformations in one or both nucleotides and B-form-like neighboring base stacking patterns (Sedova and Banavali 2016).
2 MinMax transformation of output variable (pKd) affects generalization of model to unseen data
Firstly, translations (subtracting a constant) and scaling (multiplying by a constant) do not affect correlation. Since MinMax scaling is simply a combination of translation and scaling (without any shear), it does not impact cross-correlation. Therefore, the MinMax transformation cannot artificially inflate Pearson correlation coefficients (PCCs). Regarding the case discussed by Krishnan et al. the reported binding affinity of the theophylline aptamer to caffeine, which is 3500 μM (Jenison, et al. 1994; Menichelli, et al. 2022), may not correspond accurately to the PDB structure 1EHT, deposited by Zimmermann et al. in 1997 (Zimmermann, et al. 1997).
3 Inability of RLaffinity model to generalize to unseen RNA–ligand complexes
The results presented in Table 1 of the comment letter reflect a performance similar to that described in the RLaffinity manuscript (Sun and Gao 2024), which reported a PCC of approximately 0.56 and a Spearman’s rank correlation coefficient (SPCC) of around 0.54.
4 Prediction performance is not better than existing methods
Table 1 demonstrates that RSAPred can achieve a PCC of 0.90 and a SPCC of 0.75 on the seven riboswitch-ligand pairs, using the RSAPred riboswitch model for predictions. According to the RSAPred paper (Krishnan, et al. 2024), RSAPred exhibited the best performance on riboswitches compared to other types of RNA-ligand pairs. In contrast, RLaffinity was trained on all types of RNA–ligand pairs, not specifically on riboswitch–ligand pairs, which may explain its lower performance in this context. Additionally, the mean absolute error values of RLaffinity and RSAPred are comparable. However, a comparison across all RNA–ligand pair types shows that RSAPred is less effective on type-insensitive RNA–ligand pairs (Sun and Gao 2024).
5 Incorrect comparison with existing methods
Firstly, Vina and RF-score are also capable of predicting RNA-small molecule interactions. Due to the limited availability of RNA-small molecule binding affinity prediction methods, RLaffinity used Vina and RF-score as baseline methods for comparison, providing additional context for evaluating RLaffinity’s performance. As mentioned earlier, linear (Minmax) normalization does not affect the PCCs between predicted and actual values. RLaffinity is specifically designed for type-insensitive RNA-small molecule binding affinity prediction, which is why the comparison with RSAPred (Krishnan et al. 2024) did not account for RNA types. This approach also allows for evaluating RSAPred’s performance on type-unknown RNAs.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Jenison RD , Gill SC, Pardi A et al High-resolution molecular discrimination by RNA. Science 1994;263:1425–9.7510417 10.1126/science.7510417 · doi ↗ · pubmed ↗
- 2Krishnan SR , Roy A, Gromiha MM. Reliable method for predicting the binding affinity of RNA-small molecule interactions using machine learning. Brief Bioinform 2024;25:bbae 002.10.1093/bib/bbae 002PMC 1080517938261341 · doi ↗ · pubmed ↗
- 3Menichelli E, Lam BJ, Wang Y et al Discovery of small molecules that target a tertiary-structured RNA. Proc Natl Acad Sci U S A 2022;119:e 2213117119.36413497 10.1073/pnas.2213117119 PMC 9860313 · doi ↗ · pubmed ↗
- 4Sedova A , Banavali NK. RNA approaches the B-form in stacked single strand dinucleotide contexts. Biopolymers 2016;105:65–82.26443416 10.1002/bip.22750 · doi ↗ · pubmed ↗
- 5Sun S , Gao L. Contrastive pre-training and 3D convolution neural network for RNA and small molecule binding affinity prediction. Bioinformatics 2024;40:btae 155.10.1093/bioinformatics/btae 155PMC 1100723838507691 · doi ↗ · pubmed ↗
- 6Wang R , Fang X, Lu Y et al The PD Bbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. J Med Chem 2004;47:2977–80.15163179 10.1021/jm 030580 l · doi ↗ · pubmed ↗
- 7Zimmermann GR , Jenison RD, Wick CL et al Interlocking structural motifs mediate molecular discrimination by a theophylline-binding RNA. Nat Struct Biol 1997;4:644–9.9253414 10.1038/nsb 0897-644 · doi ↗ · pubmed ↗
