Pre-processing in AI based Prediction of QSARs
Om Prasad Patri, Amit Kumar Mishra

TL;DR
This paper emphasizes the importance of pre-processing and analysis of datasets in AI-based QSAR prediction, demonstrating how proper feature reduction and mapping methods improve understanding and model selection for drug design and toxicology.
Contribution
It introduces a systematic approach to dataset pre-analysis for QSAR problems, guiding the choice of mapping methods and classifiers based on data characteristics.
Findings
Pre-analysis reveals dataset relationships and suitable mapping methods.
Proper preprocessing aids in selecting effective feature extraction tools.
Insight into data nature improves classifier choice for QSAR tasks.
Abstract
Machine learning, data mining and artificial intelligence (AI) based methods have been used to determine the relations between chemical structure and biological activity, called quantitative structure activity relationships (QSARs) for the compounds. Pre-processing of the dataset, which includes the mapping from a large number of molecular descriptors in the original high dimensional space to a small number of components in the lower dimensional space while retaining the features of the original data, is the first step in this process. A common practice is to use a mapping method for a dataset without prior analysis. This pre-analysis has been stressed in our work by applying it to two important classes of QSAR prediction problems: drug design (predicting anti-HIV-1 activity) and predictive toxicology (estimating hepatocarcinogenicity of chemicals). We apply one linear and two nonlinear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Anomaly Detection Techniques and Applications
