International Journal on Science and Technology

E-ISSN: 2229-7677     Impact Factor: 9.88

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 17 Issue 1 January-March 2026 Submit your research before last 3 days of March to publish your research paper in the issue of January-March.

Genome-Based Drug Repurposing: Identifying Potential Targets Using FASTA Sequences and Machine Learning

Author(s) Mr. Ahmed Abdallah Alshahab, Dr. Vaishali A. Chavan
Country India
Abstract This study uses genomic data to explore machine learning techniques for drug repurposing in viral diseases. This research aims to develop classification models that utilize FASTA protein sequence data to find similar genomes and, based on identified similar genome could help in drug repurposing for viral diseases. To develop this model, we explored three machine learning models Decision Tree, Random Forest , and K-Nearest Neighbours. These models were implemented and assessed.

The study used genetic information from seven viral disease families obtained from the National Center for Biotechnology Information (NCBI). Data preprocessing involved cleaning and encoding the FASTA protein sequences.

These three models were implemented to identify similar genomes for targeted viral disease, and tested targeted HMPV Viral disease on all three models and found Rotavirus as the closest match.

The Random Forest model shows the best performance with an accuracy of 98.79\% and F1 Score of 0.988.
Keywords Machine Learning, Genome Predictor, FASTA sequence, Drug Repurposing
Field Computer > Artificial Intelligence / Simulation / Virtual Reality
Published In Volume 16, Issue 4, October-December 2025
Published On 2025-12-31
DOI https://doi.org/10.71097/IJSAT.v16.i4.10034
Short DOI https://doi.org/hbjmqd

Share this