Drug discovery is expensive and time consuming. Machine learning can drastically reduce time and cost, and allow collaboration and innovation.
Every year, the hepatitis C virus affects more than 200 million across the world. The infection is a major cause of chronic liver disease, including chronic hepatitis, liver cirrhosis and liver cancer.
Unlike hepatitis A and hepatitis B, there are no approved vaccines against hepatitis C.
Though anti-viral drugs for hepatitis C have revolutionised treatment with high success rates, there’s another class of drugs that may be highly effective, particularly in cases of drug resistance or even disease prevention.
These drugs are based on peptide molecules that are short chains of amino acids and have higher specificity to target viral components, better solubility and lower toxicity. They can also work against every genotype of the virus and are easier to synthesise than conventional small-molecule drugs.
The trouble, however, lies in looking for peptides that will work against a specific virus. This is because there are large numbers of peptides but very few that will work against a specific virus — a process akin to finding a needle in a haystack.
Recent research has shown that using machine learning is an efficient approach to look for these molecules.
Though machine learning has been used to predict general antimicrobial and antiviral peptides, only a handful of studies have concentrated on those that target specific viruses such as hepatitis C.
Proteins and peptides have garnered enough attention from scientists and researchers worldwide for their potential use in diagnostics, therapeutics and drug delivery systems.
Insulin, for instance, is a peptide with 51 amino acids, and its discovery and development is considered one of the most significant advancements in drug discovery. More recently, the weightloss drug Semaglutide, a peptide with 31 amino acids, is being hailed as a game-changer for patients with Type 2 diabetes, obesity and heart disease.
Proteins and peptides are both chemically the same. A peptide chain is made up of amino acid residues, the same chemical molecules that constitute proteins and enable cells to perform various functions.
Given that 22 amino acids are found in nature, it is possible to create distinct short peptide chains in numerous ways. For example, synthesising a peptide chain that is made up of 10 amino acid residues can yield a large number — 22¹⁰— of distinct chains.
However, very few of these peptides have therapeutic effects, and even fewer can act against a specific virus.
Besides, the number of FDA-approved bioactive peptides — those with therapeutic properties — is quite limited. Hence, identifying those that exhibit antimicrobial and antiviral properties among the vast possibilities is a herculean task.
Here, machine learning has come to the aid of researchers.
Traditional methods for identifying effective anti-hepatitis C peptides — or any drug molecule for that matter — are time-consuming and resource-intensive.
Some estimates suggest that traditional methods may require synthesising up to 5,000 molecules for over four to six years to find one promising candidate for drug development.
Machine learning models, however, can bring down the screening time drastically. They can screen billions of molecules, allowing researchers to process a billion molecules each day. This makes the drug discovery process, particularly the early stages of drug development, quicker and cost-efficient.
One such model that researchers have developed recently is a web-based predictive tool called Pred-AHCP (predict anti-hepatitis C peptides) to evaluate if a peptide molecule can effectively inhibit the hepatitis C virus.
The model does this by analysing its amino acid composition and physico-chemical properties.
The method employs a two-step computational filtering process that relies on statistical algorithms. This approach is particularly useful because not only does it predict whether a peptide is likely to be anti-hepatitis C, but it also explains the reasons for its effectiveness by highlighting the molecule’s most significant molecular characteristics.
Hence, rather than just predicting candidates, researchers can understand the underlying mechanisms that might make them work.
Beyond treating hepatitis C, this approach can be adapted to develop similar predictive tools to look for peptides against other viruses, thus creating a family of virus-specific prediction tools.
This would be particularly valuable for viruses that currently lack effective treatments such as HIV, Herpes Simplex Virus and Zika.
The findings from the development of Pred-AHCP can facilitate a more efficient discovery of lead peptide candidates than generic antiviral peptide prediction methods.
This is in large measure due to the explainable nature of this machine learning model that provides synthetic biochemists, organic chemists and bioengineers with insights into which molecular features might make particular peptides effective against hepatitis C.
Such an understanding can guide the rational design of new therapeutics by emphasising essential characteristics such as the distribution of hydrophobicity — a crucial physico-chemical property, often used in drug design, toxicology and environmental monitoring — the existence of particular amino acid pairs, and other structural elements.
One could, for instance, experiment with modifications guided by the model to maximise the effects of significant features, potentially creating peptides with greater efficacy, sustained virological response, and even improved pharmacological properties such as stability, permeability (how easily they diffuse across biological membranes), and bioavailability (what percentage of the drug actually reaches, or, is absorbed by the blood stream and remains active).
Besides, this model is available as a webserver, allowing for greater accessibility among researchers without specialised computational expertise.
Such democratisation of advanced computational tools by sharing therapeutic approaches has the potential to enhance collaboration and innovation in antiviral research globally for various infectious diseases.
Originally published under Creative Commons by 360info™.
Akash Saraswat is a senior research scholar at BML Munjal University, Gurugram, Haryana.
Bipin Singh is Assistant Professor, Centre for Life Sciences, Mahindra University, Hyderabad, Telangana.
Arijit Maitra is Associate Professor, School of Engineering and Technology, BML Munjal University, Gurugram, Haryana.