A groundbreaking artificial intelligence model known as Deep Predictor of Binding Specificity (DeepPBS) has been developed by USC researchers and featured in the esteemed journal Nature Methods. This cutting-edge model accurately predicts how a diverse range of proteins bind to DNA, heralding a significant breakthrough in the development of new drugs and medical treatments.
DeepPBS, a geometric deep learning model, has been specifically engineered to forecast protein-DNA binding specificity based on complex structures. This powerful tool equips scientists and researchers with the ability to input the data structure of a protein-DNA complex into an intuitive online computational platform, revolutionizing the field of protein-DNA interaction analysis.
“Structures of protein–DNA complexes contain proteins that are usually bound to a single DNA sequence. For understanding gene regulation, it is important to have access to the binding specificity of a protein to any DNA sequence or region of the genome,” said Remo Rohs, professor and founding chair in the Department of Quantitative and Computational Biology at the USC Dornsife College of Letters, Arts and Sciences.
“DeepPBS is an AI tool that replaces the need for high-throughput sequencing or structural biology experiments to reveal protein–DNA binding specificity.”
DeepPBS utilizes a cutting-edge geometric deep learning model, a form of machine learning that examines data using geometric structures. This advanced AI tool is specifically designed to analyze the chemical properties and geometric contexts of protein-DNA interactions in order to accurately predict binding specificity.
By analyzing this data, DeepPBS generates spatial graphs that visually represent protein structure and the complex interplay between protein and DNA. Unlike many existing methods, which are limited to specific protein families, DeepPBS can predict binding specificity across diverse protein families, making it a versatile and powerful tool in bioinformatics.
“It is important for researchers to have a method available that works universally for all proteins and is not restricted to a well-studied protein family. This approach allows us also to design new proteins,” Rohs said.
The field of protein-structure prediction has made significant strides thanks to DeepMind’s AlphaFold, which can predict protein structure from sequence data. These advancements have resulted in an abundance of structural data accessible for scientific analysis. DeepPBS, when combined with structure prediction methods, is capable of predicting specificity for proteins without experimental structures.
Rohs emphasized the wide-ranging applications of DeepPBS. This groundbreaking research tool has the potential to expedite the development of new drugs and treatments tailored to specific mutations in cancer cells. Additionally, it could lead to new breakthroughs in synthetic biology and find applications in RNA research.
Journal reference:
- Raktim Mitra, Jinsen Li, Jared M. Sagendorf, Yibei Jiang, Ari S. Cohen, Tsu-Pei Chiu, Cameron J. Glasscock & Remo Rohs. Geometric deep learning of protein–DNA binding specificity. Nature Methods, 2024; DOI: 10.1038/s41592-024-02372-w