close
close

PINNACLE AI model improves protein analysis in real-world context


PINNACLE AI model improves protein analysis in real-world context

A fish on land still moves its fins, but in water the results are significantly different. The analogy, attributed to renowned computer scientist Alan Kay, is intended to illustrate the power of context in elucidating questions under investigation.

A tool called PINNACLE embodies, for the first time in the field of artificial intelligence (AI), Kay’s insights into understanding the behavior of proteins in their proper context, which is determined by the tissues and cells in which those proteins function and interact. In particular, PINNACLE overcomes some of the limitations of current AI models, which, while analyzing protein function and dysfunction, do so in isolation, one cell and tissue type at a time.

The development of the new AI model, described in Natural methodswas led by researchers at Harvard Medical School.

“The natural world is interconnected, and PINNACLE helps identify these connections that we can use to gain more detailed insights into proteins and safer, more effective drugs,” said study lead author Marinka Zitnik, assistant professor of biomedical informatics at HMS’s Blavatnik Institute. “It overcomes the limitations of current, context-free models and suggests the future direction for improving the analysis of protein interactions.”

This advance, the researchers say, could advance current understanding of the role of proteins in health and disease and reveal new drug targets that will enable the development of more precise and tailored therapies.

PINNACLE is available free of charge to scientists everywhere.

A big step forward

Deciphering the interactions between proteins and the effects of their biological neighbors is difficult. Current analytical tools serve an important purpose by Information about the structural properties and shapes of individual proteins. However, these tools are not designed to account for the contextual nuances of the entire protein environment. Instead, they produce context-free protein representations, meaning they lack contextual information about cell type and tissue type.

However, proteins play different roles in the different cellular and tissue contexts in which they are found, including depending on whether the same tissue or cell is healthy or diseased. Single-protein representation models cannot identify protein functions that vary across contexts.

The behaviour of proteins depends on the location

Composed of twenty different amino acids, proteins are the building blocks of cells and tissues and are essential for a range of life-sustaining biological functions – from transporting oxygen throughout the body to contracting muscles for breathing and walking to enabling digestion and fighting off infection, to name a few.

Scientists estimate that there are between 20,000 and hundreds of thousands of proteins in the human body.

Proteins interact with each other, but also with other molecules such as DNA and RNA.
The complex interplay between and across proteins creates nested networks of protein interaction. These networks are located within and between other cells and are involved in many complex interactions with other proteins and protein networks.

The advantage of PINNACLE is its ability to recognize that the behavior of proteins can vary depending on the cell and tissue type. The same protein can have a different function in a healthy lung cell than in a healthy kidney cell or a diseased colon cell.

PINNACLE sheds light on how these cells and tissues affect the same proteins differently, which is not possible with current models. Depending on the specific cell type a protein network is in, PINNACLE can determine which proteins participate in certain conversations and which remain silent. This helps PINNACLE better decipher protein cross-communication and the nature of behavior, and ultimately allows it to predict narrowly tailored drug targets for disease-causing proteins.

According to the researchers, PINNACLE does not make single representation models obsolete, but rather complements them because it can analyze protein interactions in different cellular contexts.

PINNACLE could thus enable researchers to better understand and predict protein function and help elucidate vital cellular processes and disease mechanisms.

This ability can help identify “drugable” proteins that can serve as targets for individual drugs and predict the effects of different drugs on different cell types. PINNACLE could therefore become a valuable tool for scientists and drug developers to identify potential targets much more efficiently.

Such optimization of the drug discovery process is urgently needed, said Zitnik, who is also an associate professor at Harvard University’s Kempner Institute for the Study of Natural and Artificial Intelligence.

It can take 10 to 15 years and cost up to a billion dollars to bring a new drug to market. The path from discovery to drug is notoriously bumpy and the end result often unpredictable. In fact, nearly 90 percent of drug candidates do not become drugs.

Structure and training PINNACLE

Using human cell data from a comprehensive multi-organ atlas, combined with multiple networks of protein-protein interactions, cell type-to-cell type interactions, and tissues, the researchers trained PINNACLE to create panoramic graphical representations of proteins spanning 156 cell types and 62 tissues and organs.

PINNACLE has generated nearly 395,000 multidimensional representations to date, compared to about 22,000 possible representations for current single-protein models. Each of its 156 cell types contains context-rich protein interaction networks with about 2,500 proteins.

The current number of cell types, tissues and organs does not represent the upper limit of the model. The cell types studied so far come from living human donors and cover most, but not all, cell types in the human body. In addition, many cell types have not yet been identified, while others are rare or difficult to study, such as neurons in the brain.

To expand PINNACLE’s cellular repertoire, Zitnik plans to use a data platform that includes tens of millions of cells taken from throughout the human body.

Leave a Reply

Your email address will not be published. Required fields are marked *