Proteins are those working that keep our cells on, and our cells have many thousands of proteins, each of which performs a special function. Researchers have long known that the structure of a protein determines what it can do. Recently, researchers are coming to appreciate that localization of a protein is also important for its function. Cells are filled with compartments that help organize many of their densities. Along with the famous organel, which beautify the pages of biology textbooks, these spaces also include a variety of dynamic, membrane-weed compartments that focus some molecules to perform shared functions together. To know where a given protein localizes, and with whom it co-localize, so it may be useful for better understanding that its role in protein and healthy or diseased cells, but researchers have predicted this information There is a lack of a systematic way to do.
Meanwhile, the protein structure has been studied for more than half a century, the artificial intelligence equipment closing in the alfafold, which can predict the protein structure from the amino acid code of a protein, the linear string of building blocks within it. Its composition is folded to create. Models such as alphafold and models have become widely used tools in research.
Protein also contains areas of amino acids that do not change in a certain structure, but instead are important to help protein to join dynamic compartments in the cell. MIT professor Richard Young and colleagues wondered whether the code in those areas could be used to predict protein localization, in the same way that other areas are used to predict the structure. Other researchers have discovered some protein sequences that are code for protein localization, and some have begun to develop future models for protein localization. However, researchers did not know that the localization of a protein for any dynamic compartment could be predicted based on its sequence, nor did they have a comparable tool for alfafold to predict localization.
Now, also members of Young, Whitehead Institute for Biological Research; Young Lab Postdock Henry Kilgor; Regina Barzile, School of Engineering by MIT’s Electrical Engineering and Computer Science and Computer Science and Artificial Intelligence Laboratory (CSAIL), Professor for AI and Health in the Department of Principal Investigative Department in CSIL (CSAIL); And colleagues have created a model they call Protgp. In a paper published on 6 February in the journal ScienceThe first authors with Kilgore and Barzile Lab Graduate Students Itmar Chin, Peter Mikhail, and Ilan Mitnikov, the cross-disciplinary team started their model. Researchers explain that Protgp can estimate which of the 12 -known types of coaches will localization, as well as the mutation associated with a disease will change that localization. Additionally, the research team developed a common algorithm that can design novel proteins to make specific coaches local.
“My hope is that this is a first step towards a powerful platform that enables people studying protein to do their research,” Young says, “and it helps us understand how a human being complicated how complicated The organisms develop in the organisms, how mutations disrupt those people, and how to generate medical hypothesis and design drugs to treat dysfunction in a cell.
Researchers also valued many predictions of models with experimental tests in cells.
“It really encourages me that I must be able to go to the lab with a computational design to try these things,” says Barzile. “The AI has a lot of exciting papers in this field, but 99.9 percent of them are never tested in actual systems. Thanks to our cooperation with Young Lab, we were able to test, and really learn how well our algorithm is doing. ,
Model
Researchers trained and tested Protgps on two batches of protein with known localization. They found that it could properly predict where proteins end with high accuracy. Researchers also tested how protein can predict a change in protein localization based on the disease -related mutation within a protein. Many mutations – changes in sequence for a gene and its respective protein – the contribution or cause of the disease has been found based on the study of the association, but the methods in which mutation gives rise to symptoms of the disease remain unknown.
Finding how to find out how a mutation contributes to the disease is important because researchers can then develop therapy to cure that mechanism, prevent the disease or treat. Young and colleagues suspected that mutations associated with many diseases could contribute to the disease by changing protein localization. For example, a mutation can make a protein unable to join the compartment with essential partners.
He tested this hypothesis by feeding more than 200,000 proteins with a disease -associated mutation, and then to ask both to ask that those mutated proteins localize and measure to ask to ask that for a given protein How much has changed in general to mutated version. A major innings in prediction indicates possible changes in localization.
Researchers found several cases in which mutations associated with a disease appeared to change the localization of a protein. They tested 20 examples in cells, where a common protein in the cell and its mutated version, using fluorescence to compare. Experiments confirmed Protagp’s predictions. Overall, the findings support the doubts of the researchers that wrong-localization may be a low system of the disease, and displays the value of Protgp as a tool to understand the disease and identify new medical routes.
“Sale is a complex system that has many components and complex networks of interaction,” Mitnikov says. “It is very interesting to think that with this approach, we can mess up the system, can see the result of that, and therefore can drive the mechanism’s discovery in the cell, or even dependent on it. Develop medical. “
Researchers hope that other people start using protgps in the same way that they use forecast structural models such as Alphafold, carry forward various projects on protein functions, dysfunction and disease.
The novel is running beyond the prophecy for the generation
Researchers were excited about the potential uses of their prediction models, but they also wanted their model to be beyond predicting the localization of the existing protein, and allowed them to completely design new proteins. The target model was to create a fully new amino acid sequence, which, when formed in a cell, would localize to a desired location. Creating a novel protein that can actually complete a function – in this case, the function of localization in a specific cellular compartment – is incredibly difficult. To improve the possibility of success of your model, researchers forced their algorithm to design proteins found only in nature. It is an approach that is commonly used in drug design, for logical reasons; Nature is billions of years to find out which protein sequences work well and which is not.
Due to collaboration with Young Lab, the machine learning team was able to test whether their protein generator has worked. The model had good results. In one round, it produced 10 proteins to make nucleols local. When the researchers tested these proteins in the cell, they found that four of them firmly locally for nucleols, and others could have minor bias towards that place.
“The cooperation between our laboratories has been so generous for all of us,” says Mikhail. “We have learned how to speak each other languages, a lot in our case has come to know how cells work, and experimentally having a chance to test our model, we are able to find out The model works that we really need to do, and then it works better. ,
Being capable of generating functional proteins in this way can improve researchers’ ability to develop therapy. For example, if a drug should interact with a target that localizes within a certain compartment, researchers can also use this model to design a drug. This should make the drug more effective and reduce side effects, as the drug will spend less time in attaching with its target and interacting with other molecules, which will create an off-target effect.
Machine learning team members are excited about the possibility of using what has been learned by this collaboration to design novel proteins with other tasks beyond localization, which expand the possibilities for medical designs and other applications Will do it
“A lot of papers show that they can design a protein that can be expressed in a cell, but not that there is a special function of protein,” Chin says. “We had a really functional protein design, and had a relatively huge success rate compared to other generative models. This is really exciting for us, and there is something we want to build. ,
All the researchers involved in it see Protgp as an exciting beginning. They guess that their equipment will be used to learn more about the roles of localization in the protein function and the roles of incorrect localization in the disease. In addition, they are interested in incorporating more medical hypotheses, testing more therapeutic hypotheses and expanding the localization of models to design rapid functional proteins for treatments or other applications.
“Now that we know that this protein code exists for localization, and the machine learning models can make an understanding of that code and even using its argument can create functional proteins, which so many possible studies and Opens the door to applications, “Kilgore says.