Deciphering protein size and structure with Eddie

Insights into the phenomenon of co-translational assembly

Mihaly is a PhD student at the Marsh Lab at the Institute of Genetics and Cancer. In his research, he examines when and how proteins are built, especially the phenomenon of co-translational assembly. Co-translational assembly refers to the idea that protein subunits assemble with each other whilst they are being made in the cell, rather than some time after they have been made.

Mihaly used structural protein data from a resource called the Protein Data Bank. He examined the properties of proteins that undergo co-translational assembly – their size and structure, but most importantly their “interfaces”. Proteins across most domains of life function as part of larger protein complexes with lots of subunits. This means that protein subunits have interfaces to assemble into and interact within the complexes.

View publication


Digital tools were crucial

When speaking about digital tools, Mihaly explains: “Everything in my research evolves around the large structural data that is generated. I increasingly rely on large datasets, and need more computational power than a laptop can lend me. Data can get quite large, and we need to store and share it with others”.

The bulk of his work was done on Eddie: Mihaly used this tool to map genomic coordinates to protein level, thereby computing interface areas of thousands of proteins. Because of Eddie’s very high computational power lots of data could be generated. It was also possible to run jobs in parallel. “I worked on Eddie’s scratch directory, which has 2 terabyte per user. Throughout the process, Eddie provided a snapshot of the data on the Protein Data Bank”.

If Eddie was ever updated or facing any issues, Mihaly would get notified immediately and kept up to date with the repair processes until Eddie was back online. He was very impressed with the professionalism and support for this tool.

Once the data was processed and cleaned up, Mihaly moved it onto DataStore. All the data which was accumulated on Eddie’s scratch directory was deleted after one or two months. In contrast, DataStore offers the option for secure and backed-up storage of active research data.

Moreover, through DataStore’s group space function, it can be set up so that everyone in the lab can access the data. Beyond that, Mihaly was also working with researchers from Singapore through a collaboration with Singapore 10k. Sinagpore 10K is a project which seeks to map the complete set of DNA of 100,000 Singaporeans with the aim of getting insights into underlying causes of cancer and chronic diseases. DataStore made the collaboration between Mihaly and researchers in Singapore easier, as they could access data through the shared group space.

Read more about Eddie Read more about DataStore


Example of a bacterial operon-encoded protein complex

Figure 5E from the eLife paper

Essentially, it shows an example of a bacterial operon-encoded protein complex, where the interface that is made first in the sequence (N-terminal) is larger than the interface that is made last (C-terminal). We think this facilitates a process called co-translational assembly, which is when a subunit assembles with another while it is being made in the cell. Although this is just one example, we routinely compute the interface areas of thousands of complexes at residue-level on Eddie.” - Mihaly Badonyi


The biggest challenges

Mihaly has a wet lab background, so he needed to learn everything from scratch, such as Unix shell and bash scripting. He felt very well supported in this process, and explained there were a lot of training opportunities and resources available. For instance, he took up training provided by the Bioinformatics and Statistics department or from Institute of Genetics and Cancer (“Intro to Eddie” workshop).

Mihaly’s advice to future users: “Unless you are from a computational background, don’t miss the training sessions. The information provided in these sessions is more useful than any information you could find online. The staff running these sessions are great”.


Promising results pose questions for future research

Mihaly found that larger protein interfaces are more likely to assembly early. This brings up a chicken and egg question: Did large interfaces evolved to promote co-translational assembly? Or does co-translational assembly simply result in large interfaces? To approach this question, Mihaly compared certain interfaces in bacterial, yeast, and human protein complexes. The results from this strongly suggest that large interfaces have evolved to maximise the chance of successful co-translation.

Many questions remain open for future study. For instance, there remains the question of how much other components in the cell (e.g. ribosome-associated chaperones) choreograph the assembly steps in protein synthesis. Furthermore, why large interfaces increase the chance of successful co-translation remains open, although some hypotheses exist. Finally, more research needs to be done on what co-translational assembly means for human genetic disease. This last point is something Mihaly is working on at the moment, with a preprint published. Work like this can play a role in predicting types of genetic mutations that lead to disease.


Mihaly Badonyi is a PhD student at the Marsh Lab within the Institute of Genetics and Cancer.

Twitter: @BdonyiMihly, @jmarshlab



This case study was written by Dr Sarah Janac, Research Facilitator for the College of Medicine and Veterinary Medicine.

Publication

Related items

Why don't you explore featured projects demonstrating the use of similar resources and related training opportunities? Have a look at the carousels below.