Katie Dubarry is in the final year of her PhD at the Roslin Institute and Scotland’s Rural College (SRUC). In her project, she focuses on gene expression of sheep’s immune cells, using techniques in molecular biology and bioinformatics.
Her aim is to find genetic variants that could be impacting immune gene expression, and her project involved several components. First, she generated and sequenced RNA from healthy sheep of different breeds. Her sample size was more than 100 animals. “This large number is unusual in sheep research”, she explains. Katie also performed single cell nuclei RNA sequencing, highlighting that, until then, “there was no data on immune cells in sheep at single cell resolution”. Finally, Katie’s work also involved eQTL analysis. She used a subpart of her large dataset (around 60 sheep) and matched genotypes, looking for regions that influenced the expression levels of genes. “eQTL analysis is common in medical research. However, little has been explored for sheep in this area”.
An array of University tools, including Eddie, DataStore, DataSync and Gitlab were key to her work. Due the large sample size, and large file size that comes with genome sequencing, Eddie was crucial in all three types of analysis. Eddie is the University’s supercomputer, and is free at the point of use. Eddie also integrates nicely with DataStore, the University’s secure file storage system.
Using Eddie involves some previous knowledge of the Linux operating system. “Before I started my PhD, I had played around with Linux before. Moreover, in my lab group there was a bioinformatician who was very well-versed using Eddie. They taught me a lot of the basics in 6 months”. Katie supplemented her knowledge by reviewing the information on the wiki, attending training at Edinburgh Genomics and Data Carpentries online courses.
During her PhD it was necessary to transfer data to a collaborator at the Royal Veterinary College. The collaborator had a Masters student who needed some of the RNA sequencing data that Katie had generated. Katie used DataSync to send it over. DataSync is a system that integrates with DataStore, allowing you to easily send data to external collaborators. “It was reasonably straight forward; I just followed some online guidance. The file was password protected as well, which was helpful”.
More about DataSyncKatie also started using Gitlab during her PhD. She admits that she does not use all the aspects, as it is a steep learning curve. She still only knows the basics, but she finds it useful for version control and integrates it into other workflows and her R environment. The rest of her lab use it as well, that is how she got onboard in the first place.
More on Gitlab
Another thing that her lab demonstrated was the use of the University wiki has a sort of lab notebook. Katie and her colleagues use it to document anything – from lab experiments to computational research to conference presentations. “I particulalry like the friendly interface”, Katie explains. “It helps to visualise things outside of the standard folder-subfolder system”. The lab has their wiki set to private. Katie had considered that it might be useful to add a public page highlighting key research outcomes and documentation, however, this is just an early thought.
All in all, through her PhD, Katie has managed to generate unique reference datasets for sheep research. These can be mined and exploited for other sort of analyses, e.g. disease challenge studies. “Having validated that these different techniques work well in sheep for immune tissue means that the work can be scaled up in future – especially the single cell work. I hope that my work has prepared the ground so that sheep can get some attention too.”
This case study was written by Sarah Janac, Research Facilitator for CMVM.
Twitter/X: @KatieDubarry
Why don't you explore featured projects demonstrating the use of similar resources and related training opportunities? Have a look at the carousels below.