Door 23: Struggles, Setbacks & Solutions

I’m three out of four semesters deep into an MSc, meaning that in a few short months I will be delivering a completed thesis, with detailed results and an in-depth discussion. I’ve had a timeline laid out since the very beginning and am largely on track. That isn’t to say there haven’t been a fair few setbacks here and there. I think that an important skill I’ve learned as a researcher is to accept these, but importantly how to overcome them.

So, for context, my thesis is about macrosynteny in Lophotrochozoa/Spiralia. What that means is, I’m looking at groups of genes found in multiple species, and comparing how they’re distributed across the entire genome. Are these groups split across multiple chromosomes, or are they all found on one? And as mentioned, I’m looking within Lophotrochozoa – a clade of protostome animals. Lophotrochozoa comprises a significant number of phyla within Metazoa, from the better-known molluscs, to the more-rarely discussed gastrotrichs (maybe).

Now, a problem with studying genomes in this group is the lack of data available. Despite the recent influx of more advanced methodologies, many phyla here remain elusive for various reasons. A few are systematic – a lack of funding and sampling efforts for non-charismatic species, as a major example. Others are scientific – there’s known difficulties that come with sampling meiofauna (of which many Lophotrochozoans are).

These are, naturally, struggles for me – how do you conduct comparative genomic studies on species without published genomes? Well, unfortunately, for some groups, you simply don’t. Not yet at least.

However, these are setbacks that can be healed with time. As evidenced by my fellow student Nhu Dinh’s recent thesis, pipelines to sequence these trickier species are undergoing development and refinement all the time.

Another solution is outreach – a few of the CEG advent calendars have been drawing attention to this very problem of underrepresentation (especially in smaller life, from protists to meiofauna to fungi), and they advocate to rectify that (for example, Torsten Struck’s “Quo vadis biodiversity genomic research?”, Håkon Knudsen’s “Discovering hidden microscopic diversity in Norway” and Anna-Lotta Hiillos’s “Small creatures and studying them matters”).

Other problems are more rapidly fixable in the short-term – individual setbacks, as opposed to full on systemic road blocks. For example, in addition to being underrepresented, many species in my groups of study lack annotations – that is, we might have a full genome, but without an annotation, we don’t know how the DNA is distributed across the chromosomes. Annotation can be a tricky, multi-step process, however as I’m interested in gene placement on a full genome scale, precision (and the time and resources it takes to achieve that level of precision) could be sacrificed in the name of a larger dataset, using a simpler pipeline

Or so I thought.

Turns out that the annotation pipeline I used is efficient for some of my species, but for others simply returns unusable information (for example, the Longfin Inshore Squid as seen in this posts banner. It is my “test” species that has been undergoing a bunch of new ideas I’ve been throwing at it!). After multiple rounds of troubleshooting, it was by chance that I stumbled on annotations for some of the same species I was interested in, performed by another researcher with a more robust pipeline than the one I was using. Lo and behold, when using his annotations, I got results in my downstream analyses.

This is frustrating! A long time was spent creating my annotations, and it’s disappointing to learn that something you did 5 steps in your process ago has negative ramifications now. But we can look at it as a negative and be upset, or we can get excited. Sure, it’s a setback and some prior work is now unusable, but this also gives me a chance to ask new questions. Why did my pipeline work on some species but not others? What’s the difference between what they did, and what I did? What are the biological features underpinning this difference?

The important thing to recognise here is that I also have a solution. I can adapt and use a similar pipeline for my other poor-performing species. I think that’s the key – to recognise that each setback also comes with the opportunity to innovate, explore, ask new questions, and develop new solutions.

Loading

Author

Leave a Reply

Your email address will not be published. Required fields are marked *

Please reload

Please Wait