How Text-Mining Tools Can Improve Your Literature Searches

Bitesize Bio Search

Search below to delve into the Bitesize Bio archive. Here, you’ll find over two decades of the best articles, live events, podcasts, and resources, created by real experts and passionate mentors, to help you improve as a bioscientist. Whether you’re looking to learn something new or dig deep into a topic, you’ll find trustworthy, human-crafted content that’s ready to inspire and guide you.

Before starting any new research project, it’s essential that you have as complete an understanding as possible of the current research literature. Knowing what other people have done will prevent you from duplicating existing work, and will perhaps indicate under-explored niches. If you work in the same subject area over a number of years, you will accrue this knowledge from your own reading and from your colleagues and supervisor. But how can you get up to speed quickly in a new area? Perhaps your 2-hybrid / ChIP / microarray experiment has suggested some possible interacting proteins you know nothing about. Or maybe you’re formulating a new hypothesis and want to find supporting evidence in the literature.

PubMed is most scientists’ first port of call for literature searches, and there are many fine tutorials on this site explaining how to get the most from this tool. However, many people don’t know that there are also several promising text-mining tools that offer more sophisticated text-searching functions, such as semantic searches, and are quite accessible to experimental biologists. These tools analyse the free text of an article using publicly available Medline data and extract relationships between the search terms, index these relationships, and present their results almost instantly.

Here, I’ll highlight three text-mining tools which were developed by the UK’s National Center for Text Mining (NaCTeM) in Manchester. All of these tools have convenient web interfaces, so it’s easy to give them a try and see if they are useful to you.

MEDIE is an intelligent search engine that easily identifies biomedical correlations. You can use this tool to search for relations between biological entities using a subject-verb-object query, for example, ‘myprotein-causes-cancer’ or ‘mygene-regulates-lipid metabolism’. A nice feature of this tool is that results are returned with snippets of text showing the sentences from which the relationships were deduced, so you can assess their relevance very quickly.

Choose a free resource to help you move forward

CHECKLIST

You can avoid having to reread a manuscript by ensuring you make the right notes the first time around. Our manuscript summary template and guide has designated sections, with helpful prompts on what to include.

GET YOUR COPY

CHECKLIST

Choosing the wrong papers and starting without a hypothesis can stall the review writing process before it's begun. This step-by-step list walks you through all five phases of the process to keep you on track.

DOWNLOAD FREE

Kleio provides enhanced searching functionality by disambiguating alternative names and searching for all synonyms of the search term. For example, a search for ‘interleukin-1’ would also match texts containing the terms ‘IL1’ and ‘IL-1’.

Finally, Facta searches for pairwise associations between related concepts. If you have questions like: ‘What diseases are relevant to a particular gene?’ or ‘What chemical compounds are relevant to a particular disease?’, then this tool may be able to help you. One particularly interesting feature is Facta’s ability to search for indirect associations that would not be immediately obvious from reading individual abstracts. This tool also highlights relevant text from the abstract to provide evidence for the associations it identifies.

Of course, text mining is not perfect yet – the English language is so rich and varied that an idea can be expressed in a myriad of ways, not all of which are captured by the heuristic rules of the text-mining algorithm. But these tools are so easy and fast to use that they can be added to your literature-searching repertoire today!

Have you used any of these tools to search the literature? What other tools do you recommend?

You made it to the end—nice work! If you’re the kind of scientist who likes figuring things out without wasting half a day on trial and error, you’ll love our newsletter. Get 3 quick reads a week, packed with hard-won lab wisdom. Join FREE here.