Search Smarter with the Latest Technology

The amount of published biomedical literature has been growing exponentially for decades, and that trend is not slowing down anytime soon. With this explosion of published content, it can be overwhelming to find exactly what you are looking for.

The 21st Century Digital Age

The start of the 21st century was heralded as the “digital age”, and the growth of content shifted from a linear to an exponential growth model. There were approximately 13 million citations in PubMed at the start of the 21st century. Within the first decade that number rose to 20 million. Today there are over 36 million citations in PubMed. 

Growth of PubMed citations from 1986 to 2010
Source: Lu Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford). 2011;2011:baq036. Published 2011 Jan 18.

Zhiyong Lu, from the National Center for Biotechnology Information (NCBI), wrote about going beyond PubMed back in 2011, and shared an initial overview of web-based tools available that work alongside or on top of PubMed to provide more search functionality to users.

From the Digital Age to the Age of Artificial Intelligence

Today, as we now inch closer to the quarter-century point, digital technology has literally begun taking on a life of its own. With the advent of machine-learning and generative artificial intelligence, suddenly technology itself can create its own content! And while there are plenty of ethical issues surrounding the use and abuse of AI that cover nearly all aspects of life, this technology allows for considerable benefits as well.

New tools have emerged to help us better navigate, digest, and synthesize the overwhelming amount of digital information available, including biomedical literature. Many of these tools are web-based resources that either overlay or work in conjunction with PubMed to provide functionality that goes beyond basic search and retrieval.

Last month, Zhiyong Lu and several of his colleagues from NCBI published an update to his 2011 overview; PubMed and beyond: biomedical literature search in the age of artificial intelligence. This update focused on how user search needs have expanded and AI tools can provide search functionality to address these different needs.

Overview of five specialised search scenarios in biomedicine
Source: Jin Q, Leaman R, Lu Z. PubMed and beyond: biomedical literature search in the age of artificial intelligence. EBioMedicine. 2024;100:104988. 

They looked at five specific types of specialized search needs, and addressed the various tools and resources that can provide necessary functionality to support those search needs: evidence-based medicine, precision medicine, semantic searching, recommendations, and text mining.

Harness Technology with these Search Tools

Using these five identified search needs categories, below are selected resources to assist users in navigating and digesting the ever-expanding field of biomedical research.

Evidence-Based medicine

PubMed Clinical QueriesThis PubMed tool uses predefined filters to help you quickly refine PubMed searches on clinical or disease-specific topics.
Cochrane Clinical AnswersCCA provides readable, digestible, clinically focused actionable point-of-care information directly from Cochrane Reviews.
Joanna Briggs Institute (JBI) EBP DatabaseThe JBI EBP Database provides the latest research and evidence-based guidelines regarding patient care, treatment options, and interventions to empower clinicians and healthcare administrators to make informed, confident decisions. 
TRIP DatabaseTRIP is a clinical search engine designed to allow users to quickly and easily find and use high-quality research evidence to support their practice and/or care.

Precision Medicine & Genomics

OncoSearchOncoSearch is a text mining search engine that searches Medline abstracts for sentences describing gene expression changes in cancers. 
LitVarLitVar normalizes different forms of the same variant into a unique and standardized name so that all matching articles can be returned regardless of the use of a specific name in the query.
DigSeeDigSee is a text mining search engine to provide evidence sentences describing that “genes” are involved in the development of “disease” through “biological events”. With a query of (disease, genes, events), Medline abstracts with highlighted evidence sentences will be retrieved.

Semantic Searching

LitSenseLitSense is a unique search system for making sense of the biomedical literature at the sentence level. Given a query, LitSense finds the best-matching sentences based on overlapping terms as well as semantic similarity via a cutting-edge neural embedding approach.
AskMEDLINESearch PubMed using free-text and natural language
BioMed ExplorerBioMed Explorer applies semantic understanding of the content of the papers to pull out answers and highlight snippets and evidence for the user. 
Semantic ScholarSemantic Scholar provides free, AI-driven search and discovery tools, and open resources for the global research community. 

Literature Recommendations

LitSuggestAdvanced machine learning and information retrieval techniques are utilized for finding and ranking publications pertinent to a topic of interest. 
Connected PapersConnected Papers is a unique, visual tool to help researchers and applied scientists find and explore papers relevant to their field of work.

Text Mining

PubTator PubTator Central (PTC) is a Web-based system providing automatic annotations of biomedical concepts such as genes and mutations in PubMed abstracts and PMC full-text articles. 
PubMedKBPubMedKB combines a multitude of state-of-the-art text-mining tools optimized to automatically identify the complex relationships between biomedical entities in the PubMed abstracts.

As technology evolves, so will the research environment, and it’s imperative that we are able to leverage technology to keep up. It’s also important to understand these new technologies, how they work, and how they can be used to make work more efficient. But it’s also important to understand their limitations and the ethical issues that could arise when using these technologies without further human insight.

Systematic Bulk Downloading of Articles from PubMed Central (PMC)

In this era of artificial intelligence (AI) and machine learning (ML), there is increased interest in accessing large numbers of full-text articles to train deep learning models and/or evaluate their performance. The U. S. National Library of Medicine (NLM)’s PubMed Central (PMC) full-text article repository is a popular choice with AI/ML researchers who are often looking for a free, openly accessible source of the scholarly biomedical literature. For a recent example of research carried out using the PMC Open Access Subset, see PMID: 37094464:

Although the NLM is generally accommodating of researchers using and even building upon all the tools and resources that it develops and supports, there is an expectation on the part of NLM that researchers will work within their rules and restrictions. Anyone interested in “automated retrieval of articles in machine-readable formats in PubMed Central (PMC)” is encouraged to explore the “several large datasets of journal articles and other scientific publications made available for retrieval under license terms that generally allow for more liberal redistribution and reuse than a traditional copyrighted work (e.g., Creative Commons licenses)”. However, there are “Restrictions on the Systematic Downloading of Articles”– see https://www.ncbi.nlm.nih.gov/pmc/tools/textmining/

When researchers try to bulk download a large amount of content via the regular PMC web interface on their own, PMC’s systems notice the increased activity and block the IP range(s) responsible as this is in violation of the terms of the PMC Copyright Notice which states that “Systematic downloading of batches of articles from the main PMC web site, in any way, is prohibited because of copyright restrictions.”

From: https://www.ncbi.nlm.nih.gov/pmc/about/copyright/:

PMC makes certain subsets of articles (i.e., the PMC Article Datasets) accessible through auxiliary services that may be used for automated retrieval and downloading. These are:

These services are the only services that may be used for this purpose. Do not use any other automated processes for downloading articles, even if you are only retrieving articles from the PMC Article Datasets (including the PMC Open Access Subset).

Questions? Be sure to Ask Us at the MSK Library!

ChatGPT and Fake Citations: MSK Library Edition

Since the launch of ChatGPT, an artificial intelligence chatbot developed by OpenAI, we at the MSK Library have seen an uptick in requests to track down what turn out to be fake citations for studies related to cancer research.

We decided to pick a topic we were recently asked to conduct a literature search on (survival outcomes, recurrence, and pathology characteristics of poorly differentiated thyroid carcinoma) to see how ChatGPT handled it. Below are screenshots from our conversation. 

Looks pretty good, right? We asked for the full citations. 

Voila, ChatGPT delivered! We then attempted to verify these citations. We first looked them up in databases and citation indexes like PubMed and Google Scholar. Then we checked the DOIs, or digital object identifiers. Finally, we went directly to the journals these “articles” were “published” in to see if they appeared in the same journal, issue, and volume ChatGPT cited, or if they appeared in these journals at all. These citations didn’t appear to be legitimate, so we let ChatpGPT know.

ChatGPT gave the same incorrect citations again. We asked if it was fabricating this information.

Still no dice. It appeared that ChatGPT was “hallucinating.” Learn more about this phenomenon here and here

We asked ChatGPT why it was creating these fake citations, and its response was illuminating. 

Our interaction with ChatGPT isn’t surprising – it’s a large language model and not a database or citation index. ChatGPT is great for some aspects of research, but not others. Check out Duke University Libraries’ blog post ChatGPT and Fake Citations for more information. 

Learn more about AI by visiting our Artificial Intelligence guide.  Need help finding evidence based information? Ask Us