Understand Open Access Publishing

Scholarly publishing has been an integral part of scientific discovery and dissemination for the past several hundred years, however the 21st century has shifted that paradigm. In a time when there were only a handful of scientific journals, the process of publishing and dissemination was slow yet consistent; but it was also very gated, with male academics in their ivory towers deciding what was important and what was not important to be published. 

The dawn of the internet spurred a massive shift in how publishing fundamentally worked. Many of the major scientific journal publishers began to steadily increase their prices, and at the same time new journals began publishing exclusively online, which decreased their overhead costs of print, and began experimenting with the idea of open access. In turn, traditional peer-reviewed journals continued to increase their subscription rates, often making them prohibitively expensive for low-income countries, as well as small institutions and libraries. 

In the past quarter-century the growth of scientific journals has exploded, and in turn the number of articles published has grown exponentially. And while it opens the doors to many more researchers being able to get their research published, and the speed and depth of scientific discovery and dissemination has sped up significantly, this shift has left many in the scientific community unsure how to best proceed in this age of open access and unfortunately some get taken advantage of in their desire to publish.

The Scholarly Publishing and Academic Resources Coalition (SPARC) has written a very good overview of why OA matters and how it works. This initiative extends to textbooks and research data sets.

The Open Access Initiative

Open access is a publishing and distribution model that makes scholarly research literature—much of which is funded by taxpayers around the world—freely available to the public online, without restrictions. The Open Access Initiative formally began on February 14, 2002, after a conference in Budapest in December 2001 led to a public statement regarding the principles of open access to research literature. This became known as the Budapest Open Access Initiative.

On April 11, 2003 a group of researchers convened at Howard Hughes Medical Institute, with the goal of improving the access to scholarly literature by developing the logistics of how this material would be made available. The group formulated a definition of an open access journal as:

one that grants free, irrevocable, worldwide, perpetual right of access to, and a license to copy, use, distribute, transmit, and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose, subject to proper attribution of authorship.

This meeting and their released public statement became known as the Bethesda Statement on Open Access Publishing.

In October 2003 another group convened during a conference in Berlin to establish an international statement on open access and the availability of information and knowledge to the entire international scientific community. The statement from this Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities was published on October 22, 2003.

At a 2005 follow-up conference, the declaration was refined to two key principles: signatories should require researchers to deposit a copy of their work in an open access repository and encourage the publication of work in open access journals when available. Today these two concepts are often called “Green OA” and “Gold OA” respectively, and the two combined are referred to as an open-access mandate.

Together, these three statements (Budapest, Bethesda, and Berlin) are collectively known as the “BBB definition.” From these three, the entire open access movement has been formed and shaped into the entire open access publishing environment that we have today.

NIH Public Access Policy

The NIH Public Access Policy was implemented in 2008 to advance science and improve human health by providing free online access to full-text, peer-reviewed journal articles arising from taxpayer-funded research.

The National Institutes of Health (NIH) requires every scientist who receives an NIH research grant and publishes the results in a peer-reviewed journal to deposit a digital copy of the article in its digital archive, PubMed Central (PMC). In turn, the NIH will make these articles freely available within one year of publication.

While NIH Public Access Policy essentially makes articles published even in non-open access journals available freely to the public, it is only required for articles that are funded using US-taxpayer money (any NIH grant, including P30 Core Grants). It also included up to a 12-month period with which it needed to be available. That will be changing, as a policy change announced in 2022 will require any research with federal grant funds to be available immediately without delay upon publication. This policy change will go into effect no later than December 31, 2025. 

How Do Open Access Journals Work?

Open access journals are journals whose articles are freely available worldwide without restrictions or embargo using a Creative Commons License granted by the authors. The primary difference between open access journals and closed access journals is how they are funded.

  • Closed access journals (traditional subscription-based journals such as NEJM, JAMA, etc.) are funded by their readers through individual and institutional subscriptions, as well as pay-per-view. Usually this means libraries are forced to foot the cost of ever-increasing subscriptions to continue to get access. 
  • Open access journals are freely accessible to the readers, but without subscription fees they need to find funding and make money through a different means. The primary means of funding open access journals is through author-paid article processing charges (APCs). Other journals may be funded through institutions, professional societies, and consortia and rely on volunteers to publish.
  • Hybrid journals are closed access journals that offer authors the option to make their work freely available after paying the journal a hefty fee. This hybrid model can be seen as “double dipping” by the publishers who are making money off of both their readers (subscriptions) and authors (processing fees), and is seen as controversial and in disagreement with the open access philosophy. 

Types of Open Access

Gold Open Access

Gold OA is the type of open access that is directly connected to “open access journals”.

  • The final publisher version is open access via the journal website without any embargo period.
  • The publication has a license intended to maximize reuse, such as CC-BY.
  • The author(s) may be subject to pay an additional article processing charge (APC) by the publisher.

Green Open Access

Green OA is the type of open access that harnesses self-archiving by the author to provide freely available access.

  • Authors can archive their paper in an full-text journal or subject repository or in their institutional repostory.
  • The version archived is usually the final author version as accepted for publication.
  • Preprints fall under Green OA.
  • There are no additional charges (APCs) to paid.
  • A publisher embargo period may apply (usually 12 months).

Platinum/Diamond OA

  • Journals that publish OA but do not charge the authors APCs
  • Usually funded by institutions, advertising, philanthropy, etc

Bronze OA

  • Journal free to read online but doesn’t have a license
  • Not generally sharable or reusable

Black OA

  • Illegal open access
  • Pirated versions of articles

Open Access Resources

MSK Library Information Guides

MSK Documentation for NIH Policy

Open Access Resources

NIH Public Access Resources

What’s NOT: More About the Boolean Operator “NOT”

Boolean Operators (AND, OR, NOT) are tools for combining search terms and are inherent part of online database searching. While experienced searchers will use Boolean Operators directly in their search strategies, even novice searchers that just enter a string of terms into a database’s search box will end up indirectly using the Boolean operator AND, as each space between words will be treated by the database as AND, thus combining each term together into a search strategy that would retrieve results that have all terms present.

Image Source: https://sru.libguides.com/english/librarybasics/booleanoperators

Most search strategies will either use just AND or a combination of both AND and OR. The third Boolean operator, NOT, is much more complicated and requires some understanding to use properly in a search.

Using the Boolean Operator NOT

The Boolean operator NOT can be used when a term or terms needs to be excluded from your search strategy.

For example, if you were interested in articles that looked at children with cancer, but you did not want articles that looked specifically at infants, you could create a search strategy like this:

cancer AND child* NOT infant*
— or —
(cancer AND child*) NOT infant*

The Problem with NOT

When using the Boolean operator NOT to exclude terms, it can become problematic when the database excludes records that contain both the term(s) you want to exclude and the term(s) you want in your search.

In the above example, not only articles about cancer in infants will be excluded from the results but it will also exclude any articles about cancer in both children and infants.

Information professionals (librarians and informationists) advise using the Boolean operator NOT with extreme caution when conducting searches. It’s better to reach out to an information professional for assistance with complex search techniques and how to best proceed with a search when there is a term you want to avoid.

Variations Across Databases

Not all databases function the same way, and using the Boolean operator NOT is no different. While most databases allow for using simply NOT to exclude terms, depending on the database or platform, you might need to use the operator AND NOT instead (Scopus), or once the search is performed use the Exclude button found within the Refine Search panel (also in Scopus).

Takeaway

The Boolean operator NOT should be used with extreme caution. It is best to consult a Librarian on its use in your search.

Search Smarter with the Latest Technology

The amount of published biomedical literature has been growing exponentially for decades, and that trend is not slowing down anytime soon. With this explosion of published content, it can be overwhelming to find exactly what you are looking for.

The 21st Century Digital Age

The start of the 21st century was heralded as the “digital age”, and the growth of content shifted from a linear to an exponential growth model. There were approximately 13 million citations in PubMed at the start of the 21st century. Within the first decade that number rose to 20 million. Today there are over 36 million citations in PubMed. 

Growth of PubMed citations from 1986 to 2010
Source: Lu Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford). 2011;2011:baq036. Published 2011 Jan 18.

Zhiyong Lu, from the National Center for Biotechnology Information (NCBI), wrote about going beyond PubMed back in 2011, and shared an initial overview of web-based tools available that work alongside or on top of PubMed to provide more search functionality to users.

From the Digital Age to the Age of Artificial Intelligence

Today, as we now inch closer to the quarter-century point, digital technology has literally begun taking on a life of its own. With the advent of machine-learning and generative artificial intelligence, suddenly technology itself can create its own content! And while there are plenty of ethical issues surrounding the use and abuse of AI that cover nearly all aspects of life, this technology allows for considerable benefits as well.

New tools have emerged to help us better navigate, digest, and synthesize the overwhelming amount of digital information available, including biomedical literature. Many of these tools are web-based resources that either overlay or work in conjunction with PubMed to provide functionality that goes beyond basic search and retrieval.

Last month, Zhiyong Lu and several of his colleagues from NCBI published an update to his 2011 overview; PubMed and beyond: biomedical literature search in the age of artificial intelligence. This update focused on how user search needs have expanded and AI tools can provide search functionality to address these different needs.

Overview of five specialised search scenarios in biomedicine
Source: Jin Q, Leaman R, Lu Z. PubMed and beyond: biomedical literature search in the age of artificial intelligence. EBioMedicine. 2024;100:104988. 

They looked at five specific types of specialized search needs, and addressed the various tools and resources that can provide necessary functionality to support those search needs: evidence-based medicine, precision medicine, semantic searching, recommendations, and text mining.

Harness Technology with these Search Tools

Using these five identified search needs categories, below are selected resources to assist users in navigating and digesting the ever-expanding field of biomedical research.

Evidence-Based medicine

PubMed Clinical QueriesThis PubMed tool uses predefined filters to help you quickly refine PubMed searches on clinical or disease-specific topics.
Cochrane Clinical AnswersCCA provides readable, digestible, clinically focused actionable point-of-care information directly from Cochrane Reviews.
Joanna Briggs Institute (JBI) EBP DatabaseThe JBI EBP Database provides the latest research and evidence-based guidelines regarding patient care, treatment options, and interventions to empower clinicians and healthcare administrators to make informed, confident decisions. 
TRIP DatabaseTRIP is a clinical search engine designed to allow users to quickly and easily find and use high-quality research evidence to support their practice and/or care.

Precision Medicine & Genomics

OncoSearchOncoSearch is a text mining search engine that searches Medline abstracts for sentences describing gene expression changes in cancers. 
LitVarLitVar normalizes different forms of the same variant into a unique and standardized name so that all matching articles can be returned regardless of the use of a specific name in the query.
DigSeeDigSee is a text mining search engine to provide evidence sentences describing that “genes” are involved in the development of “disease” through “biological events”. With a query of (disease, genes, events), Medline abstracts with highlighted evidence sentences will be retrieved.

Semantic Searching

LitSenseLitSense is a unique search system for making sense of the biomedical literature at the sentence level. Given a query, LitSense finds the best-matching sentences based on overlapping terms as well as semantic similarity via a cutting-edge neural embedding approach.
AskMEDLINESearch PubMed using free-text and natural language
BioMed ExplorerBioMed Explorer applies semantic understanding of the content of the papers to pull out answers and highlight snippets and evidence for the user. 
Semantic ScholarSemantic Scholar provides free, AI-driven search and discovery tools, and open resources for the global research community. 

Literature Recommendations

LitSuggestAdvanced machine learning and information retrieval techniques are utilized for finding and ranking publications pertinent to a topic of interest. 
Connected PapersConnected Papers is a unique, visual tool to help researchers and applied scientists find and explore papers relevant to their field of work.

Text Mining

PubTator PubTator Central (PTC) is a Web-based system providing automatic annotations of biomedical concepts such as genes and mutations in PubMed abstracts and PMC full-text articles. 
PubMedKBPubMedKB combines a multitude of state-of-the-art text-mining tools optimized to automatically identify the complex relationships between biomedical entities in the PubMed abstracts.

As technology evolves, so will the research environment, and it’s imperative that we are able to leverage technology to keep up. It’s also important to understand these new technologies, how they work, and how they can be used to make work more efficient. But it’s also important to understand their limitations and the ethical issues that could arise when using these technologies without further human insight.