Preprints: Latest News & Useful Tips

International Open Access Week is a good time to revisit preprints and their growing role in the biomedical scholarly communication landscape. Although embraced by researchers in fields like physics for decades, it has only been in the last few years that researchers – and funding agencies – in the biomedical sciences have begun to become more serious about using preprint servers.

How are preprint servers the same or different than open access (OA) journals?

The most important difference is that manuscripts posted to a preprint server have not been confirmed by peer-review, whereas OA journal articles published in reputable scholarly journals will have gone through a rigorous peer-review process before being published. As such, taking extra precautions before citing research that appears only in a preprint version may be merited – for example, checking that it has not been listed in the Retraction Watch database.

That said, most publishers allow manuscripts that have been previously posted as preprints to also be submitted to their journals for eventual publication as a peer-reviewed article. There is even a Preprint/Journal Manuscript matcher tool now available that can help authors who have posted to either bioRxiv or medRxiv preprint servers to use a text-matching automation tool to identify good journal contenders for their preprint server-posted manuscript.

Although both OA journal articles and preprints are freely-available to readers to view and download, posting to a preprint server is free for the authors, whereas most OA journals charge an Article Processing Charge (ACP) or publication fee. Researchers can search the Directory of Open Access Journals (DOAJ) for OA journal ACP information.

Both document formats are now also accepted as works that can be cited in NIH awards reporting. The NIH has in fact expanded their working definition of publication to better accommodate “Interim Research Products” like preprints. As per NIH guidance:

Publication: A “Publication” includes (a) published research results in any manuscript that is peer-reviewed and accepted by a journal1 or (b) a complete and public draft of a scientific document (commonly referred to as preprint).2

It is also important to note that most preprint servers will assign a DOI (digital object identifier) to the preprint manuscript that will be different than the DOI that may eventually be assigned to the final published article. As such, the two versions can and should be treated as separate “citable” items that can both be included in a researcher’s author profile(s) and CV.

Select preprints have also begun to be indexed in PMC and PubMed, initially as part of a pilot project for COVID-19 research, but “NLM will expand the pilot to include preprints resulting from the broader spectrum of NIH-supported research as curation and ingest workflows are refined, automated, and made scalable”

Last – ORCiD has also added features and functionality to accommodate preprint citation information in their author profiles. A preprint work type category has been added, as well as the ability for preprint servers that are ORCiD members to transfer citation information into author profiles. Furthermore, linkages within ORCiD can be created once the published article citation information related to that preprint becomes available.

To learn more about preprints, be sure to check out NLM’s new self-paced tutorial on Preprints or Ask Us at the MSK Library!

Journal/Manuscript Matching Tools

As the number of new journal titles steadily increases from year to year, choosing the most appropriate publication outlet for a manuscript is becoming more complex than ever. Add to that the very real threat posed by predatory publishers who are actively trying to deceive unsuspecting authors – not to mention the multiple open access options available to choose from – and you have a journal selection process severely in need of some support.

Luckily, automation has come to the rescue! In this case, via the development of a category of journal selection/finder tools often referred to as Journal/Manuscript Matching Tools.

What are Journal/Manuscript Matching tools?

Journal/Manuscript Matchers are innovative tools that offer suggestions for which journals to consider submitting a manuscript to based on a text-matching search using the manuscript’s title and abstract provided by the author. This software is similar to plagiarism detection software in that it compares strings of text. However, the two tools likely apply very different cutoff values for variables like percent similarity. In the case of journal finders, they don’t run the text comparison against other full-text articles, but rather against bibliographic literature database records (ie. other titles and abstracts).

The Journal/Manuscript matching tools work under the assumption that journals that have published papers on a similar topic before may be more likely to be interested in publishing on this topic again in the future. By searching for published papers that are “similar” (as per text word match based on the title and abstract only) to an author’s manuscript, these tools are able to suggest potential journal homes where “similar” papers have been accepted before for publication. Generally, these tools are helpful in identifying appropriate journals in terms of scope and target audience.

A limitation, however, is that past journal editor behavior does not always predict future journal editor decisions. (For example, a journal that is broader in scope may have recently published a special issue on a particular topic or subject area that the editors are not planning on revisiting any time in the very near future.)

What are some examples of available Journal/Manuscript Matching Tools?

Which Journal/Manuscript Matcher should you use?

Depending on the discipline and subject area whose audience you are targeting, one of these options may be more fruitful/relevant – with regards to the suggestions made – than the others. For example, the JANE or Journal/Author Name estimator, conducts the text-matching search against the PubMed database and is best for targeting biomedical journal titles. As such, if you are targeting an engineering journal, for example, you may be better off exploring the suggestions provided by the IEEE publication recommender, or more multidisciplinary journal matchers like Elsevier’s or Clarivate’s journal finder tools.

The oldest of these tools, JANE, was developed in 2007 by a Biosemantics research group at the University Medical Center Rotterdam’s Department of Medical Informatics (Netherlands). It is the least biased towards a particular publisher’s journal titles and it offers a fairly current snapshot of publication trends as the data from PubMed that it runs the comparisons against is updated monthly.

To get a better understanding of what you get out of these journal selection tools in general and to learn more about JANE, see this recent review:

Curry CL. Journal/Author Name Estimator (JANE). J Med Libr Assoc. 2019 Jan;107(1):122–4. doi: 10.5195/jmla.2019.598. Epub 2019 Jan 1. PMCID: PMC6300233.

To submit a request for assistance with the journal selection process, be sure to Ask Us at the MSK Library!

Checking to What Extent PubMed and MEDLINE Index a Journal

Understanding the extent to which a particular database indexes the contents of a journal is a crucial step towards maximizing the visibility and reach of your published work(s).

Although social media and other marketing channels have definitely helped with getting the word out about new research in scholarly publishing, the reach of bibliographic indexes in terms of providing access to content beyond an individual author’s personal and professional networks is still very significant.

There are a few factors that impact the visibility and reach of a literature database:

1) Public access versus commercial databases

Content that everyone has access to because it is not stored behind a paywall, regardless of how well-funded their institution is or if they are affiliated with a research library or not, has the potential of reaching a wide range of audiences across the globe. In the case of PubMed, for example: “On an average working day approximately 2.5 million users from around the world access PubMed to perform about 3 million searches and 9 million page views.”   

2) Syndicated/leased/shared versus proprietary content

Syndicated content is content that is published on multiple sites beyond the source, which broadens its reach and visibility”. There are some databases, like MEDLINE, whose content is leased to other database vendors and  can be searched (in whole or in part) in other resources. For example, MEDLINE content is included in EMBASE, CINAHL, and Cochrane Library. Having a journal indexed in both MEDLINE and PubMed, therefore, increases the potential for the contents of that journal to be discovered by searchers of databases beyond NLM’s PubMed’s free search interface.  For this reason, it is helpful to understand the difference between PubMed and MEDLINE and how each of these resources is put together.

3) “Surface” versus “deep” web indexing by search engines

Another important question to ask of an online database is: Do regular web search engines, like Google, “see” the contents of this database? In the case of the vast majority of database resources on the Internet, the standard World Wide Web search engines generally stop at the front door of the database tool and do not index the actual contents within the database. In the case of PubMed, however, Google actually “crawls” the records contained within the database, increasing their findability by Google Scholar searchers who may never search the PubMed database via its native interface.

From Vine R. Google Scholar. J Med Libr Assoc. 2006 Jan;94(1):97–9. PMC1324783:

“Much of Google Scholar’s index derives from a crawl of full-text journal content provided by both commercial and open source publishers. Specialized bibliographic databases like OCLC’s Open WorldCat and the National Library of Medicine’s PubMed are also crawled. Since 2003, Google has entered into numerous individual agreements with publishers to index full-text content not otherwise accessible via the open Web. Although Google does not divulge the number or names of publishers that have entered into crawling or indexing agreements with the company, it is easy to see why publishers would be eager to boost their content’s visibility through a powerhouse like Google.”

In short, selecting a journal to publish in that is indexed in PubMed, as well as, in MEDLINE, gives your manuscript a good head start towards achieving maximum international reach and visibility.

Follow these steps to determine whether a journal is indexed in PubMed alone, in both PubMed and MEDLINE, or in neither:

Click on the “Journals” link of the PubMed homepage or go directly to the NLM Catalog to search for Journals referenced in the NCBI Databases. Once you bring up a catalog record for a journal of interest, click on the title to open the full record where you can confirm a journal’s MEDLINE “Current Indexing Status”.

Below are examples of the indexing status information provided in NLM Catalog records:

























Questions?  Ask Us at the MSK Library!