Preprints: Latest News & Useful Tips

International Open Access Week is a good time to revisit preprints and their growing role in the biomedical scholarly communication landscape. Although embraced by researchers in fields like physics for decades, it has only been in the last few years that researchers – and funding agencies – in the biomedical sciences have begun to become more serious about using preprint servers.

How are preprint servers the same or different than open access (OA) journals?

The most important difference is that manuscripts posted to a preprint server have not been confirmed by peer-review, whereas OA journal articles published in reputable scholarly journals will have gone through a rigorous peer-review process before being published. As such, taking extra precautions before citing research that appears only in a preprint version may be merited – for example, checking that it has not been listed in the Retraction Watch database.

That said, most publishers allow manuscripts that have been previously posted as preprints to also be submitted to their journals for eventual publication as a peer-reviewed article. There is even a Preprint/Journal Manuscript matcher tool now available that can help authors who have posted to either bioRxiv or medRxiv preprint servers to use a text-matching automation tool to identify good journal contenders for their preprint server-posted manuscript.

Although both OA journal articles and preprints are freely-available to readers to view and download, posting to a preprint server is free for the authors, whereas most OA journals charge an Article Processing Charge (ACP) or publication fee. Researchers can search the Directory of Open Access Journals (DOAJ) for OA journal ACP information.

Both document formats are now also accepted as works that can be cited in NIH awards reporting. The NIH has in fact expanded their working definition of publication to better accommodate “Interim Research Products” like preprints. As per NIH guidance:

Publication: A “Publication” includes (a) published research results in any manuscript that is peer-reviewed and accepted by a journal1 or (b) a complete and public draft of a scientific document (commonly referred to as preprint).2

It is also important to note that most preprint servers will assign a DOI (digital object identifier) to the preprint manuscript that will be different than the DOI that may eventually be assigned to the final published article. As such, the two versions can and should be treated as separate “citable” items that can both be included in a researcher’s author profile(s) and CV.

Select preprints have also begun to be indexed in PMC and PubMed, initially as part of a pilot project for COVID-19 research, but “NLM will expand the pilot to include preprints resulting from the broader spectrum of NIH-supported research as curation and ingest workflows are refined, automated, and made scalable”

Last – ORCiD has also added features and functionality to accommodate preprint citation information in their author profiles. A preprint work type category has been added, as well as the ability for preprint servers that are ORCiD members to transfer citation information into author profiles. Furthermore, linkages within ORCiD can be created once the published article citation information related to that preprint becomes available.

To learn more about preprints, be sure to check out NLM’s new self-paced tutorial on Preprints or Ask Us at the MSK Library!

Be in the Know: Access The New York Times

Did you know that the MSK Library provides access to current and back-issues of the New York Times? There is also now no need to register or log in to the NYT to access content (previously our subscription required you to register for a NYT account and log in). We have upgraded our NYT subscription to work with OpenAthens so simply click and authenticate and you are set to read, whether on campus, at home, on VPN, or on your mobile device!!

Explore even more of our resources now accessible remotely through OpenAthens, such as UpToDate! We are constantly adding new databases to be accessible through this easy to use software.

NOTE: OpenAthens is only available to MSK staff through their MSK login credentials.

Citing Code via GitHub

As we were taught in school, whenever someone quotes, paraphrases, summarizes, or otherwise references another scholar’s research, they must properly attribute that research with a citation in their work. This same rule applies to code!

Citing codes is not only required as part of the publication process, its value also includes:

  • contributing to ethical and transparent science,
  • recognizing the contributions of programmers to a research project,
  • tracking reuse of code over time, and
  • reinforcing the value of non-traditional bibliographic research outputs (like code, datasets, and software).

Code can be challenging to cite because the traditional bibliographic elements are not always readily apparent. Often the only citation information in a code repo has to be garnered from a README.md file or from the original publication that references that code, if such a publication exists.

If you are maintaining your code in GitHub, you have a few options to encourage proper citation by self-identifying contributors and citation elements.

DOI for Code. In 2016, GitHub partnered with Zenodo, the CERN-operated open-source data repository, to mint Digital Object Identifiers (DOI) for archived repos. A DOI is a persistent identifier registered in an internationally recognized database which gives your code (or data) a disambiguated, permanent redirect. DOIs are a great first step in ensuring that the correct version of code is being clearly identified with proper attribution.

To take advantage of this, create a free account with Zenodo and be prepared to archive a specific version of your code. Read more information on how to generate the webhooks between your repos and Zenodo! 

Citation Support for Code. Recently (August 2021), GitHub announced enhanced support for citation adding a ruby-cff RubyGem to their code to incorporate .cff citation files. Adding a CITATION.cff file to one’s GitHub repository lets the owner identify attribution elements, and automatically generates a simple ‘Cite this repository’ button in the repo with APA and BibTex citation formatting.

Some of the elements a repo owner can include are:

  • code author names,
  • author ORCID iDs,
  • preferred software name,
  • DOI, and
  • other info related to date and version.

In particular, ORCID iDs and DOIs have value as disambiguation elements which ensure that credit is correctly identified. Read more information on how set up citation support in GitHub!  or Schema elements for .cff

If you need help understanding how to set this up or want to discuss how you can get and/or give proper citation to code, data, or software, please reach out to Anthony Dellureficio, Associate Librarian, Research Data Management.