Citing Code via GitHub

As we were taught in school, whenever someone quotes, paraphrases, summarizes, or otherwise references another scholar’s research, they must properly attribute that research with a citation in their work. This same rule applies to code!

Citing codes is not only required as part of the publication process, its value also includes:

  • contributing to ethical and transparent science,
  • recognizing the contributions of programmers to a research project,
  • tracking reuse of code over time, and
  • reinforcing the value of non-traditional bibliographic research outputs (like code, datasets, and software).

Code can be challenging to cite because the traditional bibliographic elements are not always readily apparent. Often the only citation information in a code repo has to be garnered from a file or from the original publication that references that code, if such a publication exists.

If you are maintaining your code in GitHub, you have a few options to encourage proper citation by self-identifying contributors and citation elements.

DOI for Code. In 2016, GitHub partnered with Zenodo, the CERN-operated open-source data repository, to mint Digital Object Identifiers (DOI) for archived repos. A DOI is a persistent identifier registered in an internationally recognized database which gives your code (or data) a disambiguated, permanent redirect. DOIs are a great first step in ensuring that the correct version of code is being clearly identified with proper attribution.

To take advantage of this, create a free account with Zenodo and be prepared to archive a specific version of your code. Read more information on how to generate the webhooks between your repos and Zenodo! 

Citation Support for Code. Recently (August 2021), GitHub announced enhanced support for citation adding a ruby-cff RubyGem to their code to incorporate .cff citation files. Adding a CITATION.cff file to one’s GitHub repository lets the owner identify attribution elements, and automatically generates a simple ‘Cite this repository’ button in the repo with APA and BibTex citation formatting.

Some of the elements a repo owner can include are:

  • code author names,
  • author ORCID iDs,
  • preferred software name,
  • DOI, and
  • other info related to date and version.

In particular, ORCID iDs and DOIs have value as disambiguation elements which ensure that credit is correctly identified. Read more information on how set up citation support in GitHub!  or Schema elements for .cff

If you need help understanding how to set this up or want to discuss how you can get and/or give proper citation to code, data, or software, please reach out to Anthony Dellureficio, Associate Librarian, Research Data Management.