Explore the New Research Data Management Webpage

Curious about what services the Library offers to help you find, manage, and share your research data? We now have a webpage dedicated to providing information about the Library’s Research Data Management services and resources. Learn about our Data Catalog, get help understanding publisher data sharing requirements, or schedule a consultation to discuss best practices. We’ll be updating this page regularly as we bring more services online. 

We’re here to help you throughout the life of your research from planning to publication, and beyond.

RDM and COVID-19 (Part 4): Data Repositories and Publishers

This is the final post of a 4 part series of posts on Research Data Management and COVID-19. Click here for part 1, part 2, and part 3.

Generalist Data Repositories and Publishers have created some impressive collaborative, open access resources to support COVID-19 research. Even some traditionally for-pay companies have been contributing to the effort. Many of these repositories allow researchers to contribute their own data as well as provide platforms to enhance discoverability of datasets, code, and on-going projects. In late April, NIH hosted a webinar in which representatives from several generalist repositories spoke about their efforts to promote sharing, discovery, and citation guidance for COVID-19 data and code. The recorded webinar is available to the public: NIH/NLM Sharing, Discovering, and Citing COVID-19 Data and Code in Generalist Repositories Webinar. Here are a few of the COVID-19 resources advanced by generalist repositories:

  • Figshare: As a discipline agnostic data repository, Figshare provides a free platform (after creating a login) for researchers to store their datasets regardless of the data format. In response to COVID-19, they have created a subset of open datasets based on tags applied by data submitters and curated by Figshare staff. In addition to traditional numerical and quantity-based datasets, they encourage submission of code, presentations slides, conference posters, and other formats of information which might not typically be viewed as ‘data’, particularly since many conferences and meetings were canceled in light of the global pandemic.
    Figshare - Digital Science
  • Mendeley: This data repository provided by the publisher Elsevier, has also created a COVID-19 hub. They index numerous repositories, curating an aggregate of datasets including those deposited in their own repository. Access to the Elsevier Coronavirus Research Hub is free with registration to individual researchers wishing to use Mendeley to organize their own researh. Alternatively, COVID-19 datasets indexed in Mendeley are freely accessible to the public without an account. You can search or view them here.Mendeley Elsevier
  • Zenodo: Part of the CERN Against COVID-19 project, Zenodo, has allocated additional dedicated server space for COVID-19 research and beefed up its overall storage infrastructure in anticipation of expanded computing needs researchers. Their main ‘Coronavirus Disease Research Community’ page contains a simple but powerful search interface to discover information about COVID-19 projects and datasets. The database is powered by Elasticsearch and populated in part by the OpenAIRE harvesting gateway to bring in records for datasets from across the European research community. Zenodo has also added a librarian to their staff specifically dedicated to curating COVID-19 datasets.
    Zenodo
  • GitHub: The open code repository and versioning tool, GitHub, has aggregated public COVID-19 projects to aid in discoverability and reduce collaboration barriers. Based on tags added by contributors to their own repos, GitHub has a weekly ingest to add projects to their covid-19-repo-data. The README page of this repository contains additional documenation about how to view, search, and use code and data in this collection.
    GitHub
  • Dryad: Although Dryad, another public platform for storing and sharing research data, does not have a particular COVID-19 portal, they have engaged in specific initiatives related to supporting open science during this pandemic. Dryad supports building standards for COVID-19 data in an effort to prepare the data for future use. They are also offering guidance for researchers in appropriate handling of Protected Health Information (PHI), since many researcher have shifted to working with human subjects for the first time.Dryad Digital Repository | DataONE
  • Vivli: For clinical trial, Vivli has created a COVID-19 portal to encourage data submission and sharing from participant-level clinical trials. A challenging aspect to this has been maintaining a balance between openness, speed, and privacy. Vivli is waiving fees for submission and sharing of COVID-19 datasets.
    Vivli - Center for Global Clinical Research Data

We hope the resources highlighted in this series of Research Data Management and COVID-19 will keep any interested researchers busy exploring and potentially contributing to the global effort to understand, treat, and eradicate COVID-19!

Just remember that if you decide to take part in this tremendous collaborative effort, all data included in a publication must be cited. If you’re unsure of how to do that or would like guidance in finding datasets, using data tools, or sharing your data, please reach out to Anthony Dellureficio, Associate Librarian for Data Management Services.

RDM and COVID-19 (Part 3): Institutional Collaborations

This is part 3 of a 4 part series of posts on Research Data Management and COVID-19. Click here for part 1 and part 2.

Collaborative efforts between institutions have yielded numerous open-access datasets, visualization tools, and resources designed to accelerate COVID-19 research by making datasets, tools, and computational resources  more discoverable and accessible. Examples include:

  • MIDAS – A “global network of scientists and practitioners from academia, industry, government, and non-governmental agencies” has created a COVID-19 portal which collates publicly available datasets and code to enable researchers to contribute towards COVID-19 modeling research.
    Midas Network
  • Terra – Terra provides cloud-based work spaces to support bioinformatics research by integrating data pipelines with analytical tools, such as Jupyter Notebook. They have set up specific work spaces to accelerate COVID-19 research. Access to Terra is free but requires an account setup. You can use your google account.Terra
  • VODAN – The Virus Outbreak Data Network (VODAN) is a collaborative effort between the Committee on Data Science and Technology (CODATA), the Research Data Alliance, World Data Systems, and GO FAIR; organizations which promote Open Science efforts in research data. In an effort to encourage best practices and standards for data repositories while still hastening access to COVID-19 data, these organization have created the FAIR Data Points repository which can be granted access to the local data of participating institutions, such as hospitals and research centers. VODAN can then assist institutions with data stewardship, best practices, and managed access to the data in their repositories.
    Virus Outbreak Data Network (VODAN) - GO FAIR
  • High Performance Computing Consortium – This consortium of academic, industry, government organizations pools available computing resources to provide accelerator services for participating members around COVID-19 research. Access to the computing resources are free but require submission and approval of a research proposal. Researchers utilizing the consortium resources are expected to maintain public updates of their progress and eventually publish their findings.
    IBM & DOE COVID-19 High Performance Computing Consortium

Likewise, there are numerous institution-specific sources of COVID-19 research data out there. No doubt many people are already aware of the Johns Hopkins Coronavirus Research Center whose maps, visualizations, dashboards, and data connections have been used to display tracking by many news organizations, such as the NY Times’ global outbreak map. Another example is NYU’s list of ongoing data projects mobilized to combat COVID-19. Many of the on-going institution specific data projects are also being indexed by the above-mentioned government and generalist repositories. As a members of the Data Discovery Collaboration, the MSK Library has been participating in discussions about how institutions and researchers can best make their COVID-19 research more visible and useful to other researchers.

Interactive map from Johns Hopkins shows coronavirus

In the final post in this series, we’ll showcase some of the Data Repository and Publisher responses to the COVID-19 pandemic.