Skip to Main Content

Research Data Management

Citing data

Researchers should cite data when communicating their scholarly or scientific findings in the same way that they cite articles, books, and other sources. Data citation gives credit and attribution to the creator, encourages sharing, collaboration, and re-use, enables verification of research results, and allows for tracking usage and impact. Data takes many forms across academic disciplines. Some of these include:

  • Instrument readings
  • Spreadsheets
  • Data from structured and unstructured interviews
  • Survey data
  • Genetic sequences
  • Textual corpora
  • Satellite and geographic data
  • Software code
  • 3-D Modelling data

Common elements of data citation

Although uniform citation formats have been slow to develop, there are the commonly accepted elements of data citation:

  • Author(s) - a person, organization, government agency, or other responsible party
  • Title - name given to dataset or the study
  • Year of publication - The date when the dataset was made available, either published or released or the last version updated
  • Publisher - the data center/repository
  • Edition or version
  • Access - URL, DOI, or other location information for the data

Looking for guidance

When citing data for publication, there are a number of places for researchers to look for guidance: 

  • Journals often have instructions on how to cite data in manuscript submissions.
  • Refer to relevant style guides, some of which specifically address data citation. 
  • Archives and repositories usually provide a suggested data citation.
  • Absent any other guidance, researchers should produce their own data citations using the common elements above. Arrange elements according to the order and punctuation of style being used.

Examples

From the American Economic Review guidelines (https://www-aeaweb-org.libproxy.temple.edu/journals/policies/sample-references)

Leiss, Amelia. 1999. “Arms Transfers to Developing Countries, 1945–1968.” Inter-University Consortium for Political and Social Research, Ann Arbor, MI. ICPSR05404-v1. https://doi.org/10.3886/ICPSR05404.

--

From the American Sociological Review guidelines (https://journals-sagepub-com.libproxy.temple.edu/author-instructions/ASR)

Deschenes, Elizabeth Piper, Susan Turner, and Joan Petersilia. Intensive Community Supervision in Minnesota, 1990–1992: A Dual Experiment in Prison Diversion and Enhanced Supervised Release [Computer file]. ICPSR06849-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2000. doi:10.3886/ICPSR06849.

--

From PLOS guidelines (https://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories)

Andrikou C, Thiel D, Ruiz-Santiesteban JA, Hejnol A. Active mode of excretion across digestive tissues predates the origin of excretory organs. 2019. Dryad Digital Repository. https://doi.org/10.5061/dryad.bq068jr

--

From the APA Style Blog (https://apastyle.apa.org/style-grammar-guidelines/references/examples/data-set-references)

O’Donohue, W. (2017). Content analysis of undergraduate psychology textbooks (ICPSR 21600; Version V1) [Data set]. ICPSR. https://doi.org/10.3886/ICPSR36966.v1

--

From the Chicago Manual of Style Online (14.257: Citing data from a scientific database)

2. GenBank (for RP11-322N14 BAC [accession number AC087526.3]; accessed April 6, 2016), http://www-ncbi-nlm nihgov.libproxy.temple.edu/nuccore/19683167.

--

Produced by the Harvard Dataverse (https://doi.org/10.7910/DVN/ZHGT7U)

Lee, John D.; Alsaid, Areen, 2020, "A Machine Vision Approach for Estimating Motion Discomfort in Simulators and in Self-Driving", https://doi.org/10.7910/DVN/ZHGT7U, Harvard Dataverse, V1

--

Produced by Zenodo (http://doi.org/10.5281/zenodo.3747600)

Thanasis Vergoulis, Ilias Kanellos, Serafeim Chatzopoulos, Danae Pla Karidi, & Theodore Dalamagas. (2020). BIP4COVID19: Impact metrics and indicators for coronavirus related publications (Version 1.1) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3747600