Data Citation Index

The University Library now subscribes to the Data Citation Index from Thomson Reuters (which also provides Web of Science).  You can also access the Data Citation Index by searching for it in the Library’s Online Journals & Databases system.

The goal of the Data Citation Index is to support data discovery, reuse and interpretation.  To achieve this, the Data Citation Index brings together results from data repositories across disciplines.  The rough breakdown of repositories by discipline is: life sciences (48%), physical sciences (23%), social sciences (20%), arts & humanities (7%), and multidisciplinary (2%). Examples of repositories included are: Gene Expression Omnibus, WormBase, Dryad, NOAA National Geophysical Data Center, Inter-University Consortium for Political and Social Research (ICPSR), Archaeology Data Service, and figshare.

The Data Citation Index provides suggested citations for the data, based on the data citation recommendations of

The Data Citation Index also provides links between the data and the articles that cite it.  For example, search for “GSE2814” to see mouse liver tissue expression data that has been cited by 6 articles in Web of Science.  Because data citation is not standardized or common practice, most data in the Data Citation Index has not been cited very often.  So currently, this is not a very robust feature, but it has interesting potential.

Social Science Data Repositories

The various disciplines of the social sciences yield research data as diverse as the repositories which preserve and make these data sets accessible. Among the many fields that comprise the social sciences–archaeology, geography, sociology, economics, political science, and psychology, to name a few–the numeric and spatial data they produce document a vast array of research on human behavior, culture, society, landscapes, and economic structures. Given the number of options, social science researchers must consider which repository is best suited for the long-term preservation and curation of their data.

Some social science data repositories are broad in scope; the ICPSR (Inter-university Consortium for Political and Social Research) at the University of Michigan, for instance, preserves data related to geography and environment, economic behavior and attitudes, community and urban studies, and education. Other repositories provide data curation services for a single discipline such as tDAR (The Digital Archaeological Record) for archaeology.

Different social science repositories will have varying requirements for data deposit. The ICPSR provides a comprehensive guide for preparing data for preservation and archiving. Such guidelines will outline specifics such as acceptable file formats for data submission,  requirements for metadata descriptions, and deposit of supporting documentation (such as laboratory notebooks) that aid in illustrating the context under which the data was created.

A list of social science repositories/resources for finding and depositing data can be found below:


ADS (Archaeology Data Service)

CoPAR (Council for the Preservation of Anthropological Records)

Open Context

Registry of Anthropological Data wiki

tDAR (The Digital Archaeological Record)

Criminal Justice

NACJD (National Archive of Criminal Justice Data – associated with ICPSR)

U.S. Bureau of Justice Statistics

Demographics/Government Data Repositories

Minnesota Population Center, University of Minnesota

U.S. Bureau of Labor Statistics

U.S. Bureau of Transportation Statistics

U.S. Census Bureau


National Bureau on Economic Research


National Center on Education Statistics

General Social Sciences

Australian Data Archive

CESSDA (Consortium of European Social Science Data Archives)

ICPSR (Inter-university Consortium for Political and Social Research) at the University of Michigan. ICPSR also manages the NCAA Student-Athlete Experiences Data Archive

IQSS (The Institute for Quantitative Social Science) at Harvard University)

Pew Research Center

UK Data Archive

UK Data Service

UNESCO Institute for Statistics





National Oceanographic Data Center



Sociology/Public Opinion

American National Election Studies

National Data Archive on Child Abuse and Neglect at Cornell University

Roper Center Public Opinion Archives

While this is not a complete list, other data repositories can be found through data clearinghouses, such as OpenDOAR and Databib. The Scholarly Commons also maintains a list of Geospatial Data Repositories and Numeric Data Repositories. Likewise, the College of Liberal Arts and Sciences’ ATLAS provides a  list of repositories relating to public opinion and government data.

For more information on the Scholarly Commons’ services relating to the social sciences, please visit the Numeric and Spatial Data Services site.


Data Cite – Find, Identify, and Cite Datasets

Data Cite a non-profit organization created to establish easier access to research data,  increase acceptance of research data as legitimate, citable contributions to the scholarly record, and support data archiving.  This organization seeks to bring institutions, researchers and other interested groups together to address the challenges of making research data accessible and visible.  Through collaboration, researchers find support in locating, identifying, and citing research datasets with confidence.

Data Centers are provided persistent identifiers for datasets, plus workflows and standards for data publication. Journal publishers receive support to enable research articles to be linked with data.  Data Cite works with organizations, data centers, and libraries that host data in efforts to assign persistent identifiers to data sets.

Data citation is important for data re-use, verification and tracking.  Citable datasets become legitimate contributions to scholarly communication, paving the way for new metrics and publication models that recognize and reward data sharing. More information on  DataCite services, resources and events can be found

Databib: A Directory of Research Data Repositories

Databib is a collaborative, annotated directory of research data repositories.  It currently includes over 600 data repositories, which are international in scope and cover a variety of disciplines.

Databib’s repository records include a URL, subject tags, a brief description, and information about how the existing data can be reused and whether new data can be deposited.  The records can be searched using a basic keyword search or via an advanced search targeting specific fields. They also can be browsed alphabetically or by broad subjects.

With its useful search functionality and regularly updated content, Databib can be a helpful tool to discover new research data repositories and to identify appropriate repositories where you can submit or acquire data.

Databib can also help increase the visibility of research data repositories:  Anyone can submit a new record or edit an existing record, and an editorial board then reviews additions and changes before they are accepted into Databib.

Openness is one of the guiding principles of Databib:  All Databib data are made available to the public domain using the Creative Commons Zero protocol.  All of the records can be downloaded in RDF/XML format.  Databib also supports OpenSearch, which enables users to search Databib directly from their browsers without having to return to the Databib website.

Originally sponsored by an IMLS Sparks! Innovation National Leadership Grant awarded to Purdue University and Pennsylvania State University, Databib is now guided and maintained by international advisory and editorial boards.  The Purdue University Libraries hosts Databib.  Databib is endorsed by DataCite, which features the Databib list of repositories on its website.

It is also worth noting that the Registry of Research Data Repositories at is a similar initiative, funded by the German Research Foundation DFG for 2012-2014.  The goal of is to create a global registry of research data repositories.

If you have questions about using Databib to submit or acquire data, contact the Library at