Searching for Existing Data

Researchers are increasingly asked to make their research data available to others for further research. However, many datasets are still not easy to find or to re-use. If you’re looking for existing data, which search strategies could you use?

Data repositories and related sources

Researchers increasingly choose to archive datasets in dedicated data archives (repositories), such as DANS in the Netherlands. Together with their dataset, they provide the information necessary for other researchers to find it and determine whether it’s of interest for them. You can usually download or request a dataset via the catalogue. Sometimes, the entry will refer you to another organization. Gregory et. al. (2018) advise how to approach the search via such repositories.

Data repositories can be found on an organizational, national or international level. Some are connected to a specific discipline or type of data; some are more interdisciplinary. This means that you’ll likely need to consult various catalogues and databases.

Don’t hesitate to ask for help from specialized librarians in your institution or experts from the repository that holds the datasets that you are interested in.

Statistical data & open government data

Public sector information can also contain a wealth of relevant data. There are many overviews of statistical offices worldwide, such as the list maintained by the United Nations Statistics Division.

For public sector open data there is an overview of the European open data catalogues on the European Union Open Data Portal. These portals are aimed at a broader group of social and economic actors. These sites could refer you to judicial and court statistics, crime statistics or court information. Of course, it can still happen that the dataset that you are looking for has not been integrated in such portals and you would have to contact organizations directly.

Other sources and search options

By searching for publications, you can try to identify those that are based on data that are relevant for you. It might be helpful to include a broader range of topics or disciplines in your searches. In the best case, the publication itself contains an identifier or web link to the dataset(s). Otherwise, the information about the author and the method section might provide you with enough information about the dataset and where it can be obtained.

If you need to request the dataset directly from the author(s), consider that researchers change their affiliation and contact details during their careers. You could check if the author manages his/her name with an author Identifier such as ORCID that links together affiliations, publications, education and so forth.

If you have been unsuccessful in finding the data at the basis of a publication, you can ask advice from a colleague and/or librarian and map out further available options, such as:

  • Directly contacting the author’s publisher, institution or (former) colleagues
  • Looking up citations of the publication/dataset and contact the authors of these more recent publications to see if they have a copy of the dataset

Once you’ve found a dataset, make sure to sort out the conditions for reuse (e.g. citations and credits) and any legal aspects before using it. When in doubt, contact relevant research support services at your institution.

This entry was based on the following blog post: