Search Queries for "Mapping Research Output to the Sustainable Development Goals (SDGs)" v5.0.2

  • Maurice Vanderfeesten (Contributor)
  • René Otten (Contributor)
  • Eike Spielberg (Contributor)
  • Maurice Vanderfeesten (Contributor)
  • Eike Spielberg (Contributor)
  • René Otten (Contributor)
  • Nykohla Strong (University of Aberdeen) (Contributor)
  • Felix Schmidt (Contributor)
  • Baldvin Zarioh (Contributor)
  • Didier Vercueil (Contributor)
  • Alessandro Arienzo (Contributor)
  • Roberto Delle Donne (Contributor)
  • Ignasi Salvadó Estivill (Contributor)
  • José Luis González Ugarte (Contributor)
  • Linda Hasse (Contributor)
  • Ane Sesma (Contributor)
  • Friedrich Gaigg (Contributor)
  • Nicolien van der Grijp (Contributor)
  • Yasin Gunes (Contributor)
  • Peter van den Besselaar (Contributor)
  • Joeri Both (Contributor)
  • Kees Kouwenaar (Contributor)



    This package contains machine readable (xml) search queries, for the Scopus publication database, to find domain specific research output that are related to the 17 Sustainable Development Goals (SDGs). [ SDG QUERIES PAGES ] [ PROJECT WEBSITE ] [ FORK ON GITHUB ] Sustainable Development Goals are the 17 global challenges set by the United Nations. Within each of the goals specific targets and indicators are mentioned to monitor the progress of reaching those goals by 2030. In an effort to capture how research is contributing to move the needle on those challenges, we earlier have made an initial classification model than enables to quickly identify what research output is related to what SDG. (This Aurora SDG dashboard is the initial outcome as proof of practice.) The initiative started from the Aurora Universities Network in 2017, in the working group "Societal Impact and Relevance of Research", to investigate and to make visible 1. what research is done that are relevant to topics or challenges that live in society (for the proof of practice this has been scoped down to the SDGs), and 2. what the effect or impact is of implementing those research outcomes to those societal challenges (this also have been scoped down to research output being cited in policy documents from national and local governments an NGO's). The classification model we have used are 17 different search queries on the Scopus database. The search queries are elegant constructions with keyword combinations and boolean operators, in the syntax specific to the Scopus Query Language. We have used Scopus because it covers more research area's that are relevant to the SDG's, and we could filter much easier the Aurora Institutions. Versions Different versions of the search queries have been made over the past years to improve the precision (soundness) and recall (completeness) of the results. The queries have been made in a team effort by several bibliometric experts from the Aurora Universities. Each one did two or 3 SDG's, and than reviewed each other's work. v1.0 January 2018 Initial 'strict' version. In this version only the terms were used that appear in the SDG policy text of the targets and indicators defined by the UN. At this point we have been aware of the SDSN Compiled list of keywords, and used them as inspiration. Rule of thumb was to use keyword-combination searches as much as possible rather than single-keyword searches, to be more precise rather than to yield large amounts of false positive papers. Also we did not use the inverse or 'NOT' operator, to prevent removing true positives from the result set. This version has not been reviewed by peers. Download from: GitHub / Zenodo v2.0 March 2018 Reviewed 'strict' version. Same as version 1, but now reviewed by peers. Download from: GitHub / Zenodo v3.0 May 2019 'echo chamber' version. We noticed that using strictly the terms that policy makers of the UN use in the targets and indicators, that much of the research that did not use that specific terms was left out in the result set. (eg. "mortality" vs "deaths") To increase the recall, without reducing precision of the papers in the results, we added keywords that were obvious synonyms and antonyms to the existing 'strict' keywords. This was done based on the keywords that appeared in papers in the result set of version 2. This creates an 'echo chamber', that results in more of the same papers. Download from: GitHub / Zenodo v4.0 August 2019 uniform 'split' version. Over the course of the years, the UN changed and added Targets and indicators. In order to keep track of if we missed a target, we have split the queries to match the targets within the goals. This gives much more control in maintenance of the queries. Also in this version the use of brackets, quotation marks, etc. has been made uniform, so it also works with API's, and not only with GUI's. His version has been used to evaluate using a survey, to get baseline measurements for the precision and recall. Published here: Survey data of "Mapping Research output to the SDGs" by Aurora Universities Network (AUR) doi:10.5281/zenodo.3798385. Download from: GitHub / Zenodo v5.0 June 2020 'improved' version. In order to better reflect academic representation of research output that relate to the SDG's, we have added more keyword combinations to the queries to increase the recall, to yield more research papers related to the SDG's, using academic terminology. We mainly used the input from the Survey data of "Mapping Research output to the SDGs" by Aurora Universities Network (AUR) doi:10.5281/zenodo.3798385. We ran several text analyses: Frequent term combination in title and abstracts from Suggested papers, and in selected (accepted) papers, suggested journals, etc.found in this data set Spielberg, Eike, & Hasse, Linda. (2020). Text Analyses of Survey Data on "Mapping Research Output to the Sustainable Development Goals (SDGs)" (Version 1.0) [Data set]. Zenodo . Secondly we got inspiration out of the Elsevier SDG queries Jayabalasingham, Bamini; Boverhof, Roy; Agnew, Kevin; Klein, Lisette (2019), “Identifying research supporting the United Nations Sustainable Development Goals”, Mendeley Data, v1 And thirdly we got inspiration from this controlled vocabulary containing closely related terms. Duran-Silva, Nicolau, Fuster, Enric, Massucci, Francesco Alessandro, & Quinquillà, Arnau. (2019). A controlled vocabulary defining the semantic perimeter of Sustainable Development Goals (Version 1.2) [Data set]. Zenodo. Download from: GitHub / Zenodo Contribute and improve the SDG Search Queries We welcome you to join the Github community and to fork, improve and make a pull request to add your improvements to the new version of the SDG queries.

    Data type

    Zenodo dataset
    Supplementary github site:

    Copyright and Open Data Licencing

    CC 4.0
    Date made available2020

    Cite this