Skip to main content

APIs for Scholarly Resources

What is an API?

API stands for application programming interface. An API is a protocol that allows a user to query a resource and retrieve and download data in a machine-readable format.  Researchers sometimes use APIs to download collections of texts, such as scholarly journal articles, so they can perform automated text mining on the corpus they've downloaded.

Here is a simple tutorial that explains what an API is. 

Below are some APIs that are available to researchers. Some are open to the public, while others are available according to the terms of Temple University Libraries' subscriptions. Many require you to create an API key, which is a quick and free process.

 

How do I Use APIs?

You can create a simple query in the address bar in a web browser. However, a more complex query generally requires using a programming language. Commonly used languages for querying APIs are Python and R. (R is the language used in the R software.) The examples given in the documentation for the APIs listed below typically do not include sample programming code; they only explain how the data is structured in order to help users write a query.

List of APIs for Scholarly Research

Content: metadata and article abstracts for the e-prints hosted on arXiv.org
Permissions: no registration required
Limitations: maximum number of results returned from a single call is 30,000, in slices of 2,000 at a time
Contacthttps://groups.google.com/forum/#!forum/arxiv-apihttps://arxiv.org/help/api/index

 

Contentbibliographic data on astronomy and physics publications from SAO/NASA astrophysics databases
Permissions: free to register; request a key at https://github.com/adsabs/adsabs-dev-api
Limitations: varies
Contacthttps://groups.google.com/forum/#!forum/adsabs-dev-api, adshelp@cfa.harvard.edu

 

Content: metadata and full-text content for open access journals published in BioMed Central
Permissions: free to access, request a key at https://dev.springer.com/signup
Limitations: none
Contact: info@biomedcentral.com

 

Content: digitized newspapers from 1789-1963, as well as a directory of newspapers published 1960 to the present, with information on library holdings
Permissions: no registration required
Limitations: none
Contact: http://www.loc.gov/rr/askalib/ask-webcomments.html

 

Content: metadata and full-text of millions of OA research papers
Permissions: free to access, request a key at https://core.ac.uk/api-keys/register
Limitations: limits range from 1-10 requests per 10 seconds, depending on the type of query. Contact CORE if you need a faster rate.
Contact: theteam@core.ac.uk

 

Content: metadata records with CrossRef DOIs, 75 million scholarly works from ~5,000 publishers
Permissions: no registration required
Limitations: guides to avoid overloading the servers available at https://github.com/CrossRef/rest-api-doc#api-etiquette
Contact: labs@crossref.org

 

Content: metadata on items and collections indexed by the DPLA
Permissions: no registration required
Limitations: none, "However, the DPLA reserves the right to protect the integrity of the API and the data it dispenses from abuse (in its discretion), particularly activity that has the effect of denying or unduly degrading service to other API users."
Contact: codex@dp.la

 

Content: multiple APIs for full-text books and journals from ScienceDirect and citation data from Scopus, Engineering Village and Embase
Permissions: free to register; click 'Get API Key" to request a personal key: https://dev.elsevier.com/
Limitations: "Researchers at subscribing academic institutions can text mine subscribed full-text ScienceDirect content via the Elsevier APIs for non-commercial purposes."   Other authorized use cases are described at https://dev.elsevier.com/policy.html

Contact: integrationsupport@elsevier.com

 

Content: bibliographic and rights information for items in the HathiTrust Digital Library
Permissions: no registration required
Limitations: may request up to 20 records at once. Not intended for bulk retrieval
Contact: feedback@issues.hathitrust.org

 

Content: full-text of HathiTrust and Google digitized texts of public domain works
Permissions: free to access, request a key at https://babel.hathitrust.org/cgi/kgs/request
Limitations: "Please contact [HathiTrust] to determine the suitability of the API for intended uses."
Contact: feedback@issues.hathitrust.org

 

Content: metadata for articles included in IEEE Xplore
Permissions: must be affiliated with an institution that subscribes to IEEE Xplore. Temple is a subscriber.
Limitations: maximum 1,000 results per query
Contact: onlinesupport@ieee.org

 

Content: full-text articles from JSTOR
Permissions:  free to use, register at https://www.jstor.org/dfr/
Limitations:  Not a true API, but allows users to construct a search and then download the results as a dataset for text-mining purposes. Can download up to 25,000 documents. Largest datasets available by special request
Contact: https://support.jstor.org/hc/en-us

 

Content: 60 separate APIs for accessing various NLM databases, including PubMed Central, ToxNet, and ClinicalTrials.gov. The PubMed API is listed separately below.
Permissions: varies
Limitations: varies
Contact: varies

 

Content: bibliographic data for content hosted on Nature.com, including news stories, research articles and citations
Permissions: free to access
Limitations: varies
Contact: interfaces@nature.com

 

Content: a selection of the top used datasets covering data for OECD countries and selected non-member economies. Datasets included appear in the catalogue of OECD databases with API access
Permissions: no registration required, see terms and conditions
Limitations: max 1,000,000 results per query, max URL length of 1,000 characters.
Contact: OECDdotStat@oecd.org

 

Content: full-text of research articles in PLOS journals
Permissions: free to access, register at http://api.plos.org/registration/
Limitations: Max is 7200 requests a day, 300 per hour, 10 per minute. Users should wait 5 seconds for each query to return results. Requests should not return more than 100 rows. High-volume users should contact api@plos.org. API users are limited to no more than five concurrent connections from a single IP address.
Contact: api@plos.org

 

Content: information stored in 38 NCBI databases, including some info from PubMed. Will retrieve a PubMed ID when citation information is input.
Permissions: API key required starting May 1, 2018
Limitations: After May 1, 2018, with an API key a site can post up to 10 requests per second by default. Large jobs should be limited to outside 9-5 weekday hours. Higher rates are available by request (see contact information below)
Contact: eutilities@ncbi.nlm.nih.gov

 

Content: full-text of SpringerOpen journal content and BioMed Central, as well as metadata from other Springer resources
Permissions: free to access, request a key at https://dev.springer.com/signup
Limitations: noncommercial use
Contact: tdm@springernature.com

 

Content: APIs for the following datasets: Indicators (time series data), Projects (data on the World Bank’s operations), and World Bank financial data (World Bank Finances API)
Permissions: no registration required
Limitations: See Terms & Conditions of Using our Site
Contact: data@worldbankgroup.org 

 

 

 

Acknowledgements

We would like to acknowledge API guides created by the Libraries at MIT, Berkeley, Purdue and Drexel that informed our work on this guide.

Librarian

Gretchen Sneff's picture
Gretchen Sneff
Contact:
gsneff@temple.edu

Librarian

Karen Kohn's picture
Karen Kohn
Contact:
Paley Library, Room 101
215-204-4428