------------------------------------------------------------------------ API calls to ADS ------------------------------------------------------------------------ A good starting point for ADS, in general, is https://adsabs.github.io/help/ Underneath this URL you will find how to construct "Query", review "Search Results" and understand the default values in "User preferences" Reference (for examples): https://github.com/adsabs/adsabs-dev-api/blob/master/Search_API.ipynb ------------------------------------------------------------------------ I. Get an API Token ------------------------------------------------------------------------ To make API calls you need to have an ADS account. You also need to have your own "API Token". You get this at https://ui.adsabs.harvard.edu/user/settings/token Save the 40-character token (for future use) in a file "ADS_Token". $ export Token=$(cat ADS_Token) #this now becomes a shell variable, $Token #note: no space surrounding "=" Please do not share your API Token with anyone else. With the Token one can look at your private library. ------------------------------------------------------------------------ II. Structure of an API call: ------------------------------------------------------------------------ An API call can be made from command line, using for example, curl. For details of curl please look at Appendix I. Some APIs are open and some, like ADS, are not. The structure of a simple call is $ curl -H Authorization BaseURL Endpoint Query For ADS API we have $ export Authorize="Authorization: Bearer:"$Token $ export BaseURL="https://api.adsabs.harvard.edu/v1" and the endpoints are: /search, /metrics, /libraries The query has to be URL-coded (" " is %20; see Appendix V). For ADS API the query is limited to 1000 characters. Though not essential at this point. There are three "methods" which interface with the API gateway: GET (most common), POST (for large jobs) and PUT. POST method has two addtional items: "payload" and "data" (see Appendix I). ------------------------------------------------------------------------ III. Determine quota for queries ------------------------------------------------------------------------ You can make up to 5000 inquiries per day. See https://github.com/adsabs/adsabs-dev-api/blob/master/README.md#access Below we launch a simple API call (query: papers with "star" anywhere; by default display first 10 entries). We will decode the message (stderr) to determine how today's query quota #note the additional "-v" switch (for verbose) #note how stderr is piped to grep $ curl -v -H $Authorize 'https://api.adsabs.harvard.edu/v1/search/query?q=star' 2>&1 >/dev/null | grep ratelimit- < x-ratelimit-limit: 5000 #maximum queries per 24 hours < x-ratelimit-remaining: 4999 #number of inquiries left < x-ratelimit-reset: 1591791308 #quota reset time $ date -r 1591791308 Wed Jun 10 05:15:08 PDT 2020 ------------------------------------------------------------------------ EXAMPLE I: Simple query and a menu of outputs ------------------------------------------------------------------------ # will give basic information of first 10 papers for query "star" $ curl -H $Authorize 'https://api.adsabs.harvard.edu/v1/search/query?q=star' #the output is not pretty. so direct it to "jq" $ curl -H $Authorize 'https://api.adsabs.harvard.edu/v1/search/query?q=star'\ | jq #request one ouput field: in this case bibcode $ curl -H $Authorize 'https://api.adsabs.harvard.edu/v1/search/query?q=star&fl=bibcode' | jq #you can request multiple fields: bibcode, author, author_count #see Appendix II for comprehensive list of fields $ curl -H $Authorize 'https://api.adsabs.harvard.edu/v1/search/query?q=star&fl=bibcode,author_count,author' | jq ------------------------------------------------------------------------ EXAMPLE 2: Given a bibcode return title, citations & references ------------------------------------------------------------------------ #reduce typing by using $BaseURL #save output to "a" so we can inspect the JSON structure $ curl -H $Authorize $BaseURL"/search/query?q=bibcode:2005Natur.434...28K&fl=title,reference,citation" \ | jq > a #the output is in structure "response.docs" $ jq 'keys' a [ "response", "responseHeader" ] $ jq '.response.docs[] | keys' a [ "citation", "reference", "title" ] #extract list of ref bibcodes & cites bibcodes $ jq '.response.docs[] | .citation[]' a #quotes on $ jq -r '.response.docs[] | .reference[]' a #quotes off ------------------------------------------------------------------------ EXAMPLE 3: Given multiple bibcodes return the title ... ------------------------------------------------------------------------ Here we seek titles, refs and citations for two bibcodes 2005Natur.434...50H, 2019BAAS...51g.255H In the GUI query box you would type either of these two queries bibcode:2005Natur.434...50H OR bibcode:2019BAAS...51g.255H OR bibcode:(2005Natur.434...50H OR 2019BAAS...51g.255H) You should view the actual call the GUI makes by going to the browser command box (top line) There is one catch: you need to replace " " by URL encoding (%20; see Appendix V). Apparently ":" is okay but not " ". bibcode:(2005Natur.434...50H%20OR%202019BAAS...51g.255H) $ curl -H $Authorize \ $BaseURL"/search/query?q=bibcode:(2005Natur.434...50H%20OR%202019BAAS...51g.255H)&fl=title,reference,citation" \ | jq > a $ jq '.response|keys' a [ "docs", "numFound", #number of "docs" "start" #starting index of "docs" ] $ jq '.response.numFound' a 2 $ jq '.response.start' a 0 $ jq -r '.response.docs[0] | .citation' a #extract cites for paper 1 $ jq -r '.response.docs[1] | .citation' a #extract cites for paper 2 ------------------------------------------------------------------------ EXAMPLE IV. Given a large set of bibcodes, return authors ------------------------------------------------------------------------ You encode the request in parameter that follows "-d". My understanding is that at the present time the only filters you can use are bibcodes. The list has "bibcode\n" followed by "bibcode1\nbibcode2\n ..." Keep "q=*:*" and set "fl" to what you want. $ curl -H "Content-Type: big-query/csv" -H "Authorization: Bearer $Token" \ "https://api.adsabs.harvard.edu/v1/search/bigquery?q=*:*&fl=bibcode,title&rows=2000" \ -d $'bibcode\n2005Natur.434...50H\n2019BAAS...51g.255H' returns the same output as in the Example III. ------------------------------------------------------------------------ Example V. Converting inquiries to utility ------------------------------------------------------------------------ ------------------------------------------------------------------------ Appendix I: curl ------------------------------------------------------------------------ -# ... display progress-bar -d ... send specified datain POST request -H ... headers to supply with request -k ... allow insecure connections to succeed -o ... output file -s ... silent (quiet) mode -v ... verbose -w ... write out -X ... method to use (e.g. POST) When using with POST you need to specify how you encode "d" curl -H "Content-Type: application/x-www-form-urlencoded" -d "param1=value1¶m2=value2$" OR -d "@data.txt" OR curl -H "Content-Type: application/json" -d '{"key1":"value1", "key2":"value2"}' OR -d "@data.json" ------------------------------------------------------------------------ Appendix II: Fields (name & explanation) ------------------------------------------------------------------------ abstract the abstract of the record ack Contains acknowledgements extracted from fulltexts (if aff List of provided affiliations in a given paper aff_id List of curated affiliation IDs in a given paper alternate_bibcode List of alternate bibcodes for that document alternate_title Alternate title, usually when the original arxiv_class Which arXiv class was the paper submitted to author List of authors on a paper author_count Number of authors on a paper author_facet Contains list of names with the number of author_facet_hier Hierarchical facet field which contains author_norm List of authors with their first names shortened? bibcode ADS identifier of a paper bibgroup Bibliographic group that the bibcode belongs to bibgroup_facet Contains list of groups with the number of bibstem the abbreviated name of the journal or publication, bibstem_facet Technical field, used for faceting by body Contains extracted fulltext minus acknowledgements citation List of bibcodes that cite the paper citation_count number of citations the item has received cite_read_boost Float values containing normalized (float) classic_factor Integer values containing the boost factor used comment This is currently indexed, but not stored. To see copyright Copyright given by the publisher data List of sources that have data related to this bibcode data_facet Contains list of data with the number of database Database that the paper resides in (astronomy or date Same as pubdate, but of time format and used for doctype Type of document: article, thesis, etc, these stem doctype_facet_hier Hierarchical facets consisting of nested doi Digital object identifier eid electronic id of the paper (equivalent of page number) email List of e-mails for the authors that included them in entdate Creation date of ADS record in user-friendly format entry_date Creation date of ADS record in RFC 3339 esources Types of electronic sources available for a record facility List of facilities declared in paper (controlled first_author First author of the paper first_author_facet_hier Contains list of first names with the first_author_norm First author of the paper with their first grant Field that contains both grant ids and grant agencies. grant_agencies Index with just the grant agencies names (e.g. grant_facet_hier Hierarchical facet field which contains grant_id Index with just the grant ids (e.g. 0618398) id a unique integer for this record. Generally not useful, identifier Abstract field that can be used to search an array ids_data https://github.com/adsabs/issues/issues/73 indexstamp Date at which the document was indexed by Solr inst List of curated affiliations (institutions) in paper isbn ISBN of the publication (this applies to books) issn ISSN of the publication (applies to journals - ie. issue Issue number of the journal that includes the article keyword an array of normalized and non-normalized keyword keyword_facet Contains list of keywords with the number of keyword_norm Controlled keywords, each entry will have a keyword_schema Schema for each controlled keyword, i.e., what lang In ADS this field contains a language of the main title. links_data We use it to contain info on what readable linked nedid List of NED IDs within a record nedtype Keywords used to describe the NED type (e.g. galaxy, nedtype_object_facet_hier Hierarchical facet consisting of NED orcid_other ORCID claims from users who used the ADS claiming orcid_pub ORCID IDs supplied by publishers orcid_user ORCID claims from users who gave ADS consent to page First page of a record page_count If page_range is present, gives the difference page_range Range of page numbers covered by the record property an array of miscellaneous flags associated with the pub canonical name of the publication the record appeared in pub_raw Name of publisher, but also includes the volume, page, pubdate publication date in the form YYYY-MM-DD (DD value will pubnote Comments submitted with the arXiv version of the paper read_count number of times the record has been viewed within reader List of identifiers for people who have read the article recid Unique identifier of the document, Integer version of reference List of references inside a paper simbad_object_facet_hier The hierarchical facets consisting of simbid List of SIMBAD IDs within the paper. This has privacy simbtype Keywords used to describe the SIMBAD type thesis https://github.com/adsabs/issues/issues/72 title the title of the record vizier Keywords, “subject” tags from VizieR vizier_facet Contains list of VizieR keywords with the number volume Volume of the journal that the article exists in year Year of publication ------------------------------------------------------------------------ Appendix III: ------------------------------------------------------------------------ abs combo: abstract, title, keyword all combo: author_norm,alternate_title,bibcode,doi,identifier arxiv query parser token citations() returns list of citations from given papers; use |[citations]| to get the field contents citis() like citation(), uses less memory but is slower classic_relevance() Toy-implementation of the ADS Classic relevance score algorithm... wrap any query & obtain hits sorted in the ADS Classic ways (sort of) full combines: title^2, abstract^2, body, keyword, ack instructive() Synonym of reviews() joincitations() Equivalent of citations() but implemented using lucene block-join joinreferences() Equivalent of references() but implemented using lucene block-join orcid combines: orcid_pub, orcid_user, orcid_other pos() The pos() operator allows you to search for an item within a field by specifying the posn in the field. Syntax: pos(fieldedquery,position,[endposition]). If no endposition is given, endposition = position, otherwise this performs a query within the range [position, endposition]. references() returns list of references from given papers reviews() returns the list of documents citing the most relevant papers on the topic being researched; these are papers containing the most extensive reviews of the field. reviews2() Original implementation of reviews topn() Return the top N number of documents trending() Trending – returns the list of documents most read by users who read recent papers on the topic being researched; these are papers currently being read by people interested in this field. useful() Useful – returns the list of documents frequently cited by the most relevant papers on the topic being researched; these are studies which discuss methods & techniques useful to conduct research in this field. useful2() original implementation of useful() ------------------------------------------------------------------------ Appendix IV. ADS help: abbreviatio of journals, custom output format ------------------------------------------------------------------------ ADS: Journal Abbreviation List http://adsabs.harvard.edu/abs_doc/journal_abbr.html. ADS Custom Output https://adsabs.github.io/help/actions/export %[n.m]l ... print min(n,m) authors %c ... number of citations %R ... bibcode %T ... title %Y ... Year %l ... author list ------------------------------------------------------------------------ Appendix V. URL Encoding ------------------------------------------------------------------------ Decimal Character URL Encoding (UTF-8) 0 NUL(null character) %00 1 SOH(start of header) %01 2 STX(start of text) %02 3 ETX(end of text) %03 4 EOT(end of transmission) %04 5 ENQ(enquiry) %05 6 ACK(acknowledge) %06 7 BEL(bell (ring)) %07 8 BS(backspace) %08 9 HT(horizontal tab) %09 10 LF(line feed) %0A 11 VT(vertical tab) %0B 12 FF(form feed) %0C 13 CR(carriage return) %0D 14 SO(shift out) %0E 15 SI(shift in) %0F 16 DLE(data link escape) %10 17 DC1(device control 1) %11 18 DC2(device control 2) %12 19 DC3(device control 3) %13 20 DC4(device control 4) %14 21 NAK(negative acknowledge) %15 22 SYN(synchronize) %16 23 ETB(end transmission block) %17 24 CAN(cancel) %18 25 EM(end of medium) %19 26 SUB(substitute) %1A 27 ESC(escape) %1B 28 FS(file separator) %1C 29 GS(group separator) %1D 30 RS(record separator) %1E 31 US(unit separator) %1F 32 space %20 33 ! %21 34 " %22 35 # %23 36 $ %24 37 % %25 38 & %26 39 ' %27 40 ( %28 41 ) %29 42 * %2A 43 + %2B 44 , %2C 45 - %2D 46 . %2E 47 / %2F 48 0 %30 49 1 %31 50 2 %32 51 3 %33 52 4 %34 53 5 %35 54 6 %36 55 7 %37 56 8 %38 57 9 %39 58 : %3A 59 ; %3B 60 < %3C 61 = %3D 62 > %3E 63 ? %3F 64 @ %40 65 A %41 66 B %42 67 C %43 68 D %44 69 E %45 70 F %46 71 G %47 72 H %48 73 I %49 74 J %4A 75 K %4B 76 L %4C 77 M %4D 78 N %4E 79 O %4F 80 P %50 81 Q %51 82 R %52 83 S %53 84 T %54 85 U %55 86 V %56 87 W %57 88 X %58 89 Y %59 90 Z %5A 91 [ %5B 92 \ %5C 93 ] %5D 94 ^ %5E 95 _ %5F 96 ` %60 97 a %61 98 b %62 99 c %63 100 d %64 101 e %65 102 f %66 103 g %67 104 h %68 105 i %69 106 j %6A 107 k %6B 108 l %6C 109 m %6D 110 n %6E 111 o %6F 112 p %70 113 q %71 114 r %72 115 s %73 116 t %74 117 u %75 118 v %76 119 w %77 120 x %78 121 y %79 122 z %7A 123 { %7B 124 | %7C 125 } %7D 126 ~ %7E 127 DEL(delete (rubout)) %7F