| Title: | Interface to the arXiv API | 
| Version: | 0.12 | 
| Date: | 2025-07-29 | 
| Description: | An interface to the API for 'arXiv', a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics. | 
| URL: | https://docs.ropensci.org/aRxiv/, https://github.com/ropensci/aRxiv | 
| BugReports: | https://github.com/ropensci/aRxiv/issues | 
| Depends: | R (≥ 3.5.0) | 
| License: | MIT + file LICENSE | 
| Imports: | httr, utils, XML | 
| Suggests: | devtools, knitr, rmarkdown, roxygen2, testthat | 
| VignetteBuilder: | knitr | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-07-29 14:37:34 UTC; kbroman | 
| Author: | Karthik Ram | 
| Maintainer: | Karl Broman <broman@wisc.edu> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-07-29 15:20:09 UTC | 
arXiv subject classifications
Description
arXiv subject classifications: their abbreviations and corresponding descriptions.
Usage
data(arxiv_cats)
Format
A data frame with five columns: the abbreviations of the
subject classifications (category), the field of study,
subfield of study (within Physics; NA otherwise), a short
description, and a longer description.
Source
https://arxiv.org/category_taxonomy
Examples
arxiv_cats
Count number of results for a given search
Description
Count the number of results for a given search. Useful to check before attempting to pull down a very large number of records.
Usage
arxiv_count(query = NULL, id_list = NULL)
Arguments
| query | Search pattern as a string; a vector of such strings is
also allowed, in which case the elements are combined with  | 
| id_list | arXiv doc IDs, as comma-delimited string or a vector of such strings | 
Value
Number of results (integer). An attribute
"search_info" contains information about the search
parameters and the time at which it was performed.
See Also
arxiv_search(), query_terms(),
arxiv_cats()
Examples
# count papers in category stat.AP (applied statistics)
arxiv_count(query = "cat:stat.AP")
# count papers by Peter Hall in any stat category
arxiv_count(query = 'au:"Peter Hall" AND cat:stat*')
# count papers for a range of dates
#    here, everything in 2013
arxiv_count("submittedDate:[2013 TO 2014]")
Open abstract for results of arXiv search
Description
Open, in web browser, the abstract pages for each of set of arXiv search results.
Usage
arxiv_open(search_results, limit = 20)
Arguments
| search_results | Data frame of search results, as returned from  | 
| limit | Maximum number of abstracts to open in one call. | 
Details
There is a delay between calls to
utils::browseURL(), with the amount taken from the R
option "aRxiv_delay" (in seconds); if missing, the default
is 3 sec.
Value
(Invisibly) Vector of character strings with URLs of abstracts opened.
See Also
Examples
z <- arxiv_search('au:"Peter Hall" AND ti:deconvolution')
arxiv_open(z)
The main search function for aRxiv
Description
Allows for progammatic searching of the arXiv pre-print repository.
Usage
arxiv_search(
  query = NULL,
  id_list = NULL,
  start = 0,
  limit = 10,
  sort_by = c("submitted", "updated", "relevance"),
  ascending = TRUE,
  batchsize = 100,
  force = FALSE,
  output_format = c("data.frame", "list"),
  sep = "|"
)
Arguments
| query | Search pattern as a string; a vector of such strings
also allowed, in which case the elements are combined with  | 
| id_list | arXiv doc IDs, as comma-delimited string or a vector of such strings | 
| start | An offset for the start of search | 
| limit | Maximum number of records to return. | 
| sort_by | How to sort the results (ignored if  | 
| ascending | If TRUE, sort in ascending order; else descending
(ignored if  | 
| batchsize | Maximum number of records to request at one time | 
| force | If TRUE, force search request even if it seems extreme | 
| output_format | Indicates whether output should be a data frame or a list. | 
| sep | String to use to separate multiple authors,
affiliations, DOI links, and categories, in the case that
 | 
Value
If output_format="data.frame", the result is a data
frame with each row being a manuscript and columns being the
various fields.
If output_format="list", the result is a list parsed from
the XML output of the search, closer to the raw output from arXiv.
The data frame format has the following columns.
| [,1] | id | arXiv ID | 
| [,2] | submitted | date first submitted | 
| [,3] | updated | date last updated | 
| [,4] | title | manuscript title | 
| [,5] | summary | abstract | 
| [,6] | authors | author names | 
| [,7] | affiliations | author affiliations | 
| [,8] | link_abstract | hyperlink to abstract | 
| [,9] | link_pdf | hyperlink to pdf | 
| [,10] | link_doi | hyperlink to DOI | 
| [,11] | comment | authors' comment | 
| [,12] | journal_ref | journal reference | 
| [,13] | doi | published DOI | 
| [,14] | primary_category | primary category | 
| [,15] | categories | all categories | 
The contents are all strings; missing values are empty strings ("").
The columns authors, affiliations, link_doi,
and categories may have multiple entries separated by
sep (by default, "|").
The result includes an attribute "search_info" that includes
information about the details of the search parameters, including
the time at which it was completed. Another attribute
"total_results" is the total number of records that match
the query.
See Also
arxiv_count(), arxiv_open(),
query_terms(), arxiv_cats()
Examples
# search for author Peter Hall with deconvolution in title
z <- arxiv_search(query = 'au:"Peter Hall" AND ti:deconvolution', limit=2)
attr(z, "total_results") # total no. records matching query
z$title
# search for a set of documents by arxiv identifiers
z <- arxiv_search(id_list = c("0710.3491v1", "0804.0713v1", "1003.0315v1"))
# can also use a comma-separated string
z <- arxiv_search(id_list = "0710.3491v1,0804.0713v1,1003.0315v1")
# Journal references, if available
z$journal_ref
# search for a range of dates (in this case, one day)
z <- arxiv_search("submittedDate:[199701010000 TO 199701012400]", limit=2)
Check for connection to arXiv API
Description
Check for connection to arXiv API
Usage
can_arxiv_connect(max_time = 5)
Arguments
| max_time | Maximum wait time in seconds | 
Value
Returns TRUE if connection is established and FALSE otherwise.
Examples
can_arxiv_connect(2)
arXiv query field terms
Description
Possible terms that correspond to different fields in arXiv searches.
Usage
data(query_terms)
Format
A data frame with two columns: the term and corresponding
description.
Author(s)
Karl W Broman
Source
https://arxiv.org/help/api/user-manual.html
Examples
query_terms