The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
# let's load this package before getting started
library(libbib)
#> Loading required package: data.table
Intermediate knowledge of R is required to follow the examples here.
Basic knowledge of how to manipulate data.table
s if
helpful but following the examples should be possible without
it.
libbib
’s ability to communicate to the WorldCat Search
API is probably the most helpful capability of the package. As such, an
entire vignette dedicated to this powerful tool is warranted.
Also of note is that documentation on how to use this API is
scattered throughout the web, in different places, and sometimes even no
longer available and only accessible through the internet archive.
Because of this, the goal of this vignette is not only to document
libbib
’s worldcat_api_search
function, but to
provide a single location for compiling this archived documentation and
providing information/examples of what the WorldCat Search API is
capable, in general.
The terminology/nomenclature of the different WorldCat API offerings is fuzzy and inconsistent. The specific kind of Search API that this package offers usage of can, more specifically, be described as the ‘WorldCat SRU Search API version 1’.
There is a version 2 of this API, but version 1 hasn’t been sunsetted as of yet.
First, let’s speak of what this API is and what it is not.
This API is allows a developer to search for bibliographic records that are cataloged in WorldCat.
The queries are made using the SRU standard search protocol (Search/Retrieve via URL) using a standard query syntax called CQL (Contextual Query Language). The WorldCat Search API doesn’t implement all features of CQL; this vignette will illustrate SRU/CQL only insofar as it is supported by this API.
This API is open to libraries that maintain both WorldCat Discovery
and OCLC Cataloging subscriptions and needs an API key (called a
WSKey
) to work. A request can be made for a key (if your
institution doesn’t already have one) via this
link.
This API is not the OpenSearch/Basic API, which doesn’t allow field-specific searches and only supports keyword-anywhere searches.
This is also not the same as using the advanced search option on worldcat.org. That option only provides a subset of the bibliographic records in WorldCat and is a far less powerful tool.
This API is most akin to using the “expert search” option in OCLC’s FirstSearch, using the WorldCat database, but differs in that (a) searching the API is programmatically automate-able, and (b) the API allows for, still, more powerful search queries.
libbib
’s worldcat_api_search
function
providesAt its most basic, this function takes a SRU query and returns a
data.table
with most of the bibliographic metadata from the
MARCXml that the API returns. We’ll see how the behavior of this
function can be controlled by specifying certain function
parameters.
This function also offers assistance with the SRU query syntax. Mainly, the function allows you to substitute the arcane search index codes for more human-readable equivalents, prefixed be a (US) dollar-sign. Examples of these aids will be explained later in the vignette.
Let’s look at an example of a simple query and what the returned
data.table
looks like.
Specifically, we’ll search for “Madame Bovary” by “Gustave Flaubert” and we’ll only show the first three results
library(libbib) # load this package
<- worldcat_api_search('$title="Madame Bovary" and
result $author="Gustave Flaubert"')
# get the column names
names(result)
#> [1] "total_wc_results" "result_number" "oclc" "isbn"
#> [5] "issn" "title" "author" "pub_date"
#> [9] "lang_code" "bib_level" "record_type" "pub_place_code"
#> [13] "publisher" "leader" "oh08" "query"
# show the first three results
1:3,]
result[#> total_wc_results result_number oclc isbn issn
#> <char> <int> <char> <char> <char>
#> 1: 986 1 1125170419 9782253183464 <NA>
#> 2: 986 2 1049849403 9788415618843 <NA>
#> 3: 986 3 1203070641 9781664921993 <NA>
#> title author pub_date lang_code bib_level
#> <char> <char> <int> <char> <char>
#> 1: Madame Bovary : Flaubert, Gustave, 2019 fre Monograph/Item
#> 2: Madame Bovary / Flaubert, Gustave, 2019 spa Monograph/Item
#> 3: Madame Bovary / Flaubert, Gustave, 2021 eng Monograph/Item
#> record_type pub_place_code publisher
#> <char> <char> <char>
#> 1: Language Material fr le Livre de poche,
#> 2: Language Material sp <NA>
#> 3: Nonmusical sound recording ohu <NA>
#> leader oh08
#> <char> <char>
#> 1: 00000cam a2200000Mi 4500 190619s2019 fr a g 000 1 fre d
#> 2: 00000cam a2200000Ii 4500 180827t20192018sp a 000 1 spa d
#> 3: 00000cim a2200000Mi 4500 201104s2021 ohunnnneq f n eng d
#> query
#> <char>
#> 1: srw.ti="Madame Bovary" and srw.au="Gustave Flauber...
#> 2: srw.ti="Madame Bovary" and srw.au="Gustave Flauber...
#> 3: srw.ti="Madame Bovary" and srw.au="Gustave Flauber...
This should give you an idea of the rich information returned by the
results data.table
. All of the information returned is from
(or derived from) the MARCXml that the API returns, save for
total_wc_results
Which returns the number of the
results the query yields in WorldCat even if the number of
records/results requested is lower than this number.
result_number
This number of the result, which is
helpful if you’re using a starting position other than 1
(the default)
query
The final SRU query (after any of the SRU
query assistance routines step in) that is sent to the API. This can be
useful for debugging.
Hereafter, the output of the example queries will be abbreviated, truncated, elided, or only show a subset of columns in order to save space and aid in following the guide.
If you craft a query that yields no results, a message telling you
such will be displayed, and the return value is NULL
.
If you make an error in the query syntax, no results will be returned, but a diagnostic message returned from the server may tell you what went wrong. Here are two examples…
# missing ending double quotes
worldcat_api_search('$title="Madame Bovary and $langauge=greek')
#> Received diagnostic message: Query syntax error (org.z3950.zing.cql.CQLParseException: expected index or term, got EOF)
#> no results found
#> NULL
# "$titley" is not a valid search index
worldcat_api_search('$titley="Madame Bovary" and $langauge=greek')
#> Received diagnostic message: Unsupported index (srw.tiy)
#> no results found
#> NULL
worldcat_api_search("$holding_library=NYP")
#> Received diagnostic message: Limit index, can only be used to narrow a result
#> for a non-limit index (srw.li)
#> no results found
#> NULL
This last one failed because $holding_library
is
considered a limit index. This means that you can’t use it to
search directly, you can only combine with search facet with other
non-limiting indexes to filter the results. Hopefully this will make
more sense as you go through this vignette.
worldcat_api_search
by example / quick startIn guides like this, there is frequently tension between wanting to be a complete reference, and wanting to show cool/helpful examples at a glance.
In service of easing this tension, we’ll first look at some examples illustrating things this function can do and then turn our attention to an adequate explanation of boolean operators, relations operators, search indexes, etc…
These examples are largely taken from the examples
section of the R documentation for this function.
These examples (and the ones further along in this vignette) will
also use data.table
syntax to limit number of rows and
columns returned to aid reading.
We’ll be focusing on the types of queries you can use in this section; controlling the behavior of the function via changing parameters will come later.
library(libbib)
# title search for "The Brothers Karamazov"
<- worldcat_api_search('$title="Brothers Karamazov"')
results
# Madame Bovary by Gustave Flaubert in Greek
<- '$author="Gustave Flaubert" and
sru $title="Madame Bovary" and
$language=greek'
<- worldcat_api_search(sru)
results
# Hip Hop materials on wax, cassette, or CD published from
# 1987 to 1990
<- '(($material_type=cas or $material_type=cda or $material_type=lps)
sru and $subject="Rap") and $year="1987-1990"'
<- worldcat_api_search(sru)
results
# keyword search for "Common Lisp" for materials held
# at The New York Public Library
<- '$keyword="common lisp" and $holding_library=NYP'
sru <- worldcat_api_search(sru)
results
# keyword search for "Common Lisp" for materials held
# by any of the members of the "Manhattan Research
# Library Initiative" (MaRLI) joint borrowing program
# (New York Public Library, Columbia University, and
# New York University)
<- '($keyword="common lisp" and $holding_library=NYP)
sru or ($keyword="common lisp" and $holding_library=ZYU)
or ($keyword="common lisp" and $holding_library=ZCU)'
<- worldcat_api_search(sru)
results
# Books (only books) about Ethics (by dewey division 170s or
# LC call number subject class "BJ") published in the 19th
# century
<- '($dewey="17*" or $lc_call="bj*") and $year="18*" and
sru $material_type=bks'
<- worldcat_api_search(sru)
results
# Materials on Musicology (by Dewey division 780s) at
# the New York Public Library and not held by any
# other insitution
<- '$dewey="78*" and $holding_library=NYP and
sru $library_holdings_group=11'
<- worldcat_api_search(sru)
results
# Search for materials on "Danger Music" published since 2010
<- worldcat_api_search('$keyword="danger music" and $year="2010-"') results
Now that we’ve seen these examples, sans explanation, we can now have a closer look into the components of a query. Broadly speaking, there are four concepts to be aware of…
relations operators
boolean operators
wildcards
search indexes
We’ll be looking at each in this order, because I think that makes the most sense. To illustrate the first three concepts, though, we have to use search indexes, before a formal explanation of what they are.
Briefly, a search index is a facet along which to search. In all of
the examples above, the search indexes were prefixed by a $
character (e.g. $title
, author
,
$keyword
, etc…)
There are four relations operators available for use…
=
exact
any
all
It should be noted that not every relations operator is available for use with every search index.
=
This was the most common operator used in the quick-start examples
above. Though this operator can be most fully understood via contrast
with the next operator, exact
, suffice it to say, for now,
that using =
means that all of your search terms must
match, without intervening words. This is sometimes referred to as an
“un-anchored” search.
If the phrase you’re search for has spaces in it, you have to
surround it with double quotes. Since the SRU query to the
worldcat_api_search
function must be a string, and strings
can be made with single quotes and double quotes, we need to surround
the entire query with single quotes.
We can get around this by “escaping” the quotes, but using only single quotes to surround the whole query is the most elegant approach.
You can use both spaces and +
to separate the search
index and the search term on both sides of the relations operator…
<- '$title+=+"Common Lisp"'
sru <- worldcat_api_search(sru)
results 1:3, .(total_wc_results, title)]
results[#> total_wc_results title
#> <char> <char>
#> 1: 367 Common Lisp /
#> 2: 367 Practical Common Lisp /
#> 3: 367 Common LISP /
is the same as…
<- '$title = "Common Lisp"'
sru <- worldcat_api_search(sru)
results 1:3, .(total_wc_results, title)]
results[#> total_wc_results title
#> <char> <char>
#> 1: 367 Common Lisp /
#> 2: 367 Practical Common Lisp /
#> 3: 367 Common LISP /
But we’ll be using spaces here.
The =
relations operator is the only one that doesn’t
require any space or +
on either side of the operator, and
we’ll be using both the spaced version and non-spaced version in these
examples for this operator, only.
exact
(anchored search)The exact
operator, in contrast with the =
operator, signals to the API that your search must match as an exact
phrase, without any other terms in the matching string. For this reason,
this operator is sometimes referred to as an “anchored search”.
Here’s an example using the same title search as above…
<- worldcat_api_search('$title exact "Common Lisp"')
results 1:3, .(total_wc_results, title)]
results[#> total_wc_results title
#> <char> <char>
#> 1: 37 Common Lisp /
#> 2: 37 Common LISP /
#> 3: 37 Common LISP /
Note that the title “Practical Common Lisp” was returned by the
=
operator query. The exact
operator will not
match this title since it has another word/phrase in it besides “Common
Lisp”.
Using an exact
search is probably a better fit than
=
for some indexes and in some situations. For example, if
you know the exact title of a book
(e.g. $title exact "Brothers Karamazov")
, using
exact
will ensure that books with a title like “A Guide
to The Brothers Karamazov” or “The Brothers Karamazov
in Pictures” are not returned by the search.
That being said, we’ll be using the =
most heavily, and
it’s easier to read.
Using any
means that any of your search terms (inside
the double quotes) can match. For example…
<- worldcat_api_search('$title any "Common Lisp"')
results 1:3, .(total_wc_results, title)]
results[#> total_wc_results title
#> <char> <char>
#> 1: 351777 An Inquiry into the Human Mind on the Principles o...
#> 2: 351777 The book of common prayer.
#> 3: 351777 Dictionary of Phrase and Fable
matches the title “The book of common prayer” (decidedly not a book of the subject of Common Lisp) since it has the word “common” in it. (Note that capitalization doesn’t matter).
Using any
in this context is tantamount to using an
or
boolean operator, which we’ll look at in the next
section…
<- worldcat_api_search('$title = Common or $title = "Lisp"')
results 1:3, .(total_wc_results, title)]
results[#> total_wc_results title
#> <char> <char>
#> 1: 351777 An Inquiry into the Human Mind on the Principles o...
#> 2: 351777 The book of common prayer.
#> 3: 351777 Dictionary of Phrase and Fable
(Note the same number and order of results)
all
Using the all
operator means that all of the
search terms must match but the search terms can be in any order and
have intervening terms in between. For example, search like
$title all "Common Lisp"
can match a (fictional) book named
“Speaking with a lisp is common”.
If your search term is one word, you do not have to use double quotes
to surround the term. For example, $title = Ethics
and
$title exact Ethics
work perfectly well.
If your search term as a single quote in it, it must be escaped so that R doesn’t interpret it as the end of the search string. That being said, it appears as if you can just drop the single quote and the search will carry on perfectly fine…
<- worldcat_api_search('$title exact "Finnegans Wake"')
results 1:3, .(total_wc_results, title, query)]
results[#> total_wc_results title query
#> <char> <char> <char>
#> 1: 761 Finnegans wake / srw.ti exact "Finnegans Wake"
#> 2: 761 Finnegans Wake srw.ti exact "Finnegans Wake"
#> 3: 761 Finnegans wake srw.ti exact "Finnegans Wake"
# yields the same results as
<- worldcat_api_search('$title exact "Finnegan\'s Wake"')
results 1:3, .(total_wc_results, title, query)]
results[#> total_wc_results title query
#> <char> <char> <char>
#> 1: 761 Finnegans wake / srw.ti exact "Finnegan's Wake"
#> 2: 761 Finnegans Wake srw.ti exact "Finnegan's Wake"
#> 3: 761 Finnegans wake srw.ti exact "Finnegan's Wake"
I suspect single quotes are automatically elided by the API.
As evinced in the examples shown earlier in this vignette, you can use the boolean operators and, or, and not to refine your search.
As you mix different boolean operators in a single query, care must be taken to ensure that the order/precedence of these operators matches your intention.
For example, in one of the examples above, we searched for Hip Hop materials on wax, cassette, or CD published from 1987 to 1990 with the following SRU query
'(($material_type=cas or $material_type=cda or $material_type=lps)
and $subject="Rap") and $year="1987-1990"'
It’s important to note that if we wrote the query like shown below
$material_type=cas or $material_type=cda or $material_type=lps
and $subject="Rap" and $year="1987-1990"'
without any parentheses, the meaning of the query would be ambiguous. Make sure you use parentheses around distinct sections of your search incantation to disambiguate the query.
taken verbatim from the archived documentation
For right truncation use an asterisk - * There must be at least three characters before the * for the query to work. There is no left truncation.
To wildcard a single character wildcard use a number sign - #. So a query for wom#n provides results that include both woman and women in the results.
For a 0-9 number of characters as wildcard characters use ?n. So a query for colo?1r provides results of color and colour.
To wildcard characters within (or at the end) a term use the question mark - ?. So a query for colo?r provides results of color, colour, colonizer, and colorimeter.
Formal search index code | libbib alias |
---|---|
srw.kw | $keyword |
srw.ti | $title |
srw.ln | $language |
srw.au | $author |
srw.yr | $year |
srw.su | $subject |
srw.li | $holding_library |
srw.mt | $material_type |
srw.no | $oclc |
srw.lc | $lc_call |
srw.dd | $dewey |
srw.dn | $lccn |
srw.bn | $isbn |
srw.in | $issn |
srw.cg | $library_holdings_group |
srw.la | $language_code |
srw.pl | $place_of_publication |
srw.pb | $publisher |
srw.am | $access_method |
srw.cn | $corporate_conference_name |
srw.pc | $dlc_limit |
srw.dt | $document_type |
srw.gn | $government_document_number |
srw.mn | $music_publisher_number |
srw.nt | $notes |
srw.on | $open_digital_limit |
srw.pn | $personal_name |
srw.se | $series |
srw.sn | $standard_number |
According to the archived Search API documentation, (I have strong doubts) the title search phrase will automatically elide certain common and “un-important” words. In the field of Natural Language Processing, we call these “stop words”.
The archive docs indicates that the following words will be removed from the search phase:
a, als, am, an, are, as, at, auf, aus, be, but, by, das, dass, de, der, des, dich, dir, du, er, es, for, from, had, have, he, her, his, how, ihr, ihre, ihres, im, in, is, ist, it, kein, la, le, les, mein, mich, mir, mit, of, on, sein, sie, that, the, this, to, un, une, von, was, wer, which, wie, wird, with, yousie, that, the, this, to, un, une, von, was, wer, which, wie, wird, with, you.
This means that title searches for The Brothers Karamazov, La Noche Boca Arriba, Das Spiel ist aus, or Das Kapital will, alledgely, internally use the seach phrases Brothers Karamazov, Noche Boca Arriba, Spiel, and Kapital, respectively.
When searching the title index, though, I’d leave these words in the SRU search query for two reasons…
It makes the title search phrase more readable, especially with non-english phrases.
I have strong doubts that all of these words are elided. For example, a search for Das Spiel ist aus, yields results as if the search phrase were indeed Das Spiel ist aus, and not as if it were simply Spiel.
As shown in one of the examples above, the holding library search term is the official OCLC designator. You can search for an institution’s code by the institutions name using this link.
An exhaustive crosswalk of all material types and their (normally 3-letter) codes would be too large to include here, but you can access the crosswalk in the documentation that is archived at this link
The “library holdings group” search term is, perhaps, a confusing one. The table below is a crosswalk between the search codes to use in the SRU query, and what they mean.
Search code | Meaning |
---|---|
05 | 5 or more holdings |
06 | 10 or more holdings |
07 | 50 or more holdings |
08 | 100 or more holdings |
09 | 500 or more holdings |
10 | No holdings |
11 | 1 holding only |
12 | 2 – 4 holdings |
13 | 5 – 9 holdings |
14 | 10 – 24 holdings |
15 | 25 – 49 holdings |
16 | 50 - 74 holdings |
17 | 75 – 99 holdings |
18 | 100 - 149 holdings |
19 | 150 - 199 holdings |
20 | 200 - 299 holdings |
21 | 300 - 399 holdings |
22 | 400 - 499 holdings |
23 | 500 - 599 holdings |
24 | 600 - 699 holdings |
25 | 700 - 799 holdings |
26 | 800 - 899 holdings |
27 | 900 - 999 holdings |
28 | 1,000 - 1,499 holdings |
29 | 1,500 - 1,999 holdings |
30 | 2,000 - 2,499 holdings |
31 | 2,500 or more holdings |
So, for example, if you wanted to limit your search to items that are
held by between 100 to 149 institutions, you would have to add
$library_holdings_group=18
to your SRU query.
For all the available search indexes, it is very helpful to know which exact MARC fields are searched in each one. This archived documentation link contains that information.
On that page, the SRU index codes have a prefix of sru
,
but it’s really srw
. Consult the Table of all available
search indexes above to find the libbib aliases for the indexes of
interest (though, of course, you can use the un-translated index codes
if you’d like, too).
worldcat_api_search
functionBesides for, of course, the SRU query, the
worldcat_api_search
function takes a number of optional
parameters that can be used to alter its semantics. Below is a list and
explanation of each of those parameters. (This information is also
available by running help("worldcat_api_search")
in an R
console after loading the libbib
package.)
max_records The maximum number of search results
to return. This must be a number between 0 and 100 or Inf
.
If Inf
, the function will automatically make all follow-up
requests to retrieve all search results. To limit the number of times
the API is hit, the default is 10 search results.
sru_query_assist A logical (boolean) indicating
whether translation from more human-readable aliases to the SRU search
index codes should be allowed. The default is TRUE
. It can
also be set to FALSE
. You can control this parameter
globally by setting
options("libbib.sru_query_assist")
.
frbrGrouping FRBR (Functional Requirements
for Bibliographic Records) is a conceptual framework for
understanding the relationships between a “work” (e.g. a novel as
planned by an author), its “expression” (e.g. the manuscript of that
novel), it’s “manifestation” (e.g. the first published version of the
novel), and, finally, an “item” (e.g an actual physical book [or
microform; audio file; etc…] of the work). With the
frbrGrouping
parameter set to on
(default), an
attempt is made by the WorldCat API to group together similar editions
and present only the top held record as the representative record for
that group. This, conceptually, can be viewed as an attempt to return
search results for any expression of the “work” (or works) referred to
in the search query.
start_at The search result to start at (default is 1)
wskey A WorldCat API key. This function is
easiest to use by setting the wskey globally, with the following
incantation:
options(libbib.wskey="YOUR WSKEY GOES HERE")
more A logical indicating whether more
information from the MARCXML search results should be returned
(publisher, bib level, etc….). The default is
TRUE
.
print.progress A logical indicating whether a
message should be displayed for each API request. If
max_records
is Inf
a message will be displayed
for every group of 100 search results the function fetches. The default
is TRUE
.
debug A logical indicating whether the HTTP and
API responses should be printed (for debugging). The default is
FALSE
.
libbib
There are numerous ways to combine the
worldcat_api_search
with the other functions that
libbib
provides to do some really useful investigations. In
the example below, will be using the worldcat_api_search
and worldcat_api_locations_by_oclc
functions to get a list
of institutions that hold any edition of my textbook. (This example uses
some data.table
specific syntax for brevity, but it will
work with base R [or “tidyverse”] translations just fine.)
First, let’s use the search function to get a list of all search results for the book…
<- worldcat_api_search('srw.ti="Data Analysis with r"
results and srw.au=fischetti',
max_records=Inf)
# inspect some of the columns in the first 5 results
1:5, .(total_wc_results, result_number, oclc,
results[
title, author, pub_date)#> total_wc_results result_number oclc title
#> <char> <int> <char> <char>
#> 1: 11 1 1005106045 DATA ANALYSIS WITH R -.
#> 2: 11 2 1089176194 Data analysis with R :
#> 3: 11 3 949229431 Data analysis with R :
#> 4: 11 4 1242682069 Data Analysis with R
#> 5: 11 5 1242707288 Data Analysis with R
#> author pub_date
#> <char> <int>
#> 1: FISCHETTI, TONY. 2018
#> 2: Fischetti, Tony. 2018
#> 3: Fischetti, Tony. 2015
#> 4: Fischetti, Tony 2015
#> 5: Fischetti, Tony 2015
Now let’s get all the unique OCLC numbers from all the search results.
<- results[, unique(oclc)]
all_the_oclcs
all_the_oclcs#> [1] "1005106045" "1089176194" "949229431" "1242682069" "1242707288"
#> [6] "1244405806" "1104264768" "1242685020" "1242707684" "1244406814"
#> [11] "1104846312"
On to the worldcat_api_locations_by_oclc
function! Since
this function takes one OCLC number at a time, we need to use a
looping-construct to run the function with all the OCLC numbers
in all_the_oclcs
. We’ll be using the pblapply
function (from the great pbapply
package) to do this
because we get a useful progress bar with no extra effort.
<- pblapply(all_the_oclcs,
holds function(x){
worldcat_api_locations_by_oclc(x,
include.bib.info=FALSE)
})
Since the pblapply
function returns a list of
data.tables
(on for each OCLC), we’ll use
data.table
’s rbindlist
function to one
data.table
containing all the results.
<- rbindlist(holds)
all_holdings_dt 1:3]
all_holdings_dt[#> oclc institution_identifier institution_name copies
#> <char> <char> <char> <char>
#> 1: 1005106045 FEM The Ferguson Library 1
#> 2: 1005106045 YDX YBP Library Services 1
#> 3: 1005106045 DUQ Duquesne University Library 1
all_holdings_dt[, .(institution_name)]#> institution_name
#> <char>
#> 1: The Ferguson Library
#> 2: YBP Library Services
#> 3: Duquesne University Library
#> 4: Centennial College
#> 5: George Brown College
#> ---
#> 1029: Hochschule Mittweida (FH), Hochschulbibliothek
#> 1030: Cyberlibris
#> 1031: Cyberlibris
#> 1032: Cyberlibris
#> 1033: Bibliothèque de l'Université du Québec à Trois-Riv...
There you have it! My textbook is held by 1033 distinct OCLC institutions!
Although the example above only searches the holding institutions of one specific book, the idiom is most helpful/interesting/cool when used for finding the holding institution of a entire class of materials.
For example, in a recent project for a curator at my institution, I
used the Search API to find all search results for materials on a very
specific topic, got all the holding institutions for each of the search
results, and then aggregated the institutions (with this package’s
dt_counts_and_percents
function) to find the institutions
holding the most items on this particular (very specific) topic.
WorldCat Search API current documentation
scroll to SRU
Current documentation on the (SRU) WorldCat Search API call. Contains all index names and codes, parameters available, and the meaning of API status code return numbers
WorldCat Search API > Using the API > Request Types > SRU
(archived OCLC documentation link from 2013)
Examples of simple SRU requests and explanation of
frbrGrouping
and servicelevel
parameters
WorldCat Search API > Indexes
(archived OCLC documentation link from 2013)
Explation of relations operators, boolean operators, title index
stop-words (removed words), the MARC subfields search by (only) the
subject index, and explation of the wildcard characters *
;
#
; and ?
.
WorldCat Search API > Indexes > Complete List of Indexes
(archived OCLC documentation link from 2013)
Explation of relations operators and a complete list of all indexes,
their respective SRU index code, the relations available for use with
each, and the MARC fields search for each index. Here, the SRU index
codes have a prefix of sru
, but it’s really
srw
.
Material Types Names and Codes
A complete list of Material Types you can search for, their codes
For use with the srw.mt
/$material_type
index.
WorldCat Search API > Indexes > Tips for specific indexes
(archived OCLC documentation link from 2013)
Contains a lot of very helpful information about some of the different search indexes, the stop words they use, normalization rules, and a very helpful cross-walk on the “Number of Holding Libraries” index.
WorldCat Search API > Using the API > Parameters
(archived OCLC documentation link from 2013)
Helpful information about what the different SRU search parameters
mean. Remember that the parameters of the
worldcat_api_search
function (a) only contain a subset of
these, and (b) may have slightly different names https://web.archive.org/web/20121101034843/http://oclc.org/developer/documentation/worldcat-search-api/parameters
Finding Institution Codes
A search box for searching institution names and returning instituion codes suitable for use in the holding institution search index.
Searching WorldCat Indexes
(archived OCLC documentation link from 2012)
Information about search indexes that are largely available in the above links. Not related to the (SRU) Search API but could be helpful, anyway.
CQL specification
Helpful information about CQL query syntax. Not all is applicable to the WorldCat Search API that is discussed here.
Searching WorldCat Indexes
Current page with links containing a lot of information about wielding WorldCat searches. Not all information is application to the WorldCat Search API that is discussed here, and most of the most helpful information is already covered in the links above.
https://help.oclc.org/Librarian_Toolbox/Searching_WorldCat_Indexes
A Ruby Gem to communicate with the WorldCat Search API
Python scripts using the WorldCat search API
Using the WorldCat API to Develop Data-Driven Decision-Making for Gifts-in-Kind
https://journals.ala.org/index.php/lrts/article/view/7074/9638
Destroyer and Preserver, Hear, Oh Hear! Not All Uncirculated Books Must Chariotest to a Dark Wintry Bed: How We Used the Books Must Chariotest to a Dark Wintry Bed: How We Used the OCLC WorldCat Search API to Inform Our Weeding Decisions with OCLC WorldCat Search API to Inform Our Weeding Decisions with Holdings Data Holdings Data
https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=2026&context=charleston
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.