The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”
(from Mr. Tilney in Northanger Abbey)
This package provides access to the full texts of Jane Austen’s 6 completed, published novels. The UTF-8 plain text for each novel was sourced from Project Gutenberg, processed a bit, and is ready for text analysis. Each text is in a character vector with elements of about 70 characters. The package contains:
sensesensibility
: Sense and Sensibility,
published in 1811prideprejudice
: Pride and Prejudice, published
in 1813mansfieldpark
: Mansfield Park, published in
1814emma
: Emma, published in 1815northangerabbey
: Northanger Abbey, published
posthumously in 1818persuasion
: Persuasion, also published
posthumously in 1818There is also a function austen_books()
that returns a
tidy data frame of all 6 novels.
Users should be aware that there are some differences in usage between the novels as made available by Project Gutenberg. For example, “anything” vs. “any thing”, “Mr” vs. “Mr.”, and using underscores vs. all caps to indicate italics/emphasis.
You can install the released version of janeaustenr from CRAN with:
install.packages("janeaustenr")
And the development version from GitHub with:
# install.packages("devtools")
::install_github("juliasilge/janeaustenr") devtools
For some ideas on getting started with analyzing these texts, see my
blog post on
sentiment analysis of Austen’s novels. For help within R, try
?persuasion
or similar for getting started with the data
sets.
This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.