The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Version: 0.1-3
Title: PDF Tools Based on Poppler
Description: PDF tools based on the Poppler PDF rendering library. See http://poppler.freedesktop.org/ for more information on Poppler.
License: GPL-2
SystemRequirements: Poppler Glib interface headers and libraries (<http://poppler.freedesktop.org/>) [Debian/Ubuntu: libpoppler-glib-dev, Fedora: poppler-glib-devel]
NeedsCompilation: yes
Packaged: 2024-08-13 11:10:12 UTC; hornik
Author: Kurt Hornik ORCID iD [aut, cre]
Maintainer: Kurt Hornik <Kurt.Hornik@R-project.org>
Repository: CRAN
Date/Publication: 2024-08-13 11:13:31 UTC

PDF document reference

Description

Create a reference to a Portable Document Format (PDF) file for use in subsequent information extraction from the file.

Usage

PDF_doc(file)

Arguments

file

A character string giving the path to a PDF file.

Value

A reference to a PDF file (external pointer object).

Examples

file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils")
doc <- PDF_doc(file)
## Can now use the reference for information extraction, avoiding
## the creation of new PopplerDocument objects when doing so.
PDF_info(doc)
PDF_fonts(doc)

PDF font information

Description

Obtain the fonts used in a Portable Document Format (PDF) file and further information about these fonts.

Usage

PDF_fonts(file)

Arguments

file

A character string giving the path to a PDF file, or an object of class "PDF_doc" giving a reference to a PDF file.

Value

A data frame inheriting from PDF_fonts (which has a useful print method), with the following variables:

name

the full name of the font (character)

type

the font type (Type 1, Type 3, etc.; character)

file

the file name of the font (character; empty if the font is embedded)

emb

whether the font is embedded in the PDF file or not (logical)

sub

whether the font is a subset of another font (logical)

Examples

file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils")
PDF_fonts(file)

PDF document information

Description

Extract document information from a Portable Document Format (PDF) file.

Usage

PDF_info(file)

Arguments

file

A character string giving the path to a PDF file, or an object of class "PDF_doc" giving a reference to a PDF file.

Value

An object of class PDF_info (which has useful format and print methods), containing the information in the PDF Info dictionary (title, subject, keywords, author, creator, producer, creation date, modification date) as well as the number of pages and the page sizes, whether the document is optimized (linearized), and the PDF version it uses.

Examples

file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils")
PDF_info(file)

PDF text extraction

Description

Extract text from a Portable Document Format (PDF) file.

Usage

PDF_text(file)

Arguments

file

A character string giving the path to a PDF file, or an object of class "PDF_doc" giving a reference to a PDF file.

Value

A character vector with the extracted texts for each page.

Examples

file <- system.file(file.path("doc", "Sweave.pdf"), package = "utils")
PDF_text(file)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.