The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Type: Package
Title: Wrapper for the 'mediacloud.org' API
Version: 0.1.0
Depends: R (≥ 3.2.0)
Description: API wrapper to gather news stories, media information and tags from the 'mediacloud.org' API, based on a multilevel query https://mediacloud.org/. A personal API key is required.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Imports: httr, jsonlite, rvest, xml2
Suggests: testthat, covr, knitr, rmarkdown
RoxygenNote: 6.1.1
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2019-07-21 08:33:43 UTC; jan
Author: Dix Jan [cre, aut]
Maintainer: Dix Jan <jan.dix@uni-konstanz.de>
Repository: CRAN
Date/Publication: 2019-07-24 07:50:02 UTC

Extract meta data

Description

extract_meta_data extracts native, open graph and twitter meta data from html documents. The meta data include url, title, description and image. The html document is parsed within the function

Usage

extract_meta_data(html_doc)

Arguments

html_doc

Character string including the html document.

Value

List with three sublists for native, open graph and twitter.

Examples

## Not run: 
 library(httr)
 url <- "https://bits.blogs.nytimes.com/2013/04/07/the-potential-and-the-risks-of-data-science"
 response <- GET(url)
 html_document <- content(response, type = "text", encoding = "UTF-8")
 meta_data <- extract_meta_data(html_doc = html_document)

## End(Not run)


Get media by id

Description

get_media returns media source by their id. A media source is one publisher. Every story that can be collected via get_story or get_story_list belongs to one media source.

Usage

get_media_source(media_id, api_key = Sys.getenv("MEDIACLOUD_API_KEY"))

Arguments

media_id

Positive integer that contains a valid media“ id.

api_key

Character string with the API key you get from mediacloud.org. Passing it is compulsory. Alternatively, function can be provided from the global environment.

Value

Data frame with results. See https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md#media for field descriptions.

Examples

## Not run: 
 media_source <- get_media_source(media_id = 604L)

## End(Not run)


Get story by id

Description

get_story returns news stories by their id. One story represents one online publication. Each story refers to a single URL from any feed within a single media source.

Usage

get_story(story_id, api_key = Sys.getenv("MEDIACLOUD_API_KEY"))

Arguments

story_id

Positive numeric that contains a valid story id.

api_key

Character string with the API key you get from mediacloud.org. Passing it is compulsory. Alternatively, function can be provided from the global environment.

Value

Data frame with results. See https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md#stories for field descriptions.

Examples

## Not run: 
 story <- get_story(story_id = 604L)

## End(Not run)


Get story list

Description

get_story returns a list of stories based on a multifaceted query. One story represents one online publication. Each story refers to a single URL from any feed within a single media source.

Usage

get_story_list(last_process_stories_id = 0L, rows = 100,
  feeds_id = NULL, q = NULL, fq = NULL,
  sort = "processed_stories_id", wc = FALSE, show_feeds = FALSE,
  api_key = Sys.getenv("MEDIACLOUD_API_KEY"))

Arguments

last_process_stories_id

Return stories in which the processed_stories_id is greater than this value.

rows

Number of stories to return, max 1000.

feeds_id

Return only stories that match the given feeds_id, sorted my descending publish date

q

If specified, return only results that match the given Solr query. Only one q parameter may be included.

fq

If specified, file results by the given Solr query. More than one fq parameter may be included.

sort

Returned results sort order. Supported values: processed_stories_id, random

wc

If set to TRUE, include a 'word_count' field with each story that includes a count of the most common words in the story

show_feeds

If set to TRUE, include a 'feeds' field with a list of the feeds associated with this story

api_key

Character string with the API key you get from mediacloud.org. Passing it is compulsory. Alternatively, function can be provided from the global environment.

Value

Data frame with results. See https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md#stories for field descriptions.

Examples

## Not run: 
 stories <- get_story_list()
 stories <- get_story_list(q = "Trump")

## End(Not run)


HTML document to test extract_meta_data

Description

A HTML document with basic meta tags for open-graph, twitter and native meta data.

Usage

meta_data_html

Format

An object of class character of length 1.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.