The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
vosonSML
Twitter
functions to support major changes made in rtweet
release
version 1.0.2.endpoint
parameter to the Twitter
Collect
function. It is set to search
by
default, which is the usual collect behaviour, but can also now be set
to timeline
to collect user timelines instead. See
Collect.timeline.twitter()
for parameters.vosonSML
functions are
now silent by default. Using the verbose
parameter will
again print function output.message()
function
instead of the cat()
function by default. Setting the
global option option(voson.msg = FALSE)
will again redirect
output to cat()
. The option can be removed by assigning a
value of NULL
.voson.data
option allowing a directory path
to be set for writeToFile
output files. Files are output to
the current working directory by default, however a new directory can
now be set with option(voson.data = "my-data")
for example.
The directory path can be relative or a full path, but must be created
beforehand or already exist. If the path is invalid or does not exist it
will continue with the default behaviour. This option can be removed by
assigning a value of NULL
. This will not effect other file
write operations performed by the user.Twitter
AddText()
and
AddUserData()
functions now work with most
Twitter
network types.AddText()
now adds columns for embedded tweet text and
has a hashtags
parameter to add a list of tweet hashtags as
a network attribute.AddUserData()
now adds an additional dataframe for
missing_users
. It lists the ids and screen names of users
that did not have metadata embedded in the collected data. Using the
lookupUsers
parameter will retrieve the metadata using the
twitter API. Additonally passing the refresh = TRUE
parameter will now retrieve and update the metadata for all users in the
network.tweets
and users
.ImportData
function and replaced it with
ImportRtweet()
for rtweet
version 1.0 format
data.Merge()
and MergeFiles()
functions
to support the merging of collected data from separate operations. These
functions support input of multiple Collect objects or .RDS
files, automatically detect the datasource type and support the
writeToFile
parameter for file output of merged data.YouTube
id extraction from url function to be
more robust and added support for YouTube
shorts urls.GetYoutubeVideoIDs
function. The
YouTube
collect function parameter videoIDs
will now accept video ids or video urls.auth_twitter_app()
,
auth_twitter_dev()
and auth_twitter_user()
functions for each token type. The collect_reddit_threads()
and collect_web_hyperlinks()
functions skip the unecessary
Authenticate
step for Reddit
and web data
collection.status ID
to summarise
collected tweet range. The Min ID
and Max ID
are not necessarily the earliest and latest tweet in the tweets
collected and therefore not ideal for delimiting subsequent collections.
Instead the two Earliest Obs
and two
Latest Obs
tweets as returned by the
Twitter API
are now reported.enpoint
parameter to Collect
,
allowing search
or timeline
to be specified
for a twitter
data collection. If it is not specified the
default is a twitter search
.timeline
collection accepts a users
vector of user names or ID’s or a mixture of both, and will return up to
3,200 of each users most recent tweets.Create.actor.twitter
and
Create.activity.twitter
to use dplyr
and
data.table
techniques consistent with other package network
creation functions. Both functions are significantly faster for large
collection dataframes.Create.actor.twitter
includes two new parameters for
mentions
, inclMentions
that will process and
include mentions
edges in the network and
inclRtMentions
that will process and include mentions found
in retweets. The inclMentions
parameter is set to
TRUE
by default and inclRtMentions
set to
FALSE
. The inclRtMentions
parameter is a
subset of mentions, therefore for it to be set to TRUE
,
inclMentions
must also be TRUE
.Create.activity.twitter
network creation. Added
author_id
and author_screen_name
to nodes to
assist with labels or re-creating tweet URLs from data.rmEdgeTypes
parameter to
Create.activity.twitter
and
Create.actor.twitter
. These accept a list of edge types
that can be filtered out of the network during network creation.Graph
function.Min ID
will be the same, but sometimes the
Min ID
is outside of the expected collection range. The
last observation is a more reliable tweet to use as the starting point
for subsequent search collections.Collect
method with hyperlink
network creation. The Create
function with
activity
type parameter creates a network where nodes are
web pages
and edges the hyperlinks
linking
them (extracted from a href
HTML tags). The
actor
network has page or site domains
as the
nodes and again the hyperlinks
from linking pages between
domains.Collect
dataframes to avoid dplyr
issues.retryOnRateLimit
set to FALSE
if
rate limit cannot be determined.ImportData
will now accept a file path or a
dataframe.Collect
dataframes
after writeToFile
. Should no longer be required to manually
add class names or use ImportData
to load RDS files to use
previously saved data with Create
functions.Create.semantic.twitter
,
Create.twomode.twitter
and the
Intro-to-vosonSML
vignette:
tidyr
, tidytext
and
stopwords
package requirements in descriptions and
examplestwomode
networks as
2-mode
where possiblevctrs
error when using dplyr
functions. The
classes are no longer needed post-method routing so they are simply
removed.dplyr::funs
function that was generating a warning.bind_rows
error on joining
dataframes with different types for the structure column. Column type
was being set to integer instead of character in cases when every thread
comment have no replies or depth (except the OP).Create.semantic.twitter
and
Create.twomode.twitter
functions using the
tidytext
package. They now better support tokenization of
tweet text and allows a range of stopword lists and sources to be used
from the stopwords
package. The semantic network function
requires the tidytext
and tidyr
packages to be
installed before use.Create.semantic.twitter
:
removeNumbers
and removeUrls
, default value is
TRUE
.assoc
parameter has been added to choose which node
associations or ties to include in the network. The default value is
"limited"
and includes only ties between most frequently
occurring hashtags and terms in tweets. A value of full
will also include ties between most frequently occurring hashtags and
hashtags, and terms with terms creating a more densely connected
network.stopwords
language e.g
stopwordsLang = "en"
and source e.g
stopwordsSrc = "smart"
have been added. These correspond to
the language
and source
parameters of the
tidytext::get_stopwords
function. The
stopwords
default value is TRUE
.Create.twomode.twitter
function is weighted by default but can be disabled by setting the new
weighted
parameter to FALSE
.replies_from_text
parameter to
repliesFromText
and at_replies_only
to
atRepliesOnly
in the AddText.actor.youtube
function for consistency.tm
package dependency.Introduction to vosonSML
vignette
Merging Collected Data
examples.Collect.youtube
that was
causing no video comments to be collected if there were no reply
comments for any of the videos first maxComments
number of
top level comments. For example, if maxComments
is set to
100 and the first 100 comments made to a video had no replies then no
results would be returned.rtweet::rate_limit
function that resulted in an
error when using the rtweet retryonratelimit
search
parameter. The rate_limit
function was being called by
vosonSML
to check the twitter rate limit regardless of
whether the search parameter was set or not, and so was failing
Collect
with an error. A fix was made so that
vosonSML
checks if rtweet::rate_limit
succeeds, and if not automatically sets retryonratelimit
to
FALSE
so that a twitter Collect
can still be
performed without error should this problem occur again.pkgdown
site navbar.Introduction to vosonSML
vignette.Introduction to vosonSML
vignette to the
package.ImportData
.Authenticate
and
ImportData
.jsonlite::fromJSON
.tictoc
package from dependency imports to
suggested packages.rtweet
package is
installed.RedditExtractoR
package from imports.twomode
networks.bimodal
networks to twomode
.AddText()
and Graph()
. Also improved
consistency of output messages from Collect
and
Create
functions.reddit
gsub locale error
https://github.com/vosonlab/vosonSML/issues/21.bimodal
network hashtags to lowercase as filter
terms when entered are converted to lowercase.bimodal
and semantic
networks.GetVideoData()
function call in
AddVideoData
.AddText
functions related to
strict typing by dplyr::if_else
function.AddText
function to redirect edges towards actors based on the presence of a
screen name
or @screen name
that may be found
at the beginning of a reply comment. Typically reply comments are
directed towards a top-level comment, this instead captures when reply
comments are directed to other commenters in the thread.actor
network identifiers to be their
unique Channel ID
instead of their
screen names
.AddVideoData
function to add collected
video data to the youtube actor
network. The main purpose
of this function is to replace video identifiers with the
Channel ID
of the video publisher (actor) instead. To get
the Channel ID
of video publishers an additional API lookup
for the videos in the network is required. Additional columns such as
video Title
, Description
and
Published
time are also added to the network
$edges
dataframe as well as returned in their own dataframe
called $videos
.AddText
function to add collected text data
to networks. This feature applies to activity
and
actor
networks and will typically add a node attribute to
activity networks and an edge attribute to actor networks. For example,
this function will add the column vosonTxt_tweets
containing tweet text to $nodes
if passed an activity
network, and to $edges
if passed an actor network.igraph
graph objects and subsequent
writing to file has been removed from the Create
function
and placed in a new function Graph
. This change abstracts
the graph creation and makes it optional, but also allows supplemental
network steps such as AddText
to be performed prior to
creating the final igraph object.writeToFile
parameter from Create
functions and added it to Graph
.weightEdges
, textData
and
cleanText
parameters from Create.actor.reddit
.
cleanText
is now a parameter of
AddText.activity.reddit
and
AddText.actor.reddit
.AddTwitterUserData
with
AddUserData
function that works similarly to
AddText
. This function currently only applies to twitter
actor networks and will add, or download add if missing, user profile
information to actors as node attributes.activity
network type for reddit. In the reddit
activity network nodes are the thread posts and comments, edges
represent where comments are directed in the threads.activity
network type for twitter and youtube
Create
function. In this network nodes are the items
collected such as tweets returned from a twitter search and comments
posted to youtube videos. Edges represent the platform relationship
between the tweets or comments.self-loop
.
This aims to facilitate the later addition of tweet text to the network
graph for user tweets that have no ties to other users.rtweet::create_token
. Method is used when only twitter app
name and consumer keys are passed to Authenticate.twitter
as parameters. e.g
Authenticate("twitter", appName = "An App", apiKey = "xxxxxxxxxxxx", apiSecret = "xxxxxxxxxxxx")
.
A browser tab will open asking the user to authorize the app to their
twitter account to complete authentication. This is using twitters
Application-user authentication: OAuth 1a (access token for user context)
method.file
) via the HTTPUserAgent
option. It is temporarily set to package name and current version number
for Collect e.g vosonSML v.0.27.2 (R Package)
.Create.semantic.twitter
in which a sum
operation calculating edge weights would set NA
values for
all edges due to NA
values present in the hashtag fields.
This occurs when there are tweets with no hashtags in the twitter
collection and is now checked.Create.semantic.twitter
were also fixed.Collect.twitter
in which any additional
twitter API
parameters e.g lang
or
until
were not being passed properly to
rtweet::search_tweets
. This resulted in the additional
parameters being ignored.SaveCredential
and
LoadCredential
functions, as well as the
useCachedToken
parameter for
Authenticate.twitter
. These were simply calling the
saveRDS
and readRDS
functions and not
performing any additional processing. Using saveRDS
and
readRDS
directly to save and load an
Authenticate
credential object to file is simpler.cleanText
parameter works in
Create.actor.reddit
so that it is more permissive.
Addresses encoding issues with apostrophes and pound symbols and removes
unicode characters not permitted by the XML 1.0 standard as used in
graphml
files. This is best effort and does not resolve all
reddit
text encoding issues.Collect.twitter
summary information that includes
the earliest (min) and latest (max) tweet status_id
collected with timestamp. The status_id
values can be used
to frame subsequent collections as since_id
or
max_id
parameter values. If the until
date
parameter was used the timestamp can also be used as a quick
confirmation.Collect
method.Create.actor.reddit
that were incorrectly
creating edges between top-level commentors and thread authors from
different threads. These bugs were only observable in when collecting
multiple reddit threads.reddit
collection. Removed the
progress bar and added a table of results summarising the number of
comments collected for each thread.twitter
collection output the users
twitter API
reset time.Create.actor.twitter
and
Create.bimodal.twitter
in which the vertices dataframe
provided to the graph_from_data_frame
function as a
contained duplicate names raising an error.roxygen
documentation and examples
for all package functions.Authenticate
, Collect
and
Create
S3 methods to implement function routing based on
object class names.pkgdown
web site for github hosted package
documentation.twitteR
twitter collection implementation
with the rtweet
package.twitter
authentication token can now be cached
in the .twitter_oauth_token
file and used for subsequent
twitter API
requests without re-authentication. A new
authentication token can be cached by deleting this file and using the
re-using the parameter useCachedToken = TRUE
.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.