Repository Mirror for your Cloud Server and Webhosting

Title:

'Pubmed' Word Clouds

Description:

Create a word cloud using the abstract of publications from 'Pubmed'.

Version:

0.3.6

Date:

2019-02-28

Author:

Felix Yanhui Fan <nolanfyh@gmail.com>

Imports:

XML, stringr, RCurl, wordcloud, tm, RColorBrewer

Maintainer:

Felix Yanhui Fan <nolanfyh@gmail.com>

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

URL:

http://felixfan.github.io/PubMedWordcloud/

RoxygenNote:

6.0.1

NeedsCompilation:

Packaged:

2019-03-01 02:04:15 UTC; alicefelix

Repository:

CRAN

Date/Publication:

2019-03-01 05:30:07 UTC

clean data

Description

remove Punctuations, remove Numbers, Translate characters to lower or upper case, remove stopwords, remove user specified words, Stemming words.

Usage

cleanAbstracts(abstracts, rmNum = TRUE, tolw = TRUE, toup = FALSE,
  rmWords = TRUE, yrWords = NULL, stemDoc = FALSE)

Arguments

abstracts

output of getAbstracts, or just a paragraph of text

rmNum

Remove the text document with any numbers in it or not

tolw

Translate characters in character vectors to lower case or not

toup

Translate characters in character vectors to upper case or not

rmWords

Remove a set of English stopwords (e.g., 'the') or not

yrWords

A character vector listing the words to be removed.

stemDoc

Stem words in a text document using Porter's stemming algorithm.

Examples

# Abs=getAbstracts(c("22693232", "22564732"))
# cleanAbs=cleanAbstracts(Abs)

# text="Jobs received a number of honors and public recognition."
# cleanD=cleanAbstracts(text)

plot colors

Description

plot colors.

Usage

colSets(type)

Arguments

type

palette names from the lists: Accent, Dark2, Pastel1, Pastel2, Paired, Set1, Set2, Set3.

Examples

# colors= colSets(type="Accent")
# colors= colSets(type="Paired")
# colors= colSets(type="Set3")

edit PMIDs

Description

add two sets of PMIDs together, or exclude one set PMIDs from another set of PMIDs.

Usage

editPMIDs(x, y, method = c("add", "exclude"))

Arguments

x

output of getPMIDs, or a set of PMIDs

y

output of getPMIDs, or a set of PMIDs

method

can be 'add' (default) or 'exclude'. see details.

Details

when method is 'add', PMIDs in 'x' and 'y' will be combined. when method is 'exclude', PMIDs in 'y' will be excluded from 'x'.

Examples

# pmid1=getPMIDs(author="Yan-Hui Fan",dFrom=2007,dTo=2013,n=10)
# rm1="22698742"
# pmids1=editPMIDs(x=pmid1,y=rm1,method="exclude")

# pmid2=getPMIDs(author="Yanhui Fan",dFrom=2007,dTo=2013,n=10)
# rm2="20576513"
# pmids2=editPMIDs(x=pmid2,y=rm2,method="exclude")

# pmids=editPMIDs(x=pmids1,y=pmids2,method="add")

get Abstracts

Description

retrieve abstracts of the specified PMIDs from PubMed.

Usage

getAbstracts(pmid, https = TRUE, s = 100)

Arguments

pmid

a set of PMIDs

https

use https instead of http

s

download how many PMIDs each time

Examples

# pmids=c("22693232", "22564732", "22301463", "22015308", "21283797", "19412437")
# abstracts=getAbstracts(pmids)

# pmid="22693232"
# abstract=getAbstracts(pmid)

# pmids=getPMIDs(author="Yan-Hui Fan",dFrom=2007,dTo=2013,n=10)
# abstracts=getAbstracts(pmids)

get PMIDs using author names

Description

retrieve PMIDs (each PMID is 8 digits long) from PubMed for author and the specified date.

Usage

getPMIDs(author, dFrom, dTo, n = 500, https = TRUE)

Arguments

author

author's name

dFrom

start year

dTo

end year

n

max number of retrieved articles

https

use https instead of http

Examples

# getPMIDs(author="Yan-Hui Fan",dFrom=2007,dTo=2013,n=10)

# getPMIDs(author="Yanhui Fan",dFrom=2007,dTo=2013,n=10)

get PMIDs using Journal names and Keywords

Description

retrieve PMIDs (each PMID is 8 digits long) from PubMed for Specific Journal, Keywords and date.

Usage

getPMIDsByKeyWords(keys = NULL, journal = NULL, dFrom = NULL,
  dTo = NULL, n = 10000, https = TRUE)

Arguments

keys

keywords

journal

journal name

dFrom

start year

dTo

end year

n

max number of retrieved articles

https

use https instead of http

Examples

# getPMIDsByKeyWords(keys="breast cancer", journal="science",dTo=2013)

# getPMIDsByKeyWords(keys="breast cancer", journal="science")

# getPMIDsByKeyWords(keys="breast cancer",dFrom=2012,dTo=2013)

# getPMIDsByKeyWords(journal="science",dFrom=2012,dTo=2013)

PubMed wordcloud using function 'wordcloud' of package wordcloud

Description

PubMed wordcloud.

Usage

plotWordCloud(abs, scale = c(3, 0.3), min.freq = 1, max.words = 100,
  random.order = FALSE, rot.per = 0.35, use.r.layout = FALSE,
  colors = brewer.pal(8, "Dark2"))

Arguments

abs

output of cleanAbstracts, or a data frame with one colume of 'word' and one colume of 'freq'.

scale

A vector of length 2 indicating the range of the size of the words.

min.freq

words with frequency below min.freq will not be plotted

max.words

Maximum number of words to be plotted. least frequent terms dropped

random.order

plot words in random order. If false, they will be plotted in decreasing frequency

rot.per

proportion words with 90 degree rotation

use.r.layout

if false, then c++ code is used for collision detection, otherwise R is used

colors

color words from least to most frequent

Details

This function just call 'wordcloud' from package wordcloud. See package wordcloud for more details about the parameters.

Examples

# text="Jobs received a number of honors and public recognition." 
# cleanD=cleanAbstracts(text)
# plotWordCloud(cleanD,min.freq=1,scale=c(2,1))

clean data

Description

Usage

Arguments

See Also

Examples

plot colors

Description

Usage

Arguments

Examples

edit PMIDs

Description

Usage

Arguments

Details

See Also

Examples

get Abstracts

Description

Usage

Arguments

See Also

Examples

get PMIDs using author names

Description

Usage

Arguments

See Also

Examples

get PMIDs using Journal names and Keywords

Description

Usage

Arguments

See Also

Examples

PubMed wordcloud using function 'wordcloud' of package wordcloud

Description

Usage

Arguments

Details

Examples