The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Words used in Portuguese Wikipedia
This data-package contains a dataset with words used in a random sample from ~15.000 pages from the Portuguese Wikipedia.
It can be installed using:
::install_github("dfalbel/ptwikiwords") devtools
After installing the package, you can load the dataset using:
library(ptwikiwords)
data(ptwikiwords)
head(ptwikiwords)
#> # A tibble: 6 × 3
#> word count check
#> <chr> <int> <lgl>
#> 1 de 210954 TRUE
#> 2 a 109652 TRUE
#> 3 e 100028 TRUE
#> 4 o 87839 TRUE
#> 5 em 67040 TRUE
#> 6 do 59489 TRUE
The dataset contains 3 columns:
Here is a wordcloud of those words:
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(wordcloud))
<- ptwikiwords %>%
words_filter filter(check == T) %>%
slice(1:300)
wordcloud(words_filter$word, words_filter$count)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.