The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Sample dataframe included in package. Word pairs are arrayed in columns. Columns need not be immediately adjacent within your dataframe.
word1 | word2 |
---|---|
Dog | trumpet |
the | BANANA |
rat | astronaut |
*&^% | lizard |
bird | bird |
Arguments to clean_paired_cols
are:
dat
your raw dataframe with two columns of paired text
word1
quoted variable reflecting the column name where your
first word lives
word2
quoted variable reflecting the
column name where your first word lives
lemmatize
transforms raw word to lemmatized form, T/F default is TRUE
WordPairs_Clean <- clean_paired_cols(dat=Word_Pairs, wordcol1='word1', wordcol2='word2', lemmatize=TRUE)
knitr::kable(head(WordPairs_Clean, 6), format = "simple", digits=2)
id_row_orig | word1_clean | word2_clean | word1 | word2 |
---|---|---|---|---|
1 | dog | trumpet | Dog | trumpet |
2 | the | banana | the | BANANA |
3 | rat | astronaut | rat | astronaut |
5 | bird | bird | bird | bird |
6 | shark | shark | shark | shark |
8 | dog | leash | Dog | leash |
Generates semantic distances (Glove and SD15) between word pairs in
separate columns. Output of ‘dist_paired_cols’ on 2-column arrayed
dataframe. Argument to dist_paired_cols
: dat
=
dataframe with word pairs arrayed in columns cleaned and prepped using
‘clean_2cols’ fn
Columns_Dists <- dist_paired_cols(dat=WordPairs_Clean)
knitr::kable(head(Columns_Dists, 6), format = "simple", digits=2)
id_row_orig | word1_clean | word2_clean | word1 | word2 | CosDist_SD15 | CosDist_GLO |
---|---|---|---|---|---|---|
1 | dog | trumpet | Dog | trumpet | 0.45 | 0.84 |
2 | the | banana | the | BANANA | 1.18 | 0.77 |
3 | rat | astronaut | rat | astronaut | 1.22 | 0.93 |
5 | bird | bird | bird | bird | 0.00 | 0.00 |
6 | shark | shark | shark | shark | 0.00 | 0.00 |
8 | dog | leash | Dog | leash | 0.68 | 0.50 |
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.