The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Detect the Language of Text
Franc has no external dependencies and supports 310 languages; all languages spoken by more than one million speakers. Franc is a port of the JavaScript project of the same name, see https://github.com/wooorm/franc.
install.packages("franc")
library(franc)
Simply supply the text, and franc detects its language:
franc("Alle menslike wesens word vry")
#> [1] "afr"
franc("এটি একটি ভাষা একক IBM স্ক্রিপ্ট")
#> [1] "ben"
franc("Alle mennesker er født frie og")
#> [1] "nno"
head(franc_all("O Brasil caiu 26 posições"))
#> language score
#> 1 por 1.0000000
#> 2 src 0.8800937
#> 3 glg 0.8702576
#> 4 snn 0.8637002
#> 5 bos 0.8168618
#> 6 hrv 0.8103044
und
is the undefined
language, this is
returned if the input is too short (shorter than 10 characters by
default).
franc("the")
#> [1] "und"
franc("the", min_length = 3)
#> [1] "sco"
You can provide a whitelist or a blacklist:
franc_all("O Brasil caiu 26 posições",
whitelist = c("por", "src", "glg", "spa"))
#> language score
#> 1 por 1.0000000
#> 2 src 0.8800937
#> 3 glg 0.8702576
#> 4 spa 0.7995316
head(franc_all("O Brasil caiu 26 posições",
blacklist = c("src", "glg", "lav")))
#> language score
#> 1 por 1.0000000
#> 2 snn 0.8637002
#> 3 bos 0.8168618
#> 4 hrv 0.8103044
#> 5 cat 0.8065574
#> 6 spa 0.7995316
The R version of franc supports 310 languages. By default only the
languages with more than 1 million speakers are used, this is 175
languages. The min_speakers
argument can relax this, and
allows using more languages:
head(franc_all("O Brasil caiu 26 posições"))
#> language score
#> 1 por 1.0000000
#> 2 src 0.8800937
#> 3 glg 0.8702576
#> 4 snn 0.8637002
#> 5 bos 0.8168618
#> 6 hrv 0.8103044
head(franc_all("O Brasil caiu 26 posições", min_speakers = 0))
#> language score
#> 1 lad 1.0000000
#> 2 por 0.9442724
#> 3 pov 0.8788147
#> 4 ast 0.8677576
#> 5 roh 0.8363556
#> 6 src 0.8310482
MIT © Mango Solutions, Titus Wormer, Maciej Ceglowski, Jacob R. Rideout, Kent S. Johnson, Gábor Csárdi
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.