The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
nlp_split_sentences(),
nlp_tokenize_text() (word and Biber methods), and
nlp_cast_tokens() stepwise and as a single pipe.util_fetch_embeddings() re-added for embedding
generation via Hugging Face inference endpoints (reversed 1.1.0 removal;
now calls the HF inference API rather than loading models locally).nlp_cast_tokens() documented and surfaced – flattens
the token list from nlp_tokenize_text() into a long-format
data frame with optional character spans.ellmer and unused packages
removed.fetch_urls() (from web
search), fetch_wiki_urls(), fetch_wiki_refs()
— return URLs or metadata, not full text.read_urls() — read content
from URLs into R (replaces web_scrape_urls).nlp_split_*,
nlp_tokenize_text(), nlp_index_tokens() (and
nlp_roll_chunks() for rolling windows).search_regex() (regex/KWIC), search_index()
(BM25), search_vector() (cosine over your own embeddings),
search_dict() (dictionary match; replaces
ner_extract_entities).corpus (replaces
tif), by (replaces
text_hierarchy).web_search, wiki_search,
wiki_find_references, web_scrape_urls,
ner_extract_entities, sem_nearest_neighbors /
sem_search_corpus (replaced by search_vector
and search_regex).These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.