The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
ragnar_register_tool_retrieve()
now registers a tool
that will not return previously returned chunks, enabling the LLM to
perform deeper searches of a ragnar store with repeated tool calls
(#106).
Updates for ellmer v0.3.0 and duckdb v1.3.1 (#99)
Improved docs and error message in
ragnar_store_insert()
(@mattwarkentin, #88)
ragnar_find_links()
can now parse
sitemap.xml
files. It also gains a validate
argument, allowing for sending a HEAD
request to each link
and filtering out broken links (#83).
ragnar_inspector()
now renders all urls as clickable
links in the chunk markdown viewer, even if url is not a formal markdown
link (#82).
Before running examples and tests we now check if ragnar can load DuckDB extensions. This fixes issues in environments where DuckDB pre-built binaries for extensions are not compatible with the installed DuckDB version (#94).
Added embed_lm_studio
to use LMStudio as an
embedding provider (#100).
Fixed a bug causing ragnar_retrieve()
to fail when
documents were inserted without an origin (#102).
We now suppress a “Couldn’t find ffmpeg or avconv” warning when
importing markitdown when using read_as_markdown()
. The
warning would only be relevant for users doing audio transcription
(#103).
Added embed_google_gemini
to use Google Gemini API
as an embedding provider (#105).
ragnar_store_create()
gains a new argument:
version
, with default 2
. Store version 2 adds
support for chunk deoverlapping on retrieval and automatic chunk
augmentation with headings. To support these features, the internal
schema and ingestion requirements are different. See
markdown_chunk()
and new S7 classes
MarkdownDocument
and MarkdownDocumentChunks
.
Backwards compatibility is maintained with version = 1. (#58, #39,
#36)
ragnar_store_create()
now supports Date and POSIXct
classes supplied to extra_cols
.
ragnar_store_create()
now supports remote MotherDuck
Databases specified with md:<dbname>
as the
location
argument. (#50)
ragnar_retrieve()
and friends gain a
filter
argument, adding support for efficiently filtering
retrieval results.
ragnar_retrieve_bm25()
gains arguments
b
, k
, and conjunctive
(#56).
ragnar_retrieve_vss()
gains argument
query_vector
, supporting workflows that preprocess the
query string before embedding.
ragnar_retrieve_vss()
set of valid
method
choices have been updated to a narrower set to
ensure that an HNSW
index scan is used.
Passing a tbl(store)
to
ragnar_retrieve()
is deprecated.
New chunker markdown_chunk()
with support for chunk
heading context generation, semantic boundary selection, overlapping
chunks, document segmentation, and more. (#56)
New function embed_google_vertex()
(@dfalbel, #49)
New function embed_databricks()
(@atheriel, #45)
New function ragnar_chunks_view()
for quickly
previewing chunks (#42)
ragnar_register_tool_retrieve()
gains optional
name
and title
arguments to allow for more
descriptive tool registration. These values can also be set in
ragnar_store_create()
(#43).
ragnar_read()
and read_as_markdown()
now accept paths that begin with ~
(@topepo, #46, #48).
Changes to read_as_markdown()
HTML conversion (#40,
#51):
html_extract_selectors
and
html_zap_selectors
provide a flexible way to exclude some
html page elements from being included in the converted markdown.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.