The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
RAGFlowChainR is an R package that brings Retrieval-Augmented Generation (RAG) capabilities to R, inspired by LangChain. It enables intelligent retrieval of documents from a local vector store (DuckDB), enhanced with optional web search, and seamless integration with Large Language Models (LLMs).
Features include:
For the Python version, see: RAGFlowChain
(PyPI)
π GitHub (R): RAGFlowChainR
π GitHub (Python): RAGFlowChain
# Install from GitHub
if (!requireNamespace("remotes")) install.packages("remotes")
::install_github("knowusuboaky/RAGFlowChainR") remotes
To use features like web search (Tavily
) and LLMs
(OpenAI
, Groq
, Anthropic
), youβll
need to set up your API keys as environment variables. This ensures that
sensitive credentials are never hardcoded in your
scripts.
# Add these to your .Renviron file or run once per session
Sys.setenv(TAVILY_API_KEY = "your-tavily-api-key")
Sys.setenv(OPENAI_API_KEY = "your-openai-api-key")
Sys.setenv(GROQ_API_KEY = "your-groq-api-key")
Sys.setenv(ANTHROPIC_API_KEY = "your-anthropic-api-key")
π‘ Tip: To persist these keys across sessions, add them to a
~/.Renviron
file (not tracked by git) instead of your code.
Place this in a file named .Renviron
in your home
directory:
TAVILY_API_KEY=your-tavily-api-key
OPENAI_API_KEY=your-openai-api-key
GROQ_API_KEY=your-groq-api-key
ANTHROPIC_API_KEY=your-anthropic-api-key
Then restart R for the changes to take effect.
fetch_data()
library(RAGFlowChainR)
# Read local files and websites
<- c("documents/sample.pdf", "documents/sample.txt")
local_files <- c("https://www.r-project.org")
website_urls <- 1
crawl_depth
<- fetch_data(local_paths = local_files, website_urls = website_urls, crawl_depth = crawl_depth)
data head(data)
<- create_vectorstore("my_vectors.duckdb", overwrite = TRUE)
con
<- data.frame(
docs source = "Test Source",
title = "Test Title",
author = "Test Author",
publishedDate = "2025-01-01",
description = "Test Description",
content = "Hello world",
url = "https://example.com",
source_type = "txt",
stringsAsFactors = FALSE
)
insert_vectors(
con = con,
df = docs,
embed_fun = embed_openai(), # Or embed_ollama()
chunk_chars = 12000
)
build_vector_index(con, type = c("vss", "fts"))
<- search_vectors(con, query_text = "Who is Messi?", top_k = 5)
results print(results)
dbDisconnect(con)
<- create_rag_chain(
rag_chain llm = call_llm,
vector_database_directory = "my_vectors.duckdb",
method = "DuckDB",
embedding_function = embed_openai(),
use_web_search = FALSE
)
# Ask a question
<- rag_chain$invoke("Tell me about Messi")
response cat(response$answer)
# Get related documents
<- rag_chain$custom_invoke("Tell me about Messi")
context print(context$documents)
# Review and clear chat history
print(rag_chain$get_session_history())
$clear_history()
rag_chain$disconnect() rag_chain
RAGFlowChainR includes built-in support for calling LLMs from
providers such as OpenAI, Groq, and
Anthropic via the call_llm()
utility:
call_llm(
prompt = "Summarize the capital of France.",
provider = "groq",
model = "llama3-8b",
temperature = 0.7,
max_tokens = 200
)
chatLLM
Weβre developing a standalone R package,
chatLLM
, that will offer a unified,
modular interface for interacting with popular LLM
providersβOpenAI, Groq, and
Anthropicβvia a clean, extensible API.
Features planned:
openai
,
groq
, anthropic
)RAGFlowChainR
Stay tuned on GitHub for updates!
MIT Β© Kwadwo Daddy Nyame Owusu Boakye
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.