The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
ggmlR includes a built-in zero-dependency ONNX loader (hand-written protobuf parser in C). Load any compatible ONNX model and run inference on CPU or Vulkan GPU — no Python, no TensorFlow, no ONNX Runtime required.
Note: The examples below require a valid
.onnxmodel file. Replace"path/to/model.onnx"with the actual path on your system.
model <- onnx_load("path/to/model.onnx")
# Model summary (layers, ops, parameters)
onnx_summary(model)
# Input tensor info (name, shape, dtype)
onnx_inputs(model)Inputs are named R arrays in NCHW order (matching the ONNX model’s expected layout).
# Random image batch — replace with real data
input <- array(runif(1 * 3 * 224 * 224), dim = c(1L, 3L, 224L, 224L))
result <- onnx_run(model, list(input_name = input))
cat("Output shape:", paste(dim(result[[1]]), collapse = " x "), "\n")For models with multiple inputs, pass a named list:
result <- onnx_run(model, list(
input_ids = array(as.integer(tokens), dim = c(1L, length(tokens))),
attention_mask = array(1L, dim = c(1L, length(tokens)))
))By default ggmlR tries Vulkan first and falls back to CPU automatically. To force a specific backend:
# Check what's available
if (ggml_vulkan_available()) {
cat("Vulkan GPU ready\n")
ggml_vulkan_status()
}
# Load with explicit device
model_gpu <- onnx_load("path/to/model.onnx", device = "vulkan")
model_cpu <- onnx_load("path/to/model.onnx", device = "cpu")Weights are transferred to the GPU once at load time. Repeated calls
to onnx_run() do not re-transfer weights.
Some models accept variable-length inputs. Override shapes at load time:
Run in half-precision for faster GPU inference:
model_fp16 <- onnx_load("path/to/model.onnx", dtype = "f16")
result <- onnx_run(model_fp16, list(input = input))ggmlR supports 50+ ONNX operators, including:
Custom fused ops: RelPosBias2D (BoTNet).
For full working examples with real ONNX Zoo models see:
# GPU vs CPU benchmark across multiple models
# inst/examples/benchmark_onnx.R
# FP16 inference benchmark
# inst/examples/benchmark_onnx_fp16.R
# Run all supported ONNX Zoo models
# inst/examples/test_all_onnx.R
# BERT sentence similarity
# inst/examples/bert_similarity.RIf a model fails to load or produces wrong results:
Check operator support — print the model’s op
list with Python’s onnx package and compare against the
table above.
Verify protobuf field numbers — the built-in parser is hand-written; an unexpected field can cause silent mis-parsing.
NaN tracing — use the eval callback for per-node inspection rather than a post-compute scan (which aliases buffers and gives false readings).
Repeated-run aliasing —
ggml_backend_sched aliases intermediate buffers over weight
buffers. ggmlR calls sched_alloc_and_load() before each
compute to reset allocation. If you see correct results on the first run
but garbage on subsequent runs, this is the cause.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.