The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
dp_train(make_model, data, loss_fn, forward_fn, target_fn, n_gpu, n_iter, lr, max_norm, verbose)
— data-parallel training across multiple replicas. Weights are broadcast
from replica 0 before the first step; gradients are averaged across
replicas each iteration; weights are re-broadcast after each optimizer
update. Returns list(params, loss_history, model).ag_mul and ag_sub now support CPU
broadcast: [d×s] * [1×s] and [d×s] * [d×1]
shapes work correctly with proper gradient reduction.ag_softmax_cross_entropy_loss accepts integer target
vectors (0-based class indices) and converts them to one-hot
automatically.ggml_sum_rows f16 on Vulkan: F16→F16 dispatch now
supported natively (no CPU fallback).ag_tensor() / ag_param() —
environment-backed tensors with reference semantics; in-place optimizer
updates visible to all references.with_grad_tape({ ... }) — enables the global gradient
tape for the enclosed forward pass.backward(loss) — reverse-mode automatic
differentiation; returns a gradient environment keyed by tensor id.ag_matmul, ag_add
(with bias broadcast), ag_sub, ag_mul,
ag_scale.ag_relu, ag_sigmoid,
ag_tanh, ag_softmax.ag_sum, ag_mean,
ag_log, ag_exp, ag_pow,
ag_clamp.ag_reshape, ag_transpose.ag_mse_loss,
ag_cross_entropy_loss,
ag_softmax_cross_entropy_loss (numerically-stable
fused).optimizer_sgd() — SGD with optional momentum.optimizer_adam() — Adam with bias-corrected moment
estimates.ag_linear() — Glorot-initialised dense layer
(closure-based, returns $forward,
$params()).ag_gradcheck() — central finite-difference gradient
checker (like torch.autograd.gradcheck).ag_sequential(...) — ordered layer container; collects
all parameters for the optimizer.ag_dropout(rate) — inverted dropout; identity in eval
mode.ag_batch_norm(num_features) — batch normalisation with
running statistics and learnable γ/β.ag_embedding(vocab_size, dim) — token lookup with
scatter-add backward.ag_train(model) / ag_eval(model) — switch
all sub-layers between train and eval mode.ag_dataloader(x, y, batch_size, shuffle, col_major) —
mini-batch iterator with shuffle and $epoch() helper.lr_scheduler_step(optimizer, step_size, gamma) —
step-decay learning rate.lr_scheduler_cosine(optimizer, T_max, lr_min, restart)
— cosine-annealing (with optional SGDR warm restarts).clip_grad_norm(params, grads, max_norm) — clips all
gradients by global L2 norm in-place.ggml_layer_lstm() — LSTM recurrent layer (unrolled
BPTT).ggml_layer_gru() — GRU recurrent layer (unrolled
BPTT).ggml_layer_global_max_pooling_2d() — reduces
[H,W,C] to [C] via max pooling.ggml_layer_global_average_pooling_2d() — reduces
[H,W,C] to [C] via average pooling.ggml_save_model() — saves full model (architecture +
weights) to RDS file.ggml_load_model() — restores a model saved with
ggml_save_model().ggml_dense(), ggml_conv_2d(),
ggml_conv_1d(), ggml_batch_norm(),
ggml_embedding(), ggml_lstm(),
ggml_gru() — layer object constructors returning a reusable
ggml_layer object.ggml_apply(tensor, layer) — applies a
ggml_layer object to a tensor node; shared weights by
object identity.ggml_layer_dropout() — dropout with deterministic or
stochastic (per-epoch Bernoulli mask) mode.ggml_layer_embedding() — token embedding lookup for
integer inputs.ggml_input() gains dtype argument
("float32" or "int32").ggml_model() and
ggml_predict().ggml_input() — declare a symbolic input tensor node
(Functional API).ggml_model() — assemble a
ggml_functional_model from input/output nodes.ggml_layer_add() — element-wise addition of tensor
nodes (residual connections).ggml_layer_concatenate() — concatenate tensor nodes
along an axis.ggml_layer_*() functions now accept a
ggml_tensor_node as first argument (Functional API
mode).ggml_compile(), ggml_fit(),
ggml_evaluate(), ggml_predict() are now S3
generics with methods for ggml_functional_model.ggml_fit_opt() — low-level optimizer loop with
callbacks and learning-rate control.ggml_callback_early_stopping() — stops training when a
metric stagnates.ggml_schedule_step_decay() — step learning-rate
decay.ggml_schedule_cosine_decay() — cosine learning-rate
annealing.ggml_schedule_reduce_on_plateau() — reduces LR when
metric stops improving.ggml_opt_init_for_fit(),
ggml_opt_set_lr(), ggml_opt_get_lr() —
learning-rate control without recreating the optimizer context.configure.win.ggml_layer_conv_1d() — 1D convolution layer.ggml_layer_batch_norm() — batch normalization
layer.ggml_predict_classes() — argmax wrapper returning
1-based class indices.summary.ggml_sequential_model() — detailed model
summary with parameter counts.ggml_fit() now returns model$history
(class ggml_history) with print and
plot methods.ggml_model_sequential(),
ggml_layer_dense(), ggml_layer_conv_2d(),
ggml_layer_max_pooling_2d(),
ggml_layer_flatten(), ggml_compile(),
ggml_fit(), ggml_evaluate(),
ggml_predict(), ggml_save_weights(),
ggml_load_weights().ggml_timestep_embedding() — sinusoidal timestep
embeddings.ggml_set_f32_nd(),
ggml_get_f32_nd(), ggml_set_i32_nd(),
ggml_get_i32_nd().ggml_tensor_nb(),
ggml_tensor_num(), ggml_tensor_copy(),
ggml_tensor_set_f32_scalar(),
ggml_get_first_tensor(),
ggml_get_next_tensor().libggml.a exported for linking by
dependent packages.gguf.cpp added for GGUF file format support.inst/include/ for
LinkingTo.ggml_opt_init(),
ggml_opt_free(), ggml_opt_fit(),
ggml_opt_epoch(), ggml_opt_eval().ggml_opt_dataset_init(),
ggml_opt_dataset_data(),
ggml_opt_dataset_labels(),
ggml_opt_dataset_shuffle().ggml_opt_result_init(),
ggml_opt_result_loss(),
ggml_opt_result_accuracy(),
ggml_opt_result_pred().These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.