Reproducible Output

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Reproducible Output

Reproducibility is a cornerstone of scientific research. localLLM is designed with reproducibility as a first-class feature, ensuring that your LLM-based analyses can be reliably replicated.

Deterministic Generation by Default

All generation functions in localLLM (quick_llama(), generate(), and generate_parallel()) use deterministic greedy decoding by default. This means running the same prompt twice will produce identical results.

library(localLLM)

# Run the same query twice
response1 <- quick_llama("What is the capital of France?")
response2 <- quick_llama("What is the capital of France?")

# Results are identical
identical(response1, response2)

#> [1] TRUE

Seed Control for Stochastic Generation

Reproducibility is ensured even when temperature > 0:

# Stochastic generation with seed control
response1 <- quick_llama(
  "Write a haiku about data science",
  temperature = 0.9,
  seed = 92092
)

response2 <- quick_llama(
  "Write a haiku about data science",
  temperature = 0.9,
  seed = 92092
)

# Still reproducible with matching seeds
identical(response1, response2)

#> [1] TRUE

# Different seeds produce different outputs
response3 <- quick_llama(
  "Write a haiku about data science",
  temperature = 0.9,
  seed = 12345
)

identical(response1, response3)

#> [1] FALSE

Input/Output Hash Verification

All generation functions compute SHA-256 hashes for both inputs and outputs. These hashes enable verification that collaborators used identical configurations and obtained matching results.

result <- quick_llama("What is machine learning?")

# Access the hashes
hashes <- attr(result, "hashes")
print(hashes)

#> $input
#> [1] "a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1"
#>
#> $output
#> [1] "b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5"

The input hash includes: - Model identifier - Prompt text - Generation parameters (temperature, seed, max_tokens, etc.)

The output hash covers the generated text, allowing collaborators to verify they obtained matching results.

Hashes with explore()

For multi-model comparisons, explore() computes hashes per model:

res <- explore(
  models = models,
  prompts = template_builder,
  hash = TRUE
)

# View hashes for each model
hash_df <- attr(res, "hashes")
print(hash_df)

#>   model_id                         input_hash                        output_hash
#> 1  gemma4b a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5... b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9...
#> 2  llama3b c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0... d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1...

Set hash = FALSE to disable hash computation if not needed.

Automatic Documentation

Use document_start() and document_end() to capture everything that happens during your analysis. The log records:

Timestamps
Model metadata (paths, parameters)
Summaries of function calls
SHA-256 fingerprint of the entire run

# Start documentation
document_start(path = "analysis-log.txt")

# Run your analysis
result1 <- quick_llama("Classify this text: 'Great product!'")
result2 <- explore(models = models, prompts = prompts)

# End documentation
document_end()

The log file contains a complete audit trail:

localLLM Run Log
File: /path/to/analysis-log.txt
Started: 2025-01-15 14:30:22 EST
Ended: 2025-01-15 14:35:12 EST
Duration: 289.9 seconds

Events:
- [2025-01-15 14:30:22 EST] document_start
    {
      "package_version": "1.2.1",
      "r_version": "4.4.1",
      "platform": "aarch64-apple-darwin22.6.0",
      "os": "Darwin",
      "user": "researcher",
      "working_directory": "/home/user/analysis"
    }

- [2025-01-15 14:30:25 EST] quick_llama
    {
      "model": "Llama-3.2-3B-Instruct-Q5_K_M.gguf",
      "prompt_count": 1,
      "n_gpu_layers": 999,
      "n_ctx": 2048,
      "max_tokens": 100,
      "temperature": 0,
      "seed": 1234,
      "auto_format": true,
      "clean": false
    }

- [2025-01-15 14:30:25 EST] quick_llama_hash
    {
      "input_hash": "a3f2b8c9...",
      "output_hash": "b4c5d6e7..."
    }

- [2025-01-15 14:35:12 EST] document_end
    {
      "duration_seconds": 289.9,
      "total_events": 4
    }

Hash (SHA-256): e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2...

Best Practices for Reproducible Research

1. Always Set Seeds

Even with temperature = 0, explicitly setting seeds documents your intent:

result <- quick_llama(
  "Analyze this text",
  temperature = 0,
  seed = 42  # Explicit for documentation
)

2. Log Your Environment

Record your setup at the start of analysis:

# Check hardware profile
hw <- hardware_profile()
print(hw)

#> $os
#> [1] "Darwin"
#>
#> $cpu_cores
#> [1] 10
#>
#> $ram_total
#> [1] 17179869184
#>
#> $gpu
#> $gpu$name
#> [1] "Apple M2 Pro"

3. Use Document Functions for Audit Trails

Wrap your entire analysis in documentation calls:

document_start(path = "my_analysis_log.txt")

# All your analysis code here
# ...

document_end()

4. Share Hashes for Verification

When publishing or sharing results, include hashes so others can verify:

result <- quick_llama("Your prompt here", seed = 42)

# Report these in your paper/documentation
cat("Input hash:", attr(result, "hashes")$input, "\n")
cat("Output hash:", attr(result, "hashes")$output, "\n")

5. Version Control Your Models

Track which model versions you used:

# List cached models with metadata
cached <- list_cached_models()
print(cached[, c("name", "size_bytes", "modified")])

Summary

Feature	Function/Parameter	Purpose
Deterministic output	`temperature = 0` (default)	Same input = same output
Seed control	`seed = 42`	Reproducible stochastic generation
Hash verification	`attr(result, "hashes")`	Verify identical configurations
Audit trails	`document_start()`/`document_end()`	Complete session logging
Hardware info	`hardware_profile()`	Record execution environment

With these tools, your LLM-based analyses become fully reproducible and verifiable.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.