Getting Started with ankiR

The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

ankiR provides a tidy interface for reading Anki flashcard databases in R. This vignette shows common workflows for analyzing your Anki learning data.

Installation

# From CRAN
install.packages("ankiR")

# Or from GitHub for the development version
remotes::install_github("chrislongros/ankiR")

Opening a Collection

ankiR can automatically detect your Anki installation:

library(ankiR)

# Auto-detect (uses first profile found)
col <- anki_collection()

# Specify a profile
col <- anki_collection(profile = "User 1")

# Or provide a path directly
col <- anki_collection(path = "/path/to/collection.anki2")

The collection object provides methods to access different data:

notes <- col$notes()
cards <- col$cards()
reviews <- col$revlog()
decks <- col$decks()
models <- col$models()

# Always close when done
col$close()

Convenience Functions

For one-off queries, use the standalone functions. They handle connection cleanup automatically:

# These are equivalent to opening, querying, and closing
notes <- anki_notes()
cards <- anki_cards()
reviews <- anki_revlog()
decks <- anki_decks()
models <- anki_models()

Understanding the Data

Notes

Notes contain the actual content of your flashcards:

notes <- anki_notes()
# nid: Note ID
# mid: Model (note type) ID
# tags: Space-separated tags
# flds: Fields separated by \x1f character
# sfld: Sort field (usually the front)

Cards

Cards are generated from notes. One note can produce multiple cards:

cards <- anki_cards()
# cid: Card ID
# nid: Note ID (links to notes table)
# did: Deck ID
# type: 0=new, 1=learning, 2=review, 3=relearning
# queue: -1=suspended, 0=new, 1=learning, 2=review
# due: Due date/position
# ivl: Current interval in days
# reps: Number of reviews
# lapses: Number of times forgotten

Decks

decks <- anki_decks()
# did: Deck ID
# name: Deck name (includes parent::child hierarchy)

Review Log

Every review is recorded:

reviews <- anki_revlog()
# rid: Review ID (timestamp in milliseconds)
# cid: Card ID
# ease: Button pressed (1=Again, 2=Hard, 3=Good, 4=Easy)
# ivl: Interval after review
# time: Time taken in milliseconds
# review_date: Date of review

Working with FSRS

If you use FSRS (Free Spaced Repetition Scheduler), ankiR can extract the memory state parameters:

cards_fsrs <- anki_cards_fsrs()

# Additional columns:
# stability: Time in days for recall probability to drop to 90%
# difficulty: How hard the card is (1-10)
# desired_retention: Target recall probability
# decay: FSRS-6 decay parameter (w20)

Calculating Retrievability

Retrievability is the probability you’ll recall a card right now:

# For a card with 30-day stability, reviewed 15 days ago
fsrs_retrievability(stability = 30, days_elapsed = 15)
#> 0.946

# Using the per-card decay from FSRS-6
fsrs_retrievability(stability = 30, days_elapsed = 15, decay = 0.3)

Calculating Optimal Intervals

# When should I review for 90% retention?
fsrs_interval(stability = 30, desired_retention = 0.9)
#> 30

# For 85% retention (more reviews, better memory)
fsrs_interval(stability = 30, desired_retention = 0.85)
#> 21.3

Example Analysis: Review Patterns

library(ankiR)
library(dplyr)
library(ggplot2)

# Get data
reviews <- anki_revlog()
cards <- anki_cards()
decks <- anki_decks()

# Daily review count
daily_reviews <- reviews |>
  count(review_date, name = "reviews")

ggplot(daily_reviews, aes(review_date, reviews)) +
  geom_col(fill = "steelblue") +
  labs(title = "Daily Reviews", x = NULL, y = "Reviews") +
  theme_minimal()

# Card maturity by deck
cards |>
  left_join(decks, by = "did") |>
  filter(type == 2) |>  # Review cards only
  group_by(name) |>
  summarise(
    cards = n(),
    avg_interval = mean(ivl),
    mature = sum(ivl >= 21),  # Cards with 21+ day interval
    .groups = "drop"
  ) |>
  arrange(desc(cards))

Example: FSRS Memory Analysis

cards_fsrs <- anki_cards_fsrs()

# Distribution of stability values
cards_fsrs |>
  filter(!is.na(stability), stability > 0) |>
  ggplot(aes(stability)) +
  geom_histogram(bins = 50, fill = "steelblue") +
  scale_x_log10() +
  labs(
    title = "Distribution of Card Stability",
    x = "Stability (days, log scale)",
    y = "Count"
  ) +
  theme_minimal()

# Difficulty vs Stability
cards_fsrs |>
  filter(!is.na(stability), !is.na(difficulty)) |>
  ggplot(aes(difficulty, stability)) +
  geom_point(alpha = 0.3) +
  scale_y_log10() +
  labs(
    title = "Card Difficulty vs Stability",
    x = "Difficulty (1-10)",
    y = "Stability (days, log scale)"
  ) +
  theme_minimal()

Tips

Close connections: Always call col$close() when using anki_collection() directly, or use the convenience functions which handle this automatically.
Anki must be closed: The database is locked while Anki is running. Close Anki before reading the database.
Backup first: While ankiR only reads data (never writes), it’s good practice to backup your collection before any analysis.
Large collections: For very large collections, consider using SQL queries directly via DBI::dbGetQuery(col$con, "SELECT ...") for better performance.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.