--- title: "Getting Started with ankiR" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with ankiR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ankiR provides a tidy interface for reading Anki flashcard databases in R. This vignette shows common workflows for analyzing your Anki learning data. ## Installation ```{r install} # From CRAN install.packages("ankiR") # Or from GitHub for the development version remotes::install_github("chrislongros/ankiR") ``` ## Opening a Collection ankiR can automatically detect your Anki installation: ```{r open} library(ankiR) # Auto-detect (uses first profile found) col <- anki_collection() # Specify a profile col <- anki_collection(profile = "User 1") # Or provide a path directly col <- anki_collection(path = "/path/to/collection.anki2") ``` The collection object provides methods to access different data: ```{r methods} notes <- col$notes() cards <- col$cards() reviews <- col$revlog() decks <- col$decks() models <- col$models() # Always close when done col$close() ``` ## Convenience Functions For one-off queries, use the standalone functions. They handle connection cleanup automatically: ```{r convenience} # These are equivalent to opening, querying, and closing notes <- anki_notes() cards <- anki_cards() reviews <- anki_revlog() decks <- anki_decks() models <- anki_models() ``` ## Understanding the Data ### Notes Notes contain the actual content of your flashcards: ```{r notes} notes <- anki_notes() # nid: Note ID # mid: Model (note type) ID # tags: Space-separated tags # flds: Fields separated by \x1f character # sfld: Sort field (usually the front) ``` ### Cards Cards are generated from notes. One note can produce multiple cards: ```{r cards} cards <- anki_cards() # cid: Card ID # nid: Note ID (links to notes table) # did: Deck ID # type: 0=new, 1=learning, 2=review, 3=relearning # queue: -1=suspended, 0=new, 1=learning, 2=review # due: Due date/position # ivl: Current interval in days # reps: Number of reviews # lapses: Number of times forgotten ``` ### Decks ```{r decks} decks <- anki_decks() # did: Deck ID # name: Deck name (includes parent::child hierarchy) ``` ### Review Log Every review is recorded: ```{r revlog} reviews <- anki_revlog() # rid: Review ID (timestamp in milliseconds) # cid: Card ID # ease: Button pressed (1=Again, 2=Hard, 3=Good, 4=Easy) # ivl: Interval after review # time: Time taken in milliseconds # review_date: Date of review ``` ## Working with FSRS If you use FSRS (Free Spaced Repetition Scheduler), ankiR can extract the memory state parameters: ```{r fsrs} cards_fsrs <- anki_cards_fsrs() # Additional columns: # stability: Time in days for recall probability to drop to 90% # difficulty: How hard the card is (1-10) # desired_retention: Target recall probability # decay: FSRS-6 decay parameter (w20) ``` ### Calculating Retrievability Retrievability is the probability you'll recall a card right now: ```{r retrievability} # For a card with 30-day stability, reviewed 15 days ago fsrs_retrievability(stability = 30, days_elapsed = 15) #> 0.946 # Using the per-card decay from FSRS-6 fsrs_retrievability(stability = 30, days_elapsed = 15, decay = 0.3) ``` ### Calculating Optimal Intervals ```{r intervals} # When should I review for 90% retention? fsrs_interval(stability = 30, desired_retention = 0.9) #> 30 # For 85% retention (more reviews, better memory) fsrs_interval(stability = 30, desired_retention = 0.85) #> 21.3 ``` ## Example Analysis: Review Patterns ```{r analysis, message=FALSE} library(ankiR) library(dplyr) library(ggplot2) # Get data reviews <- anki_revlog() cards <- anki_cards() decks <- anki_decks() # Daily review count daily_reviews <- reviews |> count(review_date, name = "reviews") ggplot(daily_reviews, aes(review_date, reviews)) + geom_col(fill = "steelblue") + labs(title = "Daily Reviews", x = NULL, y = "Reviews") + theme_minimal() # Card maturity by deck cards |> left_join(decks, by = "did") |> filter(type == 2) |> # Review cards only group_by(name) |> summarise( cards = n(), avg_interval = mean(ivl), mature = sum(ivl >= 21), # Cards with 21+ day interval .groups = "drop" ) |> arrange(desc(cards)) ``` ## Example: FSRS Memory Analysis ```{r fsrs-analysis} cards_fsrs <- anki_cards_fsrs() # Distribution of stability values cards_fsrs |> filter(!is.na(stability), stability > 0) |> ggplot(aes(stability)) + geom_histogram(bins = 50, fill = "steelblue") + scale_x_log10() + labs( title = "Distribution of Card Stability", x = "Stability (days, log scale)", y = "Count" ) + theme_minimal() # Difficulty vs Stability cards_fsrs |> filter(!is.na(stability), !is.na(difficulty)) |> ggplot(aes(difficulty, stability)) + geom_point(alpha = 0.3) + scale_y_log10() + labs( title = "Card Difficulty vs Stability", x = "Difficulty (1-10)", y = "Stability (days, log scale)" ) + theme_minimal() ``` ## Tips 1. **Close connections**: Always call `col$close()` when using `anki_collection()` directly, or use the convenience functions which handle this automatically. 2. **Anki must be closed**: The database is locked while Anki is running. Close Anki before reading the database. 3. **Backup first**: While ankiR only reads data (never writes), it's good practice to backup your collection before any analysis. 4. **Large collections**: For very large collections, consider using SQL queries directly via `DBI::dbGetQuery(col$con, "SELECT ...")` for better performance.