Automated vocabulary mining

Upload content.
Get a deck.
Export to Anki.

Drop any PDF, video, or audio file. Saikutsu extracts vocabulary, builds cards from real sentences, and exports to Anki. Minutes, not hours.

Beta testers get free access and help shape the tool. Waitlist — we'll email you when Saikutsu is open to everyone.

採掘 — Murakami Ch.3
265 cards | 83 i+1 sentences | 47 already known, skipped

青豆は、その風景を黙って見つめていた。

風景 (ふうけい)

scenery; landscape; view

Ready to export Export to Anki →

The problem

Mining shouldn't take longer than reading.

Look up a word, copy the sentence, format the card, add it to Anki. Repeat 50 times. By then, you've forgotten what you were reading.

30–45 min

spent per session creating cards manually. Most people give up before finishing.

5 tools

PDF reader, dictionary, spreadsheet, Anki, tokenizer. For podcasts, add even more.

0 flow

Context switching between reading and card creation destroys immersion entirely.

Features

Everything you need to mine smarter.

Any input, one pipeline

PDF, video, or audio — novels, podcasts, anime, lectures. We extract or transcribe, tokenize, and build your deck.

Cards in context

Every card shows the target word bolded inside a real sentence from your content. Flip for the definition and reading. Context-first learning.

All words + i+1 decks

Get two decks: one with every unique word, and one with only i+1 sentences — where you know every word except the target. The fastest path to comprehension.

Smart deduplication

Already know a word? We skip it. Every new deck checks your entire collection so you never create the same card twice.

Frequency ranking

Words ranked by document frequency and language-wide frequency. Learn high-value words first.

Deep Anki integration

Import existing Anki decks — we skip words you already know. Export new decks as .apkg with sentences, readings, and definitions. No duplicates, ever.

Subtitle mining

Search for TV shows and anime, then extract vocabulary directly from subtitle files. Jimaku for Japanese, OpenSubtitles for other languages — build decks from content you're actually watching.

How it works

Three steps. That's it.

01

Upload your content

Drag and drop any PDF, video, or audio file. Novels, podcast episodes, anime, lectures. Japanese, Spanish, French, German, Italian, and Portuguese.

podcast_ep42.mp3 18.7 MB
02

We do the mining

Transcription, tokenization, dictionary lookups, frequency analysis, i+1 sentence detection, deduplication against your existing cards. Usually under 5 minutes.

Audio transcribed
Tokenized — 312 words (47 already known, skipped)
Found 83 i+1 sentences
Building cards...
03

Export to Anki

Your deck is ready. Browse cards, edit anything you want, then export as .apkg and import straight into Anki. Mining in minutes instead of hours.

265 cards ready
Export .apkg →

Early access

Two ways to get involved

Saikutsu is in private beta. You can help shape the tool now, or get notified when it's ready for everyone.

beta tester

Help us build it

Get free access now. Test new features, report bugs, and have direct input on what we build next.

  • Free access for the entire beta
  • Direct input on features and pricing
  • Founding member pricing at launch
Apply to be a tester
waitlist

Get notified at launch

Not ready to test yet? Leave your email and we'll let you know when Saikutsu is open to everyone.

  • No commitment, just a notification
  • Be first in line when we open up
Join the Waitlist

Already have access? Sign in

Compatibility

Built for Anki users.

Saikutsu handles the mining. Anki handles the reviewing. Your other tools still fit right in.

anki

Made for Anki

Import your existing Anki decks so we know what you've already learned — we'll skip those words when mining new content. When you're done, export your new cards as .apkg and import them straight into Anki.

Cards include the word in context (with the target word bolded), definitions, and readings. Ready to study the moment you import.

yomitan

Works alongside Yomitan

Yomitan is great for what it does — hover over a word while you're reading, get an instant definition, and mine it on the spot. That's realtime, one-at-a-time mining, and it's perfect for immersion reading.

Saikutsu is for everything else. Drop an entire PDF, a podcast episode, or a movie file and extract hundreds of vocabulary words at once. Batch mining and realtime mining solve different problems — use both.

jimaku

Works alongside Jimaku

Saikutsu searches Jimaku's subtitle library to find Japanese subtitles for the shows you're watching. Select an episode, and we tokenize the subtitle text, look up definitions, and generate vocabulary cards automatically.

Jimaku handles hosting and indexing Japanese subtitles — we handle turning them into Anki-ready decks. For non-Japanese languages, we pull from OpenSubtitles instead.

FAQ

Common questions.

Which languages are supported? +

We currently support Japanese, Spanish, French, German, Italian, and Portuguese. All languages are available on every plan, including Free. We're actively adding more.

How does tokenization work for different languages? +

For Japanese, we use MeCab-compatible tokenization with full dictionary lookup. For European languages (Spanish, French, German, Italian, Portuguese), we use lemmatization to reduce words to their base dictionary form — so conjugated verbs, plural nouns, and declined adjectives all map back to the root word you actually need to learn. No more creating separate cards for 'hablar,' 'hablando,' and 'hablo.'

I already use Anki. Why would I use this? +

Saikutsu isn't replacing Anki — it's the fastest way to get cards into it. Upload a PDF or video, and we'll extract vocabulary, find example sentences, look up definitions, and export a ready-to-import .apkg. You keep studying in Anki. We just handle the mining.

What formats are supported? +

Video: MP4, MKV, WebM, MOV. Audio: MP3, M4A, WAV, OGG, FLAC. Documents: PDF with selectable text. Files over 25MB are automatically chunked for transcription.

What's an i+1 sentence? +

A sentence where you know every word except one. It's the ideal context for learning — enough comprehension to guess meaning from context, with exactly one new word to acquire. We find these automatically across your content.

What happens with words I already know? +

We check every word against your entire card collection across all decks. Import your existing Anki decks first, and we'll skip those words automatically when mining new content. No duplicates, no wasted cards.

Can I edit cards before exporting? +

Yes. Browse and edit any card in your deck — change definitions, add notes, delete words you don't want. Export when you're satisfied.

Can I use scanned PDFs? +

Not yet. We need digital text to extract vocabulary. If your PDF has selectable text, you're good. We're working hard to get OCR working to support every type of PDF, but the quality is not up to snuff quite yet.

Found a bug? +

Email kamdyn@saikutsu.com or visit our contact page. We're a small team building this because we use it ourselves.

Built with

Saikutsu is built on top of several open-source projects and community-maintained data sources. We're grateful for their work.

Help us build the mining tool you actually want.

We're a small team from the immersion learning community. We built Saikutsu because we needed it ourselves — and now we're looking for early users to help us get it right.