Skip to content

Speech-check language configuration

Speech-check benchmarks email input language (fixture language) separately from speakable output language (what TTS will read aloud). Leaderboard rows are only comparable when model, backend, input filter, and output language match.


Two language axes

Axis CLI flag Meaning
Input language --input-lang Which fixture emails to run. Filters by each case’s input_lang (email body language). Omit or all = full multilingual suite (multi in reports).
Output language --output-lang Language of the speakable line the summarizer must produce. Must be a Supertonic 3 TTS code. Default: [speech].target_lang or [tts].lang in voxpost.toml.

Important: Output language is never inferred from the email. The model is instructed to brief the user in the chosen output language even when the mail is French, Japanese, etc.


Allowed output languages (Supertonic)

ar, bg, hr, cs, da, nl, en, et, fi, fr, de, el, hi, hu, id, it, ja, ko, lv, lt, pl, pt, ro, ru, sk, sl, es, sv, tr, uk, vi

List in terminal:

voxpost summarize speech-check --list-languages

Fixture input languages (shipped)

Derived from each JSON fixture’s input_lang field, or from the case_id prefix (en_…, fr_…, …):

Code Cases (approx.) Examples
en 15 en_short_ack, en_angry_order
fr 3 fr_forward_phone, fr_meeting_move
de 1 de_tax_notice
es 1 es_delivery
pt 1 pt_hotel_confirm
it 1 it_dinner_invite
nl 1 nl_interview_invite
ja 1 ja_en_mixed_vendor (mixed body; tagged ja)

Example commands

Full multilingual suite, English speakable output (default leaderboard shape):

voxpost summarize speech-check --model qwen3.5:2b
# input=multi, output=en (from TOML if [speech] target_lang = en)

English emails only, French speakable output:

voxpost summarize speech-check --model qwen3.5:2b --input-lang en --output-lang fr

French fixtures only, French output:

voxpost summarize speech-check --model qwen3.5:2b --input-lang fr --output-lang fr

TOML vs CLI

Setting Source
Default output language ~/.config/voxpost/voxpost.toml[speech] mode=fixed + target_lang, else [tts] lang
Speech-check override --output-lang CODE (benchmark runs should set this explicitly when not en)
Input filter --input-lang CODE or omit for all fixtures

Report / run log filename

Auto reports include language tokens:

{model}__{backend}__in-{input}__out-{output}__{completed}of{total}__{status}__run-{id}.md

Example:

qwen3.5-2b__ollama__in-multi__out-en__24of24__complete__run-20260524-143052-a1b2c3.md

Metadata table inside the file repeats Input language and Output language.


Leaderboard rows

When opening a PR, include in docs/MODEL_LEADERBOARD.md:

  • Input langmulti (full suite), or en, fr, … if you filtered
  • Output lang — Supertonic code used (en, fr, …)
  • PASS / WEAK / FAIL — counts for that case subset (e.g. 15/24 if --input-lang en only)

Do not compare scores across different input/output pairs without noting the mismatch.


Contributing new fixture languages

Use the GitHub issue template “Multilingual speech-check fixture” (.github/ISSUE_TEMPLATE/multilingual_fixture.yml) or open a PR adding src/voxpost/speech_check/fixtures/{lang}_{scenario}.json with:

  • "input_lang": "xx" (ISO 639-1)
  • Realistic event (from, subject, body)
  • intent, label, grading hints (must_mention_any, must_not_mention, max_words)

See existing fixtures under src/voxpost/speech_check/fixtures/.