Skip to main content

champollion

A fully customizable internationalization framework. One command translates your locale files. One config controls every method, model, and language pair. And if the built-in methods aren't enough — build your own, prove it works, and deploy it.

npx champollion sync

champollion auto-detects your locale files, format, and target languages. It translates what's missing, skips what's done, validates every result, and writes clean output. That's the starting line.


Why Not Just Script It Yourself?

You could write a quick loop that calls Google Translate on each key. Most developers do — it takes about 30 lines. Here's where it breaks:

  • No change detection. Update an English string — the translation stays stale forever. champollion tracks every source value with SHA-256 hashes and re-translates only what changed.
  • No batching. One API call per key means 200 keys = 200 round trips. champollion batches intelligently (configurable, default 80 keys/batch for LLM, 128 for Google).
  • No caching. Every sync re-translates everything. champollion's Translation Memory caches translations by source text + locale + method — re-running sync after one key change only translates that one key, not the entire file.
  • No quality gate. Machine translation hallucinates, echoes the source back, or outputs in the wrong script. champollion validates every translation before writing it — wrong-script, length inflation, and source echoes are caught and rejected.
  • No format awareness. Hardcoded to JSON? champollion handles JSON, TOML, YAML, and Hugo Markdown (frontmatter + body) with auto-detection.
  • No method control. Every pair gets the same method. champollion lets you use Google Translate for French, an LLM for Japanese, and a custom community-hosted pipeline for Cree — in the same config file.

champollion is the production version of that script.


What Makes It Different

Every method is a plugin

The translation method is configurable per language pair. Mix Google Translate, LLMs, coached prompts, and custom APIs in the same project:

champollion.config.json
{
"version": 3,
"pairs": {
"en:fr": { "method": "google-translate" },
"en:ja": { "method": "llm", "model": "google/gemini-2.5-pro" },
"en:crk": { "methodPlugin": "crk-coached-v1" }
}
}

French gets Google Translate (fast, cheap). Japanese gets a premium LLM (nuanced). Plains Cree gets a coached plugin with grammar rules, dictionaries, and morphological validation. Same sync command. Same quality gate. Same CLI.

Prove it

Think your method can translate English to Spanish? Turkish to Azerbaijani? English to Cree?

Prove it. The companion eval harness benchmarks any translation method with reproducible, fingerprinted scoring. The leaderboard tracks every submission.

The eval harness and the production CLI share the same plugin interface. A method that scores well in the harness can be used in production — if the community whose language it serves gives consent. For Indigenous and low-resource languages, that consent matters. See Data Sovereignty.

# Benchmark your method (in the eval harness repo)
cd arena
python eval/baseline_experiment.py --dataset data/edtekla-dev-v1.json --submit

# Use it locally
npx champollion sync

Same plugin. Plug and test.

The full toolkit

champollion isn't just sync. It's a complete i18n pipeline:

CommandWhat It Does
syncTranslate missing and stale keys (with post-sync verification)
watchAuto-sync when your source file changes
lintScan source code for hardcoded strings
wrapAuto-wrap hardcoded strings in t() calls
auditList all [EN] fallback markers from prior runs
verifyVerify translations are present and correct (CI gate)
integrityDetect placeholder corruption, encoding issues, and ICU plural completeness
seoGenerate hreflang tags, sitemaps, and JSON-LD schema
statusShow pair config, plugins, and benchmark scores
provenanceAudit translation resource licensing
pluginInstall, remove, and list method plugins
fontsDownload web fonts for PUA script converters
tmManage Translation Memory cache (stats, clear, per-locale)
xliffExport/import XLIFF 1.2 for professional translator review

Four of these — lint, sync, verify, audit — form a CI pipeline that catches hardcoded strings, translates them, verifies correctness, and fails the build if any locale is incomplete.


The Arena

The Method Leaderboard is the scoreboard. Every submission is fingerprinted to a Git commit, versioned to a specific dataset, and scored by the same harness. Anyone can submit.

What can you prove? The harness takes JSON. Plugins take JSON. Any method that produces JSON can be tested:

ApproachExample
Coached LLMInject grammar rules and dictionaries into a frontier model's prompt
Fine-tuned modelTrain an open model on parallel text — just not on the eval data
FST-gated pipelineLLM generates → finite-state transducer validates morphology → retry
Chained modelsModel A drafts → Model B post-edits → Model C scores
Dictionary + LLMForce known terms from a dictionary, let the LLM handle the rest
EvolutionaryGenerate candidates, score them, mutate the best, repeat
Partial translationTranslate a sample by hand, prove your LLM matches, auto-translate the rest

Fine-tune models. Deploy evolutionary algorithms. Test student answers on language exams. Build lookup tables. Chain three models together. As long as your method produces JSON, the harness scores it and the framework runs it.

:::danger The one rule Do not train on the evaluation data. Methods exposed to the benchmark dataset will be disqualified. Fine-tune on whatever you want. Just not on the test set. :::

This is an open invitation. If you work with a low-resource language — as a researcher, a community member, a student, or just someone who cares — build a method, run the harness, and claim the top score. The problem is unsolved. The infrastructure is here.

→ View the leaderboard


Next Steps

Getting started:

Customizing your setup:

Going deeper: