champollion

A fully customizable internationalization framework. One command translates your locale files. One config controls every method, model, and language pair. And if the built-in methods aren't enough — build your own, prove it works, and deploy it.

npx champollion sync

champollion auto-detects your locale files, format, and target languages. It translates what's missing, skips what's done, validates every result, and writes clean output. That's the starting line.

Why Not Just Script It Yourself?

You could write a quick loop that calls Google Translate on each key. Most developers do — it takes about 30 lines. Here's where it breaks:

No change detection. Update an English string — the translation stays stale forever. champollion tracks every source value with SHA-256 hashes and re-translates only what changed.
No batching. One API call per key means 200 keys = 200 round trips. champollion batches intelligently (configurable, default 80 keys/batch for LLM, 128 for Google).
No caching. Every sync re-translates everything. champollion's Translation Memory caches translations by source text + locale + method — re-running sync after one key change only translates that one key, not the entire file.
No quality gate. Machine translation hallucinates, echoes the source back, or outputs in the wrong script. champollion validates every translation before writing it — wrong-script, length inflation, and source echoes are caught and rejected.
No format awareness. Hardcoded to JSON? champollion handles JSON, TOML, YAML, and Hugo Markdown (frontmatter + body) with auto-detection.
No method control. Every pair gets the same method. champollion lets you use Google Translate for French, an LLM for Japanese, and a custom community-hosted pipeline for Cree — in the same config file.

champollion is the production version of that script.

What Makes It Different

Every method is a plugin

The translation method is configurable per language pair. Mix Google Translate, LLMs, coached prompts, and custom APIs in the same project:

champollion.config.json
{
  "version": 3,
  "pairs": {
    "en:fr": { "method": "google-translate" },
    "en:ja": { "method": "llm", "model": "google/gemini-2.5-pro" },
    "en:crk": { "methodPlugin": "crk-coached-v1" }
  }
}

French gets Google Translate (fast, cheap). Japanese gets a premium LLM (nuanced). Plains Cree gets a coached plugin with grammar rules, dictionaries, and morphological validation. Same sync command. Same quality gate. Same CLI.

Prove it

Think your method can translate English to Spanish? Turkish to Azerbaijani? English to Cree?

Prove it. The companion eval harness benchmarks any translation method with reproducible, fingerprinted scoring. The leaderboard tracks every submission.

The eval harness and the production CLI share the same plugin interface. A method that scores well in the harness can be used in production — if the community whose language it serves gives consent. For Indigenous and low-resource languages, that consent matters. See Data Sovereignty.

# Benchmark your method (in the eval harness repo)
cd arena
python eval/baseline_experiment.py --dataset data/edtekla-dev-v1.json --submit

# Use it locally
npx champollion sync

Same plugin. Plug and test.

The full toolkit

champollion isn't just sync. It's a complete i18n pipeline:

Command	What It Does
`sync`	Translate missing and stale keys (with post-sync verification)
`watch`	Auto-sync when your source file changes
`lint`	Scan source code for hardcoded strings
`wrap`	Auto-wrap hardcoded strings in `t()` calls
`audit`	List all `[EN]` fallback markers from prior runs
`verify`	Verify translations are present and correct (CI gate)
`integrity`	Detect placeholder corruption, encoding issues, and ICU plural completeness
`seo`	Generate hreflang tags, sitemaps, and JSON-LD schema
`status`	Show pair config, plugins, and benchmark scores
`provenance`	Audit translation resource licensing
`plugin`	Install, remove, and list method plugins
`fonts`	Download web fonts for PUA script converters
`tm`	Manage Translation Memory cache (stats, clear, per-locale)
`xliff`	Export/import XLIFF 1.2 for professional translator review

Four of these — lint, sync, verify, audit — form a CI pipeline that catches hardcoded strings, translates them, verifies correctness, and fails the build if any locale is incomplete.

The Arena

The Method Leaderboard is the scoreboard. Every submission is fingerprinted to a Git commit, versioned to a specific dataset, and scored by the same harness. Anyone can submit.

What can you prove? The harness takes JSON. Plugins take JSON. Any method that produces JSON can be tested:

Approach	Example
Coached LLM	Inject grammar rules and dictionaries into a frontier model's prompt
Fine-tuned model	Train an open model on parallel text — just not on the eval data
FST-gated pipeline	LLM generates → finite-state transducer validates morphology → retry
Chained models	Model A drafts → Model B post-edits → Model C scores
Dictionary + LLM	Force known terms from a dictionary, let the LLM handle the rest
Evolutionary	Generate candidates, score them, mutate the best, repeat
Partial translation	Translate a sample by hand, prove your LLM matches, auto-translate the rest

Fine-tune models. Deploy evolutionary algorithms. Test student answers on language exams. Build lookup tables. Chain three models together. As long as your method produces JSON, the harness scores it and the framework runs it.

:::danger The one rule Do not train on the evaluation data. Methods exposed to the benchmark dataset will be disqualified. Fine-tune on whatever you want. Just not on the test set. :::

This is an open invitation. If you work with a low-resource language — as a researcher, a community member, a student, or just someone who cares — build a method, run the harness, and claim the top score. The problem is unsolved. The infrastructure is here.

→ View the leaderboard

Next Steps

Getting started:

Installation — Set up in 2 minutes
Quick Start — Run your first sync
Supported Languages — What's available out of the box

Customizing your setup:

Translation Methods — Choose the right method per pair
Translation Memory — How caching saves you money
Configuration — Full config reference
Hugo Multilingual Site — Markdown content translation

Going deeper:

Working with Professional Translators — XLIFF export/import workflow
Data Sovereignty — OCAP, CARE, and Māori Data Sovereignty principles
Support a Low-Resource Language — The challenge that started it all
Cookbook: FST-Gated Pipeline — Build a decomposition pipeline
MT Evaluation — How the harness and leaderboard work
Method Leaderboard — Live scores and submissions

Why Not Just Script It Yourself?​

What Makes It Different​

Every method is a plugin​

Prove it​

The full toolkit​

The Arena​

Next Steps​