MultiSynt Nemotron-CC Models by Language

Models follow the naming convention MultiSynt/nemotron-cc-{language}-{translation-model}.

Language	tower9b	tower72b	opus	include	belebele	flores	others
Basque	✗	✗	✓	✓	✓	✗
Catalan	✗	✗	✓	✗	✓	✗	copa_ca
Danish	✓	✗	✓	✗	✓	✓
Dutch	✓	✗	✓	✓	✓	✓
Finnish	✓	✓	✓	✓	✓	✓
French	✓	✗	✓	✓	✓	✓
German	✓	✓	✓	✓	✓	✓
Icelandic	✓	✗	✓	✗	✗	✗	icelandic_winogrande
Italian	✓	✓	✓	✓	✓	✓
Norwegian	✓	✗	✗	✗	✓	✗
Polish	✓	✗	✓	✓	✓	✓
Portuguese	✗	✗	✓	✓	✓	✓
Romanian	✓	✗	✓	✗	✓	✓
Spanish	✓	✓	✓	✓	✓	✓
Swedish	✓	✓	✓	✗	✓	✓

Analysis

Unlike HPLT2 monolingual models, no model has intermediate checkpoints available.

Model coverage

Full coverage (all 3 models): Finnish, German, Italian, Spanish, Swedish
Missing tower72b only: Danish, Dutch, French, Icelandic, Polish, Romanian
Missing tower9b + tower72b: Basque, Catalan, Portuguese
Missing opus only: Norwegian
Norwegian is the only language missing opus while having tower9b.

Benchmark coverage

include: Basque, Dutch, Finnish, French, German, Italian, Polish, Portuguese, Spanish (9/15 languages)
belebele: Basque, Catalan, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Polish, Portuguese, Romanian, Spanish, Swedish (14/15 languages)
flores: Danish, Dutch, Finnish, French, German, Italian, Polish, Portuguese, Romanian, Spanish, Swedish (11/15 languages — Catalan, Basque, Icelandic, Norwegian pending flores_plus migration, see flores_migration.md)
others: Catalan (copa_ca), Icelandic (icelandic_winogrande)
No benchmark coverage: (none)

Steps:

evaluate multisynt models with tower72b > tower9B > opus on all languages
evaluate hplt models on all languages
add missing evaluation noreval etc in oellm-cli

Generate Evaluations

1. Generate the evaluation list

The script generate_eval_list.py produces multisynt_evals.csv, a flat CSV with one row per (model, task) pair, ready to be passed to oellm-cli.

The core of the script is LANGUAGE_TASKS, a dictionary mapping each language to its list of individual evaluation tasks with their n-shot settings and eval suite. The tasks are chosen based on the benchmark coverage table above — for example, languages present in belebele get belebele_{lang}_Latn, languages in flores get the corresponding flores200: tasks, and so on. Languages with no benchmark coverage (Catalan, Icelandic, Norwegian) have an empty task list. For each model, the best available translation model is selected following the priority tower72b > tower9b > opus.

To regenerate the CSV:

uv run generate_eval_list.py

This writes multisynt_evals.csv with columns: model_path, task_path, n_shot, eval_suite.

2. Schedule evaluations on a cluster

Copy the CSV to the cluster (e.g. LUMI):

scp multisynt_evals.csv lumi:

Then on the cluster, schedule all evaluations with oellm-cli:

oellm schedule-eval --eval_csv_path multisynt_evals.csv

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
analyse-results		analyse-results
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
generate_eval_list.py		generate_eval_list.py
multisynt_evals.csv		multisynt_evals.csv
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MultiSynt Nemotron-CC Models by Language

Analysis

Model coverage

Benchmark coverage

Generate Evaluations

1. Generate the evaluation list

2. Schedule evaluations on a cluster

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MultiSynt Nemotron-CC Models by Language

Analysis

Model coverage

Benchmark coverage

Generate Evaluations

1. Generate the evaluation list

2. Schedule evaluations on a cluster

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages