AI skill for generating validated Moodle CodeRunner programming questions
Generate auto-graded programming questions using the minimal pipeline (AI writes code, Jobe computes expected output) for Moodle's CodeRunner plugin. Every question is compiled and tested on a Jobe server before delivery. Supports Java, Python, C, C++, and Node.js across 10 question types.
Battle-tested: 1000+ questions generated and validated with a 100% pass rate.
# Claude Code CLI
mkdir -p ~/.claude/skills
git clone https://github.com/danielcregg/moodle-coderunner.git ~/.claude/skills/moodle-coderunner
# Use it
/moodle-coderunner
Generate 5 Java CodeRunner questions on arrays for first-year studentsFor Claude Desktop: Settings > Skills > Upload SKILL.md.
Generates complete, import-ready Moodle XML with:
| Component | Purpose |
|---|---|
| Question text | Clear instructions with examples and penalty regime |
| Answer preload | Compilable skeleton code for students |
| Model solution | Complete correct answer (hidden) |
| Visible test(s) | Students see expected input/output |
| Hidden test(s) | Prevents hard-coding, tests edge cases |
| General feedback | Explanation shown after quiz closes |
| Jobe validation | Every question compiled and tested before delivery |
Previous versions generated questions but couldn't verify they actually worked. v4.0 introduces:
- Jobe server validation — every question is compiled and executed against test cases before delivery
- 23 battle-tested rules — derived from analysing 370+ questions across 7 AI models
- Self-healing loop — if validation fails, the AI fixes the question and retries
- 10 question types — function, class, method, and program types for all languages
- Advanced features — partial credit, precheck, give-up, conditional display, time/memory limits
We tested the same 12 questions across 7 models. The skill rules are the difference:
| Model | Size | Pass Rate |
|---|---|---|
| Claude (with skill) | Cloud | 100% |
| GPT-5.4 (with rules) | Cloud | 100% |
| z.ai GLM-5.1 | Cloud | 17% |
| Gemma 4 E4B | 4B | 16% |
| Gemma 4 26B | 26B MoE | 0% |
| DeepSeek Coder V2 | 16B MoE | 0% |
| Qwen 3 8B | 8B | 0% |
Open-source models (7-26B) cannot reliably generate CodeRunner XML even with templates. The rules are necessary but only effective with frontier models.
| Language | Types | Reliability |
|---|---|---|
| Java | java_class, java_method, java_program |
HIGH / HIGH / LOW |
| Python | python3, python3_w_input |
HIGH / MEDIUM |
| C | c_function, c_program |
HIGH / MEDIUM |
| C++ | cpp_function, cpp_program |
MEDIUM / MEDIUM |
| Node.js | nodejs |
HIGH |
Reliability reflects first-attempt pass rates. LOW types (java_program) have known EOF pitfalls but the skill's rules handle them.
The skill contains 23 rules. Here are the most critical:
| # | Rule | Prevents |
|---|---|---|
| 1 | Java Scanner: always hasNext() before reads |
NoSuchElementException (30% of Java failures) |
| 2 | Python stdin: use sys.stdin.read() not input() |
EOFError on empty input |
| 3 | C/C++: Jobe uses -Werror (all warnings fatal) |
Compilation failures |
| 4 | C++: size_t not int for .size() loops |
-Werror=sign-compare |
| 5 | C/C++: #include in <answer>, not just test code |
-Werror=implicit-function-declaration |
| 6 | Array printing: if(i) printf(" ") pattern |
Trailing space output mismatch |
| 7 | No exact floating-point comparisons | Rounding differences |
| 8 | Trace every test case mentally | Wrong expected output (20% of failures) |
See SKILL.md Section 2 for the complete per-language rule tables.
The skill supports all major CodeRunner options:
| Feature | XML Tag | Effect |
|---|---|---|
| Partial credit | <allornothing>0</allornothing> |
Weighted marks per test |
| Precheck | <precheck>1</precheck> |
Students can test before submitting |
| Give up | <giveupallowed>1</giveupallowed> |
Students can reveal answer |
| Time limit | <cputimelimitsecs>2</cputimelimitsecs> |
Enforce CPU time limit |
| Memory limit | <memlimitmb>64</memlimitmb> |
Enforce memory limit |
| Progressive tests | hiderestiffail="1" |
Hide remaining tests on failure |
| Conditional display | HIDE_IF_FAIL, HIDE_IF_SUCCEED |
Show/hide based on result |
The repo includes validate_coderunner.py — a standalone Python script (stdlib only) that validates CodeRunner XML against a Jobe server:
python3 validate_coderunner.py questions.xml --jobe-url http://your-jobe-server:4000- Parses Moodle XML, constructs full source per question type
- Submits to Jobe for compilation and execution
- Compares actual output against expected
- Reports PASS/FAIL with error details (COMPILE_ERROR, WRONG_OUTPUT, RUNTIME_ERROR)
- Supports all 10 question types
- No pip dependencies required
/moodle-coderunner
Create 5 Java java_class questions on encapsulation (getters, setters, constructors)
for first-year students. All-or-nothing grading.
/moodle-coderunner
Generate 8 Python CodeRunner questions on list manipulation and string methods.
Mix of easy, medium, and hard difficulty. Validate on Jobe.
/moodle-coderunner
Create 3 C function questions with partial credit (weighted marks),
precheck enabled, and a 2-second CPU time limit. Include edge case tests.
mkdir -p ~/.claude/skills
git clone https://github.com/danielcregg/moodle-coderunner.git ~/.claude/skills/moodle-coderunner- Download
SKILL.mdfrom this repo - Open Claude Desktop > Settings > Skills > Upload skill
- Drag and drop
SKILL.md
See the ChatGPT setup guide in the companion repo for instructions on creating a GPT with Jobe validation via Actions.
moodle-coderunner/
SKILL.md # The AI skill (v4.0 — 23 rules, 10 sections)
validate_coderunner.py # Standalone Jobe validation script
README.md # This file
LICENSE # MIT
| Platform | Supported | How |
|---|---|---|
| Claude Code (CLI) | Yes | Install as skill, invoke with /moodle-coderunner |
| Claude Desktop | Yes | Upload SKILL.md via Settings > Skills |
| ChatGPT (Custom GPT) | Yes | Use SKILL.md as GPT instructions + Jobe Action |
| Moodle 4.x + CodeRunner 4.x | Yes | Import generated XML via Question bank |
| Moodle 3.x + CodeRunner 3.x | Yes | Import generated XML via Question bank |
| Version | Date | Changes |
|---|---|---|
| 4.0.0 | 2026-04-07 | Complete rewrite. Jobe validation loop. 23 rules from 300+ validated questions. Per-language rule tables. Model comparison data. Validation script. Advanced features support (partial credit, precheck, give-up, time/memory limits). Optimised for AI consumption (341 lines, tables over prose). |
| 3.x | 2026-04-07 | Internal iterations. Stress-tested across 10 question types. Discovered stdin newline, -Werror, trailing space, and #include placement rules. |
| 2.0.0 | 2026-03-24 | Added Python support, full CodeRunner type list, review document workflow. Published to GitHub. |
| 1.0 | 2025-11-09 | Initial skill for Java CodeRunner questions. Critical <name> tag fix. |
| Project | Purpose |
|---|---|
| coderunner-question-forge | Web app for generating and validating CodeRunner question banks |
| moodle-coderunner-ai | AI tutor that gives feedback on student CodeRunner submissions |
| moodle-mcq | Claude skill for generating Moodle MCQ questions (GIFT/XML) |