Skip to content

feat: add MiniMax as alternative LLM provider for evaluation#177

Open
octo-patch wants to merge 1 commit intoOpenDriveLab:mainfrom
octo-patch:feature/add-minimax-provider
Open

feat: add MiniMax as alternative LLM provider for evaluation#177
octo-patch wants to merge 1 commit intoOpenDriveLab:mainfrom
octo-patch:feature/add-minimax-provider

Conversation

@octo-patch
Copy link

Summary

  • Add MiniMax as an alternative LLM provider for GPT-score evaluation alongside OpenAI
  • Make GPTEvaluation class configurable with --provider and --model CLI arguments
  • Auto-detect provider from environment variables (MINIMAX_API_KEY / OPENAI_API_KEY)
  • Handle MiniMax-specific constraints: temperature clamping (0.0, 1.0], <think> tag stripping for M2.7 models
  • Replace hardcoded API key placeholder with proper env var support

Changes

File Change
challenge/gpt_eval.py Multi-provider support, PROVIDER_CONFIGS, auto-detection, temperature clamping, think-tag stripping
challenge/evaluation.py --provider and --model CLI arguments, pass through to GPTEvaluation
challenge/tests/test_gpt_eval.py 31 unit tests covering provider resolution, temperature clamping, think-tag stripping, config validation
challenge/tests/test_integration.py 3 MiniMax + 1 OpenAI live API integration tests (skipped if key unavailable)
README.md LLM provider configuration documentation with usage examples

MiniMax API Reference

Usage

# Using MiniMax for evaluation
export MINIMAX_API_KEY="your-key"
python challenge/evaluation.py --root_path1 pred.json --root_path2 test.json --provider minimax

# Using OpenAI (default, backward compatible)
export OPENAI_API_KEY="your-key"
python challenge/evaluation.py --root_path1 pred.json --root_path2 test.json

Test plan

  • 31 unit tests pass (pytest challenge/tests/test_gpt_eval.py)
  • 3 MiniMax integration tests pass with live API
  • Backward compatible — existing OpenAI usage unchanged when no new args are provided

- Make GPTEvaluation support configurable LLM providers (OpenAI, MiniMax)
- Add MINIMAX_API_KEY environment variable support with auto-detection
- Add temperature clamping for MiniMax API constraints
- Strip <think> reasoning tags from MiniMax M2.7 responses
- Add --provider and --model CLI arguments to evaluation.py
- Add 31 unit tests and 3 integration tests
- Update README with MiniMax evaluation provider documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant