Skip to content

Latest commit

 

History

History
138 lines (96 loc) · 6.76 KB

File metadata and controls

138 lines (96 loc) · 6.76 KB

Contributing to RubyLLM

Thank you for considering contributing to RubyLLM! We're aiming to build a high-quality, robust library, and thoughtful contributions are welcome.

Development Setup and Workflow

Getting started and contributing follows a typical GitHub-based workflow:

  1. Fork & Clone: Fork the repository to your own GitHub account and then clone it locally.
    gh repo fork crmne/ruby_llm --clone
    cd ruby_llm
  2. Install Dependencies:
    bundle install
  3. Set Up Git Hooks: Required.
    overcommit --install
  4. Branch: Create a new branch for your feature or bugfix. If it relates to an existing issue, you can use the gh CLI to help:
    gh issue develop 123 --checkout # Substitute 123 with the relevant issue number
  5. Code & Test: Make your changes and ensure they are well-tested. (See "Running Tests" section for more details).
  6. Commit: Write clear and concise commit messages.
  7. Pull Request: Create a Pull Request (PR) against the main branch of the crmne/ruby_llm repository.
    • Thoroughly review your own PR before submitting. Check for any "vibe coding" – unnecessary files, experimental code that doesn't belong, or incomplete work.
    • Write a clear and detailed PR description explaining the "what" and "why" of your changes. Link to any relevant issues.
    • Badly/vibe-coded PRs with minimal descriptions will likely be closed or receive extensive review comments, slowing things down for everyone. Follow the existing conventions of RubyLLM. Aim for quality.
    gh pr create --web

Model Registry (models.json) & Aliases (aliases.json)

These files are critical for how RubyLLM identifies and uses AI models. Both are auto-generated by rake tasks. Do not edit them manually or include manual changes to them in PRs.

models.json: The Model Catalog

  • How it's made: The rake models:update task builds this file. It fetches model data directly from configured provider APIs (processing these details via each provider's capabilities.rb file) and also from the Parsera LLM Specs API. These lists are then merged, with Parsera's standardized data generally taking precedence for common models, augmented by provider-specific metadata. Models unique to a provider's API (and not yet in Parsera) are also included.
  • Updating Model Information:
    • Incorrect public specs (pricing, context size, etc.)? Parsera scrapes public provider documentation. If data for a publicly documented model is wrong or missing on Parsera, please file an issue with Parsera. Once they update, rake models:update will fetch the corrections.
    • Models not in public docs / Provider-specifics: If a model isn't well-documented publicly by the provider (e.g., older or preview models) or needs provider-specific handling within RubyLLM, update the relevant lib/ruby_llm/providers/<provider>/capabilities.rb and potentially models.rb. Then run bundle exec rake models:update.
    • New Provider Support: This involves more in-depth work to create the provider-specific modules and ensure integration with the models:update task.

aliases.json: User-Friendly Shortcuts

  • Purpose: Maps common names (e.g., claude-3-5-sonnet) to precise, versioned model IDs.
  • How it's made: Generated by rake aliases:generate using the current models.json. Run this task after models.json is updated.

Running Tests

Tests are crucial. We use RSpec and VCR.

# Run all tests (uses existing VCR cassettes)
bundle exec rspec

# Run a specific test file
bundle exec rspec spec/ruby_llm/chat_spec.rb

# To re-record a specific test's cassette, first remove its .yml file:
rm spec/fixtures/vcr_cassettes/chat_vision_models_*_can_understand_local_images.yml # Adjust file name as needed
# Then run the specific test or test file that uses this cassette.

# Run a specific test by its description string (or part of it)
bundle exec rspec -e "can understand local images"

Testing Philosophy & VCR

  • New tests should generally be end-to-end to verify integration with actual provider APIs (via VCR).
  • Keep tests minimal and focused. We don't need to test every single model variant for every feature if the underlying API mechanism is the same. One or two representative models per provider for a given feature is usually sufficient.
  • API Call Costs: VCR cassettes are used to avoid hitting live APIs on every test run. However, recording these cassettes costs real money for API calls. Please be mindful of this when adding tests that would require new recordings. If you're adding extensive tests that significantly increase API usage for VCR recording, consider sponsoring the project on GitHub to help offset these costs.

Recording VCR Cassettes

If your changes affect API interactions, you'll need to re-record the VCR cassettes.

To re-record cassettes for specific providers (e.g., OpenAI and Anthropic):

# Set necessary API keys as environment variables
export OPENAI_API_KEY="your_openai_key"
export ANTHROPIC_API_KEY="your_anthropic_key"

# Run the rake task, specifying providers
bundle exec rake vcr:record[openai,anthropic]

To re-record all cassettes (requires all relevant API keys to be set):

bundle exec rake vcr:record[all]

The rake task will delete the relevant existing cassettes and re-run the tests to record fresh interactions.

CRITICAL: After recording new or updated VCR cassettes, manually inspect the YAML files in spec/fixtures/vcr_cassettes/. Ensure that no sensitive information (API keys, personal data, etc.) has accidentally been recorded. The VCR configuration has filters for common keys, but diligence is required.

Coding Style

We follow the Standard Ruby style guide.

# Check your code style
bundle exec rubocop

# Auto-fix style issues where possible
bundle exec rubocop -A

The Overcommit pre-commit hook should help enforce this.

Documentation

If you add new features or change existing behavior, please update the documentation:

  • Update relevant guides in the docs/guides/ directory.
  • Ensure the README.md remains a concise and helpful entry point for new users.

Release Process

Gem versioning follows Semantic Versioning (SemVer):

  1. MAJOR version for incompatible API changes.
  2. MINOR version for adding functionality in a backward-compatible manner.
  3. PATCH version for backward-compatible bug fixes.

Releases are handled by the maintainers through the CI/CD pipeline.


Thanks for contributing to RubyLLM,

Carmine