Skip to content

feat: add LiteLLM as AI gateway model backend#2441

Open
RheagalFire wants to merge 3 commits intoopen-compass:mainfrom
RheagalFire:feat/add-litellm-provider
Open

feat: add LiteLLM as AI gateway model backend#2441
RheagalFire wants to merge 3 commits intoopen-compass:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire
Copy link
Copy Markdown

@RheagalFire RheagalFire commented Apr 22, 2026

Motivation

OpenCompass currently requires a separate provider file for each LLM backend. Users who want to evaluate models across Azure, Bedrock, Vertex AI, or Groq need provider-specific code. LiteLLM provides a unified completion() interface that handles auth, formatting, and provider-specific quirks, enabling cross-provider
evaluation with a single configuration change.

Supersedes stale PR #202 (ishaan-jaff, Aug 2023) -- both review points from @gaotongxiao addressed: optional dep in requirements/api.txt (not core), lazy import inside _generate().

Also addresses provider flexibility mentioned in #2147 -- AI/ML API is accessible via LiteLLM as aiml/<model>.

Modification

  • opencompass/models/litellm_api.py -- new LiteLLMAPI extending BaseAPIModel (236 lines). Handles all 3 input formats: plain str, CHATML-shaped dicts, and OpenCompass-native PromptList (HUMAN/BOT/SYSTEM).
  • opencompass/models/__init__.py -- import + registration in alphabetical order
  • requirements/api.txt -- added litellm>=1.55,<1.85
  • docs/en/user_guides/models.md -- added LiteLLM to supported API providers list
  • docs/zh_cn/user_guides/models.md -- added LiteLLM to supported API providers list (Chinese)
  • tests/models/test_litellm_api.py -- 18 unit tests covering init, message translation, generation, retry, error handling, registry

Key details:

  • drop_params=True by default -- silently drops provider-unsupported kwargs for cross-provider compatibility
  • Lazy import litellm inside _generate() -- base install unaffected
  • Flexible auth: accepts key= param or provider-specific env vars (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
  • extra_body dict for forwarding provider-specific params (reasoning_effort, seed, etc.)

BC-breaking

None. Additive only -- existing providers untouched. litellm is in requirements/api.txt (optional extra), not in the base requirements.txt.

Usage and Testing

from opencompass.models import LiteLLMAPI

models = [
    dict(
        type=LiteLLMAPI,
        path='anthropic/claude-sonnet-4-20250514',  # any LiteLLM model string                                                                                                                                                                                                                                                                                          
        temperature=0,
        max_seq_len=16384,                                                                                                                                                                                                                                                                                                                                              
        query_per_second=2,
        retry=2,
    ),                                                                                                                                                                                                                                                                                                                                                                  
]

Supported model strings include:

  • OpenAI: gpt-4o, gpt-4o-mini
  • Anthropic: anthropic/claude-sonnet-4-20250514
  • Azure: azure/gpt-4o
  • Bedrock: bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0
  • Vertex AI: vertex_ai/gemini-2.5-pro
  • Together AI: together_ai/meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo
  • Groq: groq/meta-llama/llama-4-scout-17b-16e-instruct
  • 100+ more at https://docs.litellm.ai/docs/providers
  Unit tests (18/18 pass):                                                                                                                                                                                                                                                                                                                                                
                  
  tests/models/test_litellm_api.py::TestLiteLLMAPIInit::test_call_kwargs_forwards_provider_credentials PASSED                                                                                                                                                                                                                                                             
  tests/models/test_litellm_api.py::TestLiteLLMAPIInit::test_call_kwargs_omits_optional_when_blank PASSED                                                                                                                                                                                                                                                                 
  tests/models/test_litellm_api.py::TestLiteLLMAPIInit::test_default_init_stores_path PASSED
  tests/models/test_litellm_api.py::TestLiteLLMAPIInit::test_drop_params_default_true PASSED                                                                                                                                                                                                                                                                              
  tests/models/test_litellm_api.py::TestLiteLLMAPIInit::test_extra_body_cannot_override_core_params PASSED
  tests/models/test_litellm_api.py::TestLiteLLMAPIInit::test_registers_in_models_registry PASSED                                                                                                                                                                                                                                                                          
  tests/models/test_litellm_api.py::TestLiteLLMAPIMessages::test_chatml_shaped_prompt_list_passes_through PASSED
  tests/models/test_litellm_api.py::TestLiteLLMAPIMessages::test_opencompass_native_prompt_list PASSED                                                                                                                                                                                                                                                                    
  tests/models/test_litellm_api.py::TestLiteLLMAPIMessages::test_plain_string_input PASSED                                                                                                                                                                                                                                                                                
  tests/models/test_litellm_api.py::TestLiteLLMAPIMessages::test_system_prompt_not_duplicated_when_system_exists PASSED                                                                                                                                                                                                                                                   
  tests/models/test_litellm_api.py::TestLiteLLMAPIMessages::test_system_prompt_prepended_when_absent PASSED                                                                                                                                                                                                                                                               
  tests/models/test_litellm_api.py::TestLiteLLMAPIGenerate::test_generate_handles_none_content PASSED
  tests/models/test_litellm_api.py::TestLiteLLMAPIGenerate::test_generate_preserves_batch_order PASSED                                                                                                                                                                                                                                                                    
  tests/models/test_litellm_api.py::TestLiteLLMAPIGenerate::test_generate_raises_after_exhausting_retries PASSED                                                                                                                                                                                                                                                          
  tests/models/test_litellm_api.py::TestLiteLLMAPIGenerate::test_generate_raises_import_error_without_litellm PASSED                                                                                                                                                                                                                                                      
  tests/models/test_litellm_api.py::TestLiteLLMAPIGenerate::test_generate_retries_then_succeeds PASSED                                                                                                                                                                                                                                                                    
  tests/models/test_litellm_api.py::TestLiteLLMAPIGenerate::test_generate_single_string PASSED
  tests/models/test_litellm_api.py::TestLiteLLMAPIGenerate::test_generate_translates_opencompass_native_prompt_list PASSED                                                                                                                                                                                                                                                
  ======================== 18 passed in 4.71s ========================                                                                                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                                                                                                                                                          
  Live E2E against Anthropic Claude Sonnet 4-6 (via Azure):                                                                                                                                                                                                                                                                                                               
                  
  ============================================================
  Live E2E Test: opencompass LiteLLMAPI
  Model: anthropic/claude-sonnet-4-6                                                                                                                                                                                                                                                                                                                                      
  ============================================================
                                                                                                                                                                                                                                                                                                                                                                          
  [Test 1] Plain string input
    Answer: 4

  [Test 2] OpenCompass PromptList (HUMAN/BOT roles)                                                                                                                                                                                                                                                                                                                       
    Answer: The square root of 144 is 12.
                                                                                                                                                                                                                                                                                                                                                                          
  [Test 3] Batch generation (3 inputs)
    [0] Hello!
    [1] Goodbye!
    [2] Thanks!

  ============================================================
  All 3 tests passed
  ============================================================

Lint: flake8 --max-line-length 99 + isort --check -> all clean.

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
  • CLA has been signed and all committers have signed the CLA in this PR.

@RheagalFire
Copy link
Copy Markdown
Author

cc @tonysy @gaotongxiao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant