No description
Find a file
autocommit 37bdc666f3
Some checks failed
Publish / publish (push) Failing after 1s
Publish to PyPI / Build and Publish (push) Failing after 39s
deps-upgrade(config): ⬆️ Update config dependencies to latest versions for improved compatibility and security
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-04-12 00:21:19 -07:00
.forgejo/workflows chore: initial commit with DRY workflow 2026-01-21 12:48:56 -08:00
src/ml_model_router chore: initial commit with DRY workflow 2026-01-21 12:48:56 -08:00
tests chore: initial commit with DRY workflow 2026-01-21 12:48:56 -08:00
.gitignore chore: initial commit with DRY workflow 2026-01-21 12:48:56 -08:00
pyproject.toml deps-upgrade(config): ⬆️ Update config dependencies to latest versions for improved compatibility and security 2026-04-12 00:21:19 -07:00
README.md chore: initial commit with DRY workflow 2026-01-21 12:48:56 -08:00

ML Model Router

Route LLM requests to fast or reasoning model based on complexity analysis.

Installation

pip install lilith-ml-model-router

Quick Start

from ml_model_router import ModelRouter, RoutingStrategy

router = ModelRouter(
    fast_model="ministral-3b-instruct",
    reasoning_model="ministral-14b-reasoning",
    strategy=RoutingStrategy.COMPLEXITY_BASED
)

# Async usage
result = await router.select_model(
    message="Hey, are you free Saturday?",
    context_length=5,
    requires_reasoning=False
)

print(result.choice.value)  # "fast"
print(result.model_name)    # "ministral-3b-instruct"

# Sync usage
result = router.select_model_sync(
    message="Complex question requiring analysis?",
    context_length=10
)

Routing Strategies

COMPLEXITY_BASED (default)

Analyzes message complexity using multiple heuristics:

  • Message length: Messages > 200 chars prefer reasoning
  • Question count: Multiple questions prefer reasoning
  • Emotional keywords: Detected emotional content prefers reasoning
  • Context length: Conversations > 10 messages prefer reasoning
  • Complex indicators: Comparative/analytical language prefers reasoning
  • Simple patterns: Greetings, yes/no responses use fast model

CONTEXT_LENGTH

Routes based solely on conversation context length:

router = ModelRouter(
    strategy=RoutingStrategy.CONTEXT_LENGTH,
    # Uses default threshold of 10
)

ALWAYS_FAST / ALWAYS_REASONING

For testing or when you want consistent behavior:

router = ModelRouter(strategy=RoutingStrategy.ALWAYS_FAST)
router = ModelRouter(strategy=RoutingStrategy.ALWAYS_REASONING)

Complexity Analysis

Access the underlying complexity analyzer for custom logic:

from ml_model_router import ComplexityAnalyzer, ComplexityConfig

# Custom configuration
config = ComplexityConfig(
    length_threshold=300,      # Characters
    context_threshold=15,      # Messages
    question_threshold=2,      # Questions before reasoning
)

analyzer = ComplexityAnalyzer(config)
score = analyzer.analyze(
    message="Your message here",
    context_length=5,
)

print(score.is_simple)      # bool
print(score.is_emotional)   # bool
print(score.is_complex)     # bool
print(score.raw_score)      # float 0.0-1.0+
print(score.to_model_choice())  # ModelChoice.FAST or REASONING

Configuration

RouterConfig

from ml_model_router import RouterConfig

config = RouterConfig(
    fast_model="ministral-3b-instruct",
    reasoning_model="ministral-14b-reasoning",
    strategy=RoutingStrategy.COMPLEXITY_BASED,
    context_threshold=10,
)

router = ModelRouter(config=config)

Custom Complexity Config

from ml_model_router import ModelRouter, ComplexityConfig

router = ModelRouter(
    complexity_config=ComplexityConfig(
        length_threshold=250,
        emotional_keywords=frozenset({"custom", "keywords"}),
    )
)

API Reference

ModelRouter

  • select_model(message, context_length, requires_reasoning) - Async model selection
  • select_model_sync(message, context_length, requires_reasoning) - Sync model selection
  • get_model_for_choice(choice) - Get model name for "fast" or "reasoning"

RoutingResult

  • choice - ModelChoice.FAST or ModelChoice.REASONING
  • model_name - The actual model identifier
  • strategy_used - The strategy that made the decision
  • reasoning - Human-readable explanation

ComplexityScore

  • is_simple - Message matches simple patterns
  • is_emotional - Emotional keywords detected
  • is_complex - Complex indicators found
  • question_count - Number of questions
  • raw_score - Numerical complexity (0.0-1.0+)
  • to_model_choice() - Convert to ModelChoice

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy src/

# Linting
ruff check src/

License

MIT