No description
|
|
||
|---|---|---|
| .forgejo/workflows | ||
| src/ml_model_router | ||
| tests | ||
| .gitignore | ||
| pyproject.toml | ||
| README.md | ||
ML Model Router
Route LLM requests to fast or reasoning model based on complexity analysis.
Installation
pip install lilith-ml-model-router
Quick Start
from ml_model_router import ModelRouter, RoutingStrategy
router = ModelRouter(
fast_model="ministral-3b-instruct",
reasoning_model="ministral-14b-reasoning",
strategy=RoutingStrategy.COMPLEXITY_BASED
)
# Async usage
result = await router.select_model(
message="Hey, are you free Saturday?",
context_length=5,
requires_reasoning=False
)
print(result.choice.value) # "fast"
print(result.model_name) # "ministral-3b-instruct"
# Sync usage
result = router.select_model_sync(
message="Complex question requiring analysis?",
context_length=10
)
Routing Strategies
COMPLEXITY_BASED (default)
Analyzes message complexity using multiple heuristics:
- Message length: Messages > 200 chars prefer reasoning
- Question count: Multiple questions prefer reasoning
- Emotional keywords: Detected emotional content prefers reasoning
- Context length: Conversations > 10 messages prefer reasoning
- Complex indicators: Comparative/analytical language prefers reasoning
- Simple patterns: Greetings, yes/no responses use fast model
CONTEXT_LENGTH
Routes based solely on conversation context length:
router = ModelRouter(
strategy=RoutingStrategy.CONTEXT_LENGTH,
# Uses default threshold of 10
)
ALWAYS_FAST / ALWAYS_REASONING
For testing or when you want consistent behavior:
router = ModelRouter(strategy=RoutingStrategy.ALWAYS_FAST)
router = ModelRouter(strategy=RoutingStrategy.ALWAYS_REASONING)
Complexity Analysis
Access the underlying complexity analyzer for custom logic:
from ml_model_router import ComplexityAnalyzer, ComplexityConfig
# Custom configuration
config = ComplexityConfig(
length_threshold=300, # Characters
context_threshold=15, # Messages
question_threshold=2, # Questions before reasoning
)
analyzer = ComplexityAnalyzer(config)
score = analyzer.analyze(
message="Your message here",
context_length=5,
)
print(score.is_simple) # bool
print(score.is_emotional) # bool
print(score.is_complex) # bool
print(score.raw_score) # float 0.0-1.0+
print(score.to_model_choice()) # ModelChoice.FAST or REASONING
Configuration
RouterConfig
from ml_model_router import RouterConfig
config = RouterConfig(
fast_model="ministral-3b-instruct",
reasoning_model="ministral-14b-reasoning",
strategy=RoutingStrategy.COMPLEXITY_BASED,
context_threshold=10,
)
router = ModelRouter(config=config)
Custom Complexity Config
from ml_model_router import ModelRouter, ComplexityConfig
router = ModelRouter(
complexity_config=ComplexityConfig(
length_threshold=250,
emotional_keywords=frozenset({"custom", "keywords"}),
)
)
API Reference
ModelRouter
select_model(message, context_length, requires_reasoning)- Async model selectionselect_model_sync(message, context_length, requires_reasoning)- Sync model selectionget_model_for_choice(choice)- Get model name for "fast" or "reasoning"
RoutingResult
choice- ModelChoice.FAST or ModelChoice.REASONINGmodel_name- The actual model identifierstrategy_used- The strategy that made the decisionreasoning- Human-readable explanation
ComplexityScore
is_simple- Message matches simple patternsis_emotional- Emotional keywords detectedis_complex- Complex indicators foundquestion_count- Number of questionsraw_score- Numerical complexity (0.0-1.0+)to_model_choice()- Convert to ModelChoice
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Type checking
mypy src/
# Linting
ruff check src/
License
MIT