model-boss

lilith/model-boss

Fork 0

Commit graph

5a059230bb chore(config): 🔧 Update app metadata in app.manifest.yaml with new name, version, icons, and platform-specific permissions main autocommit 2026-06-10 20:14:11 -07:00
5cde520788 deps-upgrade(deps): ⬆️ Update dependency versions in coordinator and core-py packages to align with uv.lock files autocommit 2026-06-10 20:14:11 -07:00
371dcd502a feat(gpu-lease): ✨ Add lease management logic in GPULease class and update coordinator with lease timeouts and retry policies autocommit 2026-06-10 20:14:11 -07:00
314d390362 feat(inference): ✨ Refactor VRAM measurement functions for accuracy improvements and add unit tests for inference tracking autocommit 2026-06-10 20:14:11 -07:00
ce02e8e246 feat(inference): ✨ Update queue logic and add tests for dynamic slot management and improved inference task orchestration autocommit 2026-06-10 20:14:11 -07:00
15e769fe32 refactor(inference): ♻️ Implement cleaner worker management and scheduling logic in InferencePool by reorganizing selection conditions and state handling, while ensuring backward compatibility in tests autocommit 2026-06-10 20:14:11 -07:00
b28caf959c feat(llama-server): ✨ Introduce LlamaServerBackend with startup timeout and error handling for inference tasks; add test cases in test_backend.py autocommit 2026-06-10 20:14:11 -07:00
d394ba091f deps-upgrade(coordinator): ⬆️ Update key libraries in coordinator service and core Python package for security and performance improvements autocommit 2026-06-10 20:14:11 -07:00
de72966b42 release(coordinator): 🔖 Update version to 4.3.0 in app manifest and coordinator module autocommit 2026-06-10 18:19:08 -07:00
8ad01398f9 deps-upgrade(coordinator): ⬆️ Update coordinator dependencies for security and compatibility improvements autocommit 2026-06-10 18:19:08 -07:00
663ce17635 feat(model-boss-coordinator): ✨ Add idle timeout protection with priority-based scheduling to prevent model starvation autocommit 2026-06-10 17:54:57 -07:00
a347b86d8a chore(core-ts): 🔧 Update TypeScript version to 5.3 for core dependencies autocommit 2026-06-10 14:45:51 -07:00
acf7f95e0c feat(model-boss-coordinator): ✨ Introduce configurable retry policies, timeouts, and resource allocation in the coordinator service via new config keys and dynamic CLI support autocommit 2026-06-10 14:45:51 -07:00
bdc08c6c55 test(gpu): ✅ Add GPU boss test fixtures and expand test cases for enhanced resource management scenarios autocommit 2026-06-10 14:45:51 -07:00
fd2978469d feat(inference): ✨ Introduce configurable thread and batch settings for Llama inference backend to enable tunable concurrency and batching autocommit 2026-06-10 14:45:51 -07:00
02e735aad4 deps-upgrade(coordinator): ⬆️ Upgrade core-ts, model-boss-mcp, mcp-server, and types to ensure compatibility, security, and performance improvements autocommit 2026-06-10 14:45:51 -07:00
d7d9acaa25 test(coordinator): ✅ Add test cases for cloud-fallback guard behavior in coordinator router tests autocommit 2026-06-09 03:12:53 -07:00
4a3cf3a994 docs(docs): 📝 Add architectural documentation for cloud-fallback guard components and integration autocommit 2026-06-09 03:12:53 -07:00
165fb4674b deps-upgrade(core-py): ⬆️ Update core-py dependencies to latest stable versions autocommit 2026-06-09 03:12:53 -07:00
1e3abdf341 feat(model-boss): ✨ Improve batch job processing and GPU resource allocation in Boss and Client classes autocommit 2026-06-09 02:55:09 -07:00
bbf0510838 refactor(model-boss-coordinator): ♻️ Implement modular scheduling logic by reorganizing ModelBossCoordinator and its models autocommit 2026-06-09 02:55:09 -07:00
5cc75aa2e0 feat(coordinator): ✨ Add scheduling configuration options to Config and RuntimeConfig classes with test coverage autocommit 2026-06-09 02:55:08 -07:00
a9559f064a feat(coordinator): ✨ Enhance dynamic routing, proxying, and task scheduling for improved inference task performance and resource utilization autocommit 2026-06-09 02:55:08 -07:00
066b3ba63a feat(model-pool): ✨ Introduce batch job class and priority-based scheduling in model pool coordinator autocommit 2026-06-09 02:55:08 -07:00
e5183135c7 perf(model-boss-coordinator): ⚡ Implement queue management logic with timeout handling and load balancing to optimize synchronous text inference performance autocommit 2026-06-09 02:37:40 -07:00
16e59f9035 docs(docs): 📝 Add detailed architecture documentation for synchronous inference, async jobs, cold-load behavior, and timeout implications autocommit 2026-06-09 02:37:39 -07:00
11fe02311d feat(transformers-vision): ✨ Register PIL opener for HEIC/HEIF formats to support iPhone image processing autocommit 2026-06-08 22:56:19 -07:00
a457e3c04c deps-upgrade(coordinator): ⬆️ Update dependencies in pyproject.toml for security patches and compatibility improvements autocommit 2026-06-08 22:56:19 -07:00
9c662937b3 test(whisper-http): ✅ Add comprehensive tests for Whisper HTTP endpoints and behavior validation autocommit 2026-05-17 22:14:00 -07:00
1eaa6be2a4 chore(whisper-http): 🔧 Update and add configuration settings for Whisper HTTP service endpoints and defaults autocommit 2026-05-17 21:35:46 -07:00
86519212e7 feat(whisper-http): ✨ Add core HTTP service initialization, entry point, application logic, and data models autocommit 2026-05-17 21:35:45 -07:00
d3e5791675 deps-upgrade(whisper-http): ⬆️ Update HTTP service dependencies to latest compatible versions autocommit 2026-05-17 21:35:45 -07:00
1cab0fa18b feat(tasks): ✨ Add pinPrimary, keepAliveS, and budgetS configuration options to task definitions in the frontend autocommit 2026-05-16 19:46:52 -07:00
1d2f7557dd feat(inference): ✨ Update inference proxy routing and forwarding logic for new request handling autocommit 2026-05-16 19:08:35 -07:00
53600f2787 feat(inference): ✨ Add Transformers Seq2Seq backend integration with model execution and registration logic autocommit 2026-05-16 19:08:35 -07:00
7481f92dc3 feat(inference-backends): ✨ Add Transformers Seq2Seq backend and worker for sequence-to-sequence model inference autocommit 2026-05-16 19:08:35 -07:00
7be0b09e86 fix(model-boss-coordinator): 🐛 Implement graceful None-value handling in VRAM estimation to prevent crashes during stable inference task processing autocommit 2026-05-16 18:57:03 -07:00
fc6a211f83 feat(config): ✨ Add timeout budget configuration for inference tasks in tasks.yaml autocommit 2026-05-16 18:57:03 -07:00
7fd0f24234 feat(coordinator): ✨ Add budget constraint enforcement to tasks with budget_s parameter, updating API handlers and inference pipeline logic autocommit 2026-05-16 16:26:48 -07:00
f02a09cddb feat(model-boss-coordinator): ✨ Add keep-alive tracking to maintain VRAM residency for active model tasks autocommit 2026-05-15 18:18:06 -07:00
8da6fcb402 feat(config): ✨ Add keep-alive configuration to optimize VRAM usage and reduce cold-load costs autocommit 2026-05-15 18:18:06 -07:00
71f12f09ef chore(config): 🔧 Add pin_primary runtime setting to control task resolver behavior in tasks.yaml autocommit 2026-05-14 23:05:08 -07:00
56a08d89a4 feat(inference): ✨ Introduce pin_primary flag to prioritize primary models in Router, TaskRegistry, and model classes autocommit 2026-05-14 22:58:02 -07:00
e53173a95a feat(coordinator): ✨ Add pin_primary method to task API for primary state management and update tests autocommit 2026-05-14 22:58:01 -07:00
47cf947437 flags(config): 🚩 Introduce pin_primary flag to enable/disable primary prospect classification behavior autocommit 2026-05-14 22:58:01 -07:00
07efb4bda3 chore(config): 🔧 Update chat task model recommendations to prioritize Qwen3.6 family models and implement fallback strategies autocommit 2026-05-14 22:23:09 -07:00
9a8ce50edf remove(config): 🔥 Remove deprecated prospect.classify task from tasks.yaml autocommit 2026-05-14 20:33:44 -07:00
6a6ce97e5c feat(config): ✨ Add prospect-classification task with configurable model and fallback options in tasks.yaml autocommit 2026-05-14 20:16:44 -07:00
b85961cb69 feat(client): ✨ Add Markdown fence parser to strip triple-backtick content from JSON responses autocommit 2026-05-13 15:46:08 -07:00
efeb21d024 deps-upgrade(client): ⬆️ Update client dependencies to latest minor/patch versions for security fixes and compatibility improvements autocommit 2026-05-13 15:46:08 -07:00
c0548b7222 types(coordinator-coordinator): 🏷️ Introduce inference type definitions for type-safe inference operations in coordinator service autocommit 2026-05-12 00:54:39 -07:00
de6a02f14e feat(coordinator-client): ✨ Introduce Client class for inference requests, custom error types, and enhance test coverage autocommit 2026-05-12 00:54:39 -07:00
291253e48c docs(imajin-pipeline): 📝 Improve pipeline documentation with clearer consumer setup, configuration examples, and step-by-step usage guidance autocommit 2026-05-12 00:54:39 -07:00
458559bd4e deps-upgrade(coordinator): ⬆️ Update dependencies in coordinator client and types modules for security and compatibility improvements autocommit 2026-05-12 00:54:39 -07:00
685a890cf8 docs(model-encyclopedia): 📝 Add benchmark results for Qwen3-VL-8B model to documentation autocommit 2026-05-11 21:50:35 -07:00
dcb6f3cd07 docs(model-encyclopedia): 📝 Add benchmark JSON files for qwen3-vl-8b-instruct and gemma-4-31b, and update vision.md with performance metrics autocommit 2026-05-11 14:31:46 -07:00
25e4f02255 docs(model-encyclopedia): 📝 Add benchmark metrics for qwen3.6-35b-a3b, qwen3.6-27b, and mistral-small-3.2-24b models to llms.md and their JSON files autocommit 2026-05-11 14:31:46 -07:00
76b8ab03f9 docs(benchmarks): 📝 Add Ministral 14B model benchmark JSON file in code category with performance metrics autocommit 2026-05-11 13:06:37 -07:00
a8808fd150 docs(model-encyclopedia): 📝 Add benchmark reasoning performance data for Qwen 3.6-35b and Mistral Small 3.2-24b models autocommit 2026-05-11 13:06:37 -07:00
b715be9460 chore(model-encyclopedia): 🔧 Add benchmark JSON file with Qwen3.6-27B performance metrics autocommit 2026-05-11 11:45:54 -07:00
a34a68efa2 docs(model-encyclopedia): 📝 Add benchmark results for Qwen3.6-27B model performance metrics to llms.md autocommit 2026-05-11 11:45:54 -07:00
2bb4e44c71 docs(benchmarks): 📝 Add raw benchmark data for ministral-14b-reasoning model with performance metrics and evaluation results autocommit 2026-05-11 10:19:36 -07:00
c8da606f6f docs(model-encyclopedia): 📝 Update benchmark results for ministral-14b-reasoning model in LLM documentation autocommit 2026-05-11 10:19:36 -07:00
64efd5a661 scripts(scripts): 🔨 Improve debugging and automation logic in evaluation chain script autocommit 2026-05-11 09:37:57 -07:00
a6295cfb95 docs(consumers): 📝 Implement clearer TaskRegistry integration and model resolution workflows in consumer docs autocommit 2026-05-11 09:15:30 -07:00
cb0676bb43 docs(model-encyclopedia): 📝 Add ministral-14b-reasoning benchmark data to model encyclopedia docs autocommit 2026-05-11 02:29:31 -07:00
6725724e27 docs(model-encyclopedia): 📝 Add benchmark results for llm_reasoning performance metrics to model encyclopedia documentation autocommit 2026-05-11 02:29:31 -07:00
085c8ccace feat(benchmark): ✨ Introduce LLMReasoningBenchmarkSuite with logical reasoning test cases autocommit 2026-05-11 00:20:11 -07:00
3dad08614d feat(benchmark): ✨ Introduce LLMCodeBenchmark suite with code generation test cases, metrics, and evaluation logic autocommit 2026-05-11 00:20:11 -07:00
39d24bbaa9 feat(benchmark): ✨ Add sleep interval between model operations to prevent port-binding races during benchmark execution autocommit 2026-05-11 00:20:11 -07:00
3f752cb91a refactor(benchmark): ♻️ Enhance CLI argument handling, reporting structure, and execution logic to support new model suites autocommit 2026-05-10 23:36:36 -07:00
210f61e87a feat(benchmark): ✨ Introduce benchmark suites for LLM code generation, reasoning, and vision-language model tasks autocommit 2026-05-10 23:36:36 -07:00
e2628ee650 docs(model-encyclopedia): 📝 Add detailed model encyclopedia entries for Qwen3.6 variants and Mistral-small-3.2-24b with technical specs and usage guidance autocommit 2026-05-10 23:36:36 -07:00
5c91747c85 chore(pnpm-workspace): 🔧 Update pnpm workspace configuration for dependency overrides and workspace definitions autocommit 2026-05-10 21:48:20 -07:00
857fc454f4 release(manager): 🔖 Update version in manifest for controlled app releases autocommit 2026-05-10 21:48:20 -07:00
59ed8655a7 deps-upgrade(dependencies): ⬆️ Update all dependencies to latest stable versions across root and package files autocommit 2026-05-10 21:48:20 -07:00
baf2d6a84f feat(model-boss-coordinator): ✨ Introduce ModelBossCoordinator class and config keys for distributed model orchestration during service startup autocommit 2026-05-10 14:50:38 -07:00
ab06b59731 feat(model-boss-coordinator): ✨ Introduce ModelStager class, InferencePool, and staging API endpoints with comprehensive test coverage autocommit 2026-05-10 14:50:38 -07:00
b25536225e ui(model-table): 💄 Replace 'Cached' status labels with 'Hot'/'Cold' in ModelTable component autocommit 2026-05-10 14:50:38 -07:00
511f4ef9f4 types(coordinator): 🏷️ Update coordinator service interfaces to align with updated data model and API schema definitions autocommit 2026-05-10 06:21:00 -07:00
aab8dba768 feat(model-boss): ✨ Update Python client to add new API features and enhance error handling in the Client class autocommit 2026-05-10 06:21:00 -07:00
e385d7103f types(coordinator): 🏷️ Add/update inference-related type definitions for request/response models in inference.ts autocommit 2026-05-10 06:21:00 -07:00
f369400c61 chore(model-boss-coordinator): 🔧 Update task configuration and data models for improved task processing autocommit 2026-05-10 06:21:00 -07:00
b662684142 feat(coordinator): ✨ Introduce TaskAPI endpoint, task types, useTasks hook, Tasks page, and App integration for task orchestration autocommit 2026-05-10 06:21:00 -07:00
666239d403 feat(inference): ✨ Introduce InferenceProxy, InferenceRouter, and TaskRegistry for dynamic inference request handling and routing autocommit 2026-05-10 06:21:00 -07:00
762062a9c4 deps-upgrade(coordinator): ⬆️ Update dependencies in coordinator/client and coordinator/types (Node.js) and core-py (Python) to latest stable versions autocommit 2026-05-10 06:21:00 -07:00
e892603ce6 feat(model-boss-loaders): ✨ Enhance eviction logic to handle namespace directories and adjust recursion depth for artifact detection autocommit 2026-04-26 08:44:25 -07:00
029ba28093 feat(model-boss-coordinator): ✨ Introduce UsageWindowValidator and WindowScheduler classes to enforce time-based operation constraints via config keys in config.py autocommit 2026-04-26 00:30:41 -07:00
3fa6cfac1b feat(model-boss-coordinator): ✨ Introduce UsageWindow model and UsageCollector class to track usage windows and perform inference on usage data autocommit 2026-04-26 00:30:41 -07:00
4e20c80644 feat(cli): ✨ Add window subcommand for usage window management and enhance daily subcommand with additional functionality autocommit 2026-04-26 00:30:41 -07:00
c0468f6fc9 feat(model-boss-loaders): ✨ Add eviction/history management package with Managed class, CLI support, and tests for data cleanup and tracking autocommit 2026-04-25 23:08:05 -07:00
cfbc7b7996 feat(model-boss-coordinator): ✨ Update coordination logic and configuration to introduce new model management features and strategies autocommit 2026-04-25 23:08:05 -07:00
235f243fc4 feat(cli): ✨ Update CLI entry point in main.py and enhance history loader to improve performance and support new features autocommit 2026-04-25 23:08:05 -07:00
f73ab15aae feat(inference): ✨ Introduce InferencePool, InferenceQueue, and UsageCollector for optimized model serving autocommit 2026-04-25 23:08:05 -07:00
4edee4d451 feat(model-boss-coordinator): ✨ Introduce usage tracking endpoints for querying metrics in coordinator service autocommit 2026-04-25 23:08:05 -07:00
09c769e043 feat(cli-usage): ✨ Add daily and status subcommands for querying usage metrics in CLI autocommit 2026-04-25 23:08:05 -07:00
33d72a3baf feat(mesh-cli): ✨ Add mesh status and peers CLI commands for peer monitoring and interaction autocommit 2026-04-25 23:08:04 -07:00
57e748eed4 feat(inference-backends): ✨ Add embedding category support to LlamaServerBackend, update pooling and proxy logic for category-based inference routing autocommit 2026-04-24 19:40:05 -07:00
62564e24d4 chore(service): 🔧 Update service health monitoring logs for request_id tracking autocommit 2026-04-24 03:54:18 -07:00
afad7ae50a feat(model-boss): ✨ Add bigdisk loader with manifest path resolution for model loading from storage autocommit 2026-04-22 19:41:36 -07:00