Commit graph

21 commits

Author SHA1 Message Date
autocommit
4a3cf3a994 docs(docs): 📝 Add architectural documentation for cloud-fallback guard components and integration
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-06-09 03:12:53 -07:00
autocommit
16e59f9035 docs(docs): 📝 Add detailed architecture documentation for synchronous inference, async jobs, cold-load behavior, and timeout implications
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-06-09 02:37:39 -07:00
autocommit
685a890cf8 docs(model-encyclopedia): 📝 Add benchmark results for Qwen3-VL-8B model to documentation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 21:50:35 -07:00
autocommit
dcb6f3cd07 docs(model-encyclopedia): 📝 Add benchmark JSON files for qwen3-vl-8b-instruct and gemma-4-31b, and update vision.md with performance metrics
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 14:31:46 -07:00
autocommit
25e4f02255 docs(model-encyclopedia): 📝 Add benchmark metrics for qwen3.6-35b-a3b, qwen3.6-27b, and mistral-small-3.2-24b models to llms.md and their JSON files
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 14:31:46 -07:00
autocommit
76b8ab03f9 docs(benchmarks): 📝 Add Ministral 14B model benchmark JSON file in code category with performance metrics
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 13:06:37 -07:00
autocommit
a8808fd150 docs(model-encyclopedia): 📝 Add benchmark reasoning performance data for Qwen 3.6-35b and Mistral Small 3.2-24b models
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 13:06:37 -07:00
autocommit
b715be9460 chore(model-encyclopedia): 🔧 Add benchmark JSON file with Qwen3.6-27B performance metrics
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 11:45:54 -07:00
autocommit
a34a68efa2 docs(model-encyclopedia): 📝 Add benchmark results for Qwen3.6-27B model performance metrics to llms.md
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 11:45:54 -07:00
autocommit
2bb4e44c71 docs(benchmarks): 📝 Add raw benchmark data for ministral-14b-reasoning model with performance metrics and evaluation results
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 10:19:36 -07:00
autocommit
c8da606f6f docs(model-encyclopedia): 📝 Update benchmark results for ministral-14b-reasoning model in LLM documentation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 10:19:36 -07:00
autocommit
cb0676bb43 docs(model-encyclopedia): 📝 Add ministral-14b-reasoning benchmark data to model encyclopedia docs
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 02:29:31 -07:00
autocommit
6725724e27 docs(model-encyclopedia): 📝 Add benchmark results for llm_reasoning performance metrics to model encyclopedia documentation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-11 02:29:31 -07:00
autocommit
e2628ee650 docs(model-encyclopedia): 📝 Add detailed model encyclopedia entries for Qwen3.6 variants and Mistral-small-3.2-24b with technical specs and usage guidance
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-05-10 23:36:36 -07:00
Claude Code
506f66ee04 perf(inference-specific): Refactor inference task pooling and queuing logic to optimize throughput and resource utilization
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-27 17:50:38 -07:00
Claude Code
82814ed90e docs(model-encyclopedia): 📝 Improve model encyclopedia documentation with expanded definitions, usage examples, and structured entries
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-27 13:09:15 -07:00
Claude Code
ea0ea05b2e docs(docs): 📝 Update architecture and consumer integration documentation to clarify system design and integration patterns
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-25 22:57:30 -07:00
Lilith
c6eff0fcd9 feat(coordinator): Add Oracle-specific routing logic with documentation and test coverage
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-08 22:08:20 -07:00
Lilith
76f525fb7f docs(docs): 📝 Update deprecation warnings, migration paths, and CLI documentation to reflect changes in GPUBOSS_INTEGRATION.md, LLAMA_SEGFAULT_FIX.md, MIGRATION.md, and CLI.md while cleaning TODO.md
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-02-28 20:04:41 -08:00
Lilith
0690309799 docs(cli): 📝 Update CLI documentation and migration guides with clearer instructions, setup steps, and paths for core-py, core-ts, and loaders-py packages
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-02-25 15:48:38 -08:00
Lilith
b39ce3a725 feat: model-boss architecture with extracted loaders and InferenceRouter
Architecture refactor per the plan:

Packages:
- core-py: Lean model-boss with GPUBoss, RAMBoss, path resolution
- core-ts: TypeScript client (@lilith/model-boss)
- loaders-py: Extracted direct model loaders (optional, for dev/testing)

New in core-py:
- InferenceRouter for service discovery and routing
- LlamaHttpClient, DiffusionHttpClient typed clients
- VRAM estimation utilities

Services use model-boss for:
- VRAM lease coordination (GPUBoss)
- Model path resolution
- Service discovery (InferenceRouter)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 12:39:05 -08:00