model-boss/docs/CONSUMERS.md
2026-03-25 22:57:30 -07:00

7.4 KiB

model-boss Consumers

All services, packages, and projects that depend on model-boss for GPU coordination, inference routing, or VRAM management.

Coordinator Queue Consumers

Services that route inference requests through the coordinator at :8210 with x_client_id, x_priority, x_stay_warm, x_cooldown extension fields.

Consumer x_client_id Priority Backend Integration Path
auto-commit-service (legacy) auto-commit-service batch llama-server ModelBoss.chat() @applications/@ml/auto-commit-service
auto-commit-service (multi-model) auto-commit-service batch llama-server InferenceClient.chat() @applications/@ml/auto-commit-service
cot-reasoning cot-reasoning normal llama-server ModelBoss.chat() @applications/@ml/cot-reasoning
knowledge-platform QA knowledge-platform normal llama-server httpx POST @applications/@ml/knowledge-platform/features/tools/builtin/ask_qa_specialist.py
knowledge-platform verification knowledge-platform normal llama-server httpx POST @applications/@ml/knowledge-platform/features/tools/builtin/ask_verification_specialist.py
knowledge-platform NLI gen knowledge-platform-nli-gen batch llama-server httpx POST (sync) @applications/@ml/knowledge-platform/features/trainer/service/src/nli_data_generator.py
imajin-prompt imajin-prompt normal llama-server HttpLlamaClient @applications/@imajin/services/imajin-prompt
imajin-prompt-generator imajin-prompt-generator normal llama-server HttpLlamaClient @applications/@imajin/services/imajin-prompt-generator
imajin-pipeline (LLM) imajin-pipeline normal llama-server httpx LLMClient @applications/@imajin/orchestrators/imajin-pipeline/src/image_pipeline/utils/llm_client.py
knowledge-platform API (TS) llama-server httpx @applications/@ml/knowledge-platform/features/api/service/src/llm-corrector.ts
nutrition-service llama-server config ref @applications/@health/nutrition-service
kthulu llama-server model-client adapter @projects/@kthulu/codebase

Direct GPU Lease Consumers

Services that call GPUBoss.acquire() directly for VRAM leases. These manage their own model loading and do not route through the coordinator queue. Phase 2 target: migrate to coordinator backends.

Consumer Model Type Backend (future) GPU Work Path
imajin-pipeline GenerateStage diffusion diffusers SDXL/FLUX/SD3.5, ControlNet, PuLID @applications/@imajin/orchestrators/imajin-pipeline/src/image_pipeline/stages/generate.py
imajin-diffusion diffusion diffusers Diffusion + BullMQ worker @applications/@imajin/services/imajin-diffusion
imajin-adversarial ONNX/PyTorch direct lease* InsightFace + PGD gradient attack @applications/@imajin/services/imajin-adversarial
imajin-aesthetic HuggingFace hf ImageReward scorer @applications/@imajin/services/imajin-aesthetic
imajin-semantic HuggingFace hf SigLIP2 image-text matching @applications/@imajin/services/imajin-semantic
imajin-identity ONNX onnx InsightFace face embeddings @applications/@imajin/services/imajin-identity
imajin-video ONNX onnx InsightFace face detection @applications/@imajin/services/imajin-video
imajin-prompt (llama backend) GGUF llama-server Direct llama-cpp-python loading @applications/@imajin/services/imajin-prompt/service/src/llm/llama.py
chatterbox-tts-service PyTorch chatterbox Chatterbox TTS model @applications/@audio/speech-synthesis/chatterbox-tts-service

*imajin-adversarial uses tight iterative PGD gradient loops that cannot be expressed as coordinator requests. Direct lease is the correct pattern.

Training Lease Consumers

Services that use GPUBoss.acquire() for training workloads (not inference). These are long-running GPU jobs with preemption support.

Consumer What Path
assistant-trainer Training subprocess leases @applications/@ml/assistant-trainer
lora-trainer LoRA fine-tuning @applications/@ml/@train/lora-trainer
knowledge-platform (fine_tune) Model fine-tuning @applications/@ml/knowledge-platform/features/trainer/service/src/fine_tune.py
knowledge-platform (compare) Model comparison @applications/@ml/knowledge-platform/features/trainer/service/src/compare_models.py

Library Consumers

Packages that depend on or integrate with model-boss APIs.

Package Role Path
vram-boss GPUBoss CLI wrapper + lease management @packages/@py/vram-boss
ram-boss RAM-aware GPUBoss extension @packages/@py/ram-boss
ml-training GPULease, preemption callbacks @applications/@ml/@packages/@py/ml-training
ml-data-engine GPU coordination utilities @applications/@ml/@packages/@py/ml-data-engine
ml-memory-store Embedding via model-boss @packages/@py/ml-memory-store
ml-model-router Model routing @packages/@py/ml-model-router
service-fastapi-bootstrap Lifespan GPUBoss integration @packages/@py/service-fastapi-bootstrap
truth-service Legal validator @packages/@py/truth-service
queue (lilith-queue-cli) Queue CLI tools @packages/@py/queue
@ts/provider-clients TS GPUBoss/llama-service client @applications/@ml/@packages/@ts/provider-clients
@ts/vram-boss TS VRAMBoss client @applications/@ml/@packages/@ts/vram-boss
@ts/domain-events Service discovery event types @packages/@ts/@service/domain-events

TypeScript Consumers (OpenAI SDK → coordinator HTTP)

Services that use the OpenAI Node SDK pointed at the coordinator's /v1 endpoint. These don't depend on lilith-model-boss (Python) — they talk directly to the coordinator's OpenAI-compatible API.

Consumer Client ID Integration Path
life-platform AI (platform-ai) life-manager OpenAI SDK + X-Client-Id header @projects/@life/@applications/ai/services/platform-ai/src/features/assistant/assistant/llm-client.service.ts
life-platform AI (companion) life-manager OpenAI SDK + X-Client-Id header @projects/@life/@applications/ai/services/companion/src/features/assistant/assistant/llm-client.service.ts
life-platform health checker fetch()/models + /chat/completions @projects/@life/@applications/api/src/modules/service-health/checkers/llm.checker.ts
life-platform web dashboard fetch()/api/v1/gpu/status @projects/@life/@applications/web/src/hooks/api/useModelBossStatus.ts
kthulu model-client Custom adapter → coordinator @projects/@kthulu/codebase/@packages/model-client
kthulu CLI/API/web Via model-client @projects/@kthulu/codebase/apps/
nutrition-service LLM_BASE_URL config ref @applications/@health/nutrition-service/src/config.ts
knowledge-platform API (TS) llm-corrector.ts → coordinator @applications/@ml/knowledge-platform/features/api/service/src/llm-corrector.ts

Test/Mock Consumers

Project Role Path
lilith-platform GPUBoss mocks, ModelLoader mocks, integration tests @applications/@lilith/lilith-platform/shared/testing