No description
|
|
||
|---|---|---|
| .forgejo/workflows | ||
| src | ||
| .gitignore | ||
| eslint.config.js | ||
| package.json | ||
| README.md | ||
| tsconfig.json | ||
| tsup.config.ts | ||
@lilith/ml-ram-boss
System RAM lease coordinator for ML models - TypeScript client.
Prevents race conditions when multiple processes try to load models that compete for system RAM by providing a lease-based coordination system with Redis backend.
Features
- Lease-based coordination: Acquire RAM leases before loading models
- Automatic heartbeats: Leases automatically refresh to prove process is alive
- Preemption support: Higher priority processes can request lower priority processes to release RAM
- Stale lease cleanup: Automatic cleanup of leases from crashed processes
- Queue support: Processes wait in priority queue when RAM is full
- Type-safe: Full TypeScript support with comprehensive type definitions
Installation
pnpm add @lilith/ml-ram-boss
Quick Start
import { RAMBoss, Priority } from '@lilith/ml-ram-boss';
// Connect to Redis
const boss = new RAMBoss({
redisUrl: 'redis://localhost:6379',
});
await boss.connect();
// Initialize RAM tracking (typically done once on startup)
await boss.initializeRAM(32000); // 32GB total RAM
// Acquire a lease
const lease = await boss.acquire({
ramMb: 8000,
priority: Priority.NORMAL,
processId: 'my-model-service',
});
try {
// Load your model
await loadModel();
// Register preemption handler
lease.onPreempt(async (reason) => {
console.log(`Preemption requested: ${reason}`);
await unloadModel();
});
// Use the model
const result = await model.process(input);
// Touch lease to update activity
lease.touch();
} finally {
// Always release the lease
await lease.release();
}
// Close connection
await boss.close();
Configuration
const boss = new RAMBoss({
redisUrl: 'redis://localhost:6379', // Redis connection
heartbeatIntervalMs: 10_000, // Heartbeat frequency
staleLeaseTimeoutMs: 60_000, // When to consider lease stale
preemptionGracePeriodMs: 30_000, // Time to gracefully unload
defaultTimeoutMs: 300_000, // Default acquire timeout
keyPrefix: 'ram', // Redis key prefix
autoCleanup: true, // Auto cleanup stale leases
cleanupIntervalSeconds: 30, // Cleanup task frequency
});
Priority Levels
enum Priority {
URGENT = 1, // Immediate, bypasses queue
HIGH = 5, // Critical paths
NORMAL = 10, // Default
LOW = 20, // Background tasks
BATCH = 50, // Bulk operations, lowest priority
}
API
RAMBoss
Main coordinator class.
Methods
connect(): Connect to Redisclose(): Close Redis connectioninitializeRAM(totalMb: number): Initialize RAM trackingacquire(options: AcquireOptions): Acquire a RAM leasegetStatus(): Get current system statusforceRelease(leaseId: string): Force release a leasesendPreemption(leaseId: string, reason: string): Send preemption signalcleanupStale(): Clean up stale leasesdrainAll(reason?: string): Request all leases to release
RAMLease
Represents an active RAM lease.
Properties
leaseId: Unique lease identifierramMb: Amount of RAM reserved (MB)priority: Lease priorityprocessId: Process identifierisReleased: Whether lease has been released
Methods
onPreempt(callback): Register preemption callbacktouch(): Update activity timestamprelease(): Release the lease
Architecture
The system uses Redis for coordination:
- Leases: Stored in Redis hash with heartbeat TTL
- Lua scripts: Atomic operations for acquire/release
- Pub/Sub: Preemption signaling via Redis channels
- Queuing: Priority queue for waiting requests
Integration with Python
This package mirrors the Python lilith-ram-boss package, allowing TypeScript/Node.js services to coordinate RAM usage with Python ML services.
License
MIT