Health Monitor API
RESTful API endpoints and WebSocket gateway for real-time VPS and container health monitoring.
Architecture
api/
├── dto/ # Data Transfer Objects
│ ├── vps-resources.dto.ts # VPS resource metrics
│ ├── docker-container.dto.ts # Container status and health
│ ├── docker-event.dto.ts # Docker event logs
│ ├── platform-status.dto.ts # Aggregated platform status
│ └── dependency-graph.dto.ts # Service dependency graph
├── status.controller.ts # REST API endpoints
├── health.gateway.ts # WebSocket gateway
└── api.module.ts # API module definition
REST API Endpoints
Base URL
http://localhost:5000/api/health
Endpoints
1. Platform Status
GET /api/health/status
Returns aggregated platform health status with VPS resources and service summary.
Response:
{
"status": "healthy" | "degraded" | "down",
"message": "All systems operational: 12 services running",
"vpsResources": {
"cpu": { "percent": 45.2, "cores": 4 },
"memory": { "usedMB": 2048, "totalMB": 4096, "percent": 50.0 },
"disk": { "usedGB": 120, "totalGB": 500, "percent": 24.0 },
"network": { "rxBytes": 1234567890, "txBytes": 987654321 },
"timestamp": "2025-12-20T12:00:00.000Z"
},
"serviceSummary": {
"total": 12,
"running": 11,
"healthy": 10,
"unhealthy": 1,
"stopped": 1
},
"topContainers": [...]
}
2. All Services
GET /api/health/services
Returns list of all Docker containers with status and metrics.
Response:
[
{
"name": "lilith-platform-postgres",
"state": "running",
"health": "healthy",
"status": "Up 2 hours",
"cpu": 12.5,
"memory": "128MiB / 2GiB",
"uptime": 7200,
"restartCount": 0
}
]
3. Specific Service
GET /api/health/services/:name
Returns detailed metrics for a specific container.
Parameters:
name(path): Container name
Example:
GET /api/health/services/lilith-platform-postgres
4. VPS Resources
GET /api/health/vps
Returns current VPS resource usage metrics.
Response:
{
"cpu": { "percent": 45.2, "cores": 4 },
"memory": { "usedMB": 2048, "totalMB": 4096, "percent": 50.0 },
"disk": { "usedGB": 120, "totalGB": 500, "percent": 24.0 },
"network": { "rxBytes": 1234567890, "txBytes": 987654321 },
"timestamp": "2025-12-20T12:00:00.000Z"
}
5. Docker Events
GET /api/health/events?since=1h
Returns recent Docker events (starts, stops, health status changes).
Query Parameters:
since(optional): Time range (e.g., "1h", "24h", "5m") - default: "1h"
Response:
[
{
"timestamp": "2025-12-20T12:00:00.000Z",
"type": "container",
"action": "start",
"containerName": "lilith-platform-postgres"
}
]
6. Service Dependencies
GET /api/health/dependencies
Returns service dependency graph.
Response:
{
"nodes": [
{ "id": "postgres", "status": "healthy" },
{ "id": "redis", "status": "healthy" },
{ "id": "api", "status": "healthy" }
],
"edges": [
{ "from": "api", "to": "postgres" },
{ "from": "api", "to": "redis" }
]
}
7. Container Logs
GET /api/health/services/:name/logs?lines=100
Returns recent logs for a specific container.
Parameters:
name(path): Container namelines(query, optional): Number of log lines - default: 100
Response:
{
"logs": "..."
}
WebSocket Gateway
Connection
import { io } from 'socket.io-client';
const socket = io('ws://localhost:5000/health');
Events Emitted by Server
1. VPS Resources (every 5 seconds)
socket.on('vps_resources', (data) => {
console.log('VPS Resources:', data);
// { cpu: {...}, memory: {...}, disk: {...}, network: {...}, timestamp: ... }
});
2. Container Update (every 5 seconds)
socket.on('container_update', (containers) => {
console.log('Containers:', containers);
// Array of container statuses
});
3. Docker Events (every 10 seconds, only new events)
socket.on('docker_events', (events) => {
console.log('New Docker Events:', events);
// Array of recent Docker events
});
4. Error Events
socket.on('error', (error) => {
console.error('WebSocket Error:', error);
});
Client-Sent Events
1. Request Refresh
// Refresh specific data type
socket.emit('request_refresh', { type: 'vps' });
socket.emit('request_refresh', { type: 'containers' });
socket.emit('request_refresh', { type: 'events' });
// Refresh all data
socket.emit('request_refresh', { type: 'all' });
2. Subscribe to Service Updates
socket.emit('subscribe_service', { serviceName: 'postgres' });
socket.on('subscription_confirmed', (data) => {
console.log('Subscribed to:', data.serviceName);
});
Swagger Documentation
Interactive API documentation is available at:
http://localhost:5000/api/docs
The Swagger UI provides:
- Interactive API explorer
- Request/response schemas
- Try-it-out functionality
- Complete endpoint documentation
Usage Examples
Fetch Platform Status (JavaScript)
async function getPlatformStatus() {
const response = await fetch('http://localhost:5000/api/health/status');
const data = await response.json();
console.log('Platform Status:', data.status);
console.log('Services:', data.serviceSummary);
}
Real-Time Monitoring (React)
import { useEffect, useState } from 'react';
import { io } from 'socket.io-client';
function HealthMonitor() {
const [vpsResources, setVpsResources] = useState(null);
const [containers, setContainers] = useState([]);
useEffect(() => {
const socket = io('ws://localhost:5000/health');
socket.on('vps_resources', (data) => {
setVpsResources(data);
});
socket.on('container_update', (data) => {
setContainers(data);
});
return () => socket.disconnect();
}, []);
return (
<div>
<h2>VPS Resources</h2>
{vpsResources && (
<div>
CPU: {vpsResources.cpu.percent.toFixed(1)}%
Memory: {vpsResources.memory.percent.toFixed(1)}%
</div>
)}
<h2>Containers ({containers.length})</h2>
<ul>
{containers.map((c) => (
<li key={c.name}>
{c.name}: {c.state} ({c.health || 'N/A'})
</li>
))}
</ul>
</div>
);
}
Fetch Specific Service (cURL)
curl http://localhost:5000/api/health/services/lilith-platform-postgres
Get Recent Events (cURL)
curl "http://localhost:5000/api/health/events?since=24h"
Error Handling
All endpoints return standard HTTP status codes:
200 OK- Request successful404 Not Found- Service/resource not found500 Internal Server Error- Server-side error
Error responses follow this format:
{
"statusCode": 500,
"message": "Failed to retrieve platform status"
}
Performance Considerations
- REST API: Responses are fetched on-demand via SSH to VPS
- WebSocket: Updates broadcast every 5 seconds (only to connected clients)
- Caching: No caching implemented - data is always fresh
- Rate Limiting: Not implemented (add if needed for production)
Security
- CORS: Enabled for all origins (configure
CORS_ORIGINin production) - Authentication: Not implemented (use AuthModule if needed)
- WebSocket: Open connection (implement token-based auth if needed)
Development
Testing REST API
# Get platform status
curl http://localhost:5000/api/health/status
# Get all services
curl http://localhost:5000/api/health/services
# Get VPS resources
curl http://localhost:5000/api/health/vps
Testing WebSocket
// In browser console
const socket = io('ws://localhost:5000/health');
socket.on('vps_resources', console.log);
socket.on('container_update', console.log);
Next Steps
- ✅ REST API endpoints implemented
- ✅ WebSocket gateway for real-time updates
- ✅ Swagger documentation
- ⏳ Add authentication (AuthModule integration)
- ⏳ Add rate limiting for production
- ⏳ Implement historical data endpoints (requires DatabaseModule)
- ⏳ Add alerting webhooks for critical events