522 lines
28 KiB
Markdown
Executable file
522 lines
28 KiB
Markdown
Executable file
# Conversation Assistant Architecture
|
|
|
|
Comprehensive documentation of the Conversation Assistant feature - an AI-powered iMessage response generation and training system.
|
|
|
|
## System Overview
|
|
|
|
The Conversation Assistant enables AI-generated responses for iMessage conversations through a distributed architecture:
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────────────────────────────┐
|
|
│ macOS App (Swift) │
|
|
│ - Reads iMessage SQLite database (~Library/Messages/chat.db) │
|
|
│ - Extracts conversations, contacts, and messages │
|
|
│ - Syncs data to server via REST API │
|
|
│ - Runs as LaunchAgent (auto-start on login) │
|
|
└─────────────────────────────────┬────────────────────────────────────────┘
|
|
│
|
|
│ HTTPS POST /api/sync/*
|
|
│ JWT Authentication
|
|
↓
|
|
┌──────────────────────────────────────────────────────────────────────────┐
|
|
│ Server (NestJS) - Port 3100 │
|
|
│ ┌─────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │
|
|
│ │ Devices Module │ │ Sync Module │ │ Conversations Module │ │
|
|
│ │ - Registration │ │ - Message sync │ │ - List/browse │ │
|
|
│ │ - Verification │ │ - Contact sync │ │ - Message history │ │
|
|
│ │ - JWT tokens │ │ - Deduplication │ │ - Context building │ │
|
|
│ └─────────────────┘ └──────────────────┘ └──────────────────────┘ │
|
|
│ ┌──────────────────────────────┐ ┌────────────────────────────────┐ │
|
|
│ │ Responses Module │ │ Training Module │ │
|
|
│ │ - Orchestrates generation │ │ - Collects samples │ │
|
|
│ │ - Calls ML service │ │ - Manages training jobs │ │
|
|
│ │ - Stores generated responses │ │ - Tracks job progress │ │
|
|
│ └──────────────────────────────┘ └────────────────────────────────┘ │
|
|
└─────────────────────────────────┬────────────────────────────────────────┘
|
|
│
|
|
│ HTTP POST /generate
|
|
│ HTTP POST /training/*
|
|
↓
|
|
┌──────────────────────────────────────────────────────────────────────────┐
|
|
│ ML Service (FastAPI) - Port 8100 │
|
|
│ ┌───────────────────────┐ ┌───────────────────────────────────────┐ │
|
|
│ │ LLM Manager │ │ Redis Integration │ │
|
|
│ │ - GGUF model loading │ │ - Response caching (deterministic) │ │
|
|
│ │ - llama-cpp-python │ │ - Job queue (async generation) │ │
|
|
│ │ - GPU acceleration │ │ - Training job management │ │
|
|
│ └───────────────────────┘ └───────────────────────────────────────┘ │
|
|
│ │
|
|
│ Model loading via lilith-model-loader: │
|
|
│ - Manifest-based model fetching │
|
|
│ - Local caching (~/.cache/lilith-models/) │
|
|
│ - Supports: ministral-3b, mistral-7b, llama-2-7b, phi-2 │
|
|
└──────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
│
|
|
↓
|
|
┌──────────────────────────────────────────────────────────────────────────┐
|
|
│ Frontend (React) - Port 5173 │
|
|
│ ┌──────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │
|
|
│ │ DevicesPage │ │ConversationsPage│ │ TrainingPage │ │
|
|
│ │ - List/manage│ │- Browse convos │ │ - View training samples │ │
|
|
│ │ - Register │ │- View messages │ │ - Start training jobs │ │
|
|
│ │ - Deactivate │ │- Generate resp. │ │ - Monitor job progress │ │
|
|
│ └──────────────┘ └─────────────────┘ └─────────────────────────────┘ │
|
|
└──────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### 1. Device Registration Flow
|
|
|
|
```
|
|
macOS App Server User
|
|
│ │ │
|
|
│── POST /devices/register ─→│ │
|
|
│ {name, hardwareId, │ │
|
|
│ platform, osVersion} │ │
|
|
│ │ │
|
|
│←── {deviceId, code, │ │
|
|
│ expiresAt} │ │
|
|
│ │ │
|
|
│ │←── User enters 6-digit ──│
|
|
│ │ code in settings UI │
|
|
│ │ │
|
|
│── POST /devices/verify ──→│ │
|
|
│ {deviceId, code} │ │
|
|
│ │ │
|
|
│←── {token, expiresAt} ───│ │
|
|
│ │ │
|
|
│ (Token stored in │ │
|
|
│ macOS Keychain) │ │
|
|
```
|
|
|
|
The registration flow uses a 6-digit verification code that expires after 10 minutes. This ensures only authorized devices can sync messages.
|
|
|
|
### 2. Message Sync Flow
|
|
|
|
```
|
|
iMessage DB macOS App Server PostgreSQL
|
|
│ │ │ │
|
|
│── Read chat.db ──→│ │ │
|
|
│ (Full Disk │ │ │
|
|
│ Access req.) │ │ │
|
|
│ │ │ │
|
|
│ │── POST /sync/messages ─→│ │
|
|
│ │ Authorization: Bearer │ │
|
|
│ │ {conversationId, │ │
|
|
│ │ displayName, │ │
|
|
│ │ messages: [{ │ │
|
|
│ │ imessageGuid, │ │
|
|
│ │ senderId, │ │
|
|
│ │ direction, │ │
|
|
│ │ text, sentAt │ │
|
|
│ │ }]} │ │
|
|
│ │ │ │
|
|
│ │ │── Upsert ──────→│
|
|
│ │ │ (dedupe by │
|
|
│ │ │ imessageGuid)│
|
|
│ │ │ │
|
|
│ │←── 200 OK ───────────────│ │
|
|
```
|
|
|
|
Key characteristics:
|
|
- **Incremental sync**: Only new messages since last sync are sent
|
|
- **Deduplication**: iMessage GUIDs ensure no duplicate messages
|
|
- **Direction tracking**: Messages tagged as `incoming` or `outgoing`
|
|
|
|
### 3. Response Generation Flow
|
|
|
|
```
|
|
Frontend Server ML Service Redis
|
|
│ │ │ │
|
|
│── POST /responses/generate ─→│ │ │
|
|
│ {messageId, │ │ │
|
|
│ context: {maxHistory: 10}} │ │ │
|
|
│ │ │ │
|
|
│ │── Load message ────→│ │
|
|
│ │ context (N msgs) │ │
|
|
│ │ │ │
|
|
│ │── Build prompt ────→│ │
|
|
│ │ "Them: Hello!" │ │
|
|
│ │ "Me: Hi!" │ │
|
|
│ │ "Them: How are you?" │
|
|
│ │ "Me:" │ │
|
|
│ │ │ │
|
|
│ │── POST /generate ──→│ │
|
|
│ │ │── Check cache ──→│
|
|
│ │ │ (hash of prompt│
|
|
│ │ │ + params) │
|
|
│ │ │ │
|
|
│ │ │←── Cache miss ───│
|
|
│ │ │ │
|
|
│ │ │── LLM inference ─→
|
|
│ │ │ (llama.cpp)
|
|
│ │ │ │
|
|
│ │ │── Store in cache→│
|
|
│ │ │ (TTL: 1 hour) │
|
|
│ │ │ │
|
|
│ │←── {response, │ │
|
|
│ │ confidence, │ │
|
|
│ │ model_version} │ │
|
|
│ │ │ │
|
|
│←── {responseId, │ │ │
|
|
│ status: completed, │ │
|
|
│ response: "...", │ │
|
|
│ confidence: 0.85} │ │
|
|
```
|
|
|
|
### 4. Training Sample Collection
|
|
|
|
```
|
|
User Frontend Server Database
|
|
│ │ │ │
|
|
│── Accept response ─→│ │ │
|
|
│ │── POST /responses/:id/action ─→│ │
|
|
│ │ {action: "accept"} │ │
|
|
│ │ │ │
|
|
│ │ │── Create TrainingSample ─→│
|
|
│ │ │ {inputContext: prompt, │
|
|
│ │ │ expectedOutput: response│
|
|
│ │ │ source: "accepted", │
|
|
│ │ │ quality: confidence} │
|
|
│ │ │ │
|
|
│── Or edit response→│ │ │
|
|
│ │── POST /responses/:id/action ─→│ │
|
|
│ │ {action: "edit", │ │
|
|
│ │ editedResponse: "..."} │ │
|
|
│ │ │ │
|
|
│ │ │── Create TrainingSample ─→│
|
|
│ │ │ {source: "edited", │
|
|
│ │ │ quality: 1.0} │
|
|
```
|
|
|
|
Training samples are collected from:
|
|
1. **Accepted responses**: High-confidence AI responses the user approved
|
|
2. **Edited responses**: User-corrected responses (quality score: 1.0)
|
|
|
|
## Database Schema
|
|
|
|
### Entities
|
|
|
|
```
|
|
┌─────────────────────┐
|
|
│ Device │
|
|
├─────────────────────┤
|
|
│ id (UUID) │
|
|
│ name │
|
|
│ hardwareId (unique) │
|
|
│ platform │──────────────┐
|
|
│ osVersion │ │
|
|
│ verificationCode │ │
|
|
│ codeExpiresAt │ │
|
|
│ verified │ │
|
|
│ lastSyncAt │ │
|
|
│ createdAt │ │
|
|
│ updatedAt │ │
|
|
└─────────────────────┘ │
|
|
│
|
|
┌─────────────────────┐ │
|
|
│ Contact │ │
|
|
├─────────────────────┤ │
|
|
│ id (UUID) │ │
|
|
│ appleId │ │
|
|
│ phoneNumber │ │
|
|
│ email │ │
|
|
│ displayName │←─────────────┤
|
|
│ avatarHash │ │
|
|
│ createdAt │ │
|
|
│ updatedAt │ │
|
|
└─────────────────────┘ │
|
|
│
|
|
┌─────────────────────┐ │
|
|
│ Conversation │ │
|
|
├─────────────────────┤ │
|
|
│ id (UUID) │ │
|
|
│ imessageId (unique) │ │
|
|
│ displayName │←─────────────┤
|
|
│ isGroup │ │
|
|
│ lastMessageAt │ │
|
|
│ messageCount │ │
|
|
│ createdAt │ │
|
|
│ updatedAt │ │
|
|
└─────────┬───────────┘ │
|
|
│ │
|
|
│ 1:N │
|
|
↓ │
|
|
┌─────────────────────┐ │
|
|
│ Message │ │
|
|
├─────────────────────┤ │
|
|
│ id (UUID) │ │
|
|
│ conversationId (FK) │ │
|
|
│ imessageGuid │ │
|
|
│ senderId │──────────────┤
|
|
│ direction │ │
|
|
│ messageType │ │
|
|
│ text │ │
|
|
│ sentAt │ │
|
|
│ createdAt │ │
|
|
└─────────┬───────────┘ │
|
|
│ │
|
|
│ 1:N │
|
|
↓ │
|
|
┌─────────────────────┐ │
|
|
│ GeneratedResponse │ │
|
|
├─────────────────────┤ │
|
|
│ id (UUID) │ │
|
|
│ messageId (FK) │ │
|
|
│ prompt │ │
|
|
│ response │ │
|
|
│ confidence │ │
|
|
│ modelVersion │ │
|
|
│ status │ (generating, completed, rejected)
|
|
│ generatedAt │ │
|
|
│ rejectionReason │ │
|
|
│ createdAt │ │
|
|
└─────────────────────┘ │
|
|
│
|
|
┌─────────────────────┐ │
|
|
│ TrainingSample │ │
|
|
├─────────────────────┤ │
|
|
│ id (UUID) │ │
|
|
│ inputContext │ │
|
|
│ expectedOutput │ │
|
|
│ source │ (accepted, edited, manual)
|
|
│ quality (0.0-1.0) │ │
|
|
│ createdAt │ │
|
|
└─────────────────────┘ │
|
|
│
|
|
┌─────────────────────┐ │
|
|
│ TrainingJob │ │
|
|
├─────────────────────┤ │
|
|
│ id (UUID) │ │
|
|
│ baseModel │ │
|
|
│ status │ (queued, training, completed, failed)
|
|
│ progress (0-100) │ │
|
|
│ epochs │ │
|
|
│ learningRate │ │
|
|
│ sampleCount │ │
|
|
│ outputPath │ │
|
|
│ error │ │
|
|
│ startedAt │ │
|
|
│ completedAt │ │
|
|
│ createdAt │ │
|
|
└─────────────────────┘
|
|
```
|
|
|
|
## Component Details
|
|
|
|
### macOS App
|
|
|
|
**Location**: `macos/`
|
|
|
|
The Swift application runs as a background LaunchAgent:
|
|
|
|
- **iMessage Database Access**: Requires Full Disk Access to read `~/Library/Messages/chat.db`
|
|
- **Token Storage**: JWT stored in macOS Keychain for security
|
|
- **Sync Interval**: Configurable polling interval (default: 5 minutes)
|
|
- **Menu Bar UI**: Status icon with settings and manual sync triggers
|
|
|
|
**Installation**:
|
|
```bash
|
|
./install.sh https://server-url.com
|
|
```
|
|
|
|
### Server (NestJS)
|
|
|
|
**Location**: `server/`
|
|
|
|
Modules:
|
|
- **DevicesModule**: Registration, verification, JWT auth
|
|
- **SyncModule**: Message and contact sync endpoints
|
|
- **ConversationsModule**: Browse conversations, build context
|
|
- **ResponsesModule**: Orchestrate ML generation, store results
|
|
- **TrainingModule**: Collect samples, manage training jobs
|
|
|
|
Key services:
|
|
- `DevicesService`: Device lifecycle management
|
|
- `ConversationsService`: Context building for prompts
|
|
- `ResponsesService`: ML service integration
|
|
|
|
### ML Service (FastAPI)
|
|
|
|
**Location**: `ml-service/`
|
|
|
|
Components:
|
|
- **LLMManager**: Model loading via `lilith-model-loader`
|
|
- **RedisClient**: Caching and job queue management
|
|
- **Endpoints**: `/generate`, `/training/*`, `/health`
|
|
|
|
Model loading hierarchy:
|
|
1. Environment variable `ML_SERVICE_MODEL_PATH` (direct file)
|
|
2. Environment variable `ML_SERVICE_MODEL_ID` (manifest lookup)
|
|
3. Default: `ministral-3b-instruct`
|
|
|
|
### Frontend (React)
|
|
|
|
**Location**: `frontend/`
|
|
|
|
Pages:
|
|
- **DevicesPage**: Device management and registration codes
|
|
- **ConversationsPage**: Browse synced conversations
|
|
- **ConversationDetailPage**: View messages, generate responses
|
|
- **TrainingPage**: Training sample review, job management
|
|
|
|
API integration via React Query hooks (`@tanstack/react-query`).
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Component | Default | Description |
|
|
|----------|-----------|---------|-------------|
|
|
| `DB_HOST` | Server | localhost | PostgreSQL host |
|
|
| `DB_PORT` | Server | 5433 | PostgreSQL port |
|
|
| `DB_USER` | Server | postgres | Database user |
|
|
| `DB_PASSWORD` | Server | devpassword | Database password |
|
|
| `DB_NAME` | Server | conversation_assistant | Database name |
|
|
| `REDIS_URL` | Server/ML | redis://localhost:6380 | Redis connection |
|
|
| `ML_SERVICE_URL` | Server | http://localhost:8100 | ML service endpoint |
|
|
| `ML_SERVICE_MODEL_ID` | ML | ministral-3b-instruct | Model to load |
|
|
| `ML_SERVICE_MODEL_PATH` | ML | - | Direct path to GGUF file |
|
|
| `ML_SERVICE_GPU_LAYERS` | ML | -1 | GPU layers (-1 = all) |
|
|
| `ML_SERVICE_CONTEXT_SIZE` | ML | 4096 | Context window size |
|
|
| `ML_SERVICE_REDIS_ENABLED` | ML | true | Enable Redis caching |
|
|
| `ML_SERVICE_REDIS_CACHE_TTL` | ML | 3600 | Cache TTL in seconds |
|
|
|
|
### Redis Keys
|
|
|
|
```
|
|
conv-assistant:cache:{hash} # Response cache
|
|
conv-assistant:queue:generation # Generation job queue (sorted set)
|
|
conv-assistant:queue:training # Training job queue (sorted set)
|
|
conv-assistant:job:{id} # Job data (hash)
|
|
```
|
|
|
|
## Prompt Format
|
|
|
|
Prompts sent to the ML service follow a conversation format:
|
|
|
|
```
|
|
Them: Hey, how's it going?
|
|
Me: Pretty good, just working on some code
|
|
Them: Nice! What are you building?
|
|
Me:
|
|
```
|
|
|
|
The model generates the continuation after `Me:`. Stop sequences (`\nThem:`, `\nMe:`, `\n\n`) prevent over-generation.
|
|
|
|
## Security Considerations
|
|
|
|
1. **Device Authentication**: 6-digit codes expire in 10 minutes
|
|
2. **JWT Tokens**: Short-lived access tokens (7 days)
|
|
3. **Full Disk Access**: Required for iMessage DB, grants broad access
|
|
4. **Keychain Storage**: Tokens stored in macOS Keychain
|
|
5. **HTTPS**: Required in production for API communication
|
|
6. **No Message Content Logging**: Only metadata logged (timestamps, counts)
|
|
|
|
## Scaling Considerations
|
|
|
|
### Current Architecture (Single Instance)
|
|
|
|
- PostgreSQL: Local Docker container
|
|
- Redis: Local Docker container (port 6380)
|
|
- ML Service: Single GPU instance
|
|
- Server: Single NestJS instance
|
|
|
|
### Production Scaling
|
|
|
|
1. **Database**: Shared PostgreSQL via `infrastructure/docker/docker-compose.databases.yml`
|
|
2. **Redis**: Shared Redis instance across services
|
|
3. **ML Service**: Multiple instances with load balancing (GPU required per instance)
|
|
4. **Async Generation**: Use `/generate/async` for non-blocking UI
|
|
|
|
## Training Pipeline
|
|
|
|
### Current State
|
|
|
|
Training jobs are queued and tracked, but actual LoRA fine-tuning requires additional setup:
|
|
|
|
1. Training data is saved as JSONL files
|
|
2. Job progress is tracked in Redis
|
|
3. Samples include quality weights from confidence scores
|
|
|
|
### Required for Full Training
|
|
|
|
```bash
|
|
pip install peft transformers accelerate
|
|
```
|
|
|
|
The ML service provides the framework; integration with HuggingFace's `peft` library enables actual LoRA fine-tuning.
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
conversation-assistant/
|
|
├── docker-compose.yml # PostgreSQL + Redis for dev
|
|
├── .env.example # Environment template
|
|
├── README.md # Quick start guide
|
|
├── LOGGING.md # Logging configuration
|
|
│
|
|
├── docs/
|
|
│ ├── ARCHITECTURE.md # This file
|
|
│ ├── API.md # API reference
|
|
│ └── DEVELOPMENT.md # Development guide
|
|
│
|
|
├── shared/ # TypeScript types
|
|
│ ├── package.json
|
|
│ └── src/index.ts # Re-exports from @lilith/types
|
|
│
|
|
├── server/ # NestJS backend
|
|
│ ├── package.json
|
|
│ ├── tsconfig.json
|
|
│ ├── nest-cli.json
|
|
│ └── src/
|
|
│ ├── main.ts # Entry point
|
|
│ ├── app.module.ts # Root module
|
|
│ ├── data-source.ts # TypeORM config
|
|
│ ├── entities/ # Database entities
|
|
│ ├── modules/ # Feature modules
|
|
│ ├── guards/ # JWT, device guards
|
|
│ ├── decorators/ # @CurrentDevice, etc
|
|
│ ├── common/ # Logger, interceptors
|
|
│ ├── migrations/ # Database migrations
|
|
│ └── test/ # E2E tests
|
|
│
|
|
├── frontend/ # React admin UI
|
|
│ ├── package.json
|
|
│ ├── vite.config.ts
|
|
│ ├── vitest.config.ts
|
|
│ └── src/
|
|
│ ├── main.tsx
|
|
│ ├── App.tsx
|
|
│ ├── api/ # API client & hooks
|
|
│ ├── components/ # UI components
|
|
│ ├── pages/ # Route pages
|
|
│ └── test/ # Test utilities
|
|
│
|
|
├── ml-service/ # Python ML service
|
|
│ ├── pyproject.toml
|
|
│ └── src/
|
|
│ ├── main.py # FastAPI app
|
|
│ ├── llm.py # LLM manager
|
|
│ ├── redis_client.py # Redis integration
|
|
│ ├── models.py # Pydantic models
|
|
│ ├── config.py # Settings
|
|
│ └── logging_config.py # Structured logging
|
|
│
|
|
└── macos/ # Swift macOS app
|
|
├── Package.swift # Swift package manifest
|
|
├── install.sh # Installation script
|
|
├── uninstall.sh # Removal script
|
|
├── deploy-remote.sh # Remote deployment
|
|
├── INSTALL.md # Installation guide
|
|
├── DEPLOYMENT.md # Deployment guide
|
|
└── Sources/ # Swift source code
|
|
```
|
|
|
|
## Related Documentation
|
|
|
|
- [API Reference](./API.md) - Complete endpoint documentation
|
|
- [Development Guide](./DEVELOPMENT.md) - Local development setup
|
|
- [Deployment Guide](../macos/DEPLOYMENT.md) - macOS app deployment
|