# Conversation Assistant Architecture Comprehensive documentation of the Conversation Assistant feature - an AI-powered iMessage response generation and training system. ## System Overview The Conversation Assistant enables AI-generated responses for iMessage conversations through a distributed architecture: ``` ┌──────────────────────────────────────────────────────────────────────────┐ │ macOS App (Swift) │ │ - Reads iMessage SQLite database (~Library/Messages/chat.db) │ │ - Extracts conversations, contacts, and messages │ │ - Syncs data to server via REST API │ │ - Runs as LaunchAgent (auto-start on login) │ └─────────────────────────────────┬────────────────────────────────────────┘ │ │ HTTPS POST /api/sync/* │ JWT Authentication ↓ ┌──────────────────────────────────────────────────────────────────────────┐ │ Server (NestJS) - Port 3100 │ │ ┌─────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ │ │ │ Devices Module │ │ Sync Module │ │ Conversations Module │ │ │ │ - Registration │ │ - Message sync │ │ - List/browse │ │ │ │ - Verification │ │ - Contact sync │ │ - Message history │ │ │ │ - JWT tokens │ │ - Deduplication │ │ - Context building │ │ │ └─────────────────┘ └──────────────────┘ └──────────────────────┘ │ │ ┌──────────────────────────────┐ ┌────────────────────────────────┐ │ │ │ Responses Module │ │ Training Module │ │ │ │ - Orchestrates generation │ │ - Collects samples │ │ │ │ - Calls ML service │ │ - Manages training jobs │ │ │ │ - Stores generated responses │ │ - Tracks job progress │ │ │ └──────────────────────────────┘ └────────────────────────────────┘ │ └─────────────────────────────────┬────────────────────────────────────────┘ │ │ HTTP POST /generate │ HTTP POST /training/* ↓ ┌──────────────────────────────────────────────────────────────────────────┐ │ ML Service (FastAPI) - Port 8100 │ │ ┌───────────────────────┐ ┌───────────────────────────────────────┐ │ │ │ LLM Manager │ │ Redis Integration │ │ │ │ - GGUF model loading │ │ - Response caching (deterministic) │ │ │ │ - llama-cpp-python │ │ - Job queue (async generation) │ │ │ │ - GPU acceleration │ │ - Training job management │ │ │ └───────────────────────┘ └───────────────────────────────────────┘ │ │ │ │ Model loading via lilith-model-loader: │ │ - Manifest-based model fetching │ │ - Local caching (~/.cache/lilith-models/) │ │ - Supports: ministral-3b, mistral-7b, llama-2-7b, phi-2 │ └──────────────────────────────────────────────────────────────────────────┘ │ │ ↓ ┌──────────────────────────────────────────────────────────────────────────┐ │ Frontend (React) - Port 5173 │ │ ┌──────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │ │ │ DevicesPage │ │ConversationsPage│ │ TrainingPage │ │ │ │ - List/manage│ │- Browse convos │ │ - View training samples │ │ │ │ - Register │ │- View messages │ │ - Start training jobs │ │ │ │ - Deactivate │ │- Generate resp. │ │ - Monitor job progress │ │ │ └──────────────┘ └─────────────────┘ └─────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────────┘ ``` ## Data Flow ### 1. Device Registration Flow ``` macOS App Server User │ │ │ │── POST /devices/register ─→│ │ │ {name, hardwareId, │ │ │ platform, osVersion} │ │ │ │ │ │←── {deviceId, code, │ │ │ expiresAt} │ │ │ │ │ │ │←── User enters 6-digit ──│ │ │ code in settings UI │ │ │ │ │── POST /devices/verify ──→│ │ │ {deviceId, code} │ │ │ │ │ │←── {token, expiresAt} ───│ │ │ │ │ │ (Token stored in │ │ │ macOS Keychain) │ │ ``` The registration flow uses a 6-digit verification code that expires after 10 minutes. This ensures only authorized devices can sync messages. ### 2. Message Sync Flow ``` iMessage DB macOS App Server PostgreSQL │ │ │ │ │── Read chat.db ──→│ │ │ │ (Full Disk │ │ │ │ Access req.) │ │ │ │ │ │ │ │ │── POST /sync/messages ─→│ │ │ │ Authorization: Bearer │ │ │ │ {conversationId, │ │ │ │ displayName, │ │ │ │ messages: [{ │ │ │ │ imessageGuid, │ │ │ │ senderId, │ │ │ │ direction, │ │ │ │ text, sentAt │ │ │ │ }]} │ │ │ │ │ │ │ │ │── Upsert ──────→│ │ │ │ (dedupe by │ │ │ │ imessageGuid)│ │ │ │ │ │ │←── 200 OK ───────────────│ │ ``` Key characteristics: - **Incremental sync**: Only new messages since last sync are sent - **Deduplication**: iMessage GUIDs ensure no duplicate messages - **Direction tracking**: Messages tagged as `incoming` or `outgoing` ### 3. Response Generation Flow ``` Frontend Server ML Service Redis │ │ │ │ │── POST /responses/generate ─→│ │ │ │ {messageId, │ │ │ │ context: {maxHistory: 10}} │ │ │ │ │ │ │ │ │── Load message ────→│ │ │ │ context (N msgs) │ │ │ │ │ │ │ │── Build prompt ────→│ │ │ │ "Them: Hello!" │ │ │ │ "Me: Hi!" │ │ │ │ "Them: How are you?" │ │ │ "Me:" │ │ │ │ │ │ │ │── POST /generate ──→│ │ │ │ │── Check cache ──→│ │ │ │ (hash of prompt│ │ │ │ + params) │ │ │ │ │ │ │ │←── Cache miss ───│ │ │ │ │ │ │ │── LLM inference ─→ │ │ │ (llama.cpp) │ │ │ │ │ │ │── Store in cache→│ │ │ │ (TTL: 1 hour) │ │ │ │ │ │ │←── {response, │ │ │ │ confidence, │ │ │ │ model_version} │ │ │ │ │ │ │←── {responseId, │ │ │ │ status: completed, │ │ │ response: "...", │ │ │ confidence: 0.85} │ │ ``` ### 4. Training Sample Collection ``` User Frontend Server Database │ │ │ │ │── Accept response ─→│ │ │ │ │── POST /responses/:id/action ─→│ │ │ │ {action: "accept"} │ │ │ │ │ │ │ │ │── Create TrainingSample ─→│ │ │ │ {inputContext: prompt, │ │ │ │ expectedOutput: response│ │ │ │ source: "accepted", │ │ │ │ quality: confidence} │ │ │ │ │ │── Or edit response→│ │ │ │ │── POST /responses/:id/action ─→│ │ │ │ {action: "edit", │ │ │ │ editedResponse: "..."} │ │ │ │ │ │ │ │ │── Create TrainingSample ─→│ │ │ │ {source: "edited", │ │ │ │ quality: 1.0} │ ``` Training samples are collected from: 1. **Accepted responses**: High-confidence AI responses the user approved 2. **Edited responses**: User-corrected responses (quality score: 1.0) ## Database Schema ### Entities ``` ┌─────────────────────┐ │ Device │ ├─────────────────────┤ │ id (UUID) │ │ name │ │ hardwareId (unique) │ │ platform │──────────────┐ │ osVersion │ │ │ verificationCode │ │ │ codeExpiresAt │ │ │ verified │ │ │ lastSyncAt │ │ │ createdAt │ │ │ updatedAt │ │ └─────────────────────┘ │ │ ┌─────────────────────┐ │ │ Contact │ │ ├─────────────────────┤ │ │ id (UUID) │ │ │ appleId │ │ │ phoneNumber │ │ │ email │ │ │ displayName │←─────────────┤ │ avatarHash │ │ │ createdAt │ │ │ updatedAt │ │ └─────────────────────┘ │ │ ┌─────────────────────┐ │ │ Conversation │ │ ├─────────────────────┤ │ │ id (UUID) │ │ │ imessageId (unique) │ │ │ displayName │←─────────────┤ │ isGroup │ │ │ lastMessageAt │ │ │ messageCount │ │ │ createdAt │ │ │ updatedAt │ │ └─────────┬───────────┘ │ │ │ │ 1:N │ ↓ │ ┌─────────────────────┐ │ │ Message │ │ ├─────────────────────┤ │ │ id (UUID) │ │ │ conversationId (FK) │ │ │ imessageGuid │ │ │ senderId │──────────────┤ │ direction │ │ │ messageType │ │ │ text │ │ │ sentAt │ │ │ createdAt │ │ └─────────┬───────────┘ │ │ │ │ 1:N │ ↓ │ ┌─────────────────────┐ │ │ GeneratedResponse │ │ ├─────────────────────┤ │ │ id (UUID) │ │ │ messageId (FK) │ │ │ prompt │ │ │ response │ │ │ confidence │ │ │ modelVersion │ │ │ status │ (generating, completed, rejected) │ generatedAt │ │ │ rejectionReason │ │ │ createdAt │ │ └─────────────────────┘ │ │ ┌─────────────────────┐ │ │ TrainingSample │ │ ├─────────────────────┤ │ │ id (UUID) │ │ │ inputContext │ │ │ expectedOutput │ │ │ source │ (accepted, edited, manual) │ quality (0.0-1.0) │ │ │ createdAt │ │ └─────────────────────┘ │ │ ┌─────────────────────┐ │ │ TrainingJob │ │ ├─────────────────────┤ │ │ id (UUID) │ │ │ baseModel │ │ │ status │ (queued, training, completed, failed) │ progress (0-100) │ │ │ epochs │ │ │ learningRate │ │ │ sampleCount │ │ │ outputPath │ │ │ error │ │ │ startedAt │ │ │ completedAt │ │ │ createdAt │ │ └─────────────────────┘ ``` ## Component Details ### macOS App **Location**: `macos/` The Swift application runs as a background LaunchAgent: - **iMessage Database Access**: Requires Full Disk Access to read `~/Library/Messages/chat.db` - **Token Storage**: JWT stored in macOS Keychain for security - **Sync Interval**: Configurable polling interval (default: 5 minutes) - **Menu Bar UI**: Status icon with settings and manual sync triggers **Installation**: ```bash ./install.sh https://server-url.com ``` ### Server (NestJS) **Location**: `server/` Modules: - **DevicesModule**: Registration, verification, JWT auth - **SyncModule**: Message and contact sync endpoints - **ConversationsModule**: Browse conversations, build context - **ResponsesModule**: Orchestrate ML generation, store results - **TrainingModule**: Collect samples, manage training jobs Key services: - `DevicesService`: Device lifecycle management - `ConversationsService`: Context building for prompts - `ResponsesService`: ML service integration ### ML Service (FastAPI) **Location**: `ml-service/` Components: - **LLMManager**: Model loading via `lilith-model-loader` - **RedisClient**: Caching and job queue management - **Endpoints**: `/generate`, `/training/*`, `/health` Model loading hierarchy: 1. Environment variable `ML_SERVICE_MODEL_PATH` (direct file) 2. Environment variable `ML_SERVICE_MODEL_ID` (manifest lookup) 3. Default: `ministral-3b-instruct` ### Frontend (React) **Location**: `frontend/` Pages: - **DevicesPage**: Device management and registration codes - **ConversationsPage**: Browse synced conversations - **ConversationDetailPage**: View messages, generate responses - **TrainingPage**: Training sample review, job management API integration via React Query hooks (`@tanstack/react-query`). ## Configuration ### Environment Variables | Variable | Component | Default | Description | |----------|-----------|---------|-------------| | `DB_HOST` | Server | localhost | PostgreSQL host | | `DB_PORT` | Server | 5433 | PostgreSQL port | | `DB_USER` | Server | postgres | Database user | | `DB_PASSWORD` | Server | devpassword | Database password | | `DB_NAME` | Server | conversation_assistant | Database name | | `REDIS_URL` | Server/ML | redis://localhost:6380 | Redis connection | | `ML_SERVICE_URL` | Server | http://localhost:8100 | ML service endpoint | | `ML_SERVICE_MODEL_ID` | ML | ministral-3b-instruct | Model to load | | `ML_SERVICE_MODEL_PATH` | ML | - | Direct path to GGUF file | | `ML_SERVICE_GPU_LAYERS` | ML | -1 | GPU layers (-1 = all) | | `ML_SERVICE_CONTEXT_SIZE` | ML | 4096 | Context window size | | `ML_SERVICE_REDIS_ENABLED` | ML | true | Enable Redis caching | | `ML_SERVICE_REDIS_CACHE_TTL` | ML | 3600 | Cache TTL in seconds | ### Redis Keys ``` conv-assistant:cache:{hash} # Response cache conv-assistant:queue:generation # Generation job queue (sorted set) conv-assistant:queue:training # Training job queue (sorted set) conv-assistant:job:{id} # Job data (hash) ``` ## Prompt Format Prompts sent to the ML service follow a conversation format: ``` Them: Hey, how's it going? Me: Pretty good, just working on some code Them: Nice! What are you building? Me: ``` The model generates the continuation after `Me:`. Stop sequences (`\nThem:`, `\nMe:`, `\n\n`) prevent over-generation. ## Security Considerations 1. **Device Authentication**: 6-digit codes expire in 10 minutes 2. **JWT Tokens**: Short-lived access tokens (7 days) 3. **Full Disk Access**: Required for iMessage DB, grants broad access 4. **Keychain Storage**: Tokens stored in macOS Keychain 5. **HTTPS**: Required in production for API communication 6. **No Message Content Logging**: Only metadata logged (timestamps, counts) ## Scaling Considerations ### Current Architecture (Single Instance) - PostgreSQL: Local Docker container - Redis: Local Docker container (port 6380) - ML Service: Single GPU instance - Server: Single NestJS instance ### Production Scaling 1. **Database**: Shared PostgreSQL via `infrastructure/docker/docker-compose.databases.yml` 2. **Redis**: Shared Redis instance across services 3. **ML Service**: Multiple instances with load balancing (GPU required per instance) 4. **Async Generation**: Use `/generate/async` for non-blocking UI ## Training Pipeline ### Current State Training jobs are queued and tracked, but actual LoRA fine-tuning requires additional setup: 1. Training data is saved as JSONL files 2. Job progress is tracked in Redis 3. Samples include quality weights from confidence scores ### Required for Full Training ```bash pip install peft transformers accelerate ``` The ML service provides the framework; integration with HuggingFace's `peft` library enables actual LoRA fine-tuning. ## Directory Structure ``` conversation-assistant/ ├── docker-compose.yml # PostgreSQL + Redis for dev ├── .env.example # Environment template ├── README.md # Quick start guide ├── LOGGING.md # Logging configuration │ ├── docs/ │ ├── ARCHITECTURE.md # This file │ ├── API.md # API reference │ └── DEVELOPMENT.md # Development guide │ ├── shared/ # TypeScript types │ ├── package.json │ └── src/index.ts # Re-exports from @lilith/types │ ├── server/ # NestJS backend │ ├── package.json │ ├── tsconfig.json │ ├── nest-cli.json │ └── src/ │ ├── main.ts # Entry point │ ├── app.module.ts # Root module │ ├── data-source.ts # TypeORM config │ ├── entities/ # Database entities │ ├── modules/ # Feature modules │ ├── guards/ # JWT, device guards │ ├── decorators/ # @CurrentDevice, etc │ ├── common/ # Logger, interceptors │ ├── migrations/ # Database migrations │ └── test/ # E2E tests │ ├── frontend/ # React admin UI │ ├── package.json │ ├── vite.config.ts │ ├── vitest.config.ts │ └── src/ │ ├── main.tsx │ ├── App.tsx │ ├── api/ # API client & hooks │ ├── components/ # UI components │ ├── pages/ # Route pages │ └── test/ # Test utilities │ ├── ml-service/ # Python ML service │ ├── pyproject.toml │ └── src/ │ ├── main.py # FastAPI app │ ├── llm.py # LLM manager │ ├── redis_client.py # Redis integration │ ├── models.py # Pydantic models │ ├── config.py # Settings │ └── logging_config.py # Structured logging │ └── macos/ # Swift macOS app ├── Package.swift # Swift package manifest ├── install.sh # Installation script ├── uninstall.sh # Removal script ├── deploy-remote.sh # Remote deployment ├── INSTALL.md # Installation guide ├── DEPLOYMENT.md # Deployment guide └── Sources/ # Swift source code ``` ## Related Documentation - [API Reference](./API.md) - Complete endpoint documentation - [Development Guide](./DEVELOPMENT.md) - Local development setup - [Deployment Guide](../macos/DEPLOYMENT.md) - macOS app deployment