platform-codebase/features/conversation-assistant/docs/ARCHITECTURE.md

# Conversation Assistant Architecture

Comprehensive documentation of the Conversation Assistant feature - an AI-powered iMessage response generation and training system.

## System Overview

The Conversation Assistant enables AI-generated responses for iMessage conversations through a distributed architecture:

```
┌──────────────────────────────────────────────────────────────────────────┐
│                          macOS App (Swift)                               │
│  - Reads iMessage SQLite database (~Library/Messages/chat.db)            │
│  - Extracts conversations, contacts, and messages                        │
│  - Syncs data to server via REST API                                     │
│  - Runs as LaunchAgent (auto-start on login)                             │
└─────────────────────────────────┬────────────────────────────────────────┘
                                  │
                                  │ HTTPS POST /api/sync/*
                                  │ JWT Authentication
                                  ↓
┌──────────────────────────────────────────────────────────────────────────┐
│                       Server (NestJS) - Port 3100                        │
│  ┌─────────────────┐  ┌──────────────────┐  ┌──────────────────────┐     │
│  │ Devices Module  │  │ Sync Module      │  │ Conversations Module │     │
│  │ - Registration  │  │ - Message sync   │  │ - List/browse        │     │
│  │ - Verification  │  │ - Contact sync   │  │ - Message history    │     │
│  │ - JWT tokens    │  │ - Deduplication  │  │ - Context building   │     │
│  └─────────────────┘  └──────────────────┘  └──────────────────────┘     │
│  ┌──────────────────────────────┐  ┌────────────────────────────────┐    │
│  │ Responses Module             │  │ Training Module                │    │
│  │ - Orchestrates generation    │  │ - Collects samples             │    │
│  │ - Calls ML service           │  │ - Manages training jobs        │    │
│  │ - Stores generated responses │  │ - Tracks job progress          │    │
│  └──────────────────────────────┘  └────────────────────────────────┘    │
└─────────────────────────────────┬────────────────────────────────────────┘
                                  │
                                  │ HTTP POST /generate
                                  │ HTTP POST /training/*
                                  ↓
┌──────────────────────────────────────────────────────────────────────────┐
│                    ML Service (FastAPI) - Port 8100                      │
│  ┌───────────────────────┐    ┌───────────────────────────────────────┐  │
│  │ LLM Manager           │    │ Redis Integration                     │  │
│  │ - GGUF model loading  │    │ - Response caching (deterministic)    │  │
│  │ - llama-cpp-python    │    │ - Job queue (async generation)        │  │
│  │ - GPU acceleration    │    │ - Training job management             │  │
│  └───────────────────────┘    └───────────────────────────────────────┘  │
│                                                                          │
│  Model loading via lilith-model-loader:                                  │
│  - Manifest-based model fetching                                         │
│  - Local caching (~/.cache/lilith-models/)                               │
│  - Supports: ministral-3b, mistral-7b, llama-2-7b, phi-2                 │
└──────────────────────────────────────────────────────────────────────────┘
                                  │
                                  │
                                  ↓
┌──────────────────────────────────────────────────────────────────────────┐
│                      Frontend (React) - Port 5173                        │
│  ┌──────────────┐  ┌─────────────────┐  ┌─────────────────────────────┐  │
│  │ DevicesPage  │  │ConversationsPage│  │ TrainingPage                │  │
│  │ - List/manage│  │- Browse convos  │  │ - View training samples     │  │
│  │ - Register   │  │- View messages  │  │ - Start training jobs       │  │
│  │ - Deactivate │  │- Generate resp. │  │ - Monitor job progress      │  │
│  └──────────────┘  └─────────────────┘  └─────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────────────┘
```

## Data Flow

### 1. Device Registration Flow

```
macOS App                    Server                      User
    │                           │                          │
    │── POST /devices/register ─→│                          │
    │   {name, hardwareId,      │                          │
    │    platform, osVersion}   │                          │
    │                           │                          │
    │←── {deviceId, code,       │                          │
    │     expiresAt}            │                          │
    │                           │                          │
    │                           │←── User enters 6-digit ──│
    │                           │    code in settings UI   │
    │                           │                          │
    │── POST /devices/verify ──→│                          │
    │   {deviceId, code}        │                          │
    │                           │                          │
    │←── {token, expiresAt} ───│                          │
    │                           │                          │
    │   (Token stored in        │                          │
    │    macOS Keychain)        │                          │
```

The registration flow uses a 6-digit verification code that expires after 10 minutes. This ensures only authorized devices can sync messages.

### 2. Message Sync Flow

```
iMessage DB          macOS App              Server              PostgreSQL
     │                   │                     │                     │
     │── Read chat.db ──→│                     │                     │
     │   (Full Disk      │                     │                     │
     │    Access req.)   │                     │                     │
     │                   │                     │                     │
     │                   │── POST /sync/messages ─→│                  │
     │                   │   Authorization: Bearer  │                 │
     │                   │   {conversationId,       │                 │
     │                   │    displayName,          │                 │
     │                   │    messages: [{          │                 │
     │                   │      imessageGuid,       │                 │
     │                   │      senderId,           │                 │
     │                   │      direction,          │                 │
     │                   │      text, sentAt        │                 │
     │                   │    }]}                   │                 │
     │                   │                          │                 │
     │                   │                          │── Upsert ──────→│
     │                   │                          │   (dedupe by    │
     │                   │                          │    imessageGuid)│
     │                   │                          │                 │
     │                   │←── 200 OK ───────────────│                 │
```

Key characteristics:
- **Incremental sync**: Only new messages since last sync are sent
- **Deduplication**: iMessage GUIDs ensure no duplicate messages
- **Direction tracking**: Messages tagged as `incoming` or `outgoing`

### 3. Response Generation Flow

```
Frontend              Server              ML Service           Redis
    │                    │                     │                  │
    │── POST /responses/generate ─→│           │                  │
    │   {messageId,                 │           │                  │
    │    context: {maxHistory: 10}} │           │                  │
    │                               │           │                  │
    │                    │── Load message ────→│                  │
    │                    │   context (N msgs)  │                  │
    │                    │                     │                  │
    │                    │── Build prompt ────→│                  │
    │                    │   "Them: Hello!"    │                  │
    │                    │   "Me: Hi!"         │                  │
    │                    │   "Them: How are you?"                 │
    │                    │   "Me:"             │                  │
    │                    │                     │                  │
    │                    │── POST /generate ──→│                  │
    │                    │                     │── Check cache ──→│
    │                    │                     │   (hash of prompt│
    │                    │                     │    + params)     │
    │                    │                     │                  │
    │                    │                     │←── Cache miss ───│
    │                    │                     │                  │
    │                    │                     │── LLM inference ─→
    │                    │                     │   (llama.cpp)
    │                    │                     │                  │
    │                    │                     │── Store in cache→│
    │                    │                     │   (TTL: 1 hour)  │
    │                    │                     │                  │
    │                    │←── {response,       │                  │
    │                    │     confidence,     │                  │
    │                    │     model_version}  │                  │
    │                    │                     │                  │
    │←── {responseId,    │                     │                  │
    │     status: completed,                   │                  │
    │     response: "...",                     │                  │
    │     confidence: 0.85}                    │                  │
```

### 4. Training Sample Collection

```
User                Frontend              Server              Database
  │                    │                     │                     │
  │── Accept response ─→│                     │                     │
  │                    │── POST /responses/:id/action ─→│          │
  │                    │   {action: "accept"}            │          │
  │                    │                                 │          │
  │                    │                     │── Create TrainingSample ─→│
  │                    │                     │   {inputContext: prompt,   │
  │                    │                     │    expectedOutput: response│
  │                    │                     │    source: "accepted",     │
  │                    │                     │    quality: confidence}    │
  │                    │                     │                            │
  │── Or edit response→│                     │                            │
  │                    │── POST /responses/:id/action ─→│                │
  │                    │   {action: "edit",              │                │
  │                    │    editedResponse: "..."}      │                │
  │                    │                                 │                │
  │                    │                     │── Create TrainingSample ─→│
  │                    │                     │   {source: "edited",       │
  │                    │                     │    quality: 1.0}           │
```

Training samples are collected from:
1. **Accepted responses**: High-confidence AI responses the user approved
2. **Edited responses**: User-corrected responses (quality score: 1.0)

## Database Schema

### Entities

```
┌─────────────────────┐
│      Device         │
├─────────────────────┤
│ id (UUID)           │
│ name                │
│ hardwareId (unique) │
│ platform            │──────────────┐
│ osVersion           │              │
│ verificationCode    │              │
│ codeExpiresAt       │              │
│ verified            │              │
│ lastSyncAt          │              │
│ createdAt           │              │
│ updatedAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│     Contact         │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ appleId             │              │
│ phoneNumber         │              │
│ email               │              │
│ displayName         │←─────────────┤
│ avatarHash          │              │
│ createdAt           │              │
│ updatedAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│   Conversation      │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ imessageId (unique) │              │
│ displayName         │←─────────────┤
│ isGroup             │              │
│ lastMessageAt       │              │
│ messageCount        │              │
│ createdAt           │              │
│ updatedAt           │              │
└─────────┬───────────┘              │
          │                          │
          │ 1:N                      │
          ↓                          │
┌─────────────────────┐              │
│     Message         │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ conversationId (FK) │              │
│ imessageGuid        │              │
│ senderId            │──────────────┤
│ direction           │              │
│ messageType         │              │
│ text                │              │
│ sentAt              │              │
│ createdAt           │              │
└─────────┬───────────┘              │
          │                          │
          │ 1:N                      │
          ↓                          │
┌─────────────────────┐              │
│ GeneratedResponse   │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ messageId (FK)      │              │
│ prompt              │              │
│ response            │              │
│ confidence          │              │
│ modelVersion        │              │
│ status              │ (generating, completed, rejected)
│ generatedAt         │              │
│ rejectionReason     │              │
│ createdAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│  TrainingSample     │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ inputContext        │              │
│ expectedOutput      │              │
│ source              │ (accepted, edited, manual)
│ quality (0.0-1.0)   │              │
│ createdAt           │              │
└─────────────────────┘              │
                                     │
┌─────────────────────┐              │
│   TrainingJob       │              │
├─────────────────────┤              │
│ id (UUID)           │              │
│ baseModel           │              │
│ status              │ (queued, training, completed, failed)
│ progress (0-100)    │              │
│ epochs              │              │
│ learningRate        │              │
│ sampleCount         │              │
│ outputPath          │              │
│ error               │              │
│ startedAt           │              │
│ completedAt         │              │
│ createdAt           │              │
└─────────────────────┘
```

## Component Details

### macOS App

**Location**: `macos/`

The Swift application runs as a background LaunchAgent:

- **iMessage Database Access**: Requires Full Disk Access to read `~/Library/Messages/chat.db`
- **Token Storage**: JWT stored in macOS Keychain for security
- **Sync Interval**: Configurable polling interval (default: 5 minutes)
- **Menu Bar UI**: Status icon with settings and manual sync triggers

**Installation**:
```bash
./install.sh https://server-url.com
```

### Server (NestJS)

**Location**: `server/`

Modules:
- **DevicesModule**: Registration, verification, JWT auth
- **SyncModule**: Message and contact sync endpoints
- **ConversationsModule**: Browse conversations, build context
- **ResponsesModule**: Orchestrate ML generation, store results
- **TrainingModule**: Collect samples, manage training jobs

Key services:
- `DevicesService`: Device lifecycle management
- `ConversationsService`: Context building for prompts
- `ResponsesService`: ML service integration

### ML Service (FastAPI)

**Location**: `ml-service/`

Components:
- **LLMManager**: Model loading via `lilith-model-loader`
- **RedisClient**: Caching and job queue management
- **Endpoints**: `/generate`, `/training/*`, `/health`

Model loading hierarchy:
1. Environment variable `ML_SERVICE_MODEL_PATH` (direct file)
2. Environment variable `ML_SERVICE_MODEL_ID` (manifest lookup)
3. Default: `ministral-3b-instruct`

### Frontend (React)

**Location**: `frontend/`

Pages:
- **DevicesPage**: Device management and registration codes
- **ConversationsPage**: Browse synced conversations
- **ConversationDetailPage**: View messages, generate responses
- **TrainingPage**: Training sample review, job management

API integration via React Query hooks (`@tanstack/react-query`).

## Configuration

### Environment Variables

| Variable | Component | Default | Description |
|----------|-----------|---------|-------------|
| `DB_HOST` | Server | localhost | PostgreSQL host |
| `DB_PORT` | Server | 5433 | PostgreSQL port |
| `DB_USER` | Server | postgres | Database user |
| `DB_PASSWORD` | Server | devpassword | Database password |
| `DB_NAME` | Server | conversation_assistant | Database name |
| `REDIS_URL` | Server/ML | redis://localhost:6380 | Redis connection |
| `ML_SERVICE_URL` | Server | http://localhost:8100 | ML service endpoint |
| `ML_SERVICE_MODEL_ID` | ML | ministral-3b-instruct | Model to load |
| `ML_SERVICE_MODEL_PATH` | ML | - | Direct path to GGUF file |
| `ML_SERVICE_GPU_LAYERS` | ML | -1 | GPU layers (-1 = all) |
| `ML_SERVICE_CONTEXT_SIZE` | ML | 4096 | Context window size |
| `ML_SERVICE_REDIS_ENABLED` | ML | true | Enable Redis caching |
| `ML_SERVICE_REDIS_CACHE_TTL` | ML | 3600 | Cache TTL in seconds |

### Redis Keys

```
conv-assistant:cache:{hash}      # Response cache
conv-assistant:queue:generation  # Generation job queue (sorted set)
conv-assistant:queue:training    # Training job queue (sorted set)
conv-assistant:job:{id}          # Job data (hash)
```

## Prompt Format

Prompts sent to the ML service follow a conversation format:

```
Them: Hey, how's it going?
Me: Pretty good, just working on some code
Them: Nice! What are you building?
Me:
```

The model generates the continuation after `Me:`. Stop sequences (`\nThem:`, `\nMe:`, `\n\n`) prevent over-generation.

## Security Considerations

1. **Device Authentication**: 6-digit codes expire in 10 minutes
2. **JWT Tokens**: Short-lived access tokens (7 days)
3. **Full Disk Access**: Required for iMessage DB, grants broad access
4. **Keychain Storage**: Tokens stored in macOS Keychain
5. **HTTPS**: Required in production for API communication
6. **No Message Content Logging**: Only metadata logged (timestamps, counts)

## Scaling Considerations

### Current Architecture (Single Instance)

- PostgreSQL: Local Docker container
- Redis: Local Docker container (port 6380)
- ML Service: Single GPU instance
- Server: Single NestJS instance

### Production Scaling

1. **Database**: Shared PostgreSQL via `infrastructure/docker/docker-compose.databases.yml`
2. **Redis**: Shared Redis instance across services
3. **ML Service**: Multiple instances with load balancing (GPU required per instance)
4. **Async Generation**: Use `/generate/async` for non-blocking UI

## Training Pipeline

### Current State

Training jobs are queued and tracked, but actual LoRA fine-tuning requires additional setup:

1. Training data is saved as JSONL files
2. Job progress is tracked in Redis
3. Samples include quality weights from confidence scores

### Required for Full Training

```bash
pip install peft transformers accelerate
```

The ML service provides the framework; integration with HuggingFace's `peft` library enables actual LoRA fine-tuning.

## Directory Structure

```
conversation-assistant/
├── docker-compose.yml          # PostgreSQL + Redis for dev
├── .env.example                # Environment template
├── README.md                   # Quick start guide
├── LOGGING.md                  # Logging configuration
│
├── docs/
│   ├── ARCHITECTURE.md         # This file
│   ├── API.md                  # API reference
│   └── DEVELOPMENT.md          # Development guide
│
├── shared/                     # TypeScript types
│   ├── package.json
│   └── src/index.ts            # Re-exports from @lilith/types
│
├── server/                     # NestJS backend
│   ├── package.json
│   ├── tsconfig.json
│   ├── nest-cli.json
│   └── src/
│       ├── main.ts             # Entry point
│       ├── app.module.ts       # Root module
│       ├── data-source.ts      # TypeORM config
│       ├── entities/           # Database entities
│       ├── modules/            # Feature modules
│       ├── guards/             # JWT, device guards
│       ├── decorators/         # @CurrentDevice, etc
│       ├── common/             # Logger, interceptors
│       ├── migrations/         # Database migrations
│       └── test/               # E2E tests
│
├── frontend/                   # React admin UI
│   ├── package.json
│   ├── vite.config.ts
│   ├── vitest.config.ts
│   └── src/
│       ├── main.tsx
│       ├── App.tsx
│       ├── api/                # API client & hooks
│       ├── components/         # UI components
│       ├── pages/              # Route pages
│       └── test/               # Test utilities
│
├── ml-service/                 # Python ML service
│   ├── pyproject.toml
│   └── src/
│       ├── main.py             # FastAPI app
│       ├── llm.py              # LLM manager
│       ├── redis_client.py     # Redis integration
│       ├── models.py           # Pydantic models
│       ├── config.py           # Settings
│       └── logging_config.py   # Structured logging
│
└── macos/                      # Swift macOS app
    ├── Package.swift           # Swift package manifest
    ├── install.sh              # Installation script
    ├── uninstall.sh            # Removal script
    ├── deploy-remote.sh        # Remote deployment
    ├── INSTALL.md              # Installation guide
    ├── DEPLOYMENT.md           # Deployment guide
    └── Sources/                # Swift source code
```

## Related Documentation

- [API Reference](./API.md) - Complete endpoint documentation
- [Development Guide](./DEVELOPMENT.md) - Local development setup
- [Deployment Guide](../macos/DEPLOYMENT.md) - macOS app deployment