rag-retrieval/README.md
Lilith ad0c0a438b feat(rag-retrieval): initial implementation of RAG retrieval service
Retrieval-Augmented Generation service with:
- FastAPI service using lilith-ml-service-base
- Vector search via Redis HNSW
- Context augmentation and claim extraction
- TypeScript client package @lilith/rag-client
- Service-addresses integration for port resolution (8111)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-12 09:18:00 -08:00

2.4 KiB

RAG Retrieval Service

Retrieval-Augmented Generation service with vector search and context augmentation.

Overview

This service provides document retrieval, context augmentation, and claim extraction for LLM prompt enhancement. It uses Redis with HNSW indexing for vector search.

Architecture

rag-retrieval/
├── service/src/          # Python FastAPI service
│   ├── api/main.py       # FastAPI endpoints
│   ├── config.py         # Configuration (uses service-addresses)
│   ├── retrieval/        # Vector search engine
│   ├── augmentation/     # Context formatting
│   └── indexing/         # Document indexing
├── packages/client/      # @lilith/rag-client TypeScript package
└── pyproject.toml        # Python package definition

Port

Port 8111 is allocated in lilith-platform/infrastructure/ports.yaml under ml.rag-retrieval.

The service uses lilith-service-addresses for runtime port resolution.

Installation

Python Service

pip install rag-retrieval

TypeScript Client

npm install @lilith/rag-client

Usage

Python Service

# Start the service (requires Redis)
python -m service

TypeScript Client

import { RAGClient } from '@lilith/rag-client';

// Auto-discovers port from service-addresses
const client = await RAGClient.create();

// Retrieve relevant documents
const docs = await client.retrieve({
  query: 'creator earnings percentage',
  subject_type: 'worker',
  limit: 5,
});

// Augment content with verified facts
const augmented = await client.augment({
  content: 'Creators keep 85% of earnings',
  subject_type: 'worker',
});

console.log(augmented.augmented_prompt);
console.log(augmented.claims);

API Endpoints

POST /retrieve

Search for relevant documents.

{
  "query": "search query",
  "subject_type": "worker",
  "limit": 5
}

POST /augment

Extract claims and augment with verified context.

{
  "content": "content with claims",
  "subject_type": "worker"
}

POST /index

Index documents into vector store.

{
  "path": "/path/to/docs",
  "recursive": true
}

GET /health

Health check endpoint.

Dependencies

  • lilith-ml-service-base - FastAPI scaffolding
  • lilith-service-addresses - Port resolution
  • redis - Vector store backend
  • httpx - HTTP client for embeddings

License

MIT