Three pipelines, one system

The SEO engine orchestrates three independent systems, each valuable on its own. The Content Pipeline generates and validates pages. The Verification Pipeline ensures source-consistency through semantic RAG. The Image Pipeline generates contextual imagery via GPU-accelerated diffusion models. Together they produce complete, verified, illustrated pages at scale.

CITY DATABASE GeoNames + OSM ATTRIBUTE SCHEMA Client-defined filters CLIENT DOCS 700+ source files for RAG SEO CONTENT PIPELINE Route Generator locales x categories x cities x filters Content Generator Self-hosted LLM (no API calls) Source Verification + Schema Generation RAG source-checking (economics, claims, competitors) → Schema.org JSON-LD structured data Translation Engine 40+ languages Static Builder Static Build → CDN SOURCE VERIFICATION PIPELINE Embedding Service Embedding Model Vector Index Fast Semantic Search Semantic Source Validator Pattern matching + RAG retrieval + claim verification IMAGE GENERATION PIPELINE (GPU-accelerated) Prompt Builder LLM → Diffusion prompt craft Image Diffusion GPU-powered generation Processing Resize, optimize, families API Orchestrator API service layer STATIC HTML + IMAGES CDN-ready pages with structured data, sitemaps, contextual imagery KEY ARCHITECTURAL DECISIONS ✔ Self-hosted LLM = near-zero marginal content cost at scale ✔ Self-hosted diffusion = near-zero marginal image cost (GPU amortized) ✔ RAG source verification = no hallucinated claims in output ✔ Static output = CDN-friendly, sub-second loads, max Lighthouse ✔ No cloud lock-in = runs on any Linux box with a GPU, consumer or server ✔ Each pipeline independently valuable and independently licensable

Each pipeline is independently valuable

SEO Content Pipeline

The orchestrator. Coordinates the other two pipelines and produces CDN-ready static HTML.

  • Route generation (locales × cities × categories × attributes)
  • Brand-aware prompt building with voice presets
  • Self-hosted LLM content creation
  • Source verification & schema generation
  • 40+ language translation
  • Static build to CDN

Source Verification Pipeline

Every LLM-generated claim checked against client source documents via semantic RAG. Ensures the pipeline says what the client says.

  • Ingests client source docs (700+ files in current deployment)
  • Generates semantic embeddings for source matching
  • High-performance vector index for fast retrieval
  • Semantic matching validates economics, claims, competitors
  • Unverifiable claims flagged for human review
Independently licensable
🎨

Image Generation Pipeline

GPU-accelerated diffusion generates 9 image families per page from a single seed. Not stock photos.

  • 9 families: square, hero, portrait, OG, compact, tall, ultrawide, sidebar, header
  • Same seed = visual cohesion across all layouts
  • Different aspect ratios = optimized composition per context
  • LLM-crafted category-aware, city-atmospheric prompts
  • Art-directed per viewport and device
Independently licensable