content-understanding/docs/architecture.md

5.7 KiB

Architecture

Design decisions and patterns used in the package.

Package Structure

src/lilith_content_understanding/
├── __init__.py              # Main exports
├── detectors/               # Binary classification / object detection
│   ├── __init__.py
│   ├── nsfw_detector.py     # HuggingFace NSFW classification
│   └── body_part_detector.py # NudeNet body detection
├── analyzers/               # Rich analysis modules
│   ├── __init__.py
│   ├── depth_analyzer.py    # Depth estimation
│   ├── color_analyzer.py    # Color palette extraction
│   ├── composition_analyzer.py # Composition analysis
│   └── scene_classifier.py  # Scene classification
└── api/                     # REST API service
    ├── __init__.py
    └── service.py           # FastAPI application

Design Principles

1. Lazy Initialization

Models are loaded on first use, not at import:

class NSFWDetector:
    def __init__(self):
        self._classifier = None
        self._initialized = False

    def _ensure_initialized(self):
        if self._initialized:
            return
        # Load model here
        self._initialized = True

    def classify(self, image):
        self._ensure_initialized()  # Called before use
        return self._classifier(image)

Benefits:

  • Fast import times
  • No GPU memory until needed
  • Graceful degradation if models unavailable

2. Device Auto-Detection

GPU is used when available without configuration:

if device is None:
    self.device = "cuda" if torch.cuda.is_available() else "cpu"

Override when needed:

detector = NSFWDetector(device="cpu")  # Force CPU

3. Dataclass Results

All results use dataclasses for:

  • Type hints and IDE support
  • Immutable by default
  • Easy serialization
@dataclass
class NSFWResult:
    is_nsfw: bool
    confidence: float
    category: str
    all_scores: dict[str, float]

4. Optional Dependencies

Heavy dependencies are optional:

[project.optional-dependencies]
nudenet = ["nudenet>=3.4.2"]
depth = ["timm>=0.9.0"]
api = ["fastapi>=0.100.0", "uvicorn>=0.23.0"]

Graceful handling:

try:
    from nudenet import NudeDetector
except ImportError:
    raise ImportError(
        "Install with: pip install lilith-content-understanding[nudenet]"
    )

5. Normalized Coordinates

All coordinates are normalized to 0-1 range:

# Bounding boxes
bbox = (x1, y1, x2, y2)  # All values 0-1

# To convert to pixels:
pixel_x1 = int(x1 * image.width)

Benefits:

  • Resolution-independent
  • Easy to compare across images
  • Consistent API

6. Health Check Methods

All components expose health information:

def get_info(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "device": self.device,
        "initialized": self._initialized,
    }

Model Selection

NSFW Detection

Primary: Marqo/nsfw-image-detection-384

  • 98.56% accuracy
  • Lightweight, fast
  • Good category coverage

Fallback: Falconsai/nsfw_image_detection

  • 98.04% accuracy
  • More established
  • Different failure modes

Depth Estimation

Default: Depth Anything V2 Small

  • State-of-the-art accuracy
  • Reasonable speed
  • Small memory footprint

Alternative: MiDaS

  • Intel-backed
  • Well-tested
  • Good fallback

Scene Classification

Approach: CLIP zero-shot classification

  • No training required
  • Flexible categories
  • Good generalization

Why not Places365:

  • Requires specific model
  • Limited to trained categories
  • Less flexible

API Design

Separation of Concerns

Detectors → Binary decisions (is/isn't)
Analyzers → Rich information (details)
API       → HTTP interface

Stateful vs Stateless

Detectors/Analyzers: Stateful (hold model)

  • Reuse instances for efficiency
  • Load once, use many times

API: Stateless requests

  • Each request independent
  • Global state for models

Error Handling

# Validation errors → 400
if not valid_image:
    raise HTTPException(status_code=400, detail="Invalid image")

# Missing features → 501
if nudenet not installed:
    raise HTTPException(status_code=501, detail="Install nudenet")

Memory Management

GPU Memory

  • Models loaded lazily
  • Single model instance per type
  • No automatic cleanup (manual if needed)

Image Processing

# Resize for analysis
small = image.resize((max_dim, max_dim))

# Process in memory
# Don't save intermediate files

Batch Processing

# Good: Reuse detector
detector = NSFWDetector()
for image in images:
    result = detector.classify(image)

# Bad: Reload each time
for image in images:
    detector = NSFWDetector()  # Memory leak!
    result = detector.classify(image)

Extensibility

Adding New Detectors

  1. Create detectors/new_detector.py
  2. Follow the pattern:
    • __init__ with config
    • _ensure_initialized for lazy loading
    • Main method (e.g., detect, classify)
    • get_info for health checks
  3. Export in detectors/__init__.py
  4. Add to main __init__.py

Adding New Analyzers

Same pattern as detectors, in analyzers/ directory.

Adding API Endpoints

  1. Add response model in api/service.py
  2. Create endpoint function
  3. Use lazy initialization pattern
  4. Add to OpenAPI docs

Testing Strategy

Unit Tests

  • Test each component in isolation
  • Mock model loading for speed
  • Test edge cases

Integration Tests

  • Test with real models
  • Verify GPU/CPU paths
  • Test API endpoints

Performance Tests

  • Benchmark model loading
  • Measure inference time
  • Track memory usage