8.5 KiB
8.5 KiB
Analyzers
Analyzers provide rich analysis returning detailed insights about images.
DepthAnalyzer
Monocular depth estimation using transformer-based models.
Models
| Model | Description |
|---|---|
depth-anything-v2-small |
Fast, good accuracy (default) |
depth-anything-v2-base |
Better accuracy, slower |
midas-small |
Intel MiDaS hybrid model |
Basic Usage
from PIL import Image
from lilith_content_understanding import DepthAnalyzer
analyzer = DepthAnalyzer()
image = Image.open("photo.jpg")
result = analyzer.estimate(image)
# Get depth map
print(f"Size: {result.width}x{result.height}")
print(f"Depth range: {result.min_depth:.2f} to {result.max_depth:.2f}")
# Save visualization
result.save_visualization("depth.png", colormap="magma")
# Get PIL Image
depth_image = result.to_pil(colormap="viridis")
# Query specific point
depth_at_center = result.get_depth_at(0.5, 0.5) # Normalized coords
# Get foreground mask
foreground = result.get_foreground_mask(threshold=0.3)
# Segment into depth layers
layers = result.get_depth_layers(num_layers=3)
Configuration
analyzer = DepthAnalyzer(
model_name="depth-anything-v2-small", # Model to use
device="cuda", # Force GPU
)
DepthResult Fields
| Field | Type | Description |
|---|---|---|
depth_map |
NDArray[float32] | 2D array of normalized depths (0-1) |
width |
int | Depth map width |
height |
int | Depth map height |
min_depth |
float | Original minimum depth value |
max_depth |
float | Original maximum depth value |
Colormaps
Available colormaps for visualization:
magma(default) - Good for depthviridis- Perceptually uniformplasma- High contrastinferno- Alternative to magma
ColorAnalyzer
Color palette extraction using k-means clustering.
Basic Usage
from PIL import Image
from lilith_content_understanding import ColorAnalyzer
analyzer = ColorAnalyzer()
image = Image.open("photo.jpg")
result = analyzer.extract_palette(image, num_colors=5)
# Get colors
print(f"Hex colors: {result.hex_colors}")
print(f"Dominant: {result.dominant_color.hex}")
# Color analysis
print(f"Harmony: {result.harmony_type}")
print(f"Mood: {result.mood}")
print(f"Avg saturation: {result.average_saturation:.1f}%")
print(f"Avg lightness: {result.average_lightness:.1f}%")
# Individual colors
for color in result.colors:
print(f"{color.name}: {color.hex} ({color.percentage:.1f}%)")
print(f" RGB: {color.rgb}")
print(f" HSL: H={color.hsl[0]:.0f} S={color.hsl[1]:.0f} L={color.hsl[2]:.0f}")
# Generate CSS gradient
css = result.to_css_gradient()
# Create swatch image
swatch = result.to_swatch_image(width=500, height=100)
swatch.save("palette.png")
Configuration
analyzer = ColorAnalyzer(
resize_max=200, # Resize for faster analysis
min_saturation=0.05, # Filter out grays
)
Color Harmony Types
| Type | Description |
|---|---|
monochromatic |
Single hue variations |
complementary |
Opposite hues |
analogous |
Adjacent hues |
triadic |
Three equidistant hues |
split-complementary |
Complement + neighbors |
compound |
Complex relationship |
Mood Detection
| Mood | Characteristics |
|---|---|
airy |
Light, low saturation |
dark |
Low lightness |
vibrant |
High saturation |
muted |
Low saturation |
warm |
Red/orange hues |
cool |
Blue hues |
energetic |
Yellow/green hues |
natural |
Green hues |
neutral |
No dominant character |
Palette Comparison
palette1 = analyzer.extract_palette(image1)
palette2 = analyzer.extract_palette(image2)
similarity = analyzer.compare_palettes(palette1, palette2)
print(f"Overall: {similarity['overall']:.1%}")
print(f"Hue: {similarity['hue']:.1%}")
print(f"Saturation: {similarity['saturation']:.1%}")
print(f"Lightness: {similarity['lightness']:.1%}")
CompositionAnalyzer
Analyzes compositional elements of images.
Basic Usage
from PIL import Image
from lilith_content_understanding import CompositionAnalyzer
analyzer = CompositionAnalyzer()
image = Image.open("photo.jpg")
result = analyzer.analyze(image)
# Composition scores
print(f"Rule of thirds: {result.rule_of_thirds_score:.2f}")
print(f"Horizontal symmetry: {result.symmetry_score:.2f}")
print(f"Vertical symmetry: {result.vertical_symmetry_score:.2f}")
print(f"Balance: {result.balance_score:.2f} ({result.balance_type})")
# Visual complexity
print(f"Complexity: {result.complexity_score:.2f}")
print(f"Negative space: {result.negative_space_ratio:.1%}")
# Visual weight center
x, y = result.visual_weight_center
print(f"Weight center: ({x:.2f}, {y:.2f})")
# Focal points
for fp in result.focal_points:
print(f"Focal point at ({fp.x:.2f}, {fp.y:.2f})")
print(f" Strength: {fp.strength:.2f}")
print(f" Quadrant: {fp.quadrant}")
print(f" On thirds: {fp.on_thirds_intersection}")
# Improvement suggestions
for suggestion in result.suggestions:
print(f"- {suggestion}")
# Quick check
print(f"Well composed: {result.is_well_composed}")
print(f"Primary focal point: {result.primary_focal_point}")
Configuration
analyzer = CompositionAnalyzer(
resize_max=400, # Resize for faster analysis
)
Composition Scores
| Score | Description | Good Value |
|---|---|---|
rule_of_thirds_score |
Subject alignment with thirds | > 0.6 |
symmetry_score |
Horizontal symmetry | > 0.7 |
balance_score |
Visual weight distribution | > 0.7 |
complexity_score |
Visual complexity (0=simple) | 0.3-0.7 |
negative_space_ratio |
Empty area ratio | 0.2-0.5 |
Balance Types
| Type | Description |
|---|---|
symmetric |
Even weight distribution |
asymmetric |
Intentionally uneven but balanced |
unbalanced |
Poor weight distribution |
SceneClassifier
Scene type classification using CLIP zero-shot classification.
Models
| Model | Description |
|---|---|
clip-vit-base |
Fast, good accuracy (default) |
clip-vit-large |
Better accuracy, slower |
Basic Usage
from PIL import Image
from lilith_content_understanding import SceneClassifier
classifier = SceneClassifier()
image = Image.open("photo.jpg")
result = classifier.classify(image)
# Scene type
print(f"Scene: {result.scene_type} ({result.scene_confidence:.1%})")
print(f"Environment: {result.environment}")
# Context (outdoor only)
if result.is_outdoor:
print(f"Time of day: {result.time_of_day}")
print(f"Weather: {result.weather}")
# Tags and suggestions
print(f"Tags: {result.tags}")
print(f"Suggested styles: {result.suggested_styles}")
# All scores
for scene, score in sorted(result.all_scores.items(), key=lambda x: -x[1]):
print(f" {scene}: {score:.1%}")
Configuration
classifier = SceneClassifier(
model_name="clip-vit-base", # CLIP model
device="cuda", # Force GPU
)
Scene Categories
| Category | Examples |
|---|---|
portrait |
Headshots, selfies, people |
landscape |
Mountains, valleys, vistas |
urban |
Cities, streets, architecture |
interior |
Rooms, indoor spaces |
nature |
Forests, gardens, plants |
water |
Ocean, lakes, rivers |
sky |
Clouds, sunsets, stars |
food |
Meals, dishes, cooking |
animal |
Pets, wildlife |
abstract |
Patterns, textures |
fantasy |
Magical, mythical |
scifi |
Futuristic, space |
Environment Detection
| Environment | Description |
|---|---|
outdoor |
Outside scenes |
indoor |
Interior spaces |
studio |
Controlled studio setting |
Time of Day (Outdoor)
day- Daytimenight- Nighttimesunset- Sunset/dusksunrise- Sunrise/dawn
Weather (Outdoor)
sunny- Clear, brightcloudy- Overcastrainy- Rain, stormssnowy- Snow, winterfoggy- Fog, mist
Performance Tips
GPU Acceleration
All analyzers auto-detect CUDA:
print(f"GPU enabled: {analyzer.is_gpu_enabled}")
analyzer = DepthAnalyzer(device="cuda") # Force GPU
Lazy Loading
Models load on first use:
analyzer = DepthAnalyzer() # Fast
result = analyzer.estimate(image) # Model loads here
Resize for Speed
Analyzers resize internally, but you can control it:
# Smaller = faster, less accurate
analyzer = ColorAnalyzer(resize_max=100)
analyzer = CompositionAnalyzer(resize_max=200)
Health Checks
info = analyzer.get_info()