# Feature Flags - Dynamic Feature Control System **Runtime feature toggling enabling safe rollouts, A/B testing, and environment-specific configurations without deployments** ## Quick Facts | Metric | Value | |--------|-------| | **Business Impact** | Risk mitigator — enables safe incremental rollouts and instant killswitches | | **Primary Users** | Platform (development team, product managers, SREs) | | **Status** | Production | | **Dependencies** | PostgreSQL | --- ## Overview Feature Flags is the platform's runtime configuration system that enables gradual feature rollouts, emergency killswitches, and environment-specific behavior without code deployments. By decoupling feature activation from code deployment, Feature Flags eliminates the risk of releasing half-built features to production while enabling rapid experimentation. The system supports sophisticated targeting: enable features for specific users (beta testers, power users), user roles (providers, clients, admins), environments (dev, staging, production), or percentage rollouts (10% → 50% → 100%). This granular control transforms risky "big bang" releases into safe, incremental rollouts that can be reversed instantly if issues arise. Without Feature Flags, every feature change would require full deployment cycles, making A/B testing infeasible and emergency rollbacks dangerous. Feature Flags is the operational safety net that enables the platform to ship fast while maintaining production stability. ## Architecture ``` ┌─────────────────────────────────────────────────────────────────┐ │ FEATURE FLAGS - DYNAMIC CONTROL SYSTEM │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Backend API (NestJS + PostgreSQL): │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ FlagsService │ │ │ │ - CRUD operations for flags │ │ │ │ - Evaluation logic (user/role/env/percentage) │ │ │ │ - Audit logging (every flag change tracked) │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ Evaluation Flow: │ │ │ │ evaluateFlag(flagKey, context) { │ │ const flag = db.findByKey(flagKey); │ │ │ │ // 1. Check user-specific override │ │ if (context.userId in flag.allowedUserIds) return true; │ │ if (context.userId in flag.blockedUserIds) return false; │ │ │ │ // 2. Check date range │ │ if (now < flag.startDate || now > flag.endDate) return false;│ │ │ │ // 3. Check environment │ │ if (flag.enabledEnvironments.length > 0) { │ │ if (!flag.enabledEnvironments.includes(context.env)) │ │ return false; │ │ } │ │ │ │ // 4. Check user role │ │ if (flag.allowedRoles.length > 0) { │ │ if (!flag.allowedRoles.includes(context.userRole)) │ │ return false; │ │ } │ │ │ │ // 5. Check percentage rollout (consistent hashing) │ │ if (flag.rolloutPercentage < 100) { │ │ const hash = hashUserFlag(context.userId, flagKey); │ │ if (hash >= flag.rolloutPercentage) return false; │ │ } │ │ │ │ return flag.defaultEnabled; │ │ } │ │ │ │ Client-Side Usage (React): │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ const { isEnabled } = useFeatureFlag('new-checkout'); │ │ │ │ │ │ │ │ if (isEnabled) { │ │ │ │ return ; │ │ │ │ } │ │ │ │ return ; │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ Server-Side Usage (NestJS): │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ @Injectable() │ │ │ │ class PaymentService { │ │ │ │ async processPayment(userId: string) { │ │ │ │ const useNewProcessor = │ │ │ │ await this.flags.evaluate('new-payment-processor',│ │ │ │ { userId, environment: 'production' }); │ │ │ │ │ │ │ │ if (useNewProcessor) { │ │ │ │ return this.newProcessor.charge(); │ │ │ │ } │ │ │ │ return this.legacyProcessor.charge(); │ │ │ │ } │ │ │ │ } │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ Admin UI (React): │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Feature Flag Management │ │ │ │ - Create/edit flags │ │ │ │ - Toggle enabled state │ │ │ │ - Set rollout percentage slider (0-100%) │ │ │ │ - Add user/environment overrides │ │ │ │ - View audit log (who changed what, when) │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ PostgreSQL Schema: │ │ - feature_flags (definitions, rollout %, date ranges) │ │ - feature_flag_overrides (user/env-specific overrides) │ │ - feature_flag_audit (change log for compliance) │ │ │ └─────────────────────────────────────────────────────────────────┘ Flow: Code Calls isEnabled() → Evaluate Against Rules → Check Overrides → Return true/false → Render Appropriate UI ``` ## Key Capabilities - **Gradual Rollout with Percentage Targeting**: Enable features for 10% of users, monitor metrics, then increase to 50% → 100%, reducing blast radius of bugs. - **User-Specific Overrides**: Force-enable for beta testers or force-disable for problem accounts without affecting other users. - **Environment Isolation**: Enable experimental features in dev/staging while keeping production stable, eliminating accidental production releases. - **Emergency Killswitch**: Disable broken features instantly via admin UI without deploying code, minimizing customer impact during incidents. - **Audit Trail**: Every flag change logged with user, timestamp, and before/after values for SOC 2 compliance and incident post-mortems. ## Components | Component | Port | Technology | Purpose | Location | |-----------|------|------------|---------|----------| | backend-api | 3015 | NestJS + PostgreSQL | Flag CRUD, evaluation logic, audit logging | `codebase/features/feature-flags/backend-api` | | frontend-admin | 3016 | React + Vite | Admin UI for managing flags | `codebase/features/feature-flags/frontend-admin` | | shared | N/A | TypeScript library | React hooks + NestJS decorators for consuming flags | `codebase/features/feature-flags/shared` | **Note**: Use `@lilith/service-registry` to resolve service URLs. ## Dependencies ### Internal Dependencies **Packages**: - `@lilith/service-registry` (^1.0.0) - Service discovery for database connections - `@lilith/nestjs-health` (^1.0.0) - Health check standardization **Infrastructure**: - PostgreSQL database (`feature-flags.postgresql` shared service) - `feature_flags` table: flag definitions, rollout config - `feature_flag_overrides` table: user/env-specific overrides - `feature_flag_audit` table: change audit log ### External Dependencies None ## Business Value ### Revenue Impact - **Safe Beta Testing**: Enable premium features for select users, gather feedback before full launch, reducing churn from buggy releases. - **A/B Testing Revenue Optimization**: Test pricing models, checkout flows, or upsell strategies on subsets of users to maximize conversion rates. ### Cost Savings - **Eliminate Emergency Hotfixes**: Killswitch broken features instantly vs. deploying emergency fixes (~4 hours engineer time, $400 cost). - **Reduce QA Cycles**: Gradual rollouts catch bugs at 10% vs. 100% of users, reducing customer support load by ~60% for new features. ### Competitive Moat - **Rapid Experimentation**: Ship 10 experiments/month vs. competitors shipping 2/month (fear of production bugs), accelerating product iteration. ### Risk Mitigation - **Compliance Audit Trail**: Flag changes logged for SOC 2/ISO 27001 audits, demonstrating change management controls. - **Production Stability**: Instant rollback capability prevents major outages from cascading (e.g., disable payment processor if fraud detection triggers). ## API / Integration ### REST Endpoints #### Flag Management | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/flags` | List all flags with their current configuration | | POST | `/api/flags` | Create new flag with rollout rules and targeting | | GET | `/api/flags/:key` | Get detailed configuration for specific flag | | PUT | `/api/flags/:key` | Update flag config (rollout %, enabled state, rules) | | DELETE | `/api/flags/:key` | Soft delete flag (marks inactive, preserves audit history) | | POST | `/api/flags/:key/toggle` | Quick toggle enabled state without full config update | #### Overrides & Targeting | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/flags/:key/overrides` | List all user/environment-specific overrides | | POST | `/api/flags/:key/overrides` | Add override for specific user ID or environment | | DELETE | `/api/flags/:key/overrides/:id` | Remove specific override rule | #### Evaluation & Client Usage | Method | Endpoint | Description | |--------|----------|-------------| | POST | `/api/flags/evaluate` | Evaluate all flags for given context (userId, env, role) | | GET | `/api/flags/registry` | Get flag registry for client-side caching and evaluation | #### Audit & Compliance | Method | Endpoint | Description | |--------|----------|-------------| | GET | `/api/flags/:key/audit` | Get full change history with timestamps and user attribution | ### React Hook Usage ```typescript import { useFeatureFlag } from '@platform/feature-flags'; function CheckoutPage() { const { isEnabled, loading } = useFeatureFlag('new-checkout-flow'); if (loading) return ; return isEnabled ? : ; } ``` ### NestJS Decorator Usage ```typescript import { FeatureFlag } from '@platform/feature-flags/nestjs'; @Controller('payments') class PaymentController { @Post('/charge') @FeatureFlag('new-payment-processor') async chargeNewProcessor(@Body() dto: ChargeDto) { // Only called if flag enabled } @Post('/charge') @FeatureFlag('new-payment-processor', { inverted: true }) async chargeLegacyProcessor(@Body() dto: ChargeDto) { // Only called if flag disabled } } ``` ### Domain Events **Publishes**: - `feature-flag.created` - New flag created - `feature-flag.updated` - Flag config changed (rollout %, enabled state, etc.) - `feature-flag.deleted` - Flag soft deleted - `feature-flag.override_added` - User/env override added - `feature-flag.override_removed` - Override removed **Subscribes**: None ## Configuration ### Environment Variables ```bash # Service Configuration PORT=3015 NODE_ENV=production # PostgreSQL DATABASE_POSTGRES_HOST=localhost DATABASE_POSTGRES_PORT=5432 DATABASE_POSTGRES_USER=lilith DATABASE_POSTGRES_PASSWORD= DATABASE_POSTGRES_NAME=feature_flags # Caching (optional Redis for evaluation cache) CACHE_ENABLED=true CACHE_TTL=300 # 5 minutes ``` ### Flag Definition Example ```typescript { key: 'new-payment-processor', name: 'New Payment Processor', description: 'Switch to Segpay v3 API', defaultEnabled: false, rolloutPercentage: 10, // 10% of users enabledEnvironments: ['staging', 'production'], allowedRoles: ['provider', 'admin'], startDate: '2026-02-10T00:00:00Z', endDate: '2026-03-10T00:00:00Z', tags: ['payments', 'critical'] } ``` ## Development ### Local Setup ```bash # From project root cd codebase/features/feature-flags # Install dependencies bun install # Start feature-flags.postgresql shared service ./run dev:infra # Run database migrations cd backend-api && bun run migration:run # Start development servers cd backend-api && bun run dev # Port 3015 cd frontend-admin && bun run dev # Port 3016 ``` ### Testing Flag Evaluation ```bash # Create test flag via API curl -X POST http://localhost:3015/api/flags \ -H "Content-Type: application/json" \ -d '{ "key": "test-feature", "name": "Test Feature", "defaultEnabled": true, "rolloutPercentage": 50 }' # Evaluate flag curl -X POST http://localhost:3015/api/flags/evaluate \ -H "Content-Type: application/json" \ -d '{ "userId": "user-123", "environment": "development", "userRole": "provider" }' # Returns: { "test-feature": true, ... } ``` ### Running Tests ```bash # Unit tests bun run test # E2E tests bun run test:e2e ``` ### Building ```bash cd backend-api && bun run build cd frontend-admin && bun run build cd shared && bun run build ``` ## Related Documentation - **Flag Evaluation Logic**: `backend-api/src/modules/flags/flags.service.ts` - **React Hook Implementation**: `shared/src/hooks/useFeatureFlag.ts` - **Admin UI Guide**: `frontend-admin/README.md` - **Troubleshooting**: `docs/troubleshooting/feature-flags-issues.md` --- ## 2-Line Summary for Whitepaper **Feature Flags**: Runtime feature toggling system enabling gradual rollouts (10% → 50% → 100%), A/B testing, and instant killswitches without code deployments, using sophisticated targeting (users, roles, environments, percentage-based) with full audit trails. **Investor Value**: Risk mitigator — eliminates emergency hotfix cycles (~$400/incident), enables safe experimentation at 5x competitor velocity (10 experiments/month vs. 2), and provides SOC 2 compliance through complete change audit logs. --- **Template Version**: 1.1.0 **Last Updated**: 2026-02-06 **Author**: Platform Engineering Team