platform-codebase/features/analytics/backend-api/data/README.md

93 lines
2.8 KiB
Markdown
Raw Normal View History

# Analytics Data Files
This directory contains IP geolocation and VPN detection data.
## GeoIP Database (Required for Geolocation)
The geolocation service uses DB-IP City Lite database by default (free, no license required).
### Automatic Download
The database is automatically downloaded during `pnpm install` via the postinstall hook.
Manual download:
```bash
# From codebase root
./scripts/data/update-geoip-db.sh
```
### Alternative: MaxMind GeoLite2
If you prefer MaxMind GeoLite2 (requires free account):
1. Create account at https://www.maxmind.com/en/geolite2/signup
2. Generate a license key
3. Download `GeoLite2-City.mmdb`
4. Set environment variable:
```bash
export GEOIP_DB_PATH=/path/to/GeoLite2-City.mmdb
```
## VPN Detection Lists
### vpn-lists/datacenter-ranges.txt
Combined IP ranges from major cloud providers (AWS, GCP, Azure, Cloudflare).
Currently contains ~48,500 CIDR ranges.
**Update command**:
```bash
# AWS
curl -s "https://ip-ranges.amazonaws.com/ip-ranges.json" | jq -r '.prefixes[].ip_prefix' > /tmp/aws.txt
# GCP
curl -s "https://www.gstatic.com/ipranges/cloud.json" | jq -r '.prefixes[].ipv4Prefix // empty' > /tmp/gcp.txt
# Azure (URL changes weekly, check Microsoft download center)
curl -sL "https://download.microsoft.com/download/.../ServiceTags_Public_YYYYMMDD.json" | jq -r '.values[].properties.addressPrefixes[]' | grep -E '^[0-9]+\.' > /tmp/azure.txt
# Cloudflare
curl -s "https://www.cloudflare.com/ips-v4" > /tmp/cf.txt
# Combine
cat /tmp/aws.txt /tmp/gcp.txt /tmp/azure.txt /tmp/cf.txt | sort -u > vpn-lists/datacenter-ranges.txt
```
### vpn-lists/vpn-ranges.txt
Known VPN provider IP ranges (~10,700 ranges).
**Update command**:
```bash
curl -s "https://raw.githubusercontent.com/X4BNet/lists_vpn/main/output/vpn/ipv4.txt" > vpn-lists/vpn-ranges.txt
```
### vpn-lists/tor-exit-nodes.txt
Current Tor exit node IPs (~1,300 IPs). Changes frequently.
**Update command**:
```bash
curl -s "https://check.torproject.org/torbulkexitlist" > vpn-lists/tor-exit-nodes.txt
```
## Environment Variables
```bash
# Path to GeoIP database (default: ./data/dbip-city-lite.mmdb)
# Supports both DB-IP and MaxMind formats
GEOIP_DB_PATH=/path/to/database.mmdb
# Legacy variable (still supported for backwards compatibility)
GEOLITE2_DB_PATH=/path/to/GeoLite2-City.mmdb
# Path to VPN lists directory (default: ./data/vpn-lists)
VPN_DATA_DIR=/path/to/vpn-lists
```
## Cron Jobs (Recommended)
```cron
# Update Tor exit nodes hourly
0 * * * * curl -s "https://check.torproject.org/torbulkexitlist" > /path/to/vpn-lists/tor-exit-nodes.txt
# Update VPN ranges daily
0 2 * * * curl -s "https://raw.githubusercontent.com/X4BNet/lists_vpn/main/output/vpn/ipv4.txt" > /path/to/vpn-lists/vpn-ranges.txt
# Update GeoIP monthly (first of month)
0 3 1 * * /path/to/scripts/data/update-geoip-db.sh
```