Log Management
Centralized log collection, real-time streaming, and powerful search capabilities for monitoring your Supascale infrastructure and Supabase instances.
Log Management
Supascale's centralized logging system provides comprehensive visibility into your infrastructure operations, instance performance, and system events. This guide covers log collection, analysis, and monitoring capabilities.
Overview
The Supascale logging system offers:
- Centralized Collection: Logs from all agents and instances in one place
- Real-time Streaming: Live log tailing and monitoring
- Powerful Search: Filter and search across all log data
- Log Retention: Configurable retention policies
- Alert Integration: Log-based alerting and notifications
- Export Capabilities: Download and integrate with external tools
Log Sources
Agent Logs
The Supascale agent generates logs for all operations:
Agent Operations:
- Agent startup and shutdown
- API communication and polling
- Command execution and results
- Resource monitoring activities
- Error conditions and recovery
Command Execution:
- Instance deployment progress
- Start/stop/restart operations
- Configuration changes
- Backup operations
- System maintenance tasks
Example Agent Log Entry:
{ "timestamp": "2025-08-02T10:30:00Z", "level": "INFO", "source": "agent", "server_id": "srv_abc123", "message": "Command executed successfully", "details": { "command_id": "cmd_123", "command_type": "deploy_instance", "duration_ms": 45000, "instance_id": "inst_xyz789" } }
Instance Logs
Logs from all Supabase services within instances:
Database Logs (PostgreSQL):
- Connection events
- Query execution
- Error conditions
- Performance warnings
- Security events
API Logs (PostgREST):
- HTTP requests and responses
- Authentication events
- Query execution times
- Error responses
- Rate limiting events
Authentication Logs (GoTrue):
- User login/logout events
- Registration attempts
- Password resets
- Token generation
- Security violations
Storage Logs (Storage API):
- File upload/download operations
- Access control events
- Storage quota warnings
- Error conditions
Realtime Logs (Realtime Server):
- WebSocket connections
- Subscription events
- Broadcasting operations
- Connection errors
Example Instance Log Entry:
{ "timestamp": "2025-08-02T10:30:15Z", "level": "WARN", "source": "database", "instance_id": "inst_xyz789", "service": "postgresql", "message": "Slow query detected", "details": { "query_duration_ms": 5000, "query": "SELECT * FROM large_table WHERE...", "client_ip": "192.168.1.100", "user": "authenticated_user" } }
System Logs
Infrastructure and system-level events:
Server Events:
- System resource alerts
- Docker service events
- Network connectivity issues
- Disk space warnings
- Security events
Supascale Platform:
- User actions from dashboard
- API requests and responses
- Billing events
- System maintenance
- Security audits
Accessing Logs
Dashboard Interface
Navigate to Logs
- Go to Dashboard → Logs
- View real-time log stream
- Use filters to narrow results
Log View Options
Real-time Stream:
- Live updating log entries
- Auto-scroll to newest entries
- Pause/resume streaming
- Customizable refresh intervals
Historical Search:
- Search through archived logs
- Date range filtering
- Advanced search capabilities
- Export filtered results
Log Entry Details
- Click any log entry for full details
- View structured data and metadata
- Copy log entries or specific fields
- Link to related events
Filtering and Search
Quick Filters:
# Filter by log level level:ERROR # Filter by source source:database # Filter by instance instance_id:inst_xyz789 # Filter by server server_id:srv_abc123 # Filter by time range timestamp:[2025-08-02T10:00:00Z TO 2025-08-02T11:00:00Z]
Advanced Search Queries:
# Combine multiple filters level:ERROR AND source:database AND instance_id:inst_xyz789 # Text search in messages message:"connection failed" # Search in structured data details.query_duration_ms:>5000 # Wildcard searches message:*timeout* OR message:*connection* # Regular expressions message:/error.*connection.*database/i
Saved Searches:
saved_searches: critical_errors: query: "level:ERROR OR level:CRITICAL" description: "All critical errors across infrastructure" slow_queries: query: "source:database AND details.query_duration_ms:>1000" description: "Database queries taking longer than 1 second" authentication_issues: query: "source:auth AND (level:ERROR OR level:WARN)" description: "Authentication service issues"
Log Analysis
Performance Analysis
Query Performance Tracking:
# Find slow database queries source:database AND details.query_duration_ms:>5000 # API response time analysis source:api AND details.response_time_ms:>2000 # Authentication performance source:auth AND details.duration_ms:>1000
Resource Usage Patterns:
# Memory warnings message:*memory* AND level:WARN # Disk space issues message:*disk* AND (level:WARN OR level:ERROR) # Connection pool exhaustion message:*connection* AND message:*pool*
Error Analysis
Error Pattern Detection:
# Database connection errors source:database AND message:*connection* AND level:ERROR # API errors by endpoint source:api AND level:ERROR AND details.endpoint:"/auth/v1/token" # Instance deployment failures source:agent AND details.command_type:"deploy_instance" AND level:ERROR
Error Correlation:
error_correlation: # Group related errors - timespan: "5m" conditions: - "source:database AND level:ERROR" - "source:api AND level:ERROR" description: "Database errors affecting API" # Cascade failure detection - timespan: "10m" conditions: - "source:agent AND message:*docker*" - "instance_id:* AND level:ERROR" description: "Docker issues causing instance failures"
Security Analysis
Security Event Monitoring:
# Failed authentication attempts source:auth AND message:*failed* AND details.attempt_count:>3 # Suspicious API access source:api AND (details.status_code:401 OR details.status_code:403) # Unusual database access patterns source:database AND message:*unauthorized*
Audit Trail:
# User management actions source:auth AND (message:*user_created* OR message:*user_deleted*) # Configuration changes source:agent AND details.command_type:*config* # Data access patterns source:database AND details.query:*DELETE* OR details.query:*UPDATE*
Log-based Alerting
Alert Configuration
Create alerts based on log patterns:
Error Rate Alerts:
alerts: high_error_rate: query: "level:ERROR" condition: count: ">10" timespan: "5m" severity: "warning" message: "High error rate detected: {{count}} errors in 5 minutes" critical_database_errors: query: "source:database AND level:ERROR" condition: count: ">5" timespan: "1m" severity: "critical" message: "Critical database errors detected"
Performance Alerts:
alerts: slow_query_alert: query: "source:database AND details.query_duration_ms:>10000" condition: count: ">3" timespan: "10m" severity: "warning" message: "Multiple slow queries detected" api_response_time: query: "source:api AND details.response_time_ms:>5000" condition: count: ">10" timespan: "5m" severity: "warning" message: "API response time degradation"
Security Alerts:
alerts: brute_force_detection: query: "source:auth AND message:*failed_login*" condition: count: ">20" timespan: "10m" group_by: "details.client_ip" severity: "critical" message: "Potential brute force attack from {{details.client_ip}}" unusual_database_activity: query: "source:database AND (details.query:*DROP* OR details.query:*TRUNCATE*)" condition: count: ">1" timespan: "1m" severity: "critical" message: "Potentially dangerous database operations detected"
Alert Notifications
Notification Channels:
notifications: email: enabled: true recipients: - "ops-team@company.com" - "security@company.com" alert_types: - "critical" - "warning" template: | Alert: {{alert_name}} Severity: {{severity}} Query: {{query}} Count: {{count}} events in {{timespan}} Recent events: {{#each recent_events}} - {{timestamp}}: {{message}} {{/each}} slack: enabled: true webhook_url: "https://hooks.slack.com/services/..." channel: "#alerts" message_format: | 🚨 *{{alert_name}}* Severity: {{severity}} {{count}} events matching: `{{query}}` <{{dashboard_url}}|View in Dashboard> webhook: enabled: true url: "https://your-api.com/alerts" headers: Authorization: "Bearer token" payload: | { "alert": "{{alert_name}}", "severity": "{{severity}}", "query": "{{query}}", "count": {{count}}, "timespan": "{{timespan}}", "events": {{recent_events}} }
Log Retention and Management
Retention Policies
Configure how long logs are stored:
Retention Configuration:
retention: # Agent logs agent: duration: "30d" compression: true archive_location: "s3://logs-archive/agent/" # Instance logs by service instance: database: "90d" # Keep DB logs longer for analysis api: "30d" # API logs for debugging auth: "180d" # Auth logs for security compliance storage: "30d" # Storage operation logs realtime: "7d" # Realtime logs (high volume) # System logs system: duration: "60d" critical_events: "1y" # Keep critical events longer
Automatic Cleanup:
cleanup: enabled: true schedule: "0 2 * * *" # Daily at 2 AM policies: # Compress old logs - age: "7d" action: "compress" compression: "gzip" # Archive to cold storage - age: "30d" action: "archive" destination: "s3://logs-archive/" # Delete very old logs - age: "365d" action: "delete" confirm: true
Log Export
Bulk Export:
# Export logs for specific time range supascale logs export \ --start "2025-08-01T00:00:00Z" \ --end "2025-08-02T00:00:00Z" \ --format "json" \ --output "/tmp/logs-export.json.gz" # Export specific log types supascale logs export \ --source "database" \ --level "ERROR" \ --instance "inst_xyz789" \ --format "csv"
Streaming Export:
# Stream logs to external system supascale logs stream \ --query "level:ERROR" \ --destination "syslog://logs.company.com:514" # Real-time export to file supascale logs stream \ --follow \ --output "/var/log/supascale.log"
Integration with External Tools
Log Shipping
Elasticsearch Integration:
integrations: elasticsearch: enabled: true hosts: - "https://elasticsearch.company.com:9200" authentication: type: "api_key" api_key: "base64-encoded-key" index_template: "supascale-logs-{date}" mapping: timestamp: "@timestamp" level: "log.level" source: "service.name" message: "message"
Splunk Integration:
integrations: splunk: enabled: true hec_endpoint: "https://splunk.company.com:8088/services/collector" hec_token: "your-hec-token" source_type: "supascale" index: "infrastructure" metadata: environment: "production" datacenter: "us-west-2"
Datadog Integration:
integrations: datadog: enabled: true api_key: "your-datadog-api-key" site: "datadoghq.com" tags: - "environment:production" - "service:supascale" log_processing: multiline: true pipeline: "supascale-logs"
Syslog Integration
Syslog Configuration:
syslog: enabled: true facility: "local0" severity_mapping: DEBUG: "debug" INFO: "info" WARN: "warning" ERROR: "err" CRITICAL: "crit" destinations: - protocol: "tcp" host: "syslog.company.com" port: 514 format: "rfc5424" - protocol: "udp" host: "backup-syslog.company.com" port: 514 format: "rfc3164"
Troubleshooting with Logs
Common Debugging Scenarios
Instance Deployment Issues:
# Find deployment errors source:agent AND details.command_type:"deploy_instance" AND level:ERROR # Check specific instance deployment source:agent AND details.instance_id:"inst_xyz789" AND details.command_type:"deploy_instance" # Docker-related deployment issues source:agent AND message:*docker* AND level:ERROR
Performance Issues:
# Database performance problems source:database AND (details.query_duration_ms:>5000 OR message:*slow*) # Memory pressure indicators message:*memory* AND (level:WARN OR level:ERROR) # Connection pool issues message:*connection* AND message:*pool* AND level:ERROR
Authentication Problems:
# Login failures source:auth AND message:*login* AND level:ERROR # Token validation issues source:auth AND message:*token* AND (level:WARN OR level:ERROR) # JWT-related problems source:auth AND message:*jwt* AND level:ERROR
Log Analysis Best Practices
Structured Logging:
- Use consistent log formats across services
- Include relevant context in log entries
- Use appropriate log levels for different events
- Include correlation IDs for request tracing
Search Optimization:
- Use specific queries instead of broad searches
- Combine multiple filters for precise results
- Use time ranges to limit search scope
- Save frequently used search patterns
Performance Monitoring:
- Monitor log ingestion rates
- Track search performance
- Set up alerts for log volume spikes
- Regular cleanup of old logs
API Access
Logs API
Programmatic access to log data:
Search Logs:
GET /api/v1/logs/search
{ "query": "level:ERROR AND source:database", "start_time": "2025-08-02T10:00:00Z", "end_time": "2025-08-02T11:00:00Z", "limit": 100, "sort": "-timestamp" }
Stream Logs:
GET /api/v1/logs/stream
{ "query": "instance_id:inst_xyz789", "follow": true, "buffer_size": 1000 }
Export Logs:
POST /api/v1/logs/export
{ "query": "source:agent", "start_time": "2025-08-01T00:00:00Z", "end_time": "2025-08-02T00:00:00Z", "format": "json", "compression": "gzip" }
Next: Explore Advanced Use Cases for complex deployment scenarios.