Production Checklist
Before deploying OpenGate IAM to production, complete every item in this checklist.
On this page
Security
- Replace in-memory RSA keys with Vault-backed key storage
- Change all default passwords — PostgreSQL, Redis, admin account
- Enable Redis AUTH — set a strong Redis password (
requirepass) - TLS/HTTPS everywhere — terminate SSL at load balancer or gateway
- Network isolation — microservices must not be publicly accessible; route through gateway only
- Disable Vault dev mode — use a production Vault cluster with proper unsealing
- Enable Kafka TLS + SASL — encrypt Kafka traffic
- Rate limiting — configure Redis-based rate limiting on the gateway
- Remove
localhostfrom CORS — replace with your actual domain
Never expose actuator publicly
Always restrict /actuator/** to internal networks. Use a separate management port (management.server.port: 9090) locked to private subnets.
Configuration
- Set CORS origins — replace
localhost:3000with your actual domain - Configure SMTP — set real SMTP credentials per realm
- Token lifespans — access token: 5 min, refresh token: 7 days (adjust per requirements)
- BCrypt cost factor — default 12; increase to 13–14 for stronger security
- MFA enforcement — enable
mfaRequired: trueon sensitive realms -
server.forward-headers-strategy: framework— required when behind a proxy
Infrastructure
- PostgreSQL replicas — set up read replicas for user-service and realm-service
- Redis Sentinel or Cluster — for session store high availability
- Kafka replication factor — set
replication.factor=3for all production topics - Persistent volumes — ensure all Docker / K8s volumes are backed by persistent storage
- Automated backups — daily
pg_dumpwith point-in-time recovery (PITR) enabled - PgBouncer — connection pooling in front of PostgreSQL
Observability
- Centralized logging — ship logs to ELK stack or Loki
- Metrics — Prometheus scraping all services on
/actuator/prometheus - Distributed tracing — OpenTelemetry + Jaeger / Tempo
- Alerting — alert on login failure spikes, service downtime, high P95 latency
- Audit trail — Kafka audit topics consumed and stored for compliance
Monitoring Endpoints
| Endpoint | Description |
|---|---|
GET /actuator/health | Service health + all dependencies |
GET /actuator/metrics | JVM, HTTP, DB metrics |
GET /actuator/prometheus | Prometheus-format metrics |
GET /actuator/info | Build info, version |
application.yml (production)yaml
management:
endpoints:
web:
exposure:
include: health,metrics,prometheus,info
server:
port: 9090 # expose actuator on internal port only
endpoint:
health:
show-details: never