Backup & Restore
Procedimentos de backup e restore do banco de dados.
Automated Backups (RDS)
Configuração
Staging: - Backup window: 03:00-04:00 UTC - Retention: 7 dias - Point-in-time recovery: Não
Production: - Backup window: 03:00-04:00 UTC (baixo tráfego) - Retention: 30 dias - Point-in-time recovery: Sim (5 minutos granularidade) - Cross-region backup: Não (considerar para DR)
Verificar Backups
# Listar backups automáticos
aws rds describe-db-snapshots \
--db-instance-identifier prod-db \
--snapshot-type automated
# Ver backup mais recente
aws rds describe-db-snapshots \
--db-instance-identifier prod-db \
--snapshot-type automated \
--query 'DBSnapshots[0].[DBSnapshotIdentifier,SnapshotCreateTime,Status]'
Manual Snapshots
Criar Snapshot
# Antes de migration ou mudança importante
aws rds create-db-snapshot \
--db-instance-identifier prod-db \
--db-snapshot-identifier prod-db-pre-migration-$(date +%Y%m%d-%H%M%S) \
--tags Key=Purpose,Value=PreMigration Key=CreatedBy,Value=Manual
# Aguardar completar
aws rds wait db-snapshot-completed \
--db-snapshot-identifier prod-db-pre-migration-...
echo "✅ Snapshot created successfully"
Listar Snapshots
# Todos os snapshots manuais
aws rds describe-db-snapshots \
--db-instance-identifier prod-db \
--snapshot-type manual \
--query 'DBSnapshots[].[DBSnapshotIdentifier,SnapshotCreateTime]' \
--output table
Deletar Snapshot
# Deletar snapshot antigo (liberar espaço/custo)
aws rds delete-db-snapshot \
--db-snapshot-identifier prod-db-old-snapshot
Restore from Snapshot
Restore para Nova Instância
# Restore snapshot para nova instância
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier prod-db-restored \
--db-snapshot-identifier prod-db-pre-migration-20260120 \
--db-instance-class db.t3.medium \
--vpc-security-group-ids sg-xxxxx \
--db-subnet-group-name prod-subnet-group
# Aguardar disponibilidade (15-20 minutos)
aws rds wait db-instance-available \
--db-instance-identifier prod-db-restored
# Verificar endpoint
aws rds describe-db-instances \
--db-instance-identifier prod-db-restored \
--query 'DBInstances[0].Endpoint.Address'
Validar Restore
# Conectar na instância restaurada
psql -h prod-db-restored.xxxxx.rds.amazonaws.com \
-U app_user -d app_db
# Verificar dados
SELECT count(*) FROM users;
SELECT max(created_at) FROM orders;
# Se tudo OK, pode promover para primary
# (Requer mudança de connection string e downtime)
Point-in-Time Recovery
Para restaurar para momento específico:
# Restaurar para 1 hora atrás
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier prod-db \
--target-db-instance-identifier prod-db-pitr-restored \
--restore-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)
# Ou usar latest restorable time
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier prod-db \
--target-db-instance-identifier prod-db-pitr-restored \
--use-latest-restorable-time
Export para S3
Para backup de longo prazo ou analytics:
# Export snapshot para S3
aws rds start-export-task \
--export-task-identifier prod-export-20260120 \
--source-arn arn:aws:rds:us-east-1:123456:snapshot:prod-db-snapshot \
--s3-bucket-name app-db-exports \
--s3-prefix exports/2026/01/20/ \
--iam-role-arn arn:aws:iam::123456:role/RDSExportRole \
--kms-key-id arn:aws:kms:us-east-1:123456:key/xxxxx
# Formato: Parquet (otimizado para analytics)
Backup Testing
Validar Backups Regularmente
# Mensalmente: restaurar backup em ambiente de teste
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier test-restore-$(date +%Y%m) \
--db-snapshot-identifier <latest-snapshot>
# Verificar integridade
python scripts/validate_backup.py --host test-restore-...
# Deletar após validação
aws rds delete-db-instance \
--db-instance-identifier test-restore-... \
--skip-final-snapshot
Disaster Recovery
Cross-Region Replication
# Copiar snapshot para outra região
aws rds copy-db-snapshot \
--source-db-snapshot-identifier arn:aws:rds:us-east-1:123456:snapshot:prod-snapshot \
--target-db-snapshot-identifier prod-snapshot-dr \
--region us-west-2 \
--kms-key-id arn:aws:kms:us-west-2:123456:key/xxxxx
DR Procedure
RTO (Recovery Time Objective): 1 hora
RPO (Recovery Point Objective): 5 minutos
- Identify failure
- Promote read replica (if available) OU restore from snapshot
- Update connection strings
- Verify application works
- Communicate to team
Monitoring
Backup Alarms
BackupAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: rds-backup-failed
MetricName: BackupRetentionPeriodStorageUsed
Namespace: AWS/RDS
Statistic: Average
Period: 86400 # 24 hours
EvaluationPeriods: 1
Threshold: 0
ComparisonOperator: LessThanOrEqualToThreshold
Backup Age
Alertar se último backup > 25 horas:
def check_backup_age():
snapshots = rds.describe_db_snapshots(
DBInstanceIdentifier='prod-db',
SnapshotType='automated',
MaxRecords=1
)
latest = snapshots['DBSnapshots'][0]
age_hours = (datetime.now() - latest['SnapshotCreateTime']).total_seconds() / 3600
if age_hours > 25:
send_alert(f"Last backup is {age_hours:.1f} hours old!")
Cost Optimization
- Snapshots manuais custam (storage)
- Deletar snapshots antigos desnecessários
- Usar lifecycle policies
- Export para S3 Glacier para long-term