Deployment Guide
Comprehensive guide for deploying DocuStack infrastructure using Terragrunt/Terraform with Terrateam.
Overview
Deployment Philosophy
DocuStack follows a GitOps-first approach to infrastructure management:
| Principle | Implementation | Why |
|---|---|---|
| Infrastructure as Code | 100% Terragrunt/Terraform | Reproducible, auditable, version-controlled |
| PR-Based Changes | All changes go through pull requests | Review before deploy, automated validation |
| Environment Isolation | Separate AWS accounts per environment | Blast radius containment, security boundaries |
| Automated Validation | Plans run automatically, applies require approval | Catch issues early, prevent accidents |
| Audit Trail | Git history + Terrateam logs | Compliance, debugging, accountability |
Deployment Methods
| Method | Use Case | Speed |
|---|---|---|
| Terrateam (Primary) | All standard infrastructure changes | 5-15 min |
| Manual | Bootstrap, debugging, emergency fixes | Varies |
| Scheduled | Nightly teardown, weekend savings | Automated |
Workflow Overview
Developer → Feature Branch → PR → Auto Plan → Review → Apply → Merge
│
Terrateam posts
plan to PR comments
Prerequisites
AWS Credentials
Configure AWS credentials using IAM Identity Center (SSO):
# Configure SSO profile
aws configure sso
# Profile name: docustack-dev (or staging, prod)
# SSO start URL: https://your-org.awsapps.com/start
# SSO Region: us-east-1
# Account ID: <target-account-id>
# Role: AdministratorAccess (or appropriate role)
# Login before running commands
aws sso login --profile docustack-dev
# Set default profile
export AWS_PROFILE=docustack-dev
Required Tools
| Tool | Version | Installation |
|---|---|---|
| Terraform | >= 1.5.0 | brew install terraform |
| Terragrunt | >= 0.50.0 | brew install terragrunt |
| AWS CLI | >= 2.0 | brew install awscli |
| Git | >= 2.0 | brew install git |
Verify installations:
terraform version # >= 1.5.0
terragrunt version # >= 0.50.0
aws --version # >= 2.0
GitHub Access
For Terrateam PR-based deployments:
- Repository access: Push access to
docustack-infrastructure-liverepository - Team membership: Member of
infrateam (for applies) - Leads team: Member of
leadsteam (for production applies)
Initial Deployment
Deployment Order
Initial deployment must follow this order due to dependencies:
1. Bootstrap (State Backend)
↓
2. Management Account (SCPs, CloudTrail)
↓
3. Dev Environment
├── VPC
├── ECS Cluster
├── RDS
├── Temporal
└── Terrateam
↓
4. Staging Environment (same order as Dev)
↓
5. Production Environment (same order as Dev)
Step 1: Bootstrap State Backend
The bootstrap module creates S3 bucket and DynamoDB table for Terraform state.
cd docustack-infrastructure-live/bootstrap
# Update account_id in terragrunt.hcl if needed
vim terragrunt.hcl
# Initialize and apply (first run uses local state)
terragrunt init
terragrunt apply
# After apply, migrate state to S3
# Uncomment remote_state block in terragrunt.hcl, then:
terragrunt init -migrate-state
Expected resources:
- S3 bucket:
docustack-terraform-state-<account-id> - DynamoDB table:
docustack-terraform-locks
Step 2: Deploy Dev Environment
Deploy modules in dependency order:
export AWS_PROFILE=docustack-dev
cd docustack-infrastructure-live/dev/us-east-1
# 1. VPC (no dependencies)
cd vpc && terragrunt apply
# 2. ECS Cluster (depends on VPC)
cd ../ecs-cluster && terragrunt apply
# 3. RDS (depends on VPC)
cd ../rds && terragrunt apply
# 4. Temporal (depends on VPC, ECS, RDS)
cd ../temporal && terragrunt apply
Alternative: Deploy all at once
cd docustack-infrastructure-live/dev/us-east-1
terragrunt run-all apply
run-all respects dependencies defined in terragrunt.hcl files.
PR-Based Deployment (Terrateam)
Overview
Terrateam is the primary deployment method for all infrastructure changes after initial setup.
Step-by-Step Guide
1. Create Feature Branch
git checkout main
git pull origin main
git checkout -b feature/add-new-service
2. Make Infrastructure Changes
# Edit Terraform/Terragrunt files
vim docustack-infrastructure-live/dev/us-east-1/new-service/terragrunt.hcl
3. Push and Open PR
git add .
git commit -m "feat: add new service infrastructure"
git push origin feature/add-new-service
Open a Pull Request on GitHub.
4. Review Auto-Generated Plan
Terrateam automatically runs terragrunt plan and posts output:
## Terrateam Plan
**Directory:** `dev/us-east-1/new-service`
### Plan Output
Terraform will perform the following actions:
# aws_ecs_service.new_service will be created
+ resource "aws_ecs_service" "new_service" {
+ cluster = "arn:aws:ecs:us-east-1:123456789:cluster/docustack-dev"
+ desired_count = 2
+ name = "new-service"
...
}
Plan: 1 to add, 0 to change, 0 to destroy.
5. Apply Changes
After PR approval, comment to apply:
terrateam apply
For specific directories:
terrateam apply dev/us-east-1/new-service
6. Merge PR
After successful apply, merge the PR to main.
Terrateam Commands
| Command | Description |
|---|---|
terrateam plan | Re-run plan for all changed directories |
terrateam plan <dir> | Plan specific directory |
terrateam apply | Apply all planned changes |
terrateam apply <dir> | Apply specific directory |
terrateam unlock | Release state lock |
terrateam help | Show available commands |
Access Control
| Environment | Plan | Apply |
|---|---|---|
| Dev | Anyone | infra team |
| Staging | Anyone | infra team |
| Production | infra team | leads team |
| Management | infra team | leads team |
Manual Deployment
When to Use Manual Deployment
- Initial bootstrap (before Terrateam exists)
- Debugging Terrateam issues
- Emergency fixes
- Local development/testing
Single Module Deployment
cd docustack-infrastructure-live/dev/us-east-1/vpc
# Preview changes
terragrunt plan
# Apply changes
terragrunt apply
# Apply with auto-approve (use with caution)
terragrunt apply -auto-approve
All Modules Deployment
cd docustack-infrastructure-live/dev/us-east-1
# Plan all modules
terragrunt run-all plan
# Apply all modules (respects dependencies)
terragrunt run-all apply
Targeted Deployment
Apply specific resources within a module:
cd docustack-infrastructure-live/dev/us-east-1/ecs-cluster
# Target specific resource
terragrunt apply -target=aws_ecs_cluster.main
# Target multiple resources
terragrunt apply \
-target=aws_ecs_cluster.main \
-target=aws_iam_role.ecs_task_execution
Environment Promotion
Promotion Workflow
DEV STAGING PRODUCTION
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Feature │──────▶│ Integration│─────────▶│ Live │
│ Testing │ │ Testing │ │ Workloads │
└─────────────┘ └─────────────┘ └─────────────┘
Automated Automated Manual Approval
on merge on merge to Required
staging branch
Dev → Staging
- Test in Dev: Verify changes work in dev environment
- Create PR: Open PR targeting
stagingbranch - Auto Plan: Terrateam plans staging changes
- Review: Team reviews plan output
- Apply: Comment
terrateam apply - Merge: Merge to staging branch
Staging → Production
- Integration Testing: Run full test suite against staging
- Create PR: Open PR targeting
mainwith production changes - Auto Plan: Terrateam plans production changes
- Security Review: Security team reviews (if applicable)
- Approval: Requires
leadsteam approval - Apply: Comment
terrateam apply - Merge: Merge to main
Approval Requirements
| Environment | Approvers | Minimum Approvals |
|---|---|---|
| Dev | Any team member | 0 (auto-approve allowed) |
| Staging | infra team | 1 |
| Production | leads team | 1 |
Rollback Procedures
Git Revert + Apply (Recommended)
The safest rollback method:
# 1. Identify the commit to revert
git log --oneline dev/us-east-1/ecs-cluster
# 2. Create revert commit
git revert <commit-hash>
# 3. Push and open PR
git push origin revert-branch
# 4. Terrateam will plan the revert
# 5. Review and apply via PR
Terraform State Rollback
If an apply causes issues, you can roll back using Terraform state:
# List state versions (S3 versioning)
aws s3api list-object-versions \
--bucket docustack-terraform-state-<account-id> \
--prefix dev/us-east-1/ecs-cluster/terraform.tfstate
# Download previous state version
aws s3api get-object \
--bucket docustack-terraform-state-<account-id> \
--key dev/us-east-1/ecs-cluster/terraform.tfstate \
--version-id <previous-version-id> \
previous-state.tfstate
# Push previous state (use with extreme caution)
terragrunt state push previous-state.tfstate
Emergency Procedures
For critical production issues:
Option 1: Immediate Manual Rollback
# 1. Authenticate to production
export AWS_PROFILE=docustack-prod
# 2. Navigate to affected module
cd docustack-infrastructure-live/prod/us-east-1/ecs-cluster
# 3. Revert to previous commit locally
git checkout HEAD~1 -- .
# 4. Apply immediately
terragrunt apply -auto-approve
# 5. Create PR to sync main branch
git checkout -b emergency/rollback-ecs
git add .
git commit -m "emergency: rollback ECS cluster changes"
git push origin emergency/rollback-ecs
Option 2: AWS Console (Last Resort)
For immediate mitigation while preparing proper rollback:
- ECS: Update service desired count to 0, then restore
- RDS: Restore from automated backup
- ALB: Update target group health checks
Document all console changes and apply via Terraform afterward to prevent drift.
Rollback Checklist
- Identify the problematic change
- Assess impact and urgency
- Choose rollback method (state, git revert, emergency)
- Notify team via Slack
#infra-alerts - Execute rollback
- Verify services are healthy
- Document incident and root cause
- Create follow-up PR if needed
Monitoring Deployments
Terrateam PR Comments
All deployment activity is visible in PR comments:
## Terrateam Apply
**Directory:** `dev/us-east-1/ecs-cluster`
**Status:** ✅ Success
### Apply Output
Apply complete! Resources: 3 added, 1 changed, 0 destroyed.
Outputs:
cluster_arn = "arn:aws:ecs:us-east-1:123456789:cluster/docustack-dev"
cluster_name = "docustack-dev"
CloudWatch Logs
| Component | Log Group |
|---|---|
| Orchestrator Lambda | /aws/lambda/infra-orchestrator |
| Stop Resources | /aws/lambda/stop-resources |
| Start Resources | /aws/lambda/start-resources |
| Slack Bot | /ecs/slack-bot |
Viewing Logs
# View recent orchestrator logs
aws logs tail /aws/lambda/infra-orchestrator --since 1h
# Filter for errors
aws logs filter-log-events \
--log-group-name /aws/lambda/infra-orchestrator \
--filter-pattern "ERROR"
Slack Notifications
Deployment notifications are sent to #infra-alerts:
| Event | Notification |
|---|---|
| Plan started | Plan started for dev/us-east-1/... |
| Plan completed | Plan completed: 3 to add, 1 to change |
| Apply started | Apply started for dev/us-east-1/... |
| Apply completed | Apply completed successfully |
| Apply failed | Apply failed: [error message] |
| Drift detected | Drift detected in prod/us-east-1/... |
Troubleshooting
Plan Shows Unexpected Changes
Symptom: Plan shows changes you didn't make
Causes:
- Manual console changes (drift)
- Another PR applied changes
- Provider version differences
Resolution:
# Refresh state to detect drift
terragrunt refresh
# Import manually created resources
terragrunt import aws_ecs_service.main <service-arn>
State Lock Issues
Symptom: Error acquiring the state lock
Resolution:
# Option 1: Via Terrateam (preferred)
# Comment in PR:
terrateam unlock
# Option 2: Manual unlock
terragrunt force-unlock <lock-id>
# Option 3: DynamoDB direct (last resort)
aws dynamodb delete-item \
--table-name docustack-terraform-locks \
--key '{"LockID": {"S": "docustack-terraform-state-123/dev/us-east-1/vpc/terraform.tfstate"}}'
Permission Denied
Symptom: AccessDenied or UnauthorizedAccess errors
Resolution:
# Re-authenticate
aws sso login --profile docustack-dev
# Verify identity
aws sts get-caller-identity
# Check profile
echo $AWS_PROFILE
Terrateam Not Responding
Symptom: No plan posted to PR
Causes:
- GitHub webhook not delivered
- File patterns not matching
- Terrateam service issues
Resolution:
# Check webhook deliveries in GitHub
# Settings > Webhooks > Recent Deliveries
# Verify file patterns in .terrateam/config.yml
Cost Summary
| Configuration | Monthly Cost |
|---|---|
| Minimal (VPC + core) | ~$100 |
| With services (8 hrs/day) | ~$115 |
| With services (24/7) | ~$145 |
See Infrastructure Teardown for cost optimization strategies.