Skip to main content

Deployment Guide

Comprehensive guide for deploying DocuStack infrastructure using Terragrunt/Terraform with Terrateam.

Overview

Deployment Philosophy

DocuStack follows a GitOps-first approach to infrastructure management:

PrincipleImplementationWhy
Infrastructure as Code100% Terragrunt/TerraformReproducible, auditable, version-controlled
PR-Based ChangesAll changes go through pull requestsReview before deploy, automated validation
Environment IsolationSeparate AWS accounts per environmentBlast radius containment, security boundaries
Automated ValidationPlans run automatically, applies require approvalCatch issues early, prevent accidents
Audit TrailGit history + Terrateam logsCompliance, debugging, accountability

Deployment Methods

MethodUse CaseSpeed
Terrateam (Primary)All standard infrastructure changes5-15 min
ManualBootstrap, debugging, emergency fixesVaries
ScheduledNightly teardown, weekend savingsAutomated

Workflow Overview

Developer → Feature Branch → PR → Auto Plan → Review → Apply → Merge

Terrateam posts
plan to PR comments

Prerequisites

AWS Credentials

Configure AWS credentials using IAM Identity Center (SSO):

# Configure SSO profile
aws configure sso
# Profile name: docustack-dev (or staging, prod)
# SSO start URL: https://your-org.awsapps.com/start
# SSO Region: us-east-1
# Account ID: <target-account-id>
# Role: AdministratorAccess (or appropriate role)

# Login before running commands
aws sso login --profile docustack-dev

# Set default profile
export AWS_PROFILE=docustack-dev

Required Tools

ToolVersionInstallation
Terraform>= 1.5.0brew install terraform
Terragrunt>= 0.50.0brew install terragrunt
AWS CLI>= 2.0brew install awscli
Git>= 2.0brew install git

Verify installations:

terraform version   # >= 1.5.0
terragrunt version # >= 0.50.0
aws --version # >= 2.0

GitHub Access

For Terrateam PR-based deployments:

  1. Repository access: Push access to docustack-infrastructure-live repository
  2. Team membership: Member of infra team (for applies)
  3. Leads team: Member of leads team (for production applies)

Initial Deployment

Deployment Order

Initial deployment must follow this order due to dependencies:

1. Bootstrap (State Backend)

2. Management Account (SCPs, CloudTrail)

3. Dev Environment
├── VPC
├── ECS Cluster
├── RDS
├── Temporal
└── Terrateam

4. Staging Environment (same order as Dev)

5. Production Environment (same order as Dev)

Step 1: Bootstrap State Backend

The bootstrap module creates S3 bucket and DynamoDB table for Terraform state.

cd docustack-infrastructure-live/bootstrap

# Update account_id in terragrunt.hcl if needed
vim terragrunt.hcl

# Initialize and apply (first run uses local state)
terragrunt init
terragrunt apply

# After apply, migrate state to S3
# Uncomment remote_state block in terragrunt.hcl, then:
terragrunt init -migrate-state

Expected resources:

  • S3 bucket: docustack-terraform-state-<account-id>
  • DynamoDB table: docustack-terraform-locks

Step 2: Deploy Dev Environment

Deploy modules in dependency order:

export AWS_PROFILE=docustack-dev
cd docustack-infrastructure-live/dev/us-east-1

# 1. VPC (no dependencies)
cd vpc && terragrunt apply

# 2. ECS Cluster (depends on VPC)
cd ../ecs-cluster && terragrunt apply

# 3. RDS (depends on VPC)
cd ../rds && terragrunt apply

# 4. Temporal (depends on VPC, ECS, RDS)
cd ../temporal && terragrunt apply

Alternative: Deploy all at once

cd docustack-infrastructure-live/dev/us-east-1
terragrunt run-all apply
note

run-all respects dependencies defined in terragrunt.hcl files.

PR-Based Deployment (Terrateam)

Overview

Terrateam is the primary deployment method for all infrastructure changes after initial setup.

Step-by-Step Guide

1. Create Feature Branch

git checkout main
git pull origin main
git checkout -b feature/add-new-service

2. Make Infrastructure Changes

# Edit Terraform/Terragrunt files
vim docustack-infrastructure-live/dev/us-east-1/new-service/terragrunt.hcl

3. Push and Open PR

git add .
git commit -m "feat: add new service infrastructure"
git push origin feature/add-new-service

Open a Pull Request on GitHub.

4. Review Auto-Generated Plan

Terrateam automatically runs terragrunt plan and posts output:

## Terrateam Plan

**Directory:** `dev/us-east-1/new-service`

### Plan Output

Terraform will perform the following actions:

# aws_ecs_service.new_service will be created
+ resource "aws_ecs_service" "new_service" {
+ cluster = "arn:aws:ecs:us-east-1:123456789:cluster/docustack-dev"
+ desired_count = 2
+ name = "new-service"
...
}

Plan: 1 to add, 0 to change, 0 to destroy.

5. Apply Changes

After PR approval, comment to apply:

terrateam apply

For specific directories:

terrateam apply dev/us-east-1/new-service

6. Merge PR

After successful apply, merge the PR to main.

Terrateam Commands

CommandDescription
terrateam planRe-run plan for all changed directories
terrateam plan <dir>Plan specific directory
terrateam applyApply all planned changes
terrateam apply <dir>Apply specific directory
terrateam unlockRelease state lock
terrateam helpShow available commands

Access Control

EnvironmentPlanApply
DevAnyoneinfra team
StagingAnyoneinfra team
Productioninfra teamleads team
Managementinfra teamleads team

Manual Deployment

When to Use Manual Deployment

  • Initial bootstrap (before Terrateam exists)
  • Debugging Terrateam issues
  • Emergency fixes
  • Local development/testing

Single Module Deployment

cd docustack-infrastructure-live/dev/us-east-1/vpc

# Preview changes
terragrunt plan

# Apply changes
terragrunt apply

# Apply with auto-approve (use with caution)
terragrunt apply -auto-approve

All Modules Deployment

cd docustack-infrastructure-live/dev/us-east-1

# Plan all modules
terragrunt run-all plan

# Apply all modules (respects dependencies)
terragrunt run-all apply

Targeted Deployment

Apply specific resources within a module:

cd docustack-infrastructure-live/dev/us-east-1/ecs-cluster

# Target specific resource
terragrunt apply -target=aws_ecs_cluster.main

# Target multiple resources
terragrunt apply \
-target=aws_ecs_cluster.main \
-target=aws_iam_role.ecs_task_execution

Environment Promotion

Promotion Workflow

DEV                    STAGING                  PRODUCTION
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Feature │──────▶│ Integration│─────────▶│ Live │
│ Testing │ │ Testing │ │ Workloads │
└─────────────┘ └─────────────┘ └─────────────┘

Automated Automated Manual Approval
on merge on merge to Required
staging branch

Dev → Staging

  1. Test in Dev: Verify changes work in dev environment
  2. Create PR: Open PR targeting staging branch
  3. Auto Plan: Terrateam plans staging changes
  4. Review: Team reviews plan output
  5. Apply: Comment terrateam apply
  6. Merge: Merge to staging branch

Staging → Production

  1. Integration Testing: Run full test suite against staging
  2. Create PR: Open PR targeting main with production changes
  3. Auto Plan: Terrateam plans production changes
  4. Security Review: Security team reviews (if applicable)
  5. Approval: Requires leads team approval
  6. Apply: Comment terrateam apply
  7. Merge: Merge to main

Approval Requirements

EnvironmentApproversMinimum Approvals
DevAny team member0 (auto-approve allowed)
Staginginfra team1
Productionleads team1

Rollback Procedures

The safest rollback method:

# 1. Identify the commit to revert
git log --oneline dev/us-east-1/ecs-cluster

# 2. Create revert commit
git revert <commit-hash>

# 3. Push and open PR
git push origin revert-branch

# 4. Terrateam will plan the revert
# 5. Review and apply via PR

Terraform State Rollback

If an apply causes issues, you can roll back using Terraform state:

# List state versions (S3 versioning)
aws s3api list-object-versions \
--bucket docustack-terraform-state-<account-id> \
--prefix dev/us-east-1/ecs-cluster/terraform.tfstate

# Download previous state version
aws s3api get-object \
--bucket docustack-terraform-state-<account-id> \
--key dev/us-east-1/ecs-cluster/terraform.tfstate \
--version-id <previous-version-id> \
previous-state.tfstate

# Push previous state (use with extreme caution)
terragrunt state push previous-state.tfstate

Emergency Procedures

For critical production issues:

Option 1: Immediate Manual Rollback

# 1. Authenticate to production
export AWS_PROFILE=docustack-prod

# 2. Navigate to affected module
cd docustack-infrastructure-live/prod/us-east-1/ecs-cluster

# 3. Revert to previous commit locally
git checkout HEAD~1 -- .

# 4. Apply immediately
terragrunt apply -auto-approve

# 5. Create PR to sync main branch
git checkout -b emergency/rollback-ecs
git add .
git commit -m "emergency: rollback ECS cluster changes"
git push origin emergency/rollback-ecs

Option 2: AWS Console (Last Resort)

For immediate mitigation while preparing proper rollback:

  1. ECS: Update service desired count to 0, then restore
  2. RDS: Restore from automated backup
  3. ALB: Update target group health checks
warning

Document all console changes and apply via Terraform afterward to prevent drift.

Rollback Checklist

  • Identify the problematic change
  • Assess impact and urgency
  • Choose rollback method (state, git revert, emergency)
  • Notify team via Slack #infra-alerts
  • Execute rollback
  • Verify services are healthy
  • Document incident and root cause
  • Create follow-up PR if needed

Monitoring Deployments

Terrateam PR Comments

All deployment activity is visible in PR comments:

## Terrateam Apply

**Directory:** `dev/us-east-1/ecs-cluster`
**Status:** ✅ Success

### Apply Output

Apply complete! Resources: 3 added, 1 changed, 0 destroyed.

Outputs:
cluster_arn = "arn:aws:ecs:us-east-1:123456789:cluster/docustack-dev"
cluster_name = "docustack-dev"

CloudWatch Logs

ComponentLog Group
Orchestrator Lambda/aws/lambda/infra-orchestrator
Stop Resources/aws/lambda/stop-resources
Start Resources/aws/lambda/start-resources
Slack Bot/ecs/slack-bot

Viewing Logs

# View recent orchestrator logs
aws logs tail /aws/lambda/infra-orchestrator --since 1h

# Filter for errors
aws logs filter-log-events \
--log-group-name /aws/lambda/infra-orchestrator \
--filter-pattern "ERROR"

Slack Notifications

Deployment notifications are sent to #infra-alerts:

EventNotification
Plan startedPlan started for dev/us-east-1/...
Plan completedPlan completed: 3 to add, 1 to change
Apply startedApply started for dev/us-east-1/...
Apply completedApply completed successfully
Apply failedApply failed: [error message]
Drift detectedDrift detected in prod/us-east-1/...

Troubleshooting

Plan Shows Unexpected Changes

Symptom: Plan shows changes you didn't make

Causes:

  1. Manual console changes (drift)
  2. Another PR applied changes
  3. Provider version differences

Resolution:

# Refresh state to detect drift
terragrunt refresh

# Import manually created resources
terragrunt import aws_ecs_service.main <service-arn>

State Lock Issues

Symptom: Error acquiring the state lock

Resolution:

# Option 1: Via Terrateam (preferred)
# Comment in PR:
terrateam unlock

# Option 2: Manual unlock
terragrunt force-unlock <lock-id>

# Option 3: DynamoDB direct (last resort)
aws dynamodb delete-item \
--table-name docustack-terraform-locks \
--key '{"LockID": {"S": "docustack-terraform-state-123/dev/us-east-1/vpc/terraform.tfstate"}}'

Permission Denied

Symptom: AccessDenied or UnauthorizedAccess errors

Resolution:

# Re-authenticate
aws sso login --profile docustack-dev

# Verify identity
aws sts get-caller-identity

# Check profile
echo $AWS_PROFILE

Terrateam Not Responding

Symptom: No plan posted to PR

Causes:

  1. GitHub webhook not delivered
  2. File patterns not matching
  3. Terrateam service issues

Resolution:

# Check webhook deliveries in GitHub
# Settings > Webhooks > Recent Deliveries

# Verify file patterns in .terrateam/config.yml

Cost Summary

ConfigurationMonthly Cost
Minimal (VPC + core)~$100
With services (8 hrs/day)~$115
With services (24/7)~$145

See Infrastructure Teardown for cost optimization strategies.