Cost Management Modules
These modules automate cost optimization by stopping non-critical resources during off-hours and providing unified infrastructure lifecycle management.
Overview
DocuStack uses a two-tier approach to cost management:
| Tier | Method | Duration | Savings | Use Case |
|---|---|---|---|---|
| Tier 1 | Lambda stop/start | ~30 seconds | 60-70% | Daily off-hours |
| Tier 2 | Terrateam teardown | ~10-15 minutes | 100% | Weekends, extended periods |
Nightly Scheduler
EventBridge-scheduled Lambda functions for automated stop/start of non-critical AWS resources.
Why This Module?
Running dev/staging environments 24/7 wastes money:
- ECS tasks running overnight with no users
- RDS instances idle during off-hours
- EC2 instances sitting unused
This module automatically stops resources at night and restarts them in the morning.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Nightly Scheduler Architecture │
├─────────────────────────────────────────────────────────────┤
│ ┌──────────────────────┐ │
│ │ EventBridge Schedule │ │
│ │ (7 AM UTC daily) │ │
│ └──────────┬───────────┘ │
│ │ trigger │
│ v │
│ ┌──────────────────────┐ ┌──────────────────┐ │
│ │ Stop Lambda │────────►│ ECS Services │ │
│ │ │ │ (desired=0) │ │
│ │ - Find resources │ └──────────────────┘ │
│ │ - Save state │ │
│ │ - Stop resources │ ┌──────────────────┐ │
│ │ - Send notifications│────────►│ RDS Instances │ │
│ └──────────────────────┘ │ (stopped) │ │
│ └──────────────────┘ │
│ ┌──────────────────────┐ │
│ │ EventBridge Schedule │ │
│ │ (10 PM UTC daily) │ │
│ └──────────┬───────────┘ │
│ │ trigger │
│ v │
│ ┌──────────────────────┐ │
│ │ Start Lambda │ │
│ │ │ │
│ │ - Find resources │ │
│ │ - Restore state │ │
│ │ - Start resources │ │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Usage
module "nightly_scheduler" {
source = "git::git@github.com:docustackapp/docustack-infrastructure-modules.git//modules/nightly-scheduler?ref=v1.0.0"
name = "docustack-dev"
# ECS services to stop/start
ecs_services_to_stop = [
{
cluster_name = "docustack-dev"
service_name = "temporal-dev"
},
{
cluster_name = "docustack-dev"
service_name = "slack-bot-dev"
}
]
# RDS instances to stop/start
rds_instances_to_stop = [
"docustack-dev-postgres"
]
# Schedule (UTC times)
schedule_stop = "cron(0 7 ? * * *)" # 7 AM UTC = 2 AM CT
schedule_start = "cron(0 22 ? * * *)" # 10 PM UTC = 5 PM CT
lambda_source_dir = "${path.root}/../../../docustack-mono/services/lambdas/nightly-scheduler"
# Optional: Manual trigger URLs
enable_manual_trigger = true
}
Tag-Based Discovery
Instead of explicit lists, discover resources by tags:
module "nightly_scheduler" {
# ... basic config ...
enable_discovery_mode = true
ecs_cluster_names = ["docustack-dev"]
manage_ecs = true
manage_rds = true
manage_ec2 = true
# Skip resources with this tag
skip_tag_key = "NightlyTeardown"
skip_tag_value = "skip"
}
Tag resources to exclude:
resource "aws_ecs_service" "critical" {
tags = {
NightlyTeardown = "skip" # Won't be stopped
}
}
State Preservation
The stop Lambda saves ECS desired counts before stopping:
# Before stopping
desired_count = ecs.describe_services(...)['services'][0]['desiredCount']
# Tag service with original count
ecs.tag_resource(
resourceArn=service_arn,
tags=[{'key': 'OriginalDesiredCount', 'value': str(desired_count)}]
)
# Stop service
ecs.update_service(desiredCount=0)
The start Lambda restores the original count.
RDS Auto-Restart Handling
AWS automatically restarts stopped RDS instances after 7 days. The module handles this:
rds_auto_restart_check_enabled = true # Default
A daily check re-stops any RDS instances that auto-started.
Manual Triggering
# Via Lambda Function URLs
STOP_URL=$(terragrunt output -raw manual_stop_url)
curl -X POST $STOP_URL
# Via AWS CLI
aws lambda invoke --function-name docustack-dev-stop-resources /tmp/response.json
# Via Slack
/infra stop dev
Cost Savings
Example: Dev Environment
| Resource | 24/7 Cost | 8hrs/day Cost | Savings |
|---|---|---|---|
| ECS Fargate (1 task) | ~$25/month | ~$8/month | 68% |
| RDS db.t3.medium | ~$60/month | ~$20/month | 67% |
| Total | ~$85/month | ~$28/month | 67% |
Annual Savings: ~$684/year per environment
Lambda Code Location
Source: docustack-mono/services/lambdas/nightly-scheduler/
Infra Orchestrator
Lambda function that orchestrates infrastructure teardown and spinup operations, routing between Tier 1 (Lambda) and Tier 2 (Terrateam).
Why This Module?
Different situations call for different approaches:
- Daily off-hours: Quick stop/start (Tier 1)
- Weekends: Complete teardown (Tier 2)
- Cost emergencies: Immediate teardown (Tier 2)
This module provides a unified interface that routes to the appropriate tier.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Infrastructure Orchestrator Architecture │
├─────────────────────────────────────────────────────────────┤
│ Slack Bot / API │
│ │ │
│ │ /infra stop dev → Tier 1 (Lambda) │
│ │ /infra start dev → Tier 1 (Lambda) │
│ │ /infra teardown dev → Tier 2 (Terrateam) │
│ │ /infra spinup dev → Tier 2 (Terrateam) │
│ v │
│ ┌──────────────────────┐ │
│ │ Infra Orchestrator │ │
│ │ Lambda │ │
│ │ │ │
│ │ Routes to: │ │
│ │ - Stop Lambda │──────► ECS/RDS stop │
│ │ - Start Lambda │──────► ECS/RDS start │
│ │ - Terrateam API │──────► Full teardown/spinup │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Usage
module "infra_orchestrator" {
source = "git::git@github.com:docustackapp/docustack-infrastructure-modules.git//modules/infra-orchestrator?ref=v1.0.0"
name = "docustack-dev"
environment = "dev"
# Tier 1: Lambda stop/start
stop_lambda_arn = module.nightly_scheduler.lambda_stop_arn
start_lambda_arn = module.nightly_scheduler.lambda_start_arn
# Tier 2: Terrateam Cloud
terrateam_secret_name = "terrateam/api-token"
slack_webhook_secret_name = "slack/infra-alerts-webhook"
github_repo = "docustackapp/docustack-infrastructure-live"
lambda_source_dir = "${path.root}/../../../docustack-mono/services/lambdas/infra-orchestrator"
}
Operations
Tier 1: Stop/Start (Fast)
Duration: ~30 seconds
What happens: ECS services set to 0 tasks, RDS instances stopped
# Via Slack
/infra stop dev
/infra start dev
# Via AWS CLI
aws lambda invoke \
--function-name docustack-dev-infra-orchestrator \
--payload '{"action":"stop","environment":"dev"}' \
/tmp/response.json
Tier 2: Teardown/Spinup (Complete)
Duration: ~10-15 minutes
What happens: All Terraform resources destroyed/recreated via Terrateam
# Via Slack
/infra teardown dev
/infra spinup dev
# Via AWS CLI
aws lambda invoke \
--function-name docustack-dev-infra-orchestrator \
--payload '{"action":"teardown","environment":"dev"}' \
/tmp/response.json
Status Check
/infra status dev
Returns current state of ECS services, RDS instances, and EC2 instances.
Secrets Configuration
# Terrateam API token
aws secretsmanager create-secret \
--name terrateam/api-token \
--secret-string "ttc_your_api_token_here"
# Slack webhook
aws secretsmanager create-secret \
--name slack/infra-alerts-webhook \
--secret-string "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
Lambda Code Location
Source: docustack-mono/services/lambdas/infra-orchestrator/
Cost Comparison
Tier 1 vs Tier 2
| Aspect | Tier 1 (Stop/Start) | Tier 2 (Teardown) |
|---|---|---|
| Duration | ~30 seconds | ~10-15 minutes |
| Savings | 60-70% | 100% |
| Data preserved | Yes | No (recreated) |
| Load balancers | Still running | Destroyed |
| Best for | Daily off-hours | Weekends, extended |
Monthly Cost Breakdown
Dev Environment (Tier 1 - 16 hrs stopped daily):
| Resource | Full Cost | With Scheduler | Savings |
|---|---|---|---|
| ECS Fargate | $25 | $8 | $17 |
| RDS | $60 | $20 | $40 |
| ALB | $16 | $16 | $0 |
| NLB | $16 | $16 | $0 |
| Total | $117 | $60 | $57 (49%) |
Dev Environment (Tier 2 - Weekend teardown):
| Resource | Full Cost | With Teardown | Savings |
|---|---|---|---|
| ECS Fargate | $25 | $18 | $7 |
| RDS | $60 | $43 | $17 |
| ALB | $16 | $11 | $5 |
| NLB | $16 | $11 | $5 |
| Total | $117 | $83 | $34 (29%) |
Combined (Tier 1 + Tier 2):
- Weekdays: Tier 1 (16 hrs stopped)
- Weekends: Tier 2 (full teardown)
- Total savings: ~65-70%
Best Practices
- Use Tier 1 for daily operations - Faster, preserves data
- Use Tier 2 for weekends - Maximum savings
- Tag critical resources - Exclude from nightly scheduler
- Monitor costs - Set up AWS Budgets alerts
- Review skip tags - Ensure nothing critical is being stopped
- Test spinup - Verify Tier 2 spinup works before relying on it
Troubleshooting
Resources Not Stopping
# Check schedule status
aws scheduler get-schedule --name stop-resources --group-name docustack-dev
# Check Lambda logs
aws logs tail /aws/lambda/docustack-dev-stop-resources --since 1h
# Check for skip tags
aws ecs list-tags-for-resource --resource-arn $SERVICE_ARN
RDS Auto-Restarted
# Check RDS status
aws rds describe-db-instances \
--db-instance-identifier docustack-dev-postgres \
--query 'DBInstances[0].DBInstanceStatus'
# Check auto-restart check logs
aws logs tail /aws/lambda/docustack-dev-stop-resources --filter-pattern "auto-restart"
Terrateam Workflow Not Triggered
# Check orchestrator logs
aws logs tail /aws/lambda/docustack-dev-infra-orchestrator --since 10m
# Verify Terrateam API token
aws secretsmanager get-secret-value --secret-id terrateam/api-token