Skip to main content

Cost Management Modules

These modules automate cost optimization by stopping non-critical resources during off-hours and providing unified infrastructure lifecycle management.

Overview

DocuStack uses a two-tier approach to cost management:

TierMethodDurationSavingsUse Case
Tier 1Lambda stop/start~30 seconds60-70%Daily off-hours
Tier 2Terrateam teardown~10-15 minutes100%Weekends, extended periods

Nightly Scheduler

EventBridge-scheduled Lambda functions for automated stop/start of non-critical AWS resources.

Why This Module?

Running dev/staging environments 24/7 wastes money:

  • ECS tasks running overnight with no users
  • RDS instances idle during off-hours
  • EC2 instances sitting unused

This module automatically stops resources at night and restarts them in the morning.

Architecture

┌─────────────────────────────────────────────────────────────┐
│ Nightly Scheduler Architecture │
├─────────────────────────────────────────────────────────────┤
│ ┌──────────────────────┐ │
│ │ EventBridge Schedule │ │
│ │ (7 AM UTC daily) │ │
│ └──────────┬───────────┘ │
│ │ trigger │
│ v │
│ ┌──────────────────────┐ ┌──────────────────┐ │
│ │ Stop Lambda │────────►│ ECS Services │ │
│ │ │ │ (desired=0) │ │
│ │ - Find resources │ └──────────────────┘ │
│ │ - Save state │ │
│ │ - Stop resources │ ┌──────────────────┐ │
│ │ - Send notifications│────────►│ RDS Instances │ │
│ └──────────────────────┘ │ (stopped) │ │
│ └──────────────────┘ │
│ ┌──────────────────────┐ │
│ │ EventBridge Schedule │ │
│ │ (10 PM UTC daily) │ │
│ └──────────┬───────────┘ │
│ │ trigger │
│ v │
│ ┌──────────────────────┐ │
│ │ Start Lambda │ │
│ │ │ │
│ │ - Find resources │ │
│ │ - Restore state │ │
│ │ - Start resources │ │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Usage

module "nightly_scheduler" {
source = "git::git@github.com:docustackapp/docustack-infrastructure-modules.git//modules/nightly-scheduler?ref=v1.0.0"

name = "docustack-dev"

# ECS services to stop/start
ecs_services_to_stop = [
{
cluster_name = "docustack-dev"
service_name = "temporal-dev"
},
{
cluster_name = "docustack-dev"
service_name = "slack-bot-dev"
}
]

# RDS instances to stop/start
rds_instances_to_stop = [
"docustack-dev-postgres"
]

# Schedule (UTC times)
schedule_stop = "cron(0 7 ? * * *)" # 7 AM UTC = 2 AM CT
schedule_start = "cron(0 22 ? * * *)" # 10 PM UTC = 5 PM CT

lambda_source_dir = "${path.root}/../../../docustack-mono/services/lambdas/nightly-scheduler"

# Optional: Manual trigger URLs
enable_manual_trigger = true
}

Tag-Based Discovery

Instead of explicit lists, discover resources by tags:

module "nightly_scheduler" {
# ... basic config ...

enable_discovery_mode = true
ecs_cluster_names = ["docustack-dev"]

manage_ecs = true
manage_rds = true
manage_ec2 = true

# Skip resources with this tag
skip_tag_key = "NightlyTeardown"
skip_tag_value = "skip"
}

Tag resources to exclude:

resource "aws_ecs_service" "critical" {
tags = {
NightlyTeardown = "skip" # Won't be stopped
}
}

State Preservation

The stop Lambda saves ECS desired counts before stopping:

# Before stopping
desired_count = ecs.describe_services(...)['services'][0]['desiredCount']

# Tag service with original count
ecs.tag_resource(
resourceArn=service_arn,
tags=[{'key': 'OriginalDesiredCount', 'value': str(desired_count)}]
)

# Stop service
ecs.update_service(desiredCount=0)

The start Lambda restores the original count.

RDS Auto-Restart Handling

AWS automatically restarts stopped RDS instances after 7 days. The module handles this:

rds_auto_restart_check_enabled = true  # Default

A daily check re-stops any RDS instances that auto-started.

Manual Triggering

# Via Lambda Function URLs
STOP_URL=$(terragrunt output -raw manual_stop_url)
curl -X POST $STOP_URL

# Via AWS CLI
aws lambda invoke --function-name docustack-dev-stop-resources /tmp/response.json

# Via Slack
/infra stop dev

Cost Savings

Example: Dev Environment

Resource24/7 Cost8hrs/day CostSavings
ECS Fargate (1 task)~$25/month~$8/month68%
RDS db.t3.medium~$60/month~$20/month67%
Total~$85/month~$28/month67%

Annual Savings: ~$684/year per environment

Lambda Code Location

Source: docustack-mono/services/lambdas/nightly-scheduler/


Infra Orchestrator

Lambda function that orchestrates infrastructure teardown and spinup operations, routing between Tier 1 (Lambda) and Tier 2 (Terrateam).

Why This Module?

Different situations call for different approaches:

  • Daily off-hours: Quick stop/start (Tier 1)
  • Weekends: Complete teardown (Tier 2)
  • Cost emergencies: Immediate teardown (Tier 2)

This module provides a unified interface that routes to the appropriate tier.

Architecture

┌─────────────────────────────────────────────────────────────┐
│ Infrastructure Orchestrator Architecture │
├─────────────────────────────────────────────────────────────┤
│ Slack Bot / API │
│ │ │
│ │ /infra stop dev → Tier 1 (Lambda) │
│ │ /infra start dev → Tier 1 (Lambda) │
│ │ /infra teardown dev → Tier 2 (Terrateam) │
│ │ /infra spinup dev → Tier 2 (Terrateam) │
│ v │
│ ┌──────────────────────┐ │
│ │ Infra Orchestrator │ │
│ │ Lambda │ │
│ │ │ │
│ │ Routes to: │ │
│ │ - Stop Lambda │──────► ECS/RDS stop │
│ │ - Start Lambda │──────► ECS/RDS start │
│ │ - Terrateam API │──────► Full teardown/spinup │
│ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Usage

module "infra_orchestrator" {
source = "git::git@github.com:docustackapp/docustack-infrastructure-modules.git//modules/infra-orchestrator?ref=v1.0.0"

name = "docustack-dev"
environment = "dev"

# Tier 1: Lambda stop/start
stop_lambda_arn = module.nightly_scheduler.lambda_stop_arn
start_lambda_arn = module.nightly_scheduler.lambda_start_arn

# Tier 2: Terrateam Cloud
terrateam_secret_name = "terrateam/api-token"
slack_webhook_secret_name = "slack/infra-alerts-webhook"
github_repo = "docustackapp/docustack-infrastructure-live"

lambda_source_dir = "${path.root}/../../../docustack-mono/services/lambdas/infra-orchestrator"
}

Operations

Tier 1: Stop/Start (Fast)

Duration: ~30 seconds
What happens: ECS services set to 0 tasks, RDS instances stopped

# Via Slack
/infra stop dev
/infra start dev

# Via AWS CLI
aws lambda invoke \
--function-name docustack-dev-infra-orchestrator \
--payload '{"action":"stop","environment":"dev"}' \
/tmp/response.json

Tier 2: Teardown/Spinup (Complete)

Duration: ~10-15 minutes
What happens: All Terraform resources destroyed/recreated via Terrateam

# Via Slack
/infra teardown dev
/infra spinup dev

# Via AWS CLI
aws lambda invoke \
--function-name docustack-dev-infra-orchestrator \
--payload '{"action":"teardown","environment":"dev"}' \
/tmp/response.json

Status Check

/infra status dev

Returns current state of ECS services, RDS instances, and EC2 instances.

Secrets Configuration

# Terrateam API token
aws secretsmanager create-secret \
--name terrateam/api-token \
--secret-string "ttc_your_api_token_here"

# Slack webhook
aws secretsmanager create-secret \
--name slack/infra-alerts-webhook \
--secret-string "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"

Lambda Code Location

Source: docustack-mono/services/lambdas/infra-orchestrator/


Cost Comparison

Tier 1 vs Tier 2

AspectTier 1 (Stop/Start)Tier 2 (Teardown)
Duration~30 seconds~10-15 minutes
Savings60-70%100%
Data preservedYesNo (recreated)
Load balancersStill runningDestroyed
Best forDaily off-hoursWeekends, extended

Monthly Cost Breakdown

Dev Environment (Tier 1 - 16 hrs stopped daily):

ResourceFull CostWith SchedulerSavings
ECS Fargate$25$8$17
RDS$60$20$40
ALB$16$16$0
NLB$16$16$0
Total$117$60$57 (49%)

Dev Environment (Tier 2 - Weekend teardown):

ResourceFull CostWith TeardownSavings
ECS Fargate$25$18$7
RDS$60$43$17
ALB$16$11$5
NLB$16$11$5
Total$117$83$34 (29%)

Combined (Tier 1 + Tier 2):

  • Weekdays: Tier 1 (16 hrs stopped)
  • Weekends: Tier 2 (full teardown)
  • Total savings: ~65-70%

Best Practices

  1. Use Tier 1 for daily operations - Faster, preserves data
  2. Use Tier 2 for weekends - Maximum savings
  3. Tag critical resources - Exclude from nightly scheduler
  4. Monitor costs - Set up AWS Budgets alerts
  5. Review skip tags - Ensure nothing critical is being stopped
  6. Test spinup - Verify Tier 2 spinup works before relying on it

Troubleshooting

Resources Not Stopping

# Check schedule status
aws scheduler get-schedule --name stop-resources --group-name docustack-dev

# Check Lambda logs
aws logs tail /aws/lambda/docustack-dev-stop-resources --since 1h

# Check for skip tags
aws ecs list-tags-for-resource --resource-arn $SERVICE_ARN

RDS Auto-Restarted

# Check RDS status
aws rds describe-db-instances \
--db-instance-identifier docustack-dev-postgres \
--query 'DBInstances[0].DBInstanceStatus'

# Check auto-restart check logs
aws logs tail /aws/lambda/docustack-dev-stop-resources --filter-pattern "auto-restart"

Terrateam Workflow Not Triggered

# Check orchestrator logs
aws logs tail /aws/lambda/docustack-dev-infra-orchestrator --since 10m

# Verify Terrateam API token
aws secretsmanager get-secret-value --secret-id terrateam/api-token