Skip to main content

Architecture Overview

DocuStack is a HIPAA-compliant document processing platform built on AWS with a focus on security, cost optimization, and operational simplicity.

System Components

┌─────────────────────────────────────────────────────────────────┐
│ DocuStack Platform │
├─────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Web App │ │ Slack Bot │ │ Temporal Workflows │ │
│ │ (Next.js) │ │ (ECS/Farg.) │ │ (ECS/Fargate) │ │
│ └──────┬──────┘ └──────┬──────┘ └───────────┬─────────────┘ │
│ │ │ │ │
│ └────────────────┼─────────────────────┘ │
│ │ │
│ ┌───────────────────────┴───────────────────────────────────┐ │
│ │ VPC (HIPAA-Compliant) │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │ │
│ │ │ Public │ │ Private │ │ Database │ │ │
│ │ │ Subnets │ │ Subnets │ │ Subnets │ │ │
│ │ │ (ALB/NLB) │ │ (ECS Tasks) │ │ (RDS/Aurora) │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────────┘ │ │
│ └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Repository Structure

DocuStack follows the Gruntwork two-repository pattern:

RepositoryPurpose
docustack-monoApplication code (frontend, backend, Lambdas)
docustack-infrastructure-modulesReusable Terraform modules
docustack-infrastructure-liveEnvironment-specific configurations

Why This Pattern?

  1. Separation of concerns: Infrastructure modules are versioned independently
  2. Reusability: Modules can be used across environments with different configs
  3. Safety: Changes to modules don't automatically affect production
  4. Auditability: Clear version history for infrastructure changes

Key Architectural Decisions

Lambda Code Location

Lambda function code lives in docustack-mono/services/lambdas/, not in the infrastructure modules. This follows the principle:

  • Modules define HOW to deploy (Terraform resources)
  • Application code defines WHAT to deploy (Lambda function code)

See Lambda Code Location for the full ADR.

Cost Optimization

  • Fargate Spot: 70% cost reduction for non-critical workloads
  • Nightly Scheduler: Automatic stop/start of dev resources
  • On-demand Bastion: No always-on bastion hosts

HIPAA Compliance

All infrastructure is designed for HIPAA compliance:

  • Encryption at rest and in transit
  • VPC endpoints for AWS services (no public internet)
  • Audit logging via CloudTrail
  • AWS Config conformance pack

Infrastructure Layers

Deployment follows a strict dependency order:

LayerComponentsDependencies
0Bootstrap (S3, DynamoDB)None
1VPC, GitHub OIDCBootstrap
2ECR, ECS Cluster, RDSVPC
3Bastion, IP Whitelist, Nightly SchedulerVPC, ECS
4DB Init Lambda, TemporalRDS, ECS
5Slack Bot, OrchestratorsAll above

Architecture Documentation

DocumentDescription
Lambda Code LocationADR: Why Lambda code lives in the monorepo
Multi-Account StrategyAWS Organizations structure and SCPs
Terraform State StrategyPer-account state storage for security