Skip to main content

OpenSearch Domain

Cost-optimized OpenSearch domain for development and testing environments.

Overview

OpenSearch Domain provides a managed OpenSearch cluster with predictable costs and full control over instance sizing. This is the recommended option for dev/staging environments where 96% cost savings over serverless justify the operational overhead.

Key Features

  • Cost-optimized - $31/month vs $691/month for serverless (96% savings)
  • Full-text search - Traditional keyword-based search with analyzers
  • Vector search (k-NN) - Semantic similarity search using embeddings
  • VPC-based - Private subnet deployment, no public access
  • HIPAA compliant - Encryption at rest/transit, fine-grained access control
  • OpenSearch 3.3 - Latest stable version with all features

Use Cases

  • Development environments - Cost-effective search for local development
  • Staging environments - Pre-production testing with production-like setup
  • Small workloads - Predictable, low-volume search requirements
  • Learning/experimentation - Full OpenSearch features at minimal cost

When to Use Domain vs Serverless

FactorDomain (t3.small)Serverless
Cost$31/month (fixed)$691/month (minimum 4 OCUs)
Best forDev, staging, small workloadsProduction, variable workloads
ScalingManual (change instance type)Automatic
AvailabilitySingle-AZ (dev) or Multi-AZMulti-AZ by default
ManagementRequires version upgradesFully managed
Startup time10-15 minutesInstant

Recommendation:

  • Dev/Staging: Use OpenSearch Domain (this module)
  • Production: Use OpenSearch Serverless for auto-scaling and high availability

Architecture

┌──────────────────────────────────────────────────────────────────┐
│ OpenSearch Domain Architecture │
├──────────────────────────────────────────────────────────────────┤
│ │
│ Developer Laptop │
│ │ │
│ │ aws ssm start-session (port forward) │
│ v │
│ ┌─────────────────┐ │
│ │ Bastion Host │ │
│ │ (Private) │ │
│ └────────┬────────┘ │
│ │ │
│ │ HTTPS (443) │
│ v │
│ ┌─────────────────────────┐ ┌──────────────────┐ │
│ │ OpenSearch Domain │───────►│ Security │ │
│ │ (Private Subnet) │ │ - Fine-grained │ │
│ │ │ │ access control│ │
│ │ - t3.small.search │ │ - Security group│ │
│ │ - 50GB gp3 storage │ │ - Resource │ │
│ │ - OpenSearch 3.3 │ │ policy │ │
│ └────────┬────────────────┘ └──────────────────┘ │
│ │ │
│ │ ┌──────────────────┐ │
│ └─────────────────────────►│ KMS Encryption │ │
│ │ (AWS-managed) │ │
│ └──────────────────┘ │
│ │
│ ┌─────────────────────────┐ ┌──────────────────┐ │
│ │ SSM Parameter Store │ │ CloudWatch │ │
│ │ - Endpoint │ │ - Metrics │ │
│ │ - Master password │ │ - Logs │ │
│ │ - Security group ID │ │ - Alarms │ │
│ └─────────────────────────┘ └──────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘

Prerequisites

Before deploying OpenSearch Domain:

  1. VPC Module - Deployed with private subnets
  2. Bastion Orchestrator - For local development access
  3. AWS CLI - With Session Manager plugin
  4. SSM Parameter Store - For credential management

Deployment

Step 1: Deploy OpenSearch Domain

The module is already configured in infrastructure-live/_envcommon/opensearch-domain.hcl:

terraform {
source = "git::git@github.com:docustackapp/docustack-infrastructure-modules.git//modules/opensearch-domain?ref=v1.3.1"
}

inputs = {
project_name = local.project_name
environment = local.environment

# Instance configuration
instance_type = "t3.small.search" # 2 vCPU, 2GB RAM
instance_count = 1 # Single node for dev

# Storage configuration
ebs_volume_size = 50 # GB
ebs_volume_type = "gp3"
ebs_iops = 3000
ebs_throughput = 125 # MB/s

# OpenSearch version
engine_version = "OpenSearch_3.3"

# Network configuration
vpc_id = dependency.vpc.outputs.vpc_id
subnet_ids = [dependency.vpc.outputs.private_subnet_ids[0]] # Single AZ
allowed_cidr_blocks = [dependency.vpc.outputs.vpc_cidr_block]

# Security
create_master_user = true
master_username = "admin"
}

Deploy to dev:

cd infrastructure-live/dev/us-east-1/opensearch-domain
terragrunt plan
terragrunt apply

Deployment time: ~10-15 minutes

Step 2: Retrieve Credentials

The master password is stored in SSM Parameter Store:

aws ssm get-parameter \
--name /docustack/dev/opensearch/master-password \
--with-decryption \
--region us-east-1 \
--query 'Parameter.Value' \
--output text

Step 3: Access Locally

See OpenSearch Local Access for complete guide.

Quick start:

# 1. Create bastion via Slack
/infra bastion create

# 2. Port forward to OpenSearch
aws ssm start-session \
--target i-BASTION_ID \
--document-name AWS-StartPortForwardingSessionToRemoteHost \
--parameters '{"host":["OPENSEARCH_ENDPOINT"],"portNumber":["443"],"localPortNumber":["9200"]}' \
--region us-east-1

# 3. Access cluster
curl -k -u admin:PASSWORD https://localhost:9200/_cluster/health

Configuration

Instance Types

Instance TypevCPURAMStorageCost/MonthUse Case
t3.small.search22GBEBS$31Dev, small workloads
t3.medium.search24GBEBS$62Staging, medium workloads
r5.large.search216GBEBS$146Production, memory-intensive

Storage Options

Volume TypeIOPSThroughputCost/GB/MonthUse Case
gp33,000-16,000125-1,000 MB/s$0.08General purpose (recommended)
gp23 IOPS/GB250 MB/s$0.10Legacy
io1Up to 64,0001,000 MB/s$0.125 + IOPSHigh performance

Recommendation: Use gp3 for cost-effectiveness and performance.

Multi-AZ Deployment

For staging/production, enable multi-AZ:

inputs = {
instance_count = 3 # Minimum for multi-AZ
subnet_ids = dependency.vpc.outputs.private_subnet_ids # All AZs

zone_awareness_enabled = true
availability_zone_count = 3
}

Cost impact: 3x instance costs ($93/month for 3x t3.small)

Security

Fine-Grained Access Control

The module enables fine-grained access control with:

  • Master user: admin (password in SSM)
  • Internal database: OpenSearch's built-in user database
  • Role-based access: Create additional users/roles in Dashboards

Network Security

  • VPC-only: No public endpoint
  • Security group: Allows HTTPS (443) from VPC CIDR only
  • Resource policy: Allows all ES actions from within VPC
  • Encryption in transit: TLS 1.2+ enforced

Encryption

  • At rest: AWS-managed KMS key
  • In transit: Node-to-node encryption enabled
  • HTTPS: Required for all API calls

Access Patterns

Access MethodAuthenticationUse Case
Port forwarding via bastionMaster user (admin)Local development
VPC endpoint (future)IAM rolesApplication access
OpenSearch DashboardsMaster user or IAMAdministration

Operations

Monitoring

CloudWatch metrics available:

  • Cluster health: Green/Yellow/Red status
  • CPU utilization: Instance CPU usage
  • JVM memory pressure: Heap usage
  • Storage: Free space, IOPS, throughput
  • Search performance: Query latency, indexing rate

Backups

Automated snapshots:

inputs = {
automated_snapshot_start_hour = 3 # 3 AM UTC
}

Snapshots stored in AWS-managed S3 bucket (free).

Upgrades

OpenSearch version upgrades:

inputs = {
engine_version = "OpenSearch_3.4" # Update version
}

Important: Test upgrades in dev before applying to staging/prod.

Scaling

Vertical Scaling (Instance Type)

inputs = {
instance_type = "t3.medium.search" # Upgrade from t3.small
}

Downtime: ~10-15 minutes during instance replacement.

Horizontal Scaling (Node Count)

inputs = {
instance_count = 3 # Scale from 1 to 3 nodes
}

Downtime: None (rolling deployment).

Storage Scaling

inputs = {
ebs_volume_size = 100 # Increase from 50GB
}

Downtime: None (online resize).

Cost Analysis

Dev Environment (Current)

ResourceSpecificationCost/Month
OpenSearch instancet3.small.search$31.00
EBS storage50GB gp3$4.00
Total$35.00

Comparison with Serverless

MetricDomain (t3.small)Serverless
Monthly cost$35$691
Savings$656/month-
Savings %95%-
Annual savings$7,872-

Scaling Costs

ConfigurationMonthly CostUse Case
1x t3.small (dev)$35Development
1x t3.medium (staging)$66Staging
3x t3.medium (multi-AZ)$198Production (small)
3x r5.large (multi-AZ)$438Production (large)

Still cheaper than serverless for predictable workloads.

Troubleshooting

Cluster Status Yellow

Cause: Unassigned replica shards (expected for single-node clusters).

Solution: For dev, this is normal. For production, use multi-AZ with 3+ nodes.

Out of Memory Errors

Cause: JVM heap pressure too high.

Solution:

  1. Check CloudWatch metric JVMMemoryPressure
  2. If consistently >75%, upgrade instance type
  3. Consider reducing shard count or index size

Slow Queries

Cause: Insufficient resources or inefficient queries.

Solution:

  1. Check CloudWatch metric SearchLatency
  2. Review slow query logs in CloudWatch Logs
  3. Optimize queries or upgrade instance type

Cannot Connect via Port Forwarding

Cause: Security group or bastion issues.

Solution: See OpenSearch Local Access - Troubleshooting

Migration from Serverless

If migrating from OpenSearch Serverless:

Step 1: Export Data

# Export indices from serverless collection
elasticdump \
--input=https://SERVERLESS_ENDPOINT/my-index \
--output=my-index.json \
--type=data

Step 2: Deploy Domain

cd infrastructure-live/dev/us-east-1/opensearch-domain
terragrunt apply

Step 3: Import Data

# Import to domain
elasticdump \
--input=my-index.json \
--output=https://localhost:9200/my-index \
--type=data

Step 4: Update Application

Update application to use new endpoint from SSM:

aws ssm get-parameter \
--name /docustack/dev/opensearch/endpoint \
--query 'Parameter.Value' \
--output text

Step 5: Destroy Serverless

cd infrastructure-live/dev/us-east-1/opensearch-serverless
terragrunt destroy

SSM Parameters

The module creates these SSM parameters for cross-module reference:

ParameterDescriptionExample
/docustack/dev/opensearch/endpointDomain endpointvpc-docustack-dev-search-*.es.amazonaws.com
/docustack/dev/opensearch/domain-nameDomain namedocustack-dev-search
/docustack/dev/opensearch/master-passwordMaster user password(SecureString)
/docustack/dev/opensearch/kibana-endpointDashboards URLhttps://vpc-docustack-dev-search-*/_dashboards
/docustack/dev/opensearch/security-group-idSecurity group IDsg-*

Module Source

docustack-infrastructure-modules/modules/opensearch-domain/
├── main.tf # Domain resource
├── variables.tf # Input variables
├── outputs.tf # Outputs
├── data.tf # Data sources
├── locals.tf # Local values
├── versions.tf # Provider requirements
├── security-groups.tf # Security group
├── ssm-parameters.tf # SSM parameters
├── README.md # Module documentation
└── examples/
└── basic/
└── terragrunt.hcl # Example usage

Current version: v1.3.1

Repository: docustack-infrastructure-modules