Skip to main content

AWS Infrastructure Setup

Detailed AWS resource specifications and setup guide for ReptiDex using ECS Fargate serverless container orchestration, supporting 1,000-5,000 users across development and production environments.

Architecture Update: This guide reflects the current ECS Fargate-based deployment. All services run as serverless containers on ARM64 Graviton2 processors with host-based routing via Application Load Balancer. Infrastructure is fully managed via CloudFormation templates.

Quick Navigation


Resource Overview

Environment Architecture

Development Environment

Purpose: Development, testing, and staging Target Load: 10-50 concurrent users Uptime Requirement: 95% (weekdays only) Data Retention: 30 days

Production Environment

Purpose: Live application serving customers Target Load: 1,000-5,000 registered users, 200-500 concurrent Uptime Requirement: 99.9% Data Retention: Full backup and archival

Infrastructure Components Summary

  • Development
  • Production

Development Environment Resources

Compute (ECS Fargate):
  • 10x Fargate tasks (4 frontend + 6 backend services)
  • ARM64 architecture (Graviton2)
  • 0.25 vCPU / 512MB per frontend task
  • 0.5 vCPU / 1024MB per backend task
Container Registry:
  • 10x ECR repositories (one per service)
  • Image scanning enabled
Database:
  • 1x db.t4g.micro PostgreSQL (ARM64)
  • 1x cache.t4g.micro ElastiCache Redis (ARM64)
Storage & CDN:
  • 3x S3 buckets (media, assets, logs)
  • CloudFront distribution (optional)
Networking:
  • 1x VPC with 4 subnets (2 AZs)
  • Application Load Balancer
  • 2x NAT Gateways
  • Complete security group matrix
Estimated Monthly Cost: $120-160

Compute Resources

ECS Fargate Task Specifications

ServiceDevelopmentProductionJustification
Frontend Apps (4)0.25 vCPU / 512MB
1 task per app
0.5 vCPU / 1024MB
2 tasks per app (auto-scale to 4)
Static React apps served via nginx. ARM64 Graviton2 provides 20% cost savings. Minimal resources needed for static content.
Backend Services (6)0.5 vCPU / 1024MB
1 task per service
1.0 vCPU / 2048MB
2 tasks per service (auto-scale to 6)
Python FastAPI services with database connections. Higher memory for connection pooling and async operations. Each service scales independently.

ECS Service Architecture

Serverless Container Orchestration

  • Cluster Name: dev-reptidex-cluster / prod-reptidex-cluster
  • Launch Type: Fargate (serverless)
  • Platform Version: LATEST (1.4.0+)
  • Container Insights: Enabled
  • CloudWatch Logs: All tasks log to /ecs/[service-name]
  • Runtime Platform: Linux/ARM64
  • Network Mode: awsvpc (required for Fargate)
  • Task Execution Role: ECS agent permissions
  • Task Role: Application AWS service access
  • Health Checks: Container-level health monitoring

Auto Scaling Configuration

  • Development
  • Production

Development Scaling

ECS Service Auto Scaling: Disabled (fixed capacity)
  • Reason: Predictable load, cost optimization
  • Desired Count: 1 task per service (10 total tasks)
  • Manual scaling: Update desired count for load testing
  • Monitoring: Basic CloudWatch metrics
Task Management:
  • Deployment Type: Rolling update
  • Minimum Healthy Percent: 100%
  • Maximum Percent: 200%
  • Task Placement: Spread across AZs
Cost Optimization:
  • Fargate Spot: Not used (reliability over cost)
  • Task Scheduling: Run 24/7 for development access
  • Resource Right-Sizing: Match actual usage patterns

Load Balancer Setup

Application Load Balancer (Required for ECS Fargate)

  • Type: Application Load Balancer
  • Scheme: Internet-facing
  • IP Address Type: IPv4
  • Availability Zones: 2 (dev) / 3 (prod)
  • Routing: Host-based (subdomains)
  • Target Type: IP (required for Fargate)
  • Protocol: HTTP
  • Frontend Health Check: /health (port 80)
  • Backend Health Check: /api/v1/health (port 8000)
  • Deregistration Delay: 30 seconds

HTTP Listener (Port 80): Redirects all traffic to HTTPS

HTTPS Listener (Port 443): Host-based routing to services

  • • dev.reptidex.com → web-public target group
  • • admin-dev.reptidex.com → web-admin target group
  • • breeder-dev.reptidex.com → web-breeder target group
  • • embed-dev.reptidex.com → web-embed target group
  • • api-dev.reptidex.com → repti-core target group
  • • animal-api-dev.reptidex.com → repti-animal target group
  • • commerce-api-dev.reptidex.com → repti-commerce target group
  • • media-api-dev.reptidex.com → repti-media target group
  • • community-api-dev.reptidex.com → repti-community target group
  • • ops-api-dev.reptidex.com → repti-ops target group

Database Setup

PostgreSQL RDS Configuration

ConfigurationDevelopmentProduction
Engine VersionPostgreSQL 15.7PostgreSQL 15.7
Instance Classdb.t4g.micro (ARM64)db.t4g.small (ARM64)
vCPU / Memory2 vCPU / 1 GB2 vCPU / 2 GB
Storage20 GB gp3 (auto-scale to 100GB)100 GB gp3 (auto-scale to 1TB)
Multi-AZNoYes
Backup Retention7 days30 days
MonitoringBasicEnhanced (1 min)
ArchitectureGraviton2 (ARM64)Graviton2 (ARM64)

Redis ElastiCache Configuration

  • Development
  • Production

Development Redis Setup

Instance Configuration:
  • Node Type: cache.t4g.micro (ARM64 Graviton2)
  • vCPU/Memory: 2 vCPU / 0.5 GB
  • Network Performance: Up to 5 Gbps
Cluster Configuration:
  • Engine: Redis 7.0
  • Parameter Group: Default redis7.x
  • Port: 6379
  • Subnet Group: Private subnets only
  • Cluster Mode: Disabled
Security & Backup:
  • At-Rest Encryption: Yes
  • In-Transit Encryption: No (development)
  • Auth Token: No (development)
  • Automatic Backups: Disabled
  • Snapshot Retention: N/A

Database Sizing Estimates

Database Growth Projections

~2GB
Year 1 (1K users)
50MB user data + 1.5GB taxonomy
~8GB
Year 2 (5K users)
250MB user data + 7.5GB animal records
~25GB
Year 3 (15K users)
750MB user data + 24GB breeding records

Storage & Content Delivery

S3 Bucket Configuration

Bucket PurposeDevelopmentProductionStorage Class
Application Assetsreptidex-dev-assetsreptidex-prod-assetsStandard
User Uploadsreptidex-dev-uploadsreptidex-prod-uploadsStandard → IA (30d)
Database BackupsN/Areptidex-prod-backupsStandard → Glacier (7d)

CloudFront Distribution

  • Development
  • Production

Development CDN Setup

Distribution Configuration:
  • Price Class: Use Only US, Canada and Europe
  • Alternate Domain Names: dev.reptidex.com
  • SSL Certificate: ACM Certificate
  • Minimum TLS Version: TLSv1.2
Origin Configuration:
  • Origin Domain: ALB DNS name
  • Origin Protocol: HTTPS Only
  • Origin Path: /
  • Custom Headers: None required
Cache Behaviors:
  • Default: TTL 1 hour, Forward all headers
  • /api/*: TTL 0 (no caching)
  • /assets/*: TTL 7 days, Gzip compression

Network Architecture

VPC Configuration

Virtual Private Cloud Setup

  • CIDR Block: 10.1.0.0/16
  • Availability Zones: 2 (us-east-1a, us-east-1b)
  • Public Subnets: 2 (10.1.1.0/24, 10.1.2.0/24)
  • Private Subnets: 2 (10.1.10.0/24, 10.1.20.0/24)
  • Internet Gateway: Yes
  • NAT Gateway: No (cost optimization)
  • CIDR Block: 10.0.0.0/16
  • Availability Zones: 3 (us-east-1a, 1b, 1c)
  • Public Subnets: 3 (10.0.1-3.0/24)
  • Private Subnets: 3 (10.0.10-30.0/24)
  • Internet Gateway: Yes
  • NAT Gateway: 1 (high availability mode)

Security Groups Matrix

Security GroupInbound RulesOutbound RulesApplied To
ALB-SGHTTP (80) from 0.0.0.0/0
HTTPS (443) from 0.0.0.0/0
HTTP (80) to ECS-SG (frontend)
HTTP (8000-8006) to ECS-SG (backend)
Application Load Balancer
ECS-SGHTTP (80) from ALB-SG (frontend)
HTTP (8000-8006) from ALB-SG (backend)
HTTPS (443) to 0.0.0.0/0
PostgreSQL (5432) to RDS-SG
Redis (6379) to Redis-SG
ECS Fargate Tasks (all services)
RDS-SGPostgreSQL (5432) from ECS-SGNone (default deny)RDS PostgreSQL
Redis-SGRedis (6379) from ECS-SGNone (default deny)ElastiCache Redis
Security Best Practice: ECS Fargate tasks use awsvpc network mode, which assigns each task its own elastic network interface (ENI) with a private IP address. Security groups are applied directly to tasks, not to instances. All inbound traffic must go through the ALB for security.

Monitoring & Observability

CloudWatch Configuration

  • Basic Metrics
  • Custom Metrics
  • Alerting

Essential Monitoring Setup

EC2 Instance Metrics:
  • CPU Utilization: 5-minute intervals
  • Memory Utilization: Custom metric via CloudWatch Agent
  • Disk Space: Custom metric for / and /var/log
  • Network I/O: Bytes in/out, packets in/out
Application Load Balancer:
  • Request Count: Total requests per minute
  • Target Response Time: Average response time
  • HTTP 4xx/5xx Errors: Error rate monitoring
  • Healthy/Unhealthy Hosts: Target health status
RDS PostgreSQL:
  • Database Connections: Current connection count
  • CPU Utilization: Database server CPU usage
  • Free Storage Space: Available disk space
  • Read/Write Latency: Query performance metrics
ElastiCache Redis:
  • CPU Utilization: Redis server CPU usage
  • Memory Utilization: Used memory percentage
  • Cache Hit Ratio: Cache effectiveness
  • Network Bytes In/Out: Redis traffic volume

Security Configuration

IAM Roles and Policies

Security Best Practices

  • • ECR pull access for Docker images
  • • CloudWatch Logs write access
  • • Secrets Manager read access for credentials
  • • SSM Parameter Store read access for config
  • • Required for ECS agent to start tasks
  • • S3 read/write for media buckets only
  • • CloudWatch logs and custom metrics write
  • • SES send email permissions (backend only)
  • • No database credentials (connection via Secrets Manager)
  • • Service-specific permissions only
  • • MFA required for AWS console access
  • • Development environment full access
  • • Production read-only (deployments via CI/CD only)
  • • CloudTrail logging all API calls
  • • Session timeout after 4 hours
  • • No direct access to Secrets Manager production secrets

SSL/TLS Configuration

  • Certificate Management
  • Encryption at Rest

SSL Certificate Setup

AWS Certificate Manager (ACM):
  • Domain: *.reptidex.com (wildcard certificate)
  • Validation: DNS validation via Route 53
  • Renewal: Automatic (90 days before expiration)
  • Key Algorithm: RSA-2048 or ECDSA P-256
Certificate Usage:
  • CloudFront: Primary SSL termination
  • Application Load Balancer: Backend SSL
  • Development: Self-signed or Let’s Encrypt
Security Headers:
  • HSTS: max-age=31536000; includeSubDomains
  • X-Frame-Options: DENY
  • X-Content-Type-Options: nosniff
  • Referrer-Policy: strict-origin-when-cross-origin

Cost Analysis

Monthly Cost Breakdown

Development Environment

ECS Fargate (10 tasks)$28/month
Application Load Balancer$18/month
RDS PostgreSQL (t4g.micro)$12/month
ElastiCache Redis (t4g.micro)$11/month
NAT Gateway (2 AZs)$66/month
S3 Storage (100GB)$3/month
CloudWatch & Logs$10/month
ECR Storage$2/month
Total Development$150/month

Production Environment

ECS Fargate (20-40 tasks)$85/month
Application Load Balancer$22/month
RDS PostgreSQL Multi-AZ (t4g.small)$52/month
ElastiCache Redis (t4g.small)$38/month
NAT Gateway (3 AZs)$99/month
S3 Storage (1TB)$25/month
CloudFront CDN$35/month
CloudWatch & Monitoring$20/month
ECR Storage & Transfer$8/month
Total Production$384/month
Cost Savings with ARM64: Using t4g (Graviton2) instances for RDS and ElastiCache provides approximately 20% cost savings compared to equivalent t3 instances. ECS Fargate on ARM64 provides similar savings over AMD64.

Cost Optimization Strategies

  • Initial Launch
  • Growth Phase

Launch Phase Optimizations

Compute Savings Plans:
  • Purchase Compute Savings Plans for predictable Fargate usage
  • Potential Savings: Up to 50% on Fargate compute costs
  • Implementation: After 2-3 months of stable usage patterns
  • Commitment: 1-year term with flexible instance families
Resource Right-Sizing:
  • Start with smaller task definitions, scale based on metrics
  • Monitor: CPU, memory utilization via Container Insights
  • Adjust: Monthly reviews and optimization
  • Target: 60-70% average CPU/memory utilization
Development Environment:
  • ECS Service Scheduling: Scale desired count to 0 nights/weekends
  • EventBridge Rules: Automated start/stop scheduling
  • Potential Savings: 60-70% on development Fargate costs
  • NAT Gateway: Consider single NAT gateway for development

12-Month Cost Projection

Annual Infrastructure Budget

$1,200
Months 1-3
Dev + Basic Prod
$4,800
Months 4-9
Full Production
$1,800
Months 10-12
Optimized with RI
7,800</div><divclassName="textsmtextorange700dark:textorange300">TotalYear1</div><divclassName="textxstextorange600dark:textorange400mt1">Average7,800</div> <div className="text-sm text-orange-700 dark:text-orange-300">Total Year 1</div> <div className="text-xs text-orange-600 dark:text-orange-400 mt-1">Average 650/month

Deployment Strategy

Infrastructure as Code

  • CloudFormation
  • Deployment Pipeline

Infrastructure Templates

Template Structure:
# infrastructure/
├── templates/
│   ├── vpc.yaml              # Network foundation
│   ├── security.yaml         # Security groups & IAM
│   ├── compute.yaml          # EC2 & Auto Scaling
│   ├── database.yaml         # RDS & ElastiCache
│   ├── storage.yaml          # S3 & CloudFront
│   └── monitoring.yaml       # CloudWatch & alarms
├── parameters/
│   ├── dev.json             # Development parameters
│   └── prod.json            # Production parameters
└── deploy.sh                # Deployment script
Deployment Order:
  1. VPC & Networking: Foundation infrastructure
  2. Security: IAM roles and security groups
  3. Database: RDS and ElastiCache clusters
  4. Compute: EC2 instances and load balancers
  5. Storage: S3 buckets and CloudFront
  6. Monitoring: CloudWatch dashboards and alarms
Benefits:
  • Version Control: Track infrastructure changes
  • Reproducibility: Consistent environments
  • Rollback: Easy rollback on deployment issues
  • Documentation: Self-documenting infrastructure

Environment Provisioning Timeline

Implementation Timeline

1
Week 1: Development Environment
Basic VPC, single EC2, RDS micro, S3 bucket
2
Week 2: Production Foundation
Multi-AZ VPC, security groups, IAM roles, SSL certificates
3
Week 3: Production Services
RDS Multi-AZ, ElastiCache, Auto Scaling Group, Load Balancer
4
Week 4: CDN & Monitoring
CloudFront distribution, CloudWatch dashboards, alerting setup

This comprehensive AWS infrastructure setup provides ReptiDex with a scalable, secure, and cost-effective foundation for initial launch. The architecture supports growth from 1,000 to 5,000+ users while maintaining high availability and performance standards. Regular monitoring and optimization will ensure efficient resource utilization and cost management as the platform scales.