Current Architecture: ReptiDex uses GitHub Actions for CI/CD, deploying ARM64 Docker containers to AWS ECS Fargate via ECR. All infrastructure is managed through CloudFormation templates with rolling deployments and automated health checks.

CI/CD Pipeline & Deployment

Complete continuous integration and deployment strategy for ReptiDex, covering GitHub Actions workflows, Docker containerization, automated testing, deployment pipelines, and release management across development and production environments.

Pipeline Overview

Containerization setup

Deployment

Release management

CI/CD Pipeline Overview

Pipeline Architecture Strategy

Development Flow

Feature Development

Feature branch creation
Automated testing on PR
Code quality checks
Staging deployment

Integration Flow

Continuous Integration

Automated builds
Comprehensive test suites
Security scanning
Artifact creation

Deployment Flow

Continuous Deployment

ECR image push
ECS service updates
Rolling deployment
Automated rollback via deployment circuit breaker

Pipeline Stages and Gates

Development Stage
Integration Stage
Deployment Stage

Feature Development Workflow

Branch Strategy:

Main Branch: Production-ready code only
Development Branch: Integration of completed features
Feature Branches: Individual feature development
Hotfix Branches: Critical production fixes

Pull Request Process:

Automated Checks: Linting, type checking, security scan
Test Execution: Unit, integration, and e2e tests
Code Review: Peer review requirements (minimum 2 reviewers)
Staging Deploy: Automatic deployment to staging environment
QA Validation: Manual testing and acceptance criteria
Merge Approval: Final approval and merge to development

Quality Gates:

Code Coverage: Minimum 80% test coverage required
Security Scan: No critical or high vulnerabilities
Performance: No regressions in key performance metrics
Documentation: Updated documentation for new features

GitHub Actions Workflows

Core Workflow Configuration

Pull Request Workflow
Main Branch Workflow
Hotfix Workflow

PR Validation Pipeline

name: Pull Request Validation

on:
  pull_request:
    branches: [development, main]
    types: [opened, synchronize, reopened]

jobs:
  code-quality:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
          
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'
          cache: 'pip'
          
      - name: Install dependencies
        run: |
          npm ci
          pip install -r requirements.txt
          
      - name: Run linting
        run: |
          npm run lint
          ruff check .
          black --check .
          
      - name: Type checking
        run: |
          npm run type-check
          mypy .
          
      - name: Security scanning
        uses: github/codeql-action/analyze@v3
        with:
          languages: javascript,python

  unit-tests:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
          POSTGRES_DB: reptidex_test
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
          
      redis:
        image: redis:7
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup test environment
        run: |
          npm ci
          pip install -r requirements-test.txt
          
      - name: Run frontend tests
        run: npm run test:coverage
        
      - name: Run backend tests
        run: |
          pytest --cov=src --cov-report=xml --cov-min=80
          
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: ./coverage.xml,./coverage/lcov.info

  build-and-deploy-staging:
    needs: [code-quality, unit-tests]
    runs-on: ubuntu-latest
    if: github.event.pull_request.head.ref != 'main'
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
          
      - name: Build and push Docker images
        run: |
          # Build all service images
          docker-compose -f docker-compose.staging.yml build
          
          # Push to ECR
          aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REGISTRY
          docker-compose -f docker-compose.staging.yml push
          
      - name: Deploy to staging
        run: |
          # Update ECS service with new task definitions
          aws ecs update-service --cluster reptidex-staging --service repti-web --force-new-deployment
          
      - name: Run integration tests
        run: |
          # Wait for deployment to complete
          aws ecs wait services-stable --cluster reptidex-staging --services repti-web
          
          # Run integration test suite against staging
          npm run test:integration -- --baseUrl=https://staging.reptidex.com

Workflow Optimization

Performance Optimizations

• Dependency Caching: npm and pip package caching
• Docker Layer Caching: Multi-stage build optimization
• Test Result Caching: Skip unchanged test suites
• Build Artifact Caching: Reuse compiled assets

• Matrix Builds: Multiple Node/Python versions
• Parallel Tests: Split test suites across runners
• Service Independence: Parallel service builds
• Geographic Deployment: Multi-region parallel deployment

• Path-based Triggers: Run only affected service tests
• Change Detection: Skip unchanged components
• Draft PR Handling: Lightweight validation for drafts
• Scheduled Optimization: Dependency updates and cleanup

Docker Containerization Strategy

Container Architecture

Service Containers
Docker Compose
Container Optimization

Individual Service Dockerfiles

Frontend (Web) Container:

# Multi-stage build for frontend optimization
FROM node:20-alpine AS builder

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

COPY . .
RUN npm run build

# Production stage
FROM nginx:alpine AS production

# Copy custom nginx configuration
COPY nginx.conf /etc/nginx/nginx.conf
COPY --from=builder /app/dist /usr/share/nginx/html

# Add health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost/health || exit 1

EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Backend API Container:

FROM python:3.12-slim AS base

# System dependencies
RUN apt-get update && apt-get install -y \
    postgresql-client \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Python environment
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PYTHONPATH=/app

WORKDIR /app

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Application code
COPY . .

# Create non-root user
RUN adduser --disabled-password --gecos '' appuser && \
    chown -R appuser:appuser /app
USER appuser

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
  CMD curl -f http://localhost:8000/api/health || exit 1

EXPOSE 8000
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "4", "app.wsgi:application"]

Worker Container:

FROM python:3.12-slim AS worker

# System dependencies for worker tasks
RUN apt-get update && apt-get install -y \
    imagemagick \
    ffmpeg \
    postgresql-client \
    && rm -rf /var/lib/apt/lists/*

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    C_FORCE_ROOT=1

WORKDIR /app

# Install dependencies
COPY requirements.txt requirements-worker.txt ./
RUN pip install --no-cache-dir -r requirements.txt -r requirements-worker.txt

COPY . .

# Worker-specific health check
HEALTHCHECK --interval=60s --timeout=15s --start-period=60s --retries=3 \
  CMD celery -A app.celery inspect ping || exit 1

CMD ["celery", "-A", "app.celery", "worker", "--loglevel=info", "--concurrency=4"]

Deployment Strategy

Environment Management

Environment Strategy

• Purpose: Local development and debugging
• Data: Synthetic test data
• Scale: Single instance services
• Updates: Continuous deployment

• Purpose: Integration and QA testing
• Data: Production-like test data
• Scale: Reduced production replica
• Updates: Automatic from feature branches

• Purpose: Final validation before release
• Data: Anonymized production data
• Scale: Full production scale
• Updates: Manual deployment gates

• Purpose: Live customer-facing application
• Data: Real customer data
• Scale: High availability, multi-AZ
• Updates: Scheduled maintenance windows

Deployment Strategies

Blue-Green Deployment
Rolling Deployment
Canary Deployment

Zero-Downtime Deployment Strategy

Blue-Green Infrastructure:

#!/bin/bash
# Blue-Green deployment script

# Configuration
CLUSTER="reptidex-production"
SERVICE_NAME="reptidex-web"
NEW_VERSION=$1

# Determine current active environment
ACTIVE_ENV=$(aws elbv2 describe-target-groups \
  --target-group-arns $BLUE_TG_ARN $GREEN_TG_ARN \
  --query 'TargetGroups[?length(Targets) > `0`].TargetGroupName' \
  --output text)

if [[ $ACTIVE_ENV == *"blue"* ]]; then
    INACTIVE_ENV="green"
    INACTIVE_TG_ARN=$GREEN_TG_ARN
else
    INACTIVE_ENV="blue"  
    INACTIVE_TG_ARN=$BLUE_TG_ARN
fi

echo "Deploying version $NEW_VERSION to $INACTIVE_ENV environment"

# Deploy to inactive environment
aws ecs update-service \
  --cluster $CLUSTER \
  --service "${SERVICE_NAME}-${INACTIVE_ENV}" \
  --task-definition "${SERVICE_NAME}:${NEW_VERSION}" \
  --force-new-deployment

# Wait for deployment to stabilize
echo "Waiting for deployment to stabilize..."
aws ecs wait services-stable \
  --cluster $CLUSTER \
  --services "${SERVICE_NAME}-${INACTIVE_ENV}"

# Health check on inactive environment
echo "Running health checks..."
./scripts/health_check.sh "https://${INACTIVE_ENV}.reptidex.com"

if [ $? -eq 0 ]; then
    echo "Health checks passed. Switching traffic..."
    
    # Switch load balancer target group
    aws elbv2 modify-listener \
      --listener-arn $LISTENER_ARN \
      --default-actions Type=forward,TargetGroupArn=$INACTIVE_TG_ARN
      
    echo "Traffic switched to $INACTIVE_ENV environment"
    
    # Run post-deployment validation
    ./scripts/validate_production.sh
    
    if [ $? -eq 0 ]; then
        echo "Deployment successful!"
        
        # Scale down old environment
        aws ecs update-service \
          --cluster $CLUSTER \
          --service "${SERVICE_NAME}-${ACTIVE_ENV}" \
          --desired-count 0
          
    else
        echo "Post-deployment validation failed. Rolling back..."
        # Rollback logic here
        exit 1
    fi
else
    echo "Health checks failed. Deployment aborted."
    exit 1
fi

Health Check Scripts:

#!/bin/bash
# health_check.sh

BASE_URL=$1
MAX_ATTEMPTS=30
SLEEP_TIME=10

for i in $(seq 1 $MAX_ATTEMPTS); do
    echo "Health check attempt $i/$MAX_ATTEMPTS"
    
    # Check application health endpoint
    HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" "$BASE_URL/api/health")
    
    if [ $HTTP_CODE -eq 200 ]; then
        echo "✅ Health check endpoint passed"
        
        # Check database connectivity
        DB_CHECK=$(curl -s "$BASE_URL/api/health/db" | jq -r '.status')
        if [ "$DB_CHECK" = "healthy" ]; then
            echo "✅ Database connectivity check passed"
            
            # Check Redis connectivity
            REDIS_CHECK=$(curl -s "$BASE_URL/api/health/cache" | jq -r '.status')
            if [ "$REDIS_CHECK" = "healthy" ]; then
                echo "✅ All health checks passed"
                exit 0
            fi
        fi
    fi
    
    echo "❌ Health check failed, retrying in ${SLEEP_TIME}s..."
    sleep $SLEEP_TIME
done

echo "❌ Health checks failed after $MAX_ATTEMPTS attempts"
exit 1

Database Migration Strategy

Database Migration Management

• Backward Compatibility: Migrations must not break existing code
• Rollback Plans: Every migration has a tested rollback script
• Data Preservation: No data loss during schema changes
• Performance Impact: Large table migrations during maintenance windows

• Staging First: All migrations tested on staging data
• Automated Execution: Migrations run automatically in pipeline
• Health Checks: Post-migration validation of data integrity
• Monitoring: Real-time monitoring during migration execution

Current Implementation Status: ReptiDex currently implements a subset of the strategies documented above:Active:

GitHub Actions workflows for CI/CD
Docker containerization with multi-stage builds for ARM64
Automated testing (unit, integration, linting)
ECR container registry
ECS Fargate rolling deployment strategy
Deployment circuit breaker with automatic rollback
CloudFormation-managed infrastructure

Planned/Future:

Blue-Green deployment (infrastructure exists but not actively used)
Canary deployment capabilities
Comprehensive E2E test automation
Advanced database migration tooling

See Current Deployment Guide for the active deployment architecture.

Getting Started

System Architecture

Backend Services

Frontend Applications

Shared Packages

Development Standards

Infrastructure & Operations

API Reference

CI/CD Pipeline & Deployment

CI/CD Pipeline & Deployment

Quick Navigation

CI/CD Pipeline Overview

Pipeline Architecture Strategy

Development Flow

Integration Flow

Deployment Flow

Pipeline Stages and Gates

Feature Development Workflow

GitHub Actions Workflows

Core Workflow Configuration

PR Validation Pipeline

Workflow Optimization

Performance Optimizations

Docker Containerization Strategy

Container Architecture

Individual Service Dockerfiles

Deployment Strategy

Environment Management

Environment Strategy

Deployment Strategies

Zero-Downtime Deployment Strategy

Database Migration Strategy

Database Migration Management

Getting Started

System Architecture

Backend Services

Frontend Applications

Shared Packages

Development Standards

Infrastructure & Operations

API Reference

​CI/CD Pipeline & Deployment

​Quick Navigation

​CI/CD Pipeline Overview

​Pipeline Architecture Strategy

Development Flow

Integration Flow

Deployment Flow

​Pipeline Stages and Gates

​Feature Development Workflow

​GitHub Actions Workflows

​Core Workflow Configuration

​PR Validation Pipeline

​Workflow Optimization

​Performance Optimizations

​Docker Containerization Strategy

​Container Architecture

​Individual Service Dockerfiles

​Deployment Strategy

​Environment Management

​Environment Strategy

​Deployment Strategies

​Zero-Downtime Deployment Strategy

​Database Migration Strategy

​Database Migration Management

CI/CD Pipeline & Deployment

Quick Navigation

CI/CD Pipeline Overview

Pipeline Architecture Strategy

Pipeline Stages and Gates

Feature Development Workflow

GitHub Actions Workflows

Core Workflow Configuration

PR Validation Pipeline

Workflow Optimization

Performance Optimizations

Docker Containerization Strategy

Container Architecture

Individual Service Dockerfiles

Deployment Strategy

Environment Management

Environment Strategy

Deployment Strategies

Zero-Downtime Deployment Strategy

Database Migration Strategy

Database Migration Management