Skip to main content
Current Architecture: ReptiDex uses GitHub Actions for CI/CD, deploying ARM64 Docker containers to AWS ECS Fargate via ECR. All infrastructure is managed through CloudFormation templates with rolling deployments and automated health checks.

CI/CD Pipeline & Deployment

Complete continuous integration and deployment strategy for ReptiDex, covering GitHub Actions workflows, Docker containerization, automated testing, deployment pipelines, and release management across development and production environments.

Quick Navigation


CI/CD Pipeline Overview

Pipeline Architecture Strategy

Development Flow

Feature Development
  • Feature branch creation
  • Automated testing on PR
  • Code quality checks
  • Staging deployment

Integration Flow

Continuous Integration
  • Automated builds
  • Comprehensive test suites
  • Security scanning
  • Artifact creation

Deployment Flow

Continuous Deployment
  • ECR image push
  • ECS service updates
  • Rolling deployment
  • Automated rollback via deployment circuit breaker

Pipeline Stages and Gates

  • Development Stage
  • Integration Stage
  • Deployment Stage

Feature Development Workflow

Branch Strategy:
  • Main Branch: Production-ready code only
  • Development Branch: Integration of completed features
  • Feature Branches: Individual feature development
  • Hotfix Branches: Critical production fixes
Pull Request Process:
  1. Automated Checks: Linting, type checking, security scan
  2. Test Execution: Unit, integration, and e2e tests
  3. Code Review: Peer review requirements (minimum 2 reviewers)
  4. Staging Deploy: Automatic deployment to staging environment
  5. QA Validation: Manual testing and acceptance criteria
  6. Merge Approval: Final approval and merge to development
Quality Gates:
  • Code Coverage: Minimum 80% test coverage required
  • Security Scan: No critical or high vulnerabilities
  • Performance: No regressions in key performance metrics
  • Documentation: Updated documentation for new features

GitHub Actions Workflows

Core Workflow Configuration

  • Pull Request Workflow
  • Main Branch Workflow
  • Hotfix Workflow

PR Validation Pipeline

name: Pull Request Validation

on:
  pull_request:
    branches: [development, main]
    types: [opened, synchronize, reopened]

jobs:
  code-quality:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
          
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'
          cache: 'pip'
          
      - name: Install dependencies
        run: |
          npm ci
          pip install -r requirements.txt
          
      - name: Run linting
        run: |
          npm run lint
          ruff check .
          black --check .
          
      - name: Type checking
        run: |
          npm run type-check
          mypy .
          
      - name: Security scanning
        uses: github/codeql-action/analyze@v3
        with:
          languages: javascript,python

  unit-tests:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
          POSTGRES_DB: reptidex_test
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
          
      redis:
        image: redis:7
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup test environment
        run: |
          npm ci
          pip install -r requirements-test.txt
          
      - name: Run frontend tests
        run: npm run test:coverage
        
      - name: Run backend tests
        run: |
          pytest --cov=src --cov-report=xml --cov-min=80
          
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v4
        with:
          files: ./coverage.xml,./coverage/lcov.info

  build-and-deploy-staging:
    needs: [code-quality, unit-tests]
    runs-on: ubuntu-latest
    if: github.event.pull_request.head.ref != 'main'
    
    steps:
      - uses: actions/checkout@v4
      
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1
          
      - name: Build and push Docker images
        run: |
          # Build all service images
          docker-compose -f docker-compose.staging.yml build
          
          # Push to ECR
          aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REGISTRY
          docker-compose -f docker-compose.staging.yml push
          
      - name: Deploy to staging
        run: |
          # Update ECS service with new task definitions
          aws ecs update-service --cluster reptidex-staging --service repti-web --force-new-deployment
          
      - name: Run integration tests
        run: |
          # Wait for deployment to complete
          aws ecs wait services-stable --cluster reptidex-staging --services repti-web
          
          # Run integration test suite against staging
          npm run test:integration -- --baseUrl=https://staging.reptidex.com

Workflow Optimization

Performance Optimizations

  • Dependency Caching: npm and pip package caching
  • Docker Layer Caching: Multi-stage build optimization
  • Test Result Caching: Skip unchanged test suites
  • Build Artifact Caching: Reuse compiled assets
  • Matrix Builds: Multiple Node/Python versions
  • Parallel Tests: Split test suites across runners
  • Service Independence: Parallel service builds
  • Geographic Deployment: Multi-region parallel deployment
  • Path-based Triggers: Run only affected service tests
  • Change Detection: Skip unchanged components
  • Draft PR Handling: Lightweight validation for drafts
  • Scheduled Optimization: Dependency updates and cleanup

Docker Containerization Strategy

Container Architecture

  • Service Containers
  • Docker Compose
  • Container Optimization

Individual Service Dockerfiles

Frontend (Web) Container:
# Multi-stage build for frontend optimization
FROM node:20-alpine AS builder

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

COPY . .
RUN npm run build

# Production stage
FROM nginx:alpine AS production

# Copy custom nginx configuration
COPY nginx.conf /etc/nginx/nginx.conf
COPY --from=builder /app/dist /usr/share/nginx/html

# Add health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost/health || exit 1

EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Backend API Container:
FROM python:3.12-slim AS base

# System dependencies
RUN apt-get update && apt-get install -y \
    postgresql-client \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Python environment
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PYTHONPATH=/app

WORKDIR /app

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Application code
COPY . .

# Create non-root user
RUN adduser --disabled-password --gecos '' appuser && \
    chown -R appuser:appuser /app
USER appuser

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
  CMD curl -f http://localhost:8000/api/health || exit 1

EXPOSE 8000
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "4", "app.wsgi:application"]
Worker Container:
FROM python:3.12-slim AS worker

# System dependencies for worker tasks
RUN apt-get update && apt-get install -y \
    imagemagick \
    ffmpeg \
    postgresql-client \
    && rm -rf /var/lib/apt/lists/*

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    C_FORCE_ROOT=1

WORKDIR /app

# Install dependencies
COPY requirements.txt requirements-worker.txt ./
RUN pip install --no-cache-dir -r requirements.txt -r requirements-worker.txt

COPY . .

# Worker-specific health check
HEALTHCHECK --interval=60s --timeout=15s --start-period=60s --retries=3 \
  CMD celery -A app.celery inspect ping || exit 1

CMD ["celery", "-A", "app.celery", "worker", "--loglevel=info", "--concurrency=4"]

Deployment Strategy

Environment Management

Environment Strategy

  • Purpose: Local development and debugging
  • Data: Synthetic test data
  • Scale: Single instance services
  • Updates: Continuous deployment
  • Purpose: Integration and QA testing
  • Data: Production-like test data
  • Scale: Reduced production replica
  • Updates: Automatic from feature branches
  • Purpose: Final validation before release
  • Data: Anonymized production data
  • Scale: Full production scale
  • Updates: Manual deployment gates
  • Purpose: Live customer-facing application
  • Data: Real customer data
  • Scale: High availability, multi-AZ
  • Updates: Scheduled maintenance windows

Deployment Strategies

  • Blue-Green Deployment
  • Rolling Deployment
  • Canary Deployment

Zero-Downtime Deployment Strategy

Blue-Green Infrastructure:
#!/bin/bash
# Blue-Green deployment script

# Configuration
CLUSTER="reptidex-production"
SERVICE_NAME="reptidex-web"
NEW_VERSION=$1

# Determine current active environment
ACTIVE_ENV=$(aws elbv2 describe-target-groups \
  --target-group-arns $BLUE_TG_ARN $GREEN_TG_ARN \
  --query 'TargetGroups[?length(Targets) > `0`].TargetGroupName' \
  --output text)

if [[ $ACTIVE_ENV == *"blue"* ]]; then
    INACTIVE_ENV="green"
    INACTIVE_TG_ARN=$GREEN_TG_ARN
else
    INACTIVE_ENV="blue"  
    INACTIVE_TG_ARN=$BLUE_TG_ARN
fi

echo "Deploying version $NEW_VERSION to $INACTIVE_ENV environment"

# Deploy to inactive environment
aws ecs update-service \
  --cluster $CLUSTER \
  --service "${SERVICE_NAME}-${INACTIVE_ENV}" \
  --task-definition "${SERVICE_NAME}:${NEW_VERSION}" \
  --force-new-deployment

# Wait for deployment to stabilize
echo "Waiting for deployment to stabilize..."
aws ecs wait services-stable \
  --cluster $CLUSTER \
  --services "${SERVICE_NAME}-${INACTIVE_ENV}"

# Health check on inactive environment
echo "Running health checks..."
./scripts/health_check.sh "https://${INACTIVE_ENV}.reptidex.com"

if [ $? -eq 0 ]; then
    echo "Health checks passed. Switching traffic..."
    
    # Switch load balancer target group
    aws elbv2 modify-listener \
      --listener-arn $LISTENER_ARN \
      --default-actions Type=forward,TargetGroupArn=$INACTIVE_TG_ARN
      
    echo "Traffic switched to $INACTIVE_ENV environment"
    
    # Run post-deployment validation
    ./scripts/validate_production.sh
    
    if [ $? -eq 0 ]; then
        echo "Deployment successful!"
        
        # Scale down old environment
        aws ecs update-service \
          --cluster $CLUSTER \
          --service "${SERVICE_NAME}-${ACTIVE_ENV}" \
          --desired-count 0
          
    else
        echo "Post-deployment validation failed. Rolling back..."
        # Rollback logic here
        exit 1
    fi
else
    echo "Health checks failed. Deployment aborted."
    exit 1
fi
Health Check Scripts:
#!/bin/bash
# health_check.sh

BASE_URL=$1
MAX_ATTEMPTS=30
SLEEP_TIME=10

for i in $(seq 1 $MAX_ATTEMPTS); do
    echo "Health check attempt $i/$MAX_ATTEMPTS"
    
    # Check application health endpoint
    HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" "$BASE_URL/api/health")
    
    if [ $HTTP_CODE -eq 200 ]; then
        echo "✅ Health check endpoint passed"
        
        # Check database connectivity
        DB_CHECK=$(curl -s "$BASE_URL/api/health/db" | jq -r '.status')
        if [ "$DB_CHECK" = "healthy" ]; then
            echo "✅ Database connectivity check passed"
            
            # Check Redis connectivity
            REDIS_CHECK=$(curl -s "$BASE_URL/api/health/cache" | jq -r '.status')
            if [ "$REDIS_CHECK" = "healthy" ]; then
                echo "✅ All health checks passed"
                exit 0
            fi
        fi
    fi
    
    echo "❌ Health check failed, retrying in ${SLEEP_TIME}s..."
    sleep $SLEEP_TIME
done

echo "❌ Health checks failed after $MAX_ATTEMPTS attempts"
exit 1

Database Migration Strategy

Database Migration Management

  • Backward Compatibility: Migrations must not break existing code
  • Rollback Plans: Every migration has a tested rollback script
  • Data Preservation: No data loss during schema changes
  • Performance Impact: Large table migrations during maintenance windows
  • Staging First: All migrations tested on staging data
  • Automated Execution: Migrations run automatically in pipeline
  • Health Checks: Post-migration validation of data integrity
  • Monitoring: Real-time monitoring during migration execution

Current Implementation Status: ReptiDex currently implements a subset of the strategies documented above:Active:
  • GitHub Actions workflows for CI/CD
  • Docker containerization with multi-stage builds for ARM64
  • Automated testing (unit, integration, linting)
  • ECR container registry
  • ECS Fargate rolling deployment strategy
  • Deployment circuit breaker with automatic rollback
  • CloudFormation-managed infrastructure
Planned/Future:
  • Blue-Green deployment (infrastructure exists but not actively used)
  • Canary deployment capabilities
  • Comprehensive E2E test automation
  • Advanced database migration tooling
See Current Deployment Guide for the active deployment architecture.