Database Architecture & Management
Detailed database architecture and management strategy for ReptiDex, covering PostgreSQL primary storage, Redis caching, schema design, performance optimization, backup strategies, and scaling approaches.
Quick Navigation
Database Overview
Architecture & strategy
PostgreSQL
Primary database setup
Redis Caching
Cache & session storage
Backup & Recovery
Data protection strategy
Database Architecture Overview
Multi-Database Strategy
PostgreSQL Primary
Relational Data Storage
- User accounts and authentication
- Animal records and breeding data
- Vivarium organizations and memberships
- Transactional commerce data
- Taxonomy and reference data
Redis Cache Layer
High-Performance Caching
- Session storage and management
- Frequently accessed data caching
- Real-time features and pub/sub
- Rate limiting and temporary data
Database Technology Decisions
- PostgreSQL Choice
- Redis Choice
- Alternative Considerations
Why PostgreSQL for Primary Storage
Technical Advantages:- ACID Compliance: Full transactional integrity for commerce and breeding records
- JSON Support: Native JSON/JSONB for flexible animal attribute storage
- Advanced Indexing: GIN, GiST indexes for complex queries and search
- Full-Text Search: Built-in search capabilities for animal and breeder discovery
- AWS RDS Integration: Managed service with Multi-AZ, automated backups
- Mature Ecosystem: Extensive Python/SQLAlchemy support
- Performance: Excellent performance for read-heavy workloads
- Scaling Options: Read replicas, connection pooling, partitioning
- Hierarchical Data: Support for nested taxonomies and lineage trees
- Geographic Queries: PostGIS extension for location-based features
- Array Support: Native array types for tag and morph listings
- Custom Types: Enum types for species, morphs, and breeding statuses
PostgreSQL Configuration
Database Schema Architecture
Schema Organization by Service
- • repti_core: Users, authentication, profiles, vivariums (orgs)
- • repti_animal: Animals, records, breeding, lineage
- • repti_commerce: Listings, transactions, payments
- • repti_community: Messages, forums, notifications
- • repti_media: File storage, image metadata, processing
- • repti_ops: System logs, metrics, configuration
- • taxonomy: Species, morphs, traits (shared reference)
- • audit: Change tracking, compliance, security logs
Data Relationships and Schema Organization
The database schemas are organized by service boundaries with clear data relationships:Key Entity Relationships
- • Users → Animals (ownership relationship)
- • Users → Vivariums (organization membership)
- • Vivariums → Animals (organizational grouping)
- • Animals → Breeding Events (genetic tracking)
- • Animals → Listings (marketplace integration)
- • Species/Morphs → Animals (taxonomic classification)
- • Foreign keys enforce referential integrity
- • Shared taxonomy tables across all schemas
- • Audit trails for all transactional data
- • Media references linked to primary entities
- • Geographic data for location-based features
Detailed schema designs are maintained within their respective component documentation to ensure single-source maintenance and prevent synchronization issues. Refer to individual service documentation for complete table structures and relationships.
Database Performance Optimization
Performance Tuning Strategy
- • Primary Keys: UUID with sequential generation
- • Foreign Keys: All relationships indexed
- • Query Patterns: Composite indexes for common filters
- • JSON Fields: GIN indexes for JSONB columns
- • Full-Text Search: GIN indexes on tsvector columns
- • Connection Pooling: PgBouncer for connection management
- • Query Analysis: Regular EXPLAIN ANALYZE reviews
- • Pagination: Cursor-based for large result sets
- • Aggregations: Materialized views for complex reports
- • Bulk Operations: COPY and batch inserts
- • Slow Query Log: Track queries >1 second
- • Index Usage: Monitor unused and duplicate indexes
- • VACUUM: Automated maintenance scheduling
- • Statistics: Regular ANALYZE for query planner
- • Bloat Monitoring: Table and index bloat tracking
Redis Cache Configuration
Cache Architecture Strategy
- Session Management
- Query Result Caching
- Real-time Features
- Rate Limiting
Redis Performance Configuration
Redis Optimization Settings
- • Max Memory: 80% of available RAM
- • Eviction Policy: allkeys-lru for cache data
- • Memory Sampling: 5 samples for LRU decisions
- • Compression: Hash and list compression enabled
- • Key Expiration: Active expiry of 10% keys per cycle
- • RDB Snapshots: Every 15 minutes if >1 change
- • AOF: Enabled with everysec fsync policy
- • AOF Rewrite: Auto-rewrite when 100% size growth
- • Backup: Daily snapshots to S3
- • Multi-AZ: ElastiCache automatic failover
Backup & Recovery Strategy
Comprehensive Data Protection
- PostgreSQL Backups
- Redis Persistence
- Disaster Recovery
RDS Automated Backup Strategy
Automated Daily Backups:- Backup Window: 3:00-4:00 AM EST (low traffic period)
- Retention Period: 30 days for production, 7 days for development
- Point-in-Time Recovery: Available within retention period
- Cross-Region Backup: Automated replication to secondary region
- User Data: Weekly export of user accounts and profiles
- Transaction Data: Daily export of commerce transactions
- Animal Records: Weekly export of breeding and lineage data
- System Configuration: Monthly export of settings and configurations
- Automated Testing: Weekly restore tests to staging environment
- Data Integrity: Checksums and row count validation
- Recovery Time Testing: Monthly full recovery time measurement
- Cross-Region Validation: Quarterly disaster recovery drills
Backup Monitoring and Alerting
Backup Health Monitoring
- • Backup Completion: Success/failure alerts
- • Size Validation: Backup size consistency checks
- • Time Tracking: Backup duration monitoring
- • Storage Usage: Backup storage capacity alerts
- • Checksum Validation: Automatic data integrity verification
- • Row Count Verification: Table count consistency checks
- • Foreign Key Validation: Referential integrity testing
- • Sample Data Testing: Random data sampling verification
- • Restore Testing: Weekly automated restore validation
- • Performance Baseline: Recovery time benchmarking
- • Dependency Verification: External service availability
- • Documentation Updates: Automated procedure updates
Database Scaling Strategy
Horizontal and Vertical Scaling
- PostgreSQL Scaling
- Redis Scaling
- Database Sharding
Read Replica Strategy
Read Replica Configuration:- Geographic Distribution: Read replicas in user-dense regions
- Query Routing: Automatic read/write splitting at application level
- Lag Monitoring: < 5 second replication lag tolerance
- Failover Capability: Promote replica to primary if needed
- Time-based Partitioning: Care logs and system logs by month
- Hash Partitioning: Large tables by user_id hash
- Range Partitioning: Transactions by date ranges
- Automated Maintenance: Partition creation and cleanup scripts
- CPU Utilization: Scale up when >80% for 10 minutes
- Memory Usage: Scale up when >85% for 5 minutes
- Connection Count: Scale up when >80% of max connections
- IOPS Utilization: Scale up when >70% of provisioned IOPS
Database Maintenance Automation
Automated Maintenance Tasks
- • VACUUM: Weekly full vacuum, daily auto-vacuum
- • ANALYZE: Statistics update after significant data changes
- • REINDEX: Monthly index rebuilding for large tables
- • Log Rotation: Daily cleanup of old log files
- • Connection Monitoring: Idle connection cleanup
- • Bloat Detection: Weekly table and index bloat reports
- • Memory Defragmentation: Weekly active defragmentation
- • Key Expiration: Automated cleanup of expired keys
- • AOF Rewrite: Automatic AOF compaction when needed
- • Snapshot Management: Automated snapshot cleanup
- • Performance Monitoring: Key distribution analysis
- • Cache Hit Rate: Monitoring and optimization alerts
This comprehensive database architecture provides ReptiDex with a robust, scalable foundation for data storage and retrieval. The combination of PostgreSQL for transactional data and Redis for high-performance caching ensures optimal performance while maintaining data integrity and providing clear scaling paths for future growth.

