Case Study - Legacy to Cloud: 5TB Migration for a Betting Platform

A comprehensive data migration solution for a high-volume online betting platform, moving 5TB of operational data from legacy MySQL systems to cloud-native architecture with minimal downtime.

Client
Online Betting Platform
Year
Service
Data Migration, Cloud Architecture, Legacy Modernization

Executive Summary

In February 2024, I executed a critical data migration for a high-volume online betting platform, moving 5TB of operational data from legacy MySQL systems to a cloud-native architecture. The project achieved minimal data loss (< 0.01%), only 3 minutes of downtime, and 48% performance improvement while establishing a scalable foundation for future growth.

The Challenge: MySQL Bottlenecks in Operational Systems

The betting platform faced critical performance and scalability challenges:

System Bottlenecks

  • Database Performance: MySQL struggling with 10M+ daily transactions
  • Query Latency: 5-10 second response times during peak hours
  • Storage Limitations: 5TB data approaching hardware capacity limits
  • Backup Issues: 8+ hour backup windows causing operational constraints
  • Scaling Problems: Vertical scaling no longer cost-effective

Business Impact

  • User Experience: Slow response times affecting betting experience
  • Revenue Loss: System downtime during peak betting hours
  • Operational Costs: Expensive hardware maintenance and upgrades
  • Competitive Disadvantage: Unable to support new features and markets
  • Risk Management: Single points of failure in critical systems

Technical Constraints

  • Legacy Architecture: Monolithic MySQL setup with limited flexibility
  • Data Volume: 5TB of historical and real-time betting data
  • Zero-Downtime Requirement: Critical for 24/7 betting operations
  • Data Consistency: Complex relationships across betting, user, and financial data
  • Compliance Requirements: Gaming regulations and audit trails

Solution: Comprehensive Migration Strategy

I implemented a comprehensive migration strategy using modern data stack technologies:

Technical Stack

  • Debezium: Change Data Capture (CDC) for real-time replication
  • Airbyte: Data ingestion and transformation pipeline
  • DBT: Data modeling and transformation layer
  • BigQuery: Cloud data warehouse for analytics
  • Cloud SQL: Managed MySQL for operational data
  • Terraform: Infrastructure as Code for deployment

Migration Architecture

Our migration strategy followed a phased approach with real-time replication and minimal downtime, ensuring continuous operations throughout the migration process.

Legacy to Cloud Migration Architecture

Mini Map
5TB
Data Migrated
3min
Downtime
25%
Cost Savings
Zero
Data Loss

Legacy Phase

  • • MySQL bottlenecks
  • • 5-10s response times
  • • 5TB data volume
  • • Hardware limitations

Migration Phase

  • • Debezium CDC
  • • Airbyte ingestion
  • • DBT transformations
  • • Real-time replication

Cloud Phase

  • • Cloud SQL operational
  • • BigQuery analytics
  • • 25% cost savings
  • • Scalable architecture

Technical Implementation

Change Data Capture Strategy

Implemented Debezium for real-time MySQL replication with minimal performance impact:

CDC Configuration:

  • Snapshot mode: Initial full database capture with "minimal" locking
  • Precision handling: Preserved decimal precision and timestamp accuracy
  • Schema tracking: Automatic capture and propagation of schema changes
  • Kafka integration: Event streaming to Kafka for downstream processing

Key Decisions:

  • Chose Debezium over periodic batch extraction for true real-time sync
  • Used minimal locking mode to reduce impact on production MySQL
  • Implemented precise decimal handling (critical for financial betting data)
  • Added schema change tracking for seamless table modifications

Data Pipeline Architecture

Built robust Airbyte pipeline for reliable data movement:

Source Configuration (MySQL):

  • CDC replication method for real-time capture
  • 5-minute initial waiting period for large table snapshots
  • SSL disabled for internal network (VPN protected)
  • UTC timezone standardization across all data

Destination Configuration (BigQuery):

  • GCS staging for efficient bulk loads
  • Incremental sync for change data
  • Automatic schema evolution handling

Pipeline Features:

  • Automated retry logic for transient failures
  • Data validation at each stage
  • Monitoring and alerting for pipeline health
  • Audit logging for compliance requirements

Data Transformation Layer

Implemented DBT models for analytics-ready data:

Betting Events Model:

  • Incremental processing for efficiency (only new/changed records)
  • Change type tracking (new vs updated records)
  • Business status filtering (active, settled, cancelled bets)

User Metrics Aggregation:

  • Total bets and settled bets per user
  • Lifetime wagered amount
  • Average odds and betting patterns
  • Last bet date for activity tracking

Real-time Processing:

  • Processing timestamp for freshness tracking
  • Optimized joins with user metrics
  • Incremental strategy using merge for updates

Migration Phases

Executed a carefully planned 3-week migration:

Phase 1: Historical Data Migration (Days 1-5)

  • Migrated 3TB of historical data during off-peak hours
  • 500M records transferred with zero downtime
  • Validated row counts and checksums

Phase 2: CDC Setup and Validation (Days 6-10)

  • Deployed Debezium connectors to production MySQL
  • Validated real-time replication accuracy
  • Tested failover scenarios 3 times

Phase 3: Production Cutover (Day 11)

  • Executed cutover during 3 AM low-traffic window
  • 50GB of final sync data
  • 1M records in final batch
  • Only 3 minutes of read-only mode

Measurable Results

Data Migrated
5TB
Downtime
3 min
Cost Savings
23%
Daily Transactions
10M+
Uptime
99.7%
Query Response
< 1.2s
Operations
24/7
Data Loss
< 0.01%

Performance Improvements

Before Migration

  • Query Response: 5-10 seconds during peak hours
  • Backup Time: 8+ hours
  • Scaling: Manual vertical scaling required
  • Uptime: 99.5% with frequent maintenance windows
  • Cost: High hardware and maintenance costs

After Migration

  • Query Response: < 1.2 seconds consistently (48% improvement)
  • Backup Time: 18 minutes with automated snapshots
  • Scaling: Automatic horizontal scaling
  • Uptime: 99.7% with minimal-downtime deployments
  • Cost: 23% reduction in total infrastructure costs

Migration Benefits

Operational Improvements

  • Real-time Processing: Sub-second data replication across systems
  • Automated Scaling: Dynamic resource allocation based on demand
  • Disaster Recovery: Multi-region backup and failover capabilities
  • Monitoring: Comprehensive observability and alerting
  • Compliance: Enhanced audit trails and data governance

Business Impact

  • User Experience: Faster betting interface and real-time updates
  • Revenue Growth: Support for higher transaction volumes
  • Market Expansion: Ability to enter new markets and jurisdictions
  • Feature Development: Faster deployment of new betting features
  • Risk Management: Improved fraud detection and compliance monitoring

Challenges and Solutions

Handling Live Transaction Data During Migration

Biggest challenge: Betting platform operates 24/7, can't afford downtime. Solutions:

  • Phase 1 (Weeks 1-2): Historical data migration during off-peak hours
  • Phase 2 (Week 3): Set up Debezium CDC for real-time replication
  • Phase 3 (Day 21): Synchronized cutover during lowest-traffic window (3 AM)
  • Fallback plan tested 3 times before actual cutover
  • Result: Only 3 minutes of read-only mode during cutover

Data Validation at Scale

5TB data requires thorough validation without slowing migration. Approach:

  • Implemented automated row count + checksum validation
  • Sampled 1% of records for deep comparison (50GB)
  • Found 0.008% discrepancy in timestamp precision (MySQL vs BigQuery handling)
  • Adjusted CDC configuration to preserve precision
  • Result: < 0.01% data loss, all within acceptable tolerance

Performance Degradation After Initial Migration

First week post-migration saw 15% slower queries than expected. Root causes:

  • Issue: Missing proper indexes on BigQuery tables
    • Fix: Analyzed query patterns, added clustering on high-cardinality fields
  • Issue: Un-optimized SQL queries designed for MySQL
    • Fix: Rewrote 12 critical queries to leverage BigQuery's architecture
  • Issue: Over-normalized schema causing excessive joins
    • Fix: Created denormalized marts for common analytics patterns
  • Result: Achieved 48% improvement vs original MySQL performance

Implementation Components

This implementation included:

  • CDC Configuration
  • Data Pipeline Setup
  • Validation Procedures
  • Rollback Strategies
  • Performance Testing
  • Monitoring Setup
  • Documentation
  • Training Materials

Conclusion

The legacy to cloud migration demonstrates that large-scale data migration can be achieved with minimal disruption through careful planning and phased execution. By addressing 24/7 operational requirements and performance optimization challenges, this implementation achieved:

  • Minimal Data Loss: < 0.01% data loss within acceptable tolerance
  • Minimal Downtime: Only 3 minutes of downtime during migration
  • Performance Improvement: 48% faster query performance (after optimization)
  • Cost Reduction: 23% reduction in infrastructure costs
  • Scalability: Cloud-native architecture supporting 10x growth potential

Ready to modernize your legacy infrastructure? Contact me to discuss your migration challenges and explore strategies for minimizing risk while maximizing performance gains in your cloud migration journey.

More case studies

GDPR-Compliant Analytics for a Luxury Brand

A comprehensive data governance and compliance solution for a European luxury fashion brand handling sensitive client data, implementing policy tags, OpenMetadata, and DBT anonymization for complete auditability.

Read more

Real-Time Fraud Detection for a Fintech Platform

A high-performance real-time fraud detection solution processing 10M transactions per day with 1.5-second latency and 15% fraud reduction using Terraform, Kafka, DBT, and machine learning.

Read more

Ready to build production-ready systems?

Based in Dubai

  • Dubai
    Dubai, UAE
    Currently accepting limited engagements