MySQL Replication: Master-Slave Setup

Key Insights

• MySQL replication provides high availability and read scalability by maintaining synchronized copies of data across multiple servers, with the master handling writes and slaves serving read traffic. • Setting up replication requires configuring binary logging on the master, creating a dedicated replication user with appropriate privileges, and establishing the replication connection on each slave server. • Monitoring replication lag and implementing proper error handling are critical for maintaining data consistency and identifying issues before they impact production systems.

Understanding MySQL Replication Architecture

MySQL replication works through binary logs that record all data modifications on the master server. Slave servers connect to the master, read these binary log events, and replay them locally to maintain an identical dataset. This asynchronous process allows slaves to lag slightly behind the master but provides significant benefits for read-heavy workloads.

The master server maintains a binary log position pointer, while each slave tracks its progress through the master’s binary log. This mechanism enables slaves to resume replication after interruptions without data loss.

Configuring the Master Server

First, enable binary logging and assign a unique server ID in the MySQL configuration file (/etc/mysql/my.cnf or /etc/my.cnf):

[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
binlog_do_db = production_db
max_binlog_size = 100M
expire_logs_days = 7

The server-id must be unique across your replication topology. The binlog_format setting determines how changes are logged—ROW format provides the most reliable replication for complex queries. The binlog_do_db directive limits replication to specific databases.

Restart MySQL to apply these changes:

sudo systemctl restart mysql

Create a dedicated replication user with appropriate privileges:

CREATE USER 'repl_user'@'%' IDENTIFIED BY 'strong_password_here';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;

Lock the database and obtain the binary log position for initial slave setup:

FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;

This returns output similar to:

+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000003 |      154 | production_db|                  |
+------------------+----------+--------------+------------------+

Record the File and Position values—you’ll need them for slave configuration.

Creating the Initial Data Snapshot

While the master is locked, export the database:

mysqldump -u root -p --databases production_db \
  --master-data=2 \
  --single-transaction \
  --routines \
  --triggers \
  --events > master_snapshot.sql

The --master-data=2 option includes the binary log position as a comment in the dump file. The --single-transaction flag ensures consistency for InnoDB tables without requiring a lock.

Unlock the master to resume normal operations:

UNLOCK TABLES;

Transfer the snapshot to your slave server:

scp master_snapshot.sql user@slave-server:/tmp/

Configuring the Slave Server

Configure the slave server with a unique server ID in its MySQL configuration:

[mysqld]
server-id = 2
relay-log = /var/log/mysql/mysql-relay-bin
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
read_only = 1

The relay-log stores binary log events received from the master. Setting read_only = 1 prevents accidental writes to the slave, except for the replication thread and users with SUPER privilege.

Restart the slave MySQL service:

sudo systemctl restart mysql

Import the master snapshot:

mysql -u root -p < /tmp/master_snapshot.sql

Configure the slave to connect to the master:

CHANGE MASTER TO
  MASTER_HOST='192.168.1.100',
  MASTER_USER='repl_user',
  MASTER_PASSWORD='strong_password_here',
  MASTER_LOG_FILE='mysql-bin.000003',
  MASTER_LOG_POS=154;

Use the exact MASTER_LOG_FILE and MASTER_LOG_POS values from the master status check.

Start the replication process:

START SLAVE;

Verifying Replication Status

Check the slave status to confirm proper operation:

SHOW SLAVE STATUS\G

Key fields to monitor:

Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Seconds_Behind_Master: 0
Last_IO_Error: 
Last_SQL_Error:

Both Slave_IO_Running and Slave_SQL_Running must show “Yes” for healthy replication. The Seconds_Behind_Master indicates replication lag—values consistently above zero suggest performance issues.

Test replication by creating a test table on the master:

-- On master
USE production_db;
CREATE TABLE replication_test (
  id INT PRIMARY KEY AUTO_INCREMENT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO replication_test VALUES (NULL, NOW());

Verify the table appears on the slave:

-- On slave
USE production_db;
SELECT * FROM replication_test;

Handling Common Replication Errors

When replication breaks, SHOW SLAVE STATUS\G reveals the error in Last_SQL_Error. Common issues include duplicate key errors or missing tables.

For non-critical errors, skip a single problematic statement:

STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

For persistent issues, resync the slave by repeating the snapshot process. Alternatively, use pt-table-sync from Percona Toolkit for selective synchronization:

pt-table-sync --execute --sync-to-master \
  h=slave-server,D=production_db,t=problematic_table

Implementing Replication Monitoring

Create a monitoring script to track replication health:

import pymysql
import sys

def check_replication(host, user, password):
    try:
        conn = pymysql.connect(host=host, user=user, password=password)
        cursor = conn.cursor(pymysql.cursors.DictCursor)
        cursor.execute("SHOW SLAVE STATUS")
        status = cursor.fetchone()
        
        if not status:
            print("ERROR: Not configured as slave")
            sys.exit(2)
        
        io_running = status['Slave_IO_Running']
        sql_running = status['Slave_SQL_Running']
        lag = status['Seconds_Behind_Master']
        
        if io_running != 'Yes' or sql_running != 'Yes':
            print(f"ERROR: Replication stopped - IO:{io_running} SQL:{sql_running}")
            sys.exit(2)
        
        if lag and lag > 300:
            print(f"WARNING: Replication lag {lag} seconds")
            sys.exit(1)
        
        print(f"OK: Replication healthy, lag {lag} seconds")
        sys.exit(0)
        
    except Exception as e:
        print(f"ERROR: {str(e)}")
        sys.exit(2)

check_replication('slave-server', 'monitor', 'monitor_pass')

Schedule this with cron or integrate with monitoring systems like Nagios or Prometheus.

Performance Optimization

For high-traffic environments, enable parallel replication in MySQL 5.7+:

[mysqld]
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK

This allows multiple worker threads to apply relay log events concurrently, reducing replication lag.

Adjust the binary log cache size for large transactions:

binlog_cache_size = 1M
max_binlog_cache_size = 2G

Monitor replication throughput and adjust these values based on your workload characteristics. Proper configuration ensures your master-slave setup remains reliable and performant under production load.