[MySQL]: HA and Scalability
Replication:
Replication enables data from one MySQL database server (the master) to be replicated to one or more MySQL database servers (the slaves). Replication is asynchronous – slaves need not be connected permanently to receive updates from the master. Depending on the configuration, you can replicate all databases, selected databases, or even selected tables within a database.
The target uses for replication in MySQL include:
Scale-out solutions – spreading the load among multiple slaves to improve performance. In this environment, all writes and updates must take place on the master server. Reads, however, may take place on one or more slaves. This model can improve the performance of writes (since the master is dedicated to updates), while dramatically increasing read speed across an increasing number of slaves.
Data security – because data is replicated to the slave, and the slave can pause the replication process, it is possible to run backup services on the slave without corrupting the corresponding master data.
Analytics – live data can be created on the master, while the analysis of the information can take place on the slave without affecting the performance of the master.
Replication in MySQL features support for one-way, asynchronous replication, in which one server acts as the master, while one or more other servers act as slaves. This is in contrast to the synchronous replication which is a characteristic of MySQL Cluster.
Advantages:
- It makes backing up the database easier and safer.
- Gives more performance with load balancing. we can split the load between two servers by directing reads to the slave server and writes to the master server.
- We can replicate from one storage engine to another storage engine.
- We can use this slave server for data analysis, reports etc….instead on impacting the production server we can do this in slave server and even we can shutdown the slave server.
Disadvantages:
- The main disadvantage of replication is there is no guaranty of data synchronization between two servers.
- No automatic fail over technique in case if master fails. This causes a little down time.
- User defined variables and temporary tables may not work.
- If replication is set with multiple slaves it may cause some load on master server to updated the statement to all the slaves.
Replication Formats:
Replication is of three formats:
- STATEMENT BASED
- ROW BASED
- MIXED-FORMAT (Combination of statement based and row based)
Statement Based: All statements propagates from master to slave and executes the statements on slave.
Advantages of Statement-Based Replication:
- Less data written to log files. This results less storage space for log files.
- Log files contains all the changes made to the database. so they can be used to track the database changes.
Disadvantages of Statement-Based Replication:
- Queries using function like NOW(), RAND(), CURDATE(), UUID() etc are unsafe.
- DELETE and UPDATE queries using a LIMIT clause and without using ORDER BY clause are unsafe.
- Queries requires more number of locks on tables.
Row Based: Row-based binary logging logs changes in individual table rows.
Advantages of Row-Based Replication:
- All queries can be replicated safely.
- Queries requires less number of locks on tables.
Disadvantages of Row-Based Replication:
- Large data written to log files. This results large storage space for log files.
This would take a little longer time when statements like BLOB and TEXT values are encountered.
- We cannot track the logs to find the changes committed to the database.
Mixed Based: When the mixed format is in effect, statement-based logging is used by default, but automatically switches to row-based logging in particular cases
Note: MIXED-FORMAT replication was introduced from MySQL Version 5.1.8.
Replication Configuration:
Replication between servers in MySQL is based on the binary logging mechanism. The MySQL instance operating as the master writes updates and changes as “events” to the binary log. The information in the binary log is stored in different logging formats according to the database changes being recorded. Slaves are configured to read the binary log from the master and to execute the events in the binary log on the slave’s local database.
Once binary logging has been enabled, all statements are recorded in the binary log. Each slave receives a copy of the entire contents of the binary log. It is the responsibility of the slave to decide which statements in the binary log should be executed; you cannot configure the master to log only certain events. If you do not specify otherwise, all events in the master binary log are executed on the slave. If required, you can configure the slave to process only events that apply to particular databases or tables.
Each slave keeps a record of the binary log coordinates: The file name and position within the file that it has read and processed from the master. This means that multiple slaves can be connected to the master and executing different parts of the same binary log. Because the slaves control this process, individual slaves can be connected and disconnected from the server without affecting the master’s operation. Also, because each slave remembers the position within the binary log, it is possible for slaves to be disconnected, reconnect and then “catch up” by continuing from the recorded position.
Both the master and each slave must be configured with a unique ID. In addition, each slave must be configured with information about the master host name, log file name, and position within that file. These details can be controlled from within a MySQL session using the CHANGE MASTER TO statement on the slave. The details are stored within the slave’s master.info file.