MySQL經(jīng)過幾個(gè)版本的發(fā)展,Replication的穩(wěn)定性也越來越高,性能也得到了長(zhǎng)足的發(fā)展。當(dāng)建立起master-slave主從復(fù)制關(guān)系時(shí),在slave端,會(huì)創(chuàng)建兩個(gè)線程:Slave IO thread 和 Slave SQL thread. 這兩個(gè)線程有各自的功能和作用。

Slave IO thread : 這個(gè)線程與master交互,當(dāng)master產(chǎn)生新的日志時(shí),向master發(fā)出COM_BINLOG_DUMP請(qǐng)求,在master端會(huì)創(chuàng)建一個(gè)binlog dump thread來dump新的日志信息,這個(gè)線程將新的日志信息發(fā)送給slave io thread,slave io thread通過net_safe_read()來安全接收master傳遞過來的binary logs,并把這些二制日志信息,寫到slave端的relay logs,接著更新slave端的master.info文件.
Slave SQL thread : 這個(gè)線程讀取relay logs中的日志信息,解析并且執(zhí)行。
MySQL的復(fù)制也可以實(shí)現(xiàn)級(jí)連,slave又可以作為另外一個(gè)slave的master,但需要在中間的這個(gè)slave上打開一個(gè)參數(shù)log_slave_updates,其體系架構(gòu)如圖所示:

Why 2 threads?
In MySQL 3.23, we had only one thread on the slave, which did the whole job: read one event from the connection to the master, executed it, read another event, executed it, etc.
In MySQL 4.0.2 we split the job into two threads, using a relay log file to exchange between them.
This makes code more complicated. We have to deal with the relay log being written at the end, read at another position, at the same time. Plus handling the detection of EOF on the relay log, switching to the new relay log. Also the SQL thread must do different reads, depending on how the relay log file it is reading is being used:
- If the file is being written to by the I/O thread, the relay log is partly in memory, not all on disk, and mutexes are needed to avoid confusion between threads.
- If the file has already been rotated (the I/O thread is not writing to it anymore), it is a normal file that no other threads touch.
The advantages of having 2 threads instead of one:
- Helps having a more up-to-date slave. Reading a statement is fast, executing it is slow. If the master dies (burns), there are good chances that the I/O thread has caught almost all updates issued on the master, and saved them in the relay log, for use by the SQL thread.
- Reduces the required master-slave connection time. If the slave has not been connected for a long time, it is very late compared to the master. It means the SQL thread will have a lot of executing to do. So with the single-thread read-execute-read-execute technique, the slave will have to be connected for a long time to be able to fetch all updates from the master. Which is stupid, as for a significant part of the time, the connection will be idle, because the single thread is busy executing the statement. Whereas with 2 threads, the I/O thread will fetch the binlogs from the master in a shorter time. Then the connection is not needed anymore, and the SQL thread can continue executing the relay log.
采用新的復(fù)制實(shí)現(xiàn)方式,盡管在代碼方面,要比原來要復(fù)雜許多,但換來的性能提升非常明顯,并且當(dāng)遇到異常情況時(shí),主從的數(shù)據(jù)差異會(huì)比采用原有方式要好很多,數(shù)據(jù)的安全性得到了更好的保障。還有一點(diǎn),如果MySQL的主從復(fù)制的性能沒有得到比較大的提高,slave與master的數(shù)據(jù)延遲比較大,那么現(xiàn)在利用MySQL實(shí)現(xiàn)的讀寫分離方案都會(huì)遇到很大的問題,這種體系結(jié)構(gòu)也不會(huì)像現(xiàn)在這么流行。
通過兩個(gè)線程來完成原來一個(gè)線程需要完成的工作,這種軟件設(shè)計(jì)思想,我們的程序設(shè)計(jì)時(shí),是否借鑒和使用過呢?