Monday, December 11, 2017

MySQL 5.7 reads all your binlog files more often than you think

After upgrading some of our slaves to latest 5.7, I have found  what looks like a serious regression introduced in MySQL 5.7.
A couple weeks ago I noticed that the error log file of one of our clusters, where I had implemented my in place transparent compression of binary logs,  was literally flooded by the following error:

[ERROR] Binlog has bad magic number;  It's not a binary log file that can be used by this version of MySQL

In the above setup this is  an harmless error, and it should only happen at server startup, where mysqld opens and reads all available binary log files.  The error is due to the fact that since files are now compressed, mysqld doesn't recognize them as valid - not an issue, as only older files are compressed, and only after ensuring all slaves already got them.

But I couldn't understand why the error was now hitting every minute or so, after upgrading from 5.6 to 5.7.

After a short investigation I was able to relate the errors to the exact moment where a binary log was closed and a new one created, this normally happens when the binary log file size reaches the value you have set in max_binlog_size variable. But this indicated that mysqld was continuously trying to open and read all the available binary logs, each time a file was rotated.

Ouch! This wasn't a nice finding.

On this server we have in excess of 3,500 binary log files, and I know about installations which have much more (mind you, we have a retention of 4 days only here). We like to keep binlog files small, we have them sized at 100 MB as this makes easier to peruse them when troubleshooting with mysqlbinlog. Anyways this wouldn't change anything, I mean, having 3,500 100M files or 350 1G files would generate exactly the same amount of I/O when you wanted to read them all....

Anyways, I had just found that every time a binary log is rotated, mysqld needs to open all of the files. What a waste of I/O and performances, I thought.  So I decided to have a look at the code to understand what mysqld was actually trying to do with those binlogs.

It wasn't too difficult to narrow down the binlog scan with the function  purge_logs_before_date()
that expires logs based on time, as set by the expire_logs_days variable.
But reading the binary log files shouldn't be really needed when you're only looking to expire them, so this looked weird to me.

Further digging into the source code shows that mysqld is actually looking for "lost GTIDs" in the binlog files, and does it every time a binary log is closed for rotation. I haven't looked deeper into what a "lost" GTID means in the code, however whatever the reason for this search is, for sure it shouldn't be done repeatedly at every binlog rotation.

The weird thing is, we aren't even using GTIDs!
But still mysqld opens and reads ALL the available binary log files all the time. This is a serious regression compared to 5.6 where this wasn't happening, and it can hit you hard when you have a large number of binary log files. Caveat emptor.

I have raised a bug for the above, and really hope Oracle gives it the needed attention.


2 comments:

  1. If your OS File Descriptors are not enough, this will be a serious problem!

    ReplyDelete
    Replies
    1. Well no... it opens them one at a time luckily. Still a serious problem I/O wise...

      Delete