The MySQL n00b: In place transparent compression of MySQL binary logs

Wednesday, December 14, 2016

In place transparent compression of MySQL binary logs

Any DBA who has administered a busy master knows how fast the disk space occupied by binary logs may grow. DBAs have no control on this: the growth depends on the workload, and the workload depends on many factors, e.g.:

- application changes (the applications start writing more due to code changes)

- traffic changes (the peak season arrive, your workload doubles in size)

- infrastructure changes (the devops add more servers)

- business changes (new business flows adds to existing workload)

So either you have being thoughtful and have planned in advance for a large enough storage space (to handle the increase in number of binary logs), or, sooner or later, you will face the usual dilemma - how many retention days dare you give up to accommodate for the binlog growth?

In my case, I was very thoughtful, but the boss didn't listen and gave me servers with very limited binlog storage space and, more important, using internal disks so no way to add more disk space on demand. Pity, as I had previously found a way to move binlogs to a different filesystem without downtime. But unfortunately, there is no "different filesystem" this time. I'm stuck with what I have.

So, the obvious answer at first was to reduce the retention of binary logs via the
expire_logs_days variable. Not the ideal option, as in my situation, I had to reduce it to a mere 2 days, not even enough to survive a weekend. At the same time I didn't like the idea to keep more days and be waken up at 3am by a filesystem full alert.

So the idea of compressing the binary logs came to my mind, but I wanted to do it in a transparent way, preserving the existing binlog retention mechanisms (expire_logs_days variable and PURGE BINARY LOGS command should continue to work as before) and without breaking the MySQL server and/or replication. Ideally, the timestamps of the files themselves should also preserved.

I decided to try and compress the binary logs in place. In place means that when you look at your binlogs folder, you don't notice anything unusual. Here's (a part of ) the listing of one of our servers with compression active:

-rw-rw---- 1 mysql mysql 13631245 Dec 14 14:17 dbmd01.079826

-rw-rw---- 1 mysql mysql 13909321 Dec 14 14:23 dbmd01.079827

-rw-rw---- 1 mysql mysql 13656948 Dec 14 14:30 dbmd01.079828

-rw-rw---- 1 mysql mysql 13915222 Dec 14 14:37 dbmd01.079829

-rw-rw---- 1 mysql mysql 104883640 Dec 14 14:44 dbmd01.079830

-rw-rw---- 1 mysql mysql 104898032 Dec 14 14:50 dbmd01.079831

-rw-rw---- 1 mysql mysql 104861122 Dec 14 14:57 dbmd01.079832

You can spot which files are compressed by looking at the file size, but everything else is preserved. Looking with file, here's what's in each of them:

dbmd01.079826: gzip compressed data

dbmd01.079827: gzip compressed data

dbmd01.079828: gzip compressed data

dbmd01.079829: gzip compressed data

dbmd01.079830: MySQL replication log

dbmd01.079831: MySQL replication log

dbmd01.079832: MySQL replication log

Basically, after compression, the compressed file retains the original file name. This is to keep MySQL happy - you know, these files are tracked in the index file and read when mysqld starts.

Speaking of which, here's what happens when you start mysqld with in-place compression:

2016-10-26 03:06:13 0 [Note] /usr/sbin/mysqld (mysqld 5.6.34-79.1-log) starting as process 1407 ...

[ ... ]

2016-10-26 03:06:13 1407 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.34-79.1 started; log sequence number 2722915

2016-10-26 03:06:13 1407 [ERROR] Binlog has bad magic number; It's not a binary log file that can be used by this version of MySQL

Although these entries are flagged as ERROR, actually mysqld just ignores the compressed files, and more importantly, does not touch them.

This way, compressed binlog files can be managed exactly like uncompressed, standard ones. Of course, I need to remember that if I need to peruse their content, I have to uncompress them before feeding them to the mysqlbinlog utility.

Last but not least, in order to ensure that we do not inadvertently compress binlogs that haven't been downloaded by all slaves yet, the following extra safety rules are enforced:

- do not compress any binlog which is not at least 1 hour old

- only compress binlogs that have a sequence number which is less than the current sequence number of the most lagging slave

The script runs every 10 minutes, extracts the list of the binary logs which aren't compressed yet, applies the above rules, then it proceeds to compress them.

I have set expire_logs_days back to 7 days and have plenty of disk space for extra growth now....

You can find the compress_binlog.sh which I crafted for the above on my GitHub page.

One caveat - script hasn't been tested with binlog sequence numbers greater than 999,999. I am actually curious to see what happens to the binlog file name when we reach one million - need to either be very patient, or have a look at the source code... :-)

Enjoy!

Rick

18 comments:

Daniël van EedenDecember 14, 2016 at 6:13 PM
I wouldn't call this transparent. However I think this can be useful. Note that you might have to edit the binlog index file if you want to make the binlogs available again after uncompressing them. Another option is to use mysqlbinlog on the decompressed files.

Note that Tencent has a patch to get MySQL to compress the binlog files.
https://bugs.mysql.com/bug.php?id=48396
https://github.com/TencentDBA/TMySQL
ReplyDelete
Replies
UnknownDecember 15, 2016 at 7:08 PM
It'd be cool if you could store the binlogs on a compressed filesystem so they're still readable by mysql. Even cooler if you could somehow setup tiered storage so recent binlogs are on fast storage and older ones get rotated out to the compressed filesystem.

Am I curious with your current gzip design what's the benefit of keeping gzipped binlogs on the master if mysql can't read them? Archiving perhaps? I'd love to hear your use case.
ReplyDelete
Replies
UnknownDecember 16, 2016 at 2:28 AM
Thanks for taking the time to respond! It sounds like gzip meets your needs. Nice!
ReplyDelete
Replies
Jean-François GagnéDecember 19, 2016 at 4:35 PM
My approach would use one of my favorite tool: do not keep the binlogs on the master, but on a Binlog Server. A smart Binlog Server should be able to managed transparently compressed/uncompressed binlogs. And an even smarter Binlog Server should be able to store recent binlogs on the local disks, and old ones in S3 or HDFS.
ReplyDelete
Replies
UnknownFebruary 12, 2018 at 10:56 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownFebruary 12, 2018 at 11:37 AM
use this to not compress active binlog
insert this line

masterbinlog=$(mysql -u $REPL_USER -p$REPL_PASSWORD -h $MASTER_HOST -e 'show master status;' |grep $base | awk '{print $1}')

and

change the if statement to this

if [[ $(file -b $f) == "MySQL"* ]] && [ $(basename $f) != $masterbinlog ]
ReplyDelete
Replies
Arda BeyazoğluMarch 10, 2019 at 9:20 PM
Instead of in-place compression, I would keep the binlogs just few days but backup last 1 month/week of binlogs for example. I made a small script that automates that with compression and rotation. Here it is: https://github.com/ardabeyazoglu/mysql-binlog-backup
ReplyDelete
Replies
SAMMY THOMASJune 23, 2019 at 5:21 PM
Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with extra information? It is extremely helpful for me. Binary options recovery
ReplyDelete
Replies
mitOctober 9, 2019 at 1:03 PM
Dear Sir, the script is not working for me properly. Following is it output with some variables echoed values for your information. Can you pls help me out ???

Starting Wed Oct 9 15:39:55 PKT 2019
Echoed value of $slave -> dev-DB-196
In for loop $slave is dev-DB-196
In else part $sp is
$sc is 0
* no slaves detected
Just before for loop of binlog compresion...
find /mysql-bin-logs -name mysql-197-bin-log.[0-9]* -mmin +60
> compressing mysql-197-bin-log.000115
> compressing mysql-197-bin-log.000116
> compressing mysql-197-bin-log.000117
> compressing mysql-197-bin-log.000118
> compressing mysql-197-bin-log.000119
> compressing mysql-197-bin-log.000120
> compressing mysql-197-bin-log.000121
> compressing mysql-197-bin-log.000122
Completed Wed Oct 9 15:42:31 PKT 2019
ReplyDelete
Replies
mens ed pillsNovember 25, 2020 at 4:44 AM
Yes! Finally someone writes about %keyword1%.
ReplyDelete
Replies
erectile dysfunctionNovember 25, 2020 at 5:03 AM
You ought to take part in a contest for one of the finest websites online. I most certainly will recommend this website!
ReplyDelete
Replies
David SmithApril 21, 2021 at 11:57 AM
Reading this article is gave me many things to think about. You have some quality information here that any reader would enjoy. I got useful information this article.Thanks for this Information.. traders who wish to enjoy maximum returns and profit in stock market tips and share market.
ReplyDelete
Replies
fullmetalFebruary 17, 2022 at 6:49 AM
Hey friend, it is very well written article, thank you for the valuable and useful information you provide in this post. Keep up the good work! FYI, Pet Care adda
Credit card processing, Klara and the Sun PDF Download ,my favourite movie essay

ReplyDelete
Replies

Add comment