Wednesday, May 6

Secrets of Redo Logging in InnoDB

Why InnoDB engine need Redo Log Files ?

InnoDB is a general-purpose storage engine that balances high reliability and high performance. It is a transactional storage engine and is fully ACID compliant, as would be expected from any relational database. The durability guarantee provided by InnoDB is made possible by the redo logs.


How it is generated  ?

InnoDB (.ibd files) are considered to be a sequence of equal sized pages. These pages are uniquely identified within the InnoDB system using the space_id, page_no combination. If we want to read or modify any page, it needs to be loaded into memory. So there are two copies of the pages - on disk and in memory.

1. Any changes to a page is first done to the in-memory copy of the page. The page that is modified in memory and not yet flushed to disk is marked as the dirty page in memory.
2. An associated redo log is generated in memory, in the local mini transaction (mtr) buffer. This will then be transferred to the global in-memory redo log buffer.
3. The redo log record is written from the redo log buffer in memory to the redo log file on disk. This is subsequently flushed. These two steps are considered separate - writing to the redo log file and flushing the redo log file to the disk. This is to account for the file buffering done by the operating system.
4. The dirty page is flushed from memory to disk at some later point of time as part of the checkpointing operation.The order of these steps are important. The redo log record of a change must be flushed to disk before flushing the corresponding dirty page to the disk. This is the concept of write-ahead logging (WAL).

How it works ?

By default, InnoDB creates two redo log files (or just log files) ib_logfile0 and ib_logfile1 within the data directory of MySQL. In MySQL versions 5.6.8 and above, the default size of each redo log file is 48 MB each. This can be configured by the user by making use of innodb_log_file_size server option. The number of log files is controlled byinnodb_log_files_in_group server option.
A log group consists of a number of log files, each of same size. As of MySQL 5.6, InnoDB supports only one log group. So I'll not discuss this further.
The redo log files are used in a circular fashion. This means that the redo logs are written from the beginning to end of first redo log file, then it is continued to be written into the next log file, and so on till it reaches the last redo log file. Once the last redo log file has been written, then redo logs are again written from the first redo log file.
The log files are viewed as a sequence of blocks called "log blocks" whose size is given by OS_FILE_LOG_BLOCK_SIZE which is equal to 512 bytes. Each log file has a header whose size is given by LOG_FILE_HDR_SIZE, which is defined as 4*OS_FILE_LOG_BLOCK_SIZE.

Terminology 

Redo Log File Header

Each redo log file contains a header occupying four log blocks with the following information:
The first 4 bytes contain the log group number to which the log file belongs.
The next 8 bytes contain the lsn of the start of data in this log file.
First checkpoint field located in the beginning of the second log block.
Second checkpoint field located in the beginning of the fourth log block.

Log Blocks 

A redo log file can be viewed as a sequence of log blocks. All log blocks, except the ones belonging to the log file header, contain a header and a footer.
The log block header contains the following information:
The log block number. This field is of 4 bytes.
Number of bytes of log written to this block. This field is of 2 bytes.
Offset of the first start of an mtr log record group in this log block or 0 if none
The checkpoint number to which this log block belongs
The log block trailer contains checksum of the log block contents.

Log Sequence Number (LSN)

It’s a most talked terminology while among Database Admins :) The LSN is an offset into the redo log file. Within InnoDB the log sequence number is represented by the type lsn_t, which is an 8-byte unsigned integer. The LSN is what that links the dirty page, the redo log record, and the redo log file. Each redo log record when it is copied to the in-memory log buffer, it gets an associated LSN. When each database page is modified, redo log records are generated. So each database page is also associated to an LSN. The page lsn is stored in a header for each page. The page lsn gives the lsn upto which the redo log file must be flushed before flushing the page.

Global In-memory Redo Log Buffer

The in-memory redo log buffer is global and all redo logs generated by user transactions will be written into this buffer. The size of this buffer is configurable and is given by the innodb_log_buffer_size. The default size of this redo log buffer is 8MB.

  • When it comes into picture ?
  • Running transaction is modifying database contents
  • This log buffer will be written or flushed to the log file, either when the transaction commits
  • When the log buffer gets full


Mini transaction (mtr)

A mini transaction (mtr) must be used to generate all the redo log records. A mini transaction contains a local buffer (called the mini transaction buffer) into which the generated redo log records will be stored. If we need to generate a group of redo log records such that either all make it to the redo log file or none makes it, then we need to put them in a single mini transaction. Apart from the redo log records, the mini transaction also maintains a list of pages that has been modified (dirty pages).

Redo Log Record Types

When we modify a database page, a redo log record is generated. This redo log record either contains what information has been changed in the page (physical redo log) or how to perform the change again (logical redo log). InnoDB uses a combination of physical redo logs and logical redo logs.

Life cycle of a redo log record

The life cycle of a redo log record is as follows:


  • The redo log record is first created by a mini transaction and stored in the mini transaction buffer. It has information necessary to redo the same operation again in the time of database recovery.
  • When mtr_commit() is done, the redo log record is transferred to the global in-memory redo log buffer. If necessary, the redo log buffer will be flushed to the redo log file, to make space for the new redo log record.
  • The redo log record has a specific lsn associated with it. This association is established during mtr_commit(), when the redo log record is transferred from mtr buffer to log buffer. Once the lsn is established for a redo log record, then its location in the redo log file is also established.
  • The redo log record is then transferred from the log buffer to redo log file when it is written + flushed. This means that the redo log record is now durably stored on the disk.
  • Each redo log record has an associated list of dirty pages. This relationship is established via LSN. A redo log record must be flushed to disk, before its associated dirty pages. A redo log record can be discarded only after all the associated dirty pages are flushed to the disk.
  • A redo log record can be used to re-create its associated list of dirty pages. This happens during database recovery.


It should be helpful to understand concept of redo logging which makes InnoDB a dependent transaction safe engine to keep our data in safe hands .

No comments:

Post a Comment