How does a table gets fragmented in mysql

MySQL tables, including MyISAM and InnoDB, two of the most common types, can experience fragmentation. In order to understand fragmentation, let us consider How MySQL stores data for a table on disk in order to make it persistent. The data is stored on the disk in the form of files that are divided into fixed size extents, pages or blocks. The number of files that will be created to store data related to a single table, depends on the table's storage engine as well as the setting of related system variables.

Hence one part of table's fragmentation is dependent on its storage engine, if it stores data in a table by using the techniques as shown below, that leads to fragmentation. Another part is external to MySQL and is platform/OS dependent.

Table fragmentation means gaps in the information stored in the table, which comes into picture due to the following reasons:
  • Logical fragmentation
    • Some index types stores the record in a particular order of  their index key column, like B-Tree stores records in ascending order of key columns values. When the logical ordering of the index pages (Next Page <==> Previous Page) does not match with the physical ordering of the index pages on disk due to a page split operation.
    • MySQL has reserved a lot of extra extents for the table, in order to be used in future,  due to the type of ongoing operations on the table.
    • User is performing insert in the table such that page is not fully used, updating variable length attribute and/or deleting information from the table.  
  • Physical fragmentation
    • The physical extents, pages or blocks of the file that stores the data for the table are not aligned continuously on the physical disk.
Page split is an operation that leads to logical fragmentation and physical fragmentation of a table.

INSERT

When you insert a record in the table
  • The record does not fully utilize the page then it makes few bytes unused.
  • The record is out of order with respect to index key, it may lead to a Page split operation.
An example showing Logical fragmentation due to Page split while inserting a record in out of order of index key.




Imagine that there are two data pages for a InnoDB table with a clustered index. The data is ordered with respect to PK and the pages are full as shown in the above figure. A new row with a primary key of "5" needs to be inserted, and since it is a clustered index, the new row is inserted in order. Because the target page is full enough that the new row does not fit, InnoDB splits the page a.k.a Page Split, roughly in half and inserts the new data on the new page, as shown in the next figure. 

Logical fragmentation: Now, the logical order of the index pages does not match with the physical order of the index pages on disk and Page 1 has holes in it which might or might not get filled up in future.

Physical fragmentation: If the OS fails to assign the new page in continuous location on the disk which can occur due to many reasons. 

UPDATE

When you update a variable length column with new data that is:
  • less in size than old data, you create a hole in the page.
  • greater in size than old data, you may trigger a page split, depending upon the fill factor setting and may be other reasons.
DELETE

MySQL has quite a few different storage engines to store data in tables. Whenever MySQL deletes rows from your table, the space left behind is then empty as holes and is not returned to the disk automatically. Over time with a lot of deletes, this space can grow larger than the used space in your table. When MySQL goes to scan that data, it scans to the high water mark of the table, that is the highest point at which data has been added. If new inserts occur, MySQL will try to use that space, but nevertheless gaps will persist.

This extra fragmented space can make reads against the table less efficient than they might otherwise be.



Fragmentation is something to keep an eye on but not necessarily to obsess over. If you see a few tables with a very large data_free value or high unused space, it can be worthwhile to OPTIMIZE those, as read performance on the table will improve afterward. 

But note that OPTIMIZE TABLE will lock the entire table while doing its work. For small tables this will work fine because the whole table can be read and written quickly. For very large tables, this can take a very long time and interrupt or reduce availability to your application.

Also if the table gets de-fragmented or not after Optimize depends upon the type of fragmentation and the operation that OPTIMIZE do. Take for example that if the table is physically fragmented and the reason is due to non continuous extents. If you run OPTIMIZE on that table and there are not sufficient continuous blocks/extents available on the disk, then it will be of no use.

Related article: How to check fragmentation in MySQL

Comments

Back To Top

Popular posts from this blog

error 18 at 0 depth lookup: self signed certificate

How to check fragmentation in MySQL tables

How to Drop or Remove or Decommission a Database in Oracle