Difference in size of table between InnoDB and MEMORY engines

Posted on

Question :

Question

The same table stored on hard drive (InnoDB engine) takes 930.72 MB when in Ram (MEMORY engine) it takes 1538.54 MB.

  • For what reason?
  • How to calculate space needed to save data in memory? Are there any general rules?

Example

Text below is not question, but allow reconstruct my database state.

We create database:

DROP DATABASE IF EXISTS test;
CREATE DATABASE IF NOT EXISTS test;
USE test;

Create table innoDB

CREATE TABLE main(
  id INTEGER UNIQUE NOT NULL AUTO_INCREMENT PRIMARY KEY,
  value INTEGER
);

Define procedure to load data

delimiter #
create procedure load_data(IN _max INTEGER)
begin

declare counter int unsigned default 0;

  truncate table main;
  start transaction;
  while counter < _max do
    set counter=counter+1;
    insert into main (value) values (counter);
  end while;
  commit;
end #

delimiter ;

Call procedure

call load_data(25000);

Create table MEMORY

-- SET max_heap_table_size = 16*1024*1024;

CREATE TABLE memory_main (
  id INTEGER UNIQUE NOT NULL AUTO_INCREMENT PRIMARY KEY,
  value INTEGER
) ENGINE=MEMORY
AS SELECT * FROM main;

We measure size of both tables and this size is not the same.

SELECT
    table_name AS `Table`,
    round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
FROM information_schema.TABLES
WHERE table_schema = "test"
    AND table_name LIKE "%main%";

Answer :

InnoDB stores the data in a BTree, with the PRIMARY KEY clustered. Your 8 bytes of data will expand to an average of about 30 bytes due to record overhead, block overhead, BTree characteristics, etc, etc. The BTree blocks are 16KB each. After some point in the growth of the table, chunks of 8MB are allocated, so you will find the table growing is big increments.

MEMORY stores data in a Hash. So the PK is hashed to build the data structure. I am not familiar with the details, but they are significantly different than InnoDB’s BTree structure. A common technique is to double the hash size when needed, thereby also leading to significant increments at seemingly random times.

What is the goal of this exercise?

Leave a Reply

Your email address will not be published. Required fields are marked *