I recently discovered that MySQL has a “memory” engine that I wasn’t aware of (most of my database work is for hobby projects so I learn what I need as I go). It seems like this option should give me drastically improved performance, so I’m wondering if there are any drawbacks that go with it. The two that I know of are:
- I need to have enough RAM to hold the table(s) in question.
- The tables are lost if the machine shuts down.
I believe #1 shouldn’t be an issue since I’m using AWS EC2 and can move to an instance type with more memory if needed. I believe I can mitigate #2 by dumping back to disk as needed.
What other issues are there? Can the memory engine ever give worse performance than either MyISAM or InnoDB? I think I read something that indices are different with this engine; is this something I need to worry about?
Looking at the feature availability list at http://dev.mysql.com/doc/refman/5.1/en/memory-storage-engine.html two possible problems jump out:
- No transaction or FK support, meaning you will have to manage transactional integrity and referential integrity in your own code were needed (which could end up being a lot less efficient than letting the DB do this for you, though that very much depends on your app’s expected behaviour patterns).
- Table level locking only: this could be a significant barrier to scalability if your app needs multiple concurrent writers to the same set of tables or in cases where your read operations use locks to ensure consistent data is read – in such cases a disk based table that supports much finer lock granularity will perform far better if enough of its content is currently cached in RAM.
Other than that, assuming you have enough RAM, a memory based table should be faster than a disk based one. Obviously you need to factor in taking snapshots to disk to address the issue of what happens when the server instance is reset, which is likely to completely negate the performance benefit overall if the data needs capturing often (if you can live with losing a day of data in such an instance you could just take a backup once per day, but in most cases that would not be acceptable).
An alternative might be to:
- Use disk based tables, but ensure that you have more than enough RAM to hold them all in RAM at any given time (and “enough RAM” might be more than you think as you need to account for any other processes on the machine, OS IO buffers/cache and so forth)
- Scan the entire contents (all data and index pages) of the table on each startup to preload the content into memory with
SELECT * FROM <table> ORDER BY <pkey fields>for each table followed by
SELECT <indexed fields> FROM <table> ORDER BY <index fields>for each index
This way all your data is in RAM, you only have to worry about I/O performance for write operations. If your app’s common working set is much smaller than the whole DB (which it usually the case – in most applications most users will only be looking at the most recent data most if the time) you might be better of being more selective about how much you scan to preload into memory, allowing the rest to be loaded from disk on demand.
There are plenty of cases not to use the memory storage engine – and when InnoDB will be faster. You just need to think about concurrency and not trivial single threaded tests.
If you have a large enough buffer pool, then InnoDB will become entirely memory resident for read operations as well. Databases have caches. They warm themselves up!
Also – do not underestimate the value of row-level locking and MVCC (readers don’t block writers). It may be “slower” when writes have to persist to disk. But at least you won’t be blocking during that write operation like you would be on a memory table (no MVCC; table level locking).
For the record. I tested Mysql tables in Memory for store some information. And i tested PHP’s APC (APCu) for store the same information.
For 58000 registries. (varchar + integer + date).
- Original information 24mb in text format (csv format).
- PHP’s APC uses 44.7mb of RAM.
- Mysql’s Table uses 575mb of RAM.
The table only has a single index so i don’t think that it is the main factor.
Memory table is not an option for “big” tables because it uses too much memory.
The other disadvantage of the MEMORY-based tables is that they can’t be referred multiple times in the same query. At least that behaviour was found till the v5.4. How with CTEs (since v8.x) there is no need to use mem-based intermediate tables for complex procedures.
According to MySQL and MariaDB manuals, BLOB and CLOB (various TEXT types) are not supported by MEMORY storage. For our own purposes, this made the MEMORY storage engine almost useless.
MEMORY tables cannot contain BLOB or TEXT columns.
Variable-length types like VARCHAR can be used in MEMORY tables. BLOB or TEXT columns are not supported for MEMORY tables.
When trying to convert only part of the database into MEMORY storage, I found that inter-storage engine foreign keys are not supported. So, all the tables, that should have foreign key references to the tables, containing BLOB/CLOB should also be in non-memory storage types (at least, this affects InnoDB child tables).
MEMORY tables are not intended for persistent storage, particularly of large subsets of data or anything where retention is critical. Their best purpose from my experience is to house transient records during the creation and population of temporary tables during complex procedures, which performs significantly faster than most other table types for this purpose provided your key buffer threshold for the engine is set high enough not to incur a disk write. This can operate an order of magnitude faster than MyISAM or InnoDB for this purpose as there is no disk I/O, and in the case of a table that is encapsulated within a specific procedure, indexing and relations do not bear as much significance as they would where persistence is expected.