Some months ago I helped out in another project in which they had some performance problems. They had a very big table and the index of the table was bigger than the table itself. As every change in the table causes MySQL to recalculate/reload the index, this can take some seconds for such big tables.
So, I thought it would be a good idea to split that big table into very small ones. This should reduce the overhead of reloading big indices and instead reload only very small parts. And the next thought was: Is it possible to use the “new” MySQL-Partitioning for that?
Short overview to partitioning in MySQL 5.1
First of all lets have a quick look over the partitioning features:
- Main aspect is the physical storage of the data which can be set for every partition extra
- There are some partitioning methods which define the rules of which data is stored in which partition
- Only horizontal partitioning possible
There are four “partition types” which can be seen as the rules for “which data is stored in which partition”:
The definitions of that can be found here.
After reading through all that stuff I discovered that MySQL Partitioning is very suitable for partitioning by date. There are really several and useful functions which enables you to part tables by date. But there aren’t enough functions to calculate a partition-id from a string.
There is a limitation of 1024 partitions. In my case I had more than 400 partitions (and the project is growing), already 40 % of the limit.
I found it also very irritating, that the row, a partition expression is based on, must be part of the primary key or any other unique key. For me and in the above mentioned case this doesn’t make sense at all as this is exactly what I want to avoid with partitioning into smaller parts: I want split my problem into many smaller problems and not to share the problem. (See here Partitioning Keys, Primary Keys, and Unique Keys in partitions: “Every unique key on the table must use every column in the table’s partitioning expression.“)
Especially, this last limitation was the reason why I write now this article: for the kind of partition the MySQL-developers had in mind when implementing partitions, this limitation makes sense. But – as so often in MySQL – the idea is not obvious for application-developers like me. At first glance into documentation you may think that MySQL can handle direct partitions on a key. But it can’t. This limitation is for the chosen example so performance-limiting, that it should be written in bold at the first page of partition-documentation!
Let’s assume we have a table of “adresses of the world”:
CREATE TABLE adr (
PRIMARY KEY (country, zipcode, street)
Very simple, just to illustrate the problem. We can assume, that this table can be very, very big, if you managed to get all adresses of the world in it.
The interesting thing is now: The index of this table gets bigger, than the table itself! And there are many thousand changes in it per day, maybe 1,000 per second. For every change the index must be recalculated. And this is a real problem, if you also have many thousand requests on this table per second.
The moment, where direct partitioning over a key begins to make sense
- SELECT * FROM adr WHERE street LIKE '%muster%'; makes no sense – with more than 1 billion records this can take days…
- Creating more indices to search faster: This new index will nearly double the total index-size necessary, doubles the required space in memory, the time to reload the index, etc.
- Analysis show also: The select for “all streets like” is very seldom. Most queries are looking like SELECT * FROM adr WHERE country = 'de' AND zipcode = '12345%' and street like '%muster%'. This kind of query runs very fast if the table is warmed up and if there are no changes to the table.
- No matter what we optimize, the basic problem remains: If you want to speed up this table, we need to have much more RAM, more CPU, and extremly faster disks. And at some point, when the number of records increases faster than Moores Law, even this doesn’t help anymore. So the problem is: How to speed up the table for fast SELECTs WHILE MUCH INSERT/UPDATEs?
- Important logical step: As developer we are now able to say “We need to know WHERE, before we search, because otherwise it just makes no sense. Searching only for streets needs a completely different solution, but I’m currently not focused on that problem. I’m focused to search the adr-table.”
- Next important step: At this moment, we can also say “I’ve made sure, that I know at least the country code before I begin to search.“.
- With this prerequisites we are able to split our problem in as many parts as countries exists.
- This standpoint has also many other advantages:
- The total index is much smaller, because we save one “index-level”. We need less RAM and can store more records.
- An update is much smaller: Only operations to this part must wait, other selects aren’t influenced.
- Read is much faster. We read only those parts of the index from where we know that the results must be inside it.
- We can make many parallel requests; the probability, that two queries go to the same part at the same time is very low.
- This is perfectly scalable. If one machine is too slow, we can just put half of the parts on another machine and the whole application should work nearly twice as fast.
A partition is in this case maybe not a number, it is just the value of the key and not every partition must be generated manually.
This is useful, because if your application makes only SELECTs of known keys, the partition-manager doesn’t need to calculate which the correct partition is. It can directly look into the right part. No need to calculate an unnecessary index (= in which partition is my record?).
The borders in which this idea work
- Very big number of records. (So big, that a full table scan takes some time.) A search to the table only on non-indexed fields is not possible and so forbidden for this table.
- Many reads and changes to the same tables (update/delete). (I call this “real-life-data”, because in real-life everything changes at every time. :)
- The index should fit completely into the available RAM; only then we can search fast.
- The index for our search criteria needs not be unique.
- For every new partition-key, a new partition partition is created automatically.
In those cases, it makes sense to split/partition the big table into smaller pieces just by a key. This may result in not equal big partitions but this doesn’t matter much: The most important thing is, that the biggest partitioned table is much (!) smaller than the original. The pieces must be so small, that it takes in the worst case at the utmost one second to load the index of this partition into memory.
To test this kind of concept, I created a partitioned table, which implements just that. With some tricks it really worked that for every key in the table, the record was stored in another partition. In the end I had over 400 partitions. The results were really bad, so bad, that I haven’t kept them.
But this test brought me to the idea to make partitioning myself: I created as many tables as partitions before and stored every record in it’s correct table. Afterwards I made again some tests.
The results were overwhelmingly good!
- The machine was a dual quadcore with only 8 GiB RAM and some good raid on it.
- I created over 400 tables (MyISAM), with a total number of 200 million records (more was not possible due to the available amount of RAM at this time).
- The biggest table had then about 1,000,000 records, the smallest only about 5,000.
- I warmed the database up with about 100,000 SELECTs (10 minutes).
- I also created benchmarks with random SELECTs with 4 threads and one thread which INSERTs/UPDATEs random records in random tables.
- With a first test over this, I reached more than 500 Inserts/Selects per second. That was quite good. One rule I found was: The smaller the tables, the faster the access.
- With some other tricks (more partitioning for INSERT/UPDATE), I came to 4,000 INSERTs/UPDATEs and 6,000 SELECTs per second. Yes, per second! Yes, at the same time on one machine. That where the lower average values. In average and top speed it was much faster.
- Direct partitioning by a key value is a fascinating thing, which can solve problems with very big “real-life”-data-tables and it should be further looked into.
- MySQL concentrates the partitioning feature too much on “WHERE is the data stored” and not on “WHERE does the data belong to“.
A storage has always physical limitations, a logical information of data which belongs together, however, doesn’t have any physical limitations; the storage itself can keep this information.
- MySQL isn’t currently able to handle partitions on a direct key in the table. The limitations forbid it, and even if you go over them: they are really slow.
- This was the reason why I wrote this article: it is of course possible to use the partitions in MySQL like that, but it makes no sense, because there is this need for this unique key, which creates more problems than it solves for such big tables.
- This feature should not be put together with the current “PARTITION”-implementation. I think, this is something different. Maybe we can call it “AUTO-MERGE”?
- Even if MySQL doesn’t support that kind of partition: It’s not so complicated to change your application. Make partitioning yourself! It’s easier than you think.
- It makes actually so much sense, that I want to raise my hand and say “I want this”. :) – If MySQL would support that, it would help a lot.
- Testing large partitions
- Partition by value of column list
- Partitioning on Application Side. The author here suggests to do this kind of partition in with MySQL-Proxy. I think, this is not the right place for that, because this is not parallelizable on one server.
- How MySQL does table locking in partitions.