Question :
The database is not a flat structure like a file system, and generally has three to four levels of index levels. If you simply use COW, you will have obvious write amplification every time you modify the leaf nodes.
How do the mainstream databases use COW? In which scenarios will COW be used? Are there optimizations for COW’s write amplification?
Answer :
In my experience, COW is done at the OS level. The database (MySQL, at least) does not know or care about it.
One use of COW is via LVM. When you take a “snapshot” of the disk (whole partition), it starts using COW to efficiently have 2 copies of the entire disk. The second copy grows as blocks are modified, so it is wise to copy the copy off the drive and get rid of the extra volume quickly.
Yes, a database does a lot of writing to disk. On the other hand, because disk I/O is often the most time-consuming task, the database engine works hard to avoid doing extra writes. This can include delaying writes by holding stuff in a RAM-based cache for as long as practical.
On the other hand, a database engine will have to do extra writes to assure “ACID”. At some level, this requires writing everything twice to disk. (Of course, it avoids truly doing 2x the work.
COW kicks in only once per block. Immediately after the snapshot, there are two sets of pointers to all the blocks in all the files. When a block is modified, COW kicks in and makes two copies of the block, changing one of the pointers. After that, that block is not copied again. Meanwhile, the database thinks it is updating a particular block on the disk it is assigned to. The database is oblivious to the other disk volume.