Skip to main content

Data Persistence in Various Databases

This blog post will cover the varying levels of data caching and data persistence provided by various operating systems and database systems.

Cache - A device placed in front of another storage device that is used for temporary storage of data.  The use of a hardware cache is often transparent to the end user and is generally smaller, faster, and less persistent than the backed storage device.  Rather than reading/writing from/to a slower storage device (i.e. a hard disk), we read/write from/to the cache (i.e. RAM, NVRAM, etc.).

Reading from a cache is simple.  If the datum we requested is in the cache, we simply read it from the cache.  If not, we need to retrieve the datum from the storage device and place it into the cache (often a much slower operation).

There are different policies when we write to the cache, though...

Write-Through Cache Policy - When using this policy, writing over the cache also writes over the data on the backed storage device... synchronously and simulataneously.  In the event of a catastrophic power failure, for example, the data written to the cache is guaranteed to exist on the backed storage device, as well.

Write-Back Cache Policy - When using this policy, writing over the cache marks the cache block as "dirty."  The backed storage device is not updated until the entire cache becomes full and new data needs to be retrieved from the disk into the cache.  At this point, to make room for additional space in the cache, one of the dirty cache blocks are written to the backed storage device.  In the event of a catastrophic power failure, the data might or might not have time to be transferred to the backed storage device.  While the write-back cache policy often performs better than a write-through cache policy, it is also often a source of data integrity and data durability concerns.

What happens when you write to a file?

No matter what operating system you use, no matter what database you use, when you modify a record in a database, something like this happens:

  1. The database writes to some sort of cache in memory
  2. If data persistence is needed (and it normally is), the database "lazily" tells the operating system to write the data to the hard disk.  The operating system then determines when the data is actually to be written to the disk.
  3. If the database determines that it needs to be less lazy and force the disk to flush, it will tell the disk controller to do so.  Many database systems force a flush every so often (say once a second).  It is also common to flush to the disk when transactions are complete or when sensitive data has just been written to the database.  When I say "flushing to disk," I'm talking about calling fsync().  If you're running Windows, I'm talking about writing to a file that was opened with the FILE_FLAG_WRITE_THROUGH flag.
At this point, your data has been given to the disk controller to be written to the hard disk at its convenience.  Some disk controllers have caches, some don't.  And, many hard disks have internal caches.

Now, we all know that it's much more complicated than that.  Databases often queue batches of changes to improve performance, and to ensure data integrity, they also write queued database changes to some sort of append-only log file or journal.  Nevertheless, whether writing data to a file or to a database, you eventually want to write to the disk before telling the client that their data is "safe."

Unfortunately, because of varying levels of caching when flushing data to the disk (as we did in step #3 above), there is still a chance that our data will remain unsafe.

Possible Data Persistence Problems

Problem #1 - If your hard disk drive has a write-back cache, your data is at the mercy of the hard disk.  Most hard disk caches are supported by no more than a small capacitor, and there is often not enough time to write the entire cache to disk in the event of a power failure; therefore, power failure to the hard disk could mean corruption/loss of data.  Typical workstations often have hard disks with write-back caches.  Using hard disks with write-through caches solve this problem, or on some disks, you can disable the cache entirely within the operating system.

Problem #2 - If your hard disk controller has a write-back cache, your data is at the mercy of the hard disk controller.  This problem is similar to #1 above.  Fortunately, some disk controllers have battery-backed write-back caches.  So long as these batteries don't die, your data remains safe, and you get the added performance benefit of a write-back cache.

Problem #3 - If your database doesn't force a flush to disk (using fsync() or other OS-specific methods), then the data can be cached by the operating system and is susceptible to loss.

Problem #4 - None of this really matters if your hard disk drive takes a crap.  In this situation, data loss/corruption is almost guaranteed unless your data is mirrored on other hard disks.

Solutions for each Problem:
  1. Ensure that your disk drives don't use write-back caching policies.  Write-through caching or no caching at all is okay.  Remember, many commodity disk drives use write-back caching.
  2. Ensure that your disk controllers use a write-through cache policy or a battery-backed write-back cache.  Even in the event of a disk controller failure, a well-designed battery-backed write-back cache card can be placed into another disk controller.
  3. Force your database to flush to the disk as often as necessary.  If data persistence is extraordinarily important, you should fsync() to the disk each time the database is modified.  It should be noted that the performance penalty associated with constantly syncing is significant; yet, many databases like SQL Server, PostgreSQL, etc. will flush to the disk quite frequently.  Databases like Redis allow you to tweak how frequently an fsync() should occur.
  4. Ensure that your data is mirrored across multiple hard disk drives to ensure that the failure of a few hard disk drives won't destroy your data.
Additional Resources:

http://www.postgresql.org/docs/8.3/static/wal-reliability.html - my favorite article on the topic; discusses when PostgreSQL flushes to disk
http://www.humboldt.co.uk/2009/03/fsync-across-platforms.html - compares fsync() and other flushing methods for various operating systems
http://support.microsoft.com/kb/78363 - when SQL Server flushes dirty pages to disk
http://redis.io/topics/persistence - when Redis flushes to disk
http://www.slideshare.net/iammutex/what-every-data-programmer-needs-to-know-about-disks - A slideshow that provides some insight into this problem

Comments

Popular posts from this blog

Developing a lightweight WebSocket library

Late in 2016, I began development on a lightweight, isomorphic WebSocket library for Node.js called ws-wrapper .  Today, this library is stable and has been successfully used in many production apps. Why?  What about socket.io ?  In my opinion, socket.io and its dependencies are way too heavy .  Now that the year is 2018, this couldn't be more true.  Modern browsers have native WebSocket support meaning that all of the transports built into the socket.io project are just dead weight.  On the other hand, ws-wrapper and its dependencies weigh about 3 KB when minified and gzipped.  Similarly, ws-wrapper consists of about 500 lines of code; whereas, socket.io consists of thousands of lines of code.  As Dijkstra once famously said: "Simplicity is prerequisite for reliability." ws-wrapper also provides a few more features out of the box.  The API exposes a two-way, Promise-based request/response interface.  That is, clients can request data from servers just as easily as se

Computer Clocks Cause More Issues

Two nights ago, a leap second was added to system clocks running Linux, causing much-undesired havoc. On July 1st at 12:00 AM UTC, both of my Amazon EC2 instances fired an alarm indicating high CPU usage. I investigated to find that it was MySQL that was eating all of the CPU. I logged in and ran SHOW PROCESSLIST to find that no queries were running (these servers don't get hit much after business hours). I stopped MySQL, CPU utilization dropped back down to 1-3% (as normal). I restarted MySQL, and it started eating a lot of CPU again. Then, I restarted the server (shutdown -r now), and the problem went away. Both servers had the exact same problem (running Ubuntu 12.04 LTS). In my particular case, MySQL began eating CPU, even after being restarted.  It was a livelock. The only relevant item I saw in the syslog was: Jun 30 23:59:59 hostname kernel: [14152976.187987] Clock: inserting leap second 23:59:60 UTC Oh yeah... leap seconds.  Those are super important.

JavaScript Sticky Footer and Scroll Effect

This post talks about two different HTML/JavaScript effects: How to keep a page footer stuck at the bottom of the browser window. How to create a scrolling <div> without using a scroll bar OK. So... you have a website. You want a header stuck at the top of your page and the footer stuck at the bottom of your page. The stuff in the middle, you want to be able to scrollable. But, you don't want those ugly scrollbars to the right of your scrollable text. Maybe, instead, you'll have up arrows and down arrows above and below your <div>. When you mouseover the arrows, the text in the <div> will move up or down and create a scrolling effect. Suppose your page looks like this... <html> <head> <title>Test</title> </head> <body> <div style="position: relative; width: 700px; margin-left: auto; margin-right: auto;"> <div id="header">Header</div> <div id="scrollUp&q