In a lengthy discussion this past week I was reminded that Jan 2011 is when the hard drive manufactures agreed to focus on drives with sector sizes of 4K. I have read all the latest materials about this over the past week and you can too. Just search for 512e or Advanced Format Sector sizes and you will find the same articles I read. I concentrated on articles by Seagate, Western Digital and other manufactures.
Why am I talking about this on a SQL Server blog? - The change has impact to your SQL Servers. There are two areas you need to be aware of. PERFORMANCE and DATA INTEGRITY
PERFORMANCE: All the articles outline the performance implications for the 512e (512 byte sector size emulation mode). This is important to you because when SQL Server creates a database it makes the Windows API calls to determine the sector size. When 512e is enabled the operating system reports 512 bytes and SQL Server aligns the log file I/O requests on 512 byte boundaries. This means that placing a database on a 512e enabled drive will cause SQL Server to engage the RMW (Read-Modify-Write) behavior and you could see elongated I/O times when writing log records. This many only be a millisecond or two but can accumulate quickly.
DATA INTEGRITY: When I point this out I am not indicating that the 4K sector based drives are inherently any better or worse than 512 byte sector drives. In fact, many of the designs for the 4K sector drives allow an enhanced ECC mechanism so in some respects the drives could be considered more resilient to media failure conditions than the 512 byte sector formats.
What I am warning about is the Read-Modify-Write behavior that takes place under the 512e mode. When SQL Server thinks the drive is handling 512 byte sectors the log I/O is aligned on 512 byte boundaries so a partial 4K write could be encountered at the drive level. Some specifications say that the drive may bundle these until the 4K sector is filled before flushing to the platter media, others are not so detailed in their information. If the drive holds the 512 byte write in disk cache (not battery backed) but reports the write complete to SQL Server, SQL Server can flush the data page because it thinks it has met the WAL protocol requirement for writing the log record before the data page. If a crash occurs at this point and the disk cache does not have time to flush you have missing log records that recovery won't know about.
SNIPPETS
Here are a few snippets from the articles I read.
A drawback to the current r/m/w operation is that a power loss during the r/m/w operation can cause unrecoverable data loss. This possibility occurs during every r/m/w operation, at the point where the two part-modified sectors at the start and end of the logical blocks (i.e., the "boundary" sectors) are being written to the media.
In modern computing applications, data such as documents, pictures and video streams are much larger than 512 bytes. Therefore, hard drives can store these write requests in cache until there are enough sequential 512-byte blocks to build a 4K sector.
Read-Modify-Write Prevention
As described above, a read-modify-write condition occurs when the hard drive is
issued a write command for a block of data that is smaller, or misaligned, to the
4K sectors. These write requests are called runts since they result in a request
smaller than 4K. There are two primary root causes for runts in 512-byte emulation.
1. Write requests that are misaligned because of logical to physical partition misalignment
2. Write requests smaller than 4K in size
RECOMMENDATION
For SQL Server the best recommendation is to work with the hardware manufacture to make sure the 512e mode is disabled on drives that hold the SQL Server database and log files and that the Windows API is reporting 4K sector sizes. SQL Server will then align the log writes on 4K boundaries and avoid the emulation behavior.
Moving Databases
SQL Server stores the initial sector size with the database metadata and may prevent you from attaching or restoring the database to a drive of different sector size. Going from a 4K to a 512 byte drive can lead to torn write behavior. Going from a 512 byte to a 4K drive can lead to Read-Modify-Write behavior.
Bob Dorr - Principal SQL Server Escalation Engineer