Testing SATA Express And Why We Need Faster SSDs
by Kristian Vättö on March 13, 2014 7:00 AM EST- Posted in
- Storage
- SSDs
- Asus
- SATA
- SATA Express
During the hard drive era, the Serial ATA International Organization (SATA-IO) had no problems keeping up with the bandwidth requirements. The performance increases that new hard drives provided were always quite moderate because ultimately the speed of the hard drive was limited by its platter density and spindle speed. Given that increasing the spindle speed wasn't really a viable option for mainstream drives due to power and noise issues, increasing the platter density was left as the only source of performance improvement. Increasing density is always a tough job and it's rare that we see any sudden breakthroughs, which is why density increases have only given us small speed bumps every once in a while. Even most of today's hard drives can't fully saturate the SATA 1.5Gbps link, so it's obvious that the SATA-IO didn't have much to worry about. However, that all changed when SSDs stepped into the game.
SSDs no longer relied on rotational media for storage but used NAND, a form of non-volatile storage, instead. With NAND the performance was no longer dictated by the laws of rotational physics because we were dealing with all solid-state storage, which introduced dramatically lower latencies and opened the door for much higher throughputs, putting pressure on SATA-IO to increase the interface bandwidth. To illustrate how fast NAND really is, let's do a little calculation.
It takes 115 microseconds to read 16KB (one page) from IMFT's 20nm 128Gbit NAND. That works out to be roughly 140MB/s of throughput per die. In a 256GB SSD you would have sixteen of these, which works out to over 2.2GB/s. That's about four times the maximum bandwidth of SATA 6Gbps. This is all theoretical of course—it's one thing to dump data into a register but transferring it over an interface requires more work. However, the NAND interfaces have also caught up in the last couple of years and we are now looking at up to 400MB/s per channel (both ONFI 3.x and Toggle-Mode 2.0). With most client platforms being 8-channel designs, the potential NAND-to-controller bandwidth is up to 3.2GB/s, meaning it's no longer a bottleneck.
Given the speed of NAND, it's not a surprise that the SATA interface quickly became a bottleneck. When Intel finally integrated SATA 6Gbps into its chipsets in early 2011, SandForce immediately came out with its SF-2000 series controllers and said, "Hey, we are already maxing out SATA 6Gbps; give us something faster!" The SATA-IO went back to the drawing board and realized that upping the SATA interface to 12Gbps would require several years of development and the cost of such rapid development would end up being very high. Another major issue was power; increasing the SATA protocol to 12Gbps would have meant a noticeable increase in power consumption, which is never good.
Therefore the SATA-IO had to look elsewhere in order to provide a fast yet cost efficient standard in a timely matter. Due to these restrictions, it was best to look at already existing interfaces, more specifically PCI Express, to speed up the time to the market as well as cut costs.
Serial ATA | PCI Express | |||
2.0 | 3.0 | 2.0 | 3.0 | |
Link Speed | 3Gbps | 6Gbps |
8Gbps (x2) 16Gbps (x4) |
16Gbps (x2) 32Gbps (x4) |
Effective Data Rate | ~275MBps | ~560MBps |
~780MBps ~1560MBps |
~1560MBps ~3120MBps (?) |
PCI Express makes a ton of sense. It's already integrated into all major platforms and thanks to scalability it offers the room for future bandwidth increases when needed. In fact, PCIe is already widely used in the high-end enterprise SSD market because the SATA/SAS interface was never enough to satisfy the enterprise performance needs in the first place.
Even a PCIe 2.0 x2 link offers about a 40% increase in maximum throughput over SATA 6Gbps. Like most interfaces, PCIe 2.0 isn't 100% efficient and based on our internal tests the bandwidth efficiency is around 78-79%, so in the real world you should expect to get ~780MB/s out of a PCIe 2.0 x2 link, but remember that SATA 6Gbps isn't 100% either (around 515MB/s is the typical maximum we see). The currently available PCIe SSD controller designs are all 2.0 based but we should start to see some PCIe 3.0 drives next year. We don't have efficiency numbers for 3.0 yet but I would expect to see nearly twice the bandwidth of 2.0, making +1GB/s a norm.
But what exactly is SATA Express? Hop on to next page to read more!
131 Comments
View All Comments
SirKnobsworth - Thursday, March 13, 2014 - link
Thunderbolt 2 is really PCIe x4 + DisplayPort in disguise, and you don't need DisplayPort to your SSD.MrSpadge - Thursday, March 13, 2014 - link
Couldn't you build a nice M.2 to SATAe adapter in a 2.5" form factor and thereby reuse your existing M.2 designs for SATAe?Kristian Vättö - Thursday, March 13, 2014 - link
Technically yes, but the problem is that M.2 is shaped differently. You could certainly fit a small M.2 drive with only few NAND packages in there but the longer, faster ones don't really fit inside 2.5".Kevin G - Thursday, March 13, 2014 - link
"At 24 frames per second, uncompressed 4K video (3840x2160, 12-bit RGB color) requires about 450MB/s of bandwidth, which is still (barely) within the limits of SATA 6Gbps."This is incorrect:
3840 * 2160 * 12 bit per channel * 3 channels / 8 bits per byte * 24 fps ~ 896 MByte/s
And that figure is with with good byte packing. For raw recording, the algorithm may pack the 12 bits into two bytes for speed purposes meaning you'd need about 1.2 Gbyte/s of bandwidth. Jumping to 4096 x 2160 resolution at 12 bit color and 30 fps, the bandwidth need grows to about 1.6 Gbyte/s.
The other thing worth noting is that uncompressed recording is going to take a lot of storage. A modern phone recording at the highest quality settings with 64 GB of storage would last less than 40 seconds before running out.
Kristian Vättö - Thursday, March 13, 2014 - link
Oh, you're absolutely right. I used the below calculator to calculate the bandwidth but accidentally left "interlaced" box ticked, which screwed up the results. Thanks for the heads up, fixing...Kristian Vättö - Thursday, March 13, 2014 - link
And the calculator... http://web.forret.com/tools/video_fps.asp?width=38...JarredWalton - Thursday, March 13, 2014 - link
Aren't there *four* channels, though? RGB and Alpha? Or is Alpha not used with 12-bit?Kevin G - Thursday, March 13, 2014 - link
No real way to record with an Alpha channel value to my knowledge. Cameras and scanners etc all presume a flattened image as if everything were solid. The only exception to this would be direct frame buffer capture from video memory which can independently process an Alpha channel.Input media would generally be 36 bit. During the editing phase an Alpha channel can be added as part of compositing pipeline bringing the total bit depth to 48 bit. Final rendering can be done to a 48 bit RGBA file. Display output on screen will be reduced to 36 bit due to compositing for the frame buffer.
Nightraptor - Thursday, March 13, 2014 - link
When I saw the daughterboard Asus provided my instant thought was actually using this (in pcie 3.0 form) to somehow provide the option to add an external GPU to a tablet. I may be the outlier, but my dream would be to have and 11.6" 16:10 1920 x 1200 tablet with the ability to connect a keyboard dock to function as a laptop, or another dock with a discrete graphics card to function as a desktop for occasional gaming (1080p at high setting would be all I'd ask for - so pcie 3.0 4x should be sufficient). If you could somehow get a SATAe cable on a tablet I think this would do it.vladman - Thursday, March 13, 2014 - link
If you want speed from storage, get a nice Areca PCIe RAID controller, attach 4 or more fast SSDs, do RAID 0, and you've got anywhere from 1.7 to 2GB/s of data transfer. Done deal.