Background: I have two problems with my computer that I'd like to solve.
The CPU is a pain point. It's a quad-core i7, but it spends a lot of time at 100%. (Test suites, transcoding media, crunching weather models, etc.) I would upgrade, but none of the newer laptops are materially faster. I've been holding out for a breakthrough of some kind, and I got one when the new Mac Pros were announced. That'll fix the CPU problem nicely.
Storage is the other pain point. I have a 256 GB SSD, which is stupidly fast and holds my typical working set, but I pretty frequently reference piles of data that won't fit onboard. Photo editing, for example: the Lightroom catalog and thumbnails live on my internal SSD, while the original RAWs are pulled over the network from my NAS as needed. This works, but it takes about as long to copy the file as it does to render, with Gigabit Ethernet being the bottleneck. What's more, I'll still be dealing with this even assuming I get a Mac Pro down the line, since its PCI-e SSD won't be especially large either.
I figured I had to wait on getting more CPU, but maybe I could do something about storage now. Specifically, I wanted:
- Fast streaming I/O. It should be significantly faster than Gigabit Ethernet, i.e. well over 100 MB/s reading or writing large files.
- High capacity. It should be big enough that I don't have to worry about exactly how big it is. Some of the scientific datasets I use today are in the 5 TB range; let's shoot for double that.
- Redundant. So far this sounds like I'll be using hard drives, and hard drives die. I don't want total data loss when that happens.
- Luggable. I want to be able to pack up and move my work station in five minutes, storage included.
I settled on a Drobo 5D. It's a 5-bay direct-attached disk subsystem offering Thunderbolt (10 Gbps) and USB 3 (5 Gbps) host interfaces. I figured five hard drives over Thunderbolt would let it firmly outperform Gigabit Ethernet, and I was right.
I filled it with 3 TB WD Red drives. They're not the fastest drives (spinning about 5400 RPM), but they ought to be fast enough, they're built for RAID enclosures, they don't produce a lot of heat, and so on. I purchased the drives on Amazon because about half of the WD Reds I've ordered from NewEgg were dead on arrival.
Drobo uses hybrid RAID levels and defaults to the one resembling RAID-5, meaning these five 3 TB drives yields into 12 TB of usable capacity. This configuration means I could lose one drive without data loss*. Single-drive redundancy is acceptable for my use case.
* This assumes that every bit on every remaining disk can be retrieved successfully, which is often not true with large disk arrays. RAID-6 is usually more appropriate since it allows one to recover from a lost drive even assuming unreadable sectors are found during the rebuild.
In addition to the hard drives, the Drobo 5D has an mSATA port that accepts an SSD. It can use this SSD for write-back and read-through caching, which sounded like a good idea, so I got one. And... yep, it was. Random writes are as fast as streaming writes since they're going to the Drobo's SSD. Read caching is most noticeable when accessing filesystem metadata, but I'm sure I've been using it on regular data too without even knowing.
I'm pleased with the Drobo 5D.
That said, there is a catch: the Drobo 5D is not particularly smart. Some computations – RAID parity, I'm guessing – are done in software on the host CPU by the Drobo device driver. (Remember how I said this was already a pain point?) The Drobo driver just about fully consumes one of my CPU cores when writing at 200+ MB/s.
I guess that's okay. (Anything that's writing that fast can't be using all of my CPU, right?) Still, I wish I had known that in advance, instead of wondering "why is my computer so hot?!" as I was initially copying over data.