We recently had a customer who wanted to secure a very large amount of unstructured data with our security platform. Terabytes and terabytes of data. This made us realize that we should have a specific blog post on CipherStor performance. Our highest performing plan is our Enterprise Plan, with the Large Plan closely trailing it.
We’ve captured numbers off our test machine, which is a 4th Gen Quad-core Core i7-4870HQ @ 2.5 GHz (hardware AES support), 16 GB RAM with Solid State Drives. The processor is a hyper-threaded processor, meaning 8 logical cores. The OS was Windows 10 inside a VM provisioned with 4 logical cores and 8 GB RAM. All tests are at AES256 in GCM mode. In order to truly test the performance of CipherStor, we had to ensure that network and disk IO was out of the picture, so these tests were conducted in-memory.
So what do the raw performance numbers look like?
Multi-threaded performance
This fires up 4 threads (actually 4 C# Tasks) to keep all 4 logical cores busy. CPU utilization was about 65%, not surprising since the AES in our higher tier plans is hardware accelerated.
Iteration | MBytes/sec |
1 | 2600 |
2 | 2599 |
3 | 2569 |
Average | 2589 |
As you can see, with just 4 logical cores on a laptop, we’re getting almost 2.6 Gigabytes/sec (GBps, not Gbps or Gigabits/sec) of throughput. This is super blazing FAST and scales well as you add more cores to your application. In fact if all 8 logical cores were available, we’re looking at about 3.7 Gigabytes/sec of raw performance.
Next up, a very simple, single-threaded case.
Single threaded performance
A single thread doing bulk stream encryption kept the processor about 30% busy.
Iteration | MBytes/sec |
1 | 1189 |
2 | 1194 |
3 | 1151 |
Average | 1178 |
Yet we’re still seeing over 1.1 Gigabytes/sec even in this simple case.
Intel vs AMD
We also ran tests on an AMD FX-8970E @ 3.1 GHz desktop processor. It clocked in at a constant 750 Megabytes/sec regardless of how many hardware or software threads we threw at it. So while 750MBytes/sec is fast enough to handle most applications, we were still surprised to see AMD’s desktop performance trail Intel’s laptop performance by that much, especially when it’s got 125 Watts of power to spare as compared to the 45 Watts available to the Intel processor. Still, we should point out that the AMD processor is less than half the price of the Intel. Given the price-power-performance results, we recommend Intel processors for their security performance.
System design
So with 3 GBytes/sec of AES256 performance on a single physical processor, you’d be tempted to think you could drive over 1,000 4K UHD streams at 20Mbit/sec each (Netflix does 15Mbit/sec). After all, 3 Gigabytes/sec = 24 Gigabits/sec which divided by 20Mbit/sec is … quite a bit! But with the incredible performance seen above, the most likely bottlenecks in a practical application will likely be other systems – like the SSD IO or network interface rate. So as a system designer or architect, you should be mindful of that.
Upcoming improvements
Our Large and Enterprise customers already have the above performance available. Our Developer, Startup and Medium plans don’t currently support hardware acceleration or multi-threading. But the good news is that in our next CipherStor SDK release (after 2015.11.xx), we’ll be enabling the multi-threading feature for ALL customers, including those on free Developer plan.
If you have any questions, just reach out to us. We’re here to help!