Skip to content
Home > Blog > FinTech: Optimize the Data Pipeline to Process More Ticks with ScaleFlux SSDs

FinTech: Optimize the Data Pipeline to Process More Ticks with ScaleFlux SSDs

  • JB Baker 
  • 4 min read

Utilize hardware-based compression to alleviate performance, cost & power challenges

Ok, so you need to process ever expanding Tick Data volumes. The data access needs to be fast, pushing you to use SSDs to meet the IO demands. The data set is growing from single TBs to 10s or 100s of TBs – adding pressure to the costpower and physical space required for the storage. Compressing the data would make sense to offset the storage physical space, cost per effective TB, and power… but data compression in the CPU can introduce latency, cause hiccups in the analytics application performance, and end up increasing total power consumption. 

Enter the CSD 3000, a NVMe SSD designed specifically for demanding workloads and integrating a unique capability to compress (and decompress) data in the drive, resulting in lower latency, higher IO, and better performance/W of power consumption – a win win win situation.

Host CPU’s & GPUs are much better at analyzing data than compressing it to save space so why would you burden them with KDB+ for software compression? Don’t rob your Host CPUs of application processing cycles when you can offload Compression/Decompression functions to an advanced NVMe SSD with processors on-board. Specialized processors (aka “domain specific compute”) are already deployed in various use cases – AI cycles to GPUs, TCP traffic to Smart NICs, Video CODECs to transcoders, so why not use an SSD with built-in compression/decompression processors that can give you 4x capacity, 9x endurance, and 2x+ performance over ordinary enterprise NVME SSDs? 

With hardware-based compression in the drives, you can improve efficiency, TCO, and application responsiveness by:

  • Analyzing more data in the same server footprint
  • Utilizing compression to reduce storage costs and improve latency & IO (instead of trading off between storage costs and latency & IO penalties)
  • Extending the lifespan of flash storage to match server refresh cycles
  • Avoiding installing any new drivers or software since the drives use standard NVMe drivers & commands
  • Scaling compression throughput with each drive you install instead of over-buying CPU cores to handle the potential future workload

ScaleFlux SSDs in-line compression enables you to analyze more data in the same footprint, reducing server and storage sprawl, all without adding complexity. ScaleFlux SSDs eliminate the need to use host CPU for compression/decompression, so you get more analytics capability without having to add more servers or faster CPUs.

The CSD’s compression function is transparent to the application. That means there is nothing you need to do to trigger the “compress on write” or “decompress on read” functions. It also means zero application changes. So, less downtime with no risk.

The CSD’s compression engines operate at line-rate. The hardware-based compression engines enable sustained write speeds up to 6.2GB/s! By compressing the data, the drive does not need to program as many NAND cells to initially store the data and the drive maintains more free space, resulting in fewer background write operations from garbage collection (known as “write amplification”). All this translates to 2x or more the I/O performance, consistently lower latency, and up to 9x higher endurance in comparison to other enterprise NVMe SSDs.

Tick data can be very compressible, with reported compression ratios upwards of 4:1. Packet capture data is also highly compressible, consistently measuring from 2:1 to 2.1:1. To be sure, contact ScaleFlux to get access to our tools that enable you to evaluate your data’s compressibility. You can also schedule an evaluation to try ScaleFlux SSDs for yourself. Installation is as simple as installing any ordinary NVMe SSD in either U.2 or U.3 slot, evaluation devices are available in 4 TB, 8 TB, and 16 TB capacities.

Using CSD’s in your Tick Analytics systems can get you:

  • 2x higher QPS
  • 0x lower 99.9% latency*
  • 2x lower $/TBe
  • 9x SSD endurance
  • No new drivers
  • No application modifications

*Write latency in a mixed 70Read/30Write random workload

JB Baker

JB Baker

JB Baker is a successful technology business leader with a 20+ year track record of driving top and bottom line growth through new products for enterprise and data center storage. He joined ScaleFlux in 2018 to lead Product Planning & Marketing as we expand the capabilities of Computational Storage and its adoption in the marketplace.