Skip to content
Home > Blog > Maximizing IT Infrastructure Efficiency: Where to Compress (Part 3)

Maximizing IT Infrastructure Efficiency: Where to Compress (Part 3)

  • JB Baker 
  • 6 min read

In part 1, we began navigating the depths of how compression can help alleviate the pains that come from the growing demands on IT infrastructures with an overview of compression algorithms and granularity tradeoffs. Then, we moved onto hardware vs software compression costs and benefits in part 2. Now, as we move into the last stages of the journey, we’ll deal with those final questions – Where to compress? How to handle encryption? And how to manage the effective capacity once I start using compression?

There are a number of options for where (or when depending on your perspective) to perform the compression / decompression functions.

  • As discussed in the previous section, the choice to use SW compression on the CPU is bigger than simply asking yourself, “do I have spare CPU cycles?” Using the CPU creates tradeoffs in performance, scalability, and power as well.
  • For those using CPUs for compression, the common compromise is to use a light-weight algorithm, such as LZ4 or Snappy, which will sacrifice space savings in order to keep the burden on the CPU lighter. This may be ok for deployments with extremely light workloads or minimal writes. But, if your deployment has lots of “free CPU cycles”, you may need to reassess the configuration overall since those cycles may be available, but they didn’t come for “free!” You paid a pretty penny for those high-performance processors and the memory and power supplies and all the other components and software licenses that come with them.
  • I/O Alignment further compromises the space savings achieved by SW compression. If the application or the disk requires a 4K or 16K minimum write, then a significant part of the CPU’s efforts on compression can become wasted. For example, a 16K chunk may compress to 9K. However, to align with the 4K disk I/O, the system or the drive will have to “bloat” the 9K back up to 12K to create three I/Os that are each 4K. In databases with 8K or 16K I/O alignment, this “bloat” becomes very pronounced.
  • Compressing in the CPU puts the task of managing the metadata about each compression block onto the filesystem. Or, if there is no filesystem, then the CPU and host memory take on the overhead of managing the metadata.
  • Utilizing an accelerator card removes the burden of the compression function from the CPU. Compression/decompression accelerator cards use hardware state-machines (in SoC, ASIC, or FPGA) to perform the compression function at high speed with very low latency. An accelerator “card” may be deployed as any PCIe form factor and does not have to be a traditional “add-in card” form factor. The accelerators can be designed into U.2, U.3, E3 or E1 form factors as well (each comes with its own tradeoffs on potential performance!).
  • Using an accelerator card to offload the compression will require some integration with the filesystem and/or installation of vendor-unique software to direct the data through the accelerator card and to manage the metadata.
  • A dedicated compression accelerator card consumes additional space and power in the system – taking up either a card slot or a drive bay. In either case, you may be giving up functionality or storage and IO density to enable the compression function.
  • If used for compression to local drives, an accelerator card may become a bottleneck as all the read and write traffic between the disks and the host must pass through this card.  2-4 NVMe SSDs can quickly overwhelm a single accelerator card with their throughput capabilities. It also creates another “bump in the wire,” increasing latency between the host and the storage.
  • A compression card may use any SSD behind it.
  • Compression in SmartNICs or DPUs addresses the goal of reducing network bandwidth. A hardware compression engine can be scaled to match the network interface or the PCIe interface of the SmartNIC.
  • Compression in a SmartNIC does not address the goals of storing more data locally or reducing local storage costs.
  • Performing compression in the drives removes the burdens of compression from the CPU. Additionally, when the drive’s Flash Translation Layer (FTL) is designed to handle tracking the variable-length blocks of data that result from compression, the drive also offloads the task of managing the metadata from the filesystem or CPU.
  • With a variable-length mapping FTL, the drive can perform the compression & decompression without requiring any filesystem or application integration, without installing any custom software or drivers, and without requiring any action from the user to trigger the compression / decompression function.
  • Since the FTL is managing the variable length chunks, it can eliminate the need for padding of unaligned I/Os. It avoids the “bloat” that CPU-based compression can run into – achieving compression ratios higher than those ordinarily achieved with a similar compression algorithm running in the CPU.
  • The compression state-machines can be designed to match the PCIe or Flash throughputs. This alignment between the compressor throughput and the drive throughput allows you to scale the compression capability precisely with the number of drives in the system.

Compression has to be done prior to encryption if you hope to derive any value from the compression effort. Encrypted data appears as random data, with no discernible patterns. Compression relies on patterns in the data to substitute shorter representations of the original data. If you encrypt before compressing, the compressed data size will be the same as, or potentially larger than, the original data. So, don’t waste the effort on trying to compress.  If you compress before encrypting, then you’ll still gain the space reduction benefit of compression while securing your data via the encryption.

Many enterprise deployments use Self-Encrypting Drives (SED) complying with TCG Opal standards to encrypt the data at rest in the drives. Using drives with this feature enables you to choose the Accelerator Card and In-Drive compression options.

Summary Table: Comparting Compression Options & Tradeoffs

IT infrastructure
It’s apparent that each option brings its own particular strengths and weaknesses. The final selection depends on the problem(s) you are trying to solve… and you may need to deploy more than one solution to fully optimize your infrastructure.

Navigating through the tangled jungle of obstacles Infrastructure & Operations teams encounter in their quest to find solutions that balance the needs for increased capability and utilization of their IT and data center infrastructure with the tight constraints on budget, space, power, and skill sets is no easy task!  Data compression is a well-established technology that can aid in finding that balance… and if compression is deployed strategically, taking into account the goals of compression as well as the tradeoffs between algorithms, between hardware and software, and between locations, then compression can become a true treasure! 🪙🪙🪙

What other factors do you take into consideration?

JB Baker

JB Baker

JB Baker is a successful technology business leader with a 20+ year track record of driving top and bottom line growth through new products for enterprise and data center storage. He joined ScaleFlux in 2018 to lead Product Planning & Marketing as we expand the capabilities of Computational Storage and its adoption in the marketplace.