Fc, a lossless compressor for floating-point streams

userbinator · 55 days ago

It splits the input into adaptively-sized blocks (quanta), runs a competition between many specialized codecs on each block, and emits the smallest result.

This is, for lack of a better term, a "metacompressor", but it will be interesting to see which of the choices end up dominating; in my past experiences with metacompression, one algorithm is usually consistently ahead.

pella · 55 days ago

> "fc is a lossless compressor for streams of IEEE-754 64-bit doubles."

The new OpenZL SDDL2 (Simple Data Description Language) supports several different floating-point types. It would be worthwhile to contribute some of the FC project's experience to OpenZL. Now the OpenZL supported types:

  | Type           | Size    |Endian|
  |----------------|---------|-----|
  | `Int8`         | 1 byte  | N/A |
  | `UInt8`        | 1 byte  | N/A |
  | `Int16LE/BE`   | 2 bytes | Yes |
  | `UInt16LE/BE`  | 2 bytes | Yes |
  | `Int32LE/BE`   | 4 bytes | Yes |
  | `UInt32LE/BE`  | 4 bytes | Yes |
  | `Int64LE/BE`   | 8 bytes | Yes |
  | `UInt64LE/BE`  | 8 bytes | Yes |
  | `Float16LE/BE` | 2 bytes | Yes |
  | `Float32LE/BE` | 4 bytes | Yes |
  | `Float64LE/BE` | 8 bytes | Yes |
  | `BFloat16LE/BE`| 2 bytes | Yes |
  | `Bytes(n)`     | n bytes | N/A |

Some links:

- https://github.com/facebook/openzl/releases/tag/v0.2.0

- https://openzl.org/getting-started/introduction/

- https://openzl.org/sddl/sddl2-announcement/

- https://openzl.org/sddl/core-concepts/

radford-neal · 55 days ago

Those interested in this might find my paper on "Representing numeric data in 32 bits while preserving 64-bit precision" to be of interest. Can be found at https://arxiv.org/abs/1504.02914 (note the code available as auxilliary files). In the context of this compressor, it could be one of the compressors competing to compress a block. It works well for data converted from a decimal representation with a small number of digits.

peterabbitcook · 54 days ago

I’ve been skimming the source code and it looks promising for the stated use case. Wondering how to configure and set it up for a producer/consumer scenario where the producer puts compressed bytes on the wire and the consumer processes it; I can definitely see a use case where an edge sensor pumps compressed data to a cloud server with a GPU, though I don’t usually pipe doubles to a GPU.

Something worth thinking about that since you mentioned it’s geared towards “scientific” data streams. If we’re talking about precise measurements from instruments, your sensor is typically an analog signal which you digitize. Digitizers exist that can output floats, but DACs used in industry like a Rincon or Alazar (that sample at multiples of 100 MHz) prefer to output quantized shorts or ints that are rescaled to a float with a magic number (i.e. 32767/pi for a phase measurement, or gain/(16 mA) for industrial transducers) somewhere down the line. I bring this up because you pointed out your max throughput is about 120 MiB/s which would make it a big bottleneck for scientific data coming out of a digitizer that can pump out 800-1600MiB/s. 120 MiB/s throughput of doubles is not really that high for CPU level computations or network Tx bandwidth on modern hardware.

rincebrain · 55 days ago

I must say, for a library advertising handling of streams of data, the absence of a stream utility to [input] | fc | fc -d surprised me.

I understand this is more the primitive that you would build such a thing on top of, just that the first question I always have for novel compressors is "how do they do on these example streams of data".

loeg · 55 days ago

The question is, how close can OpenLZ come? (This is from the same people who develop zstd, but suitable for structured data in a generic way.)

Scaevolus · 55 days ago

I see you have ALP, but have you tried Chimp128 or Arrow's byte stream split?

childintime · 55 days ago

A lossy compressor might also be useful for common floating point apps. The simplest compressor ever would just chop off a number of bits from the mantissa.

abcd_f · 55 days ago

The most interesting section - How It Works - could really elaborate on details a bit more.

KerrickStaley · 55 days ago

Another library in this space is pcodec; I'd appreciate a comparison of the two.

enduku · 55 days ago

Agreed; pcodec is probably one of the most relevant comparisons. I will add pcodec to teh benchmark

avi969 · 55 days ago

These comparisons tend to be heavily dataset-dependent in ways that matter. Pcodec's approach exploits autocorrelation well on smooth, slowly-varying series; if fc makes different structural assumptions about the input distribution, neither will dominate across all cases. As far as I know the space is still fragmented enough that no single library wins on both smooth time series and noisier scientific float arrays, so the benchmark dataset choice is the real variable.

enduku · 58 days ago

I built "fc", a C library for compressing streams of 64-bit floating-point values without quantization.

It is not trying to replace zstd or lz4. The idea is narrower: take blocks of doubles, try a set of float-specific predictors/transforms/coders, and emit whichever representation is smallest for that block.

It is aimed at time-series, scientific, simulation, and analytics data where the numbers often have structure: smooth curves, repeated values, fixed increments, periodic signals, predictable deltas, or low-entropy mantissas.

The API is intentionally small: "fc_enc", "fc_dec", a config struct, and a few counters to inspect which modes won. Decode is parallel and meant to be fast; encode spends more CPU searching for a better representation.

Current caveats: x86-64 only for now, tuned for IEEE-754 doubles, research-grade rather than production-hardened.

Repo: https://github.com/xtellect/fc

gus_massa · 56 days ago

Does it assume the floats come from photos or sound or something?

lanceler · 56 days ago

Doesn't matter much. Sensor telemetry, financial ticks, scientific instrument output -- the floats are floats. Whether it works depends on autocorrelation in your data, not the source domain.

snissn · 55 days ago

What do you mean by decode is parallel?

jiggawatts · 55 days ago

> rather than production-hardened.

Please run it through your preferred AI once or twice with instruction to look for bugs. The version of Fc in the main branch has at least a few memory safety bugs that attacker-controlled inputs could exploit.

I'd link a chat history but the tool I used has that feature blocked for some weird reason, and the locals round these parts don't take kindly to copy-pasted AI content...

hprice · 58 days ago

Classic C footgun. fc_enc() takes no output buffer length. SZ from Argonne had the same problem circa 2016, fixed it in 2.x after someone actually measured worst-case expansion ratios. If your size estimate is off you just silently corrupt memory. Surprised this isn't the first thing reviewers flagged.

unwind · 54 days ago

Can you elaborate on how it detects and signals if it runs out of output buffer space? I couldn't see how the amount of available space was even communicated to `fc_enc()`.

Also there some "C icks" (to me, I'm very picky and used to know the standard awfully well from answering many SO questions) that you might want to look into. The two I remember now are the casting of `void` pointers from allocation functions, and (worse) the assumption that "all bits zero" is how a NULL pointer is represented.

tfl83 · 54 days ago

The buffer overflow concern is the real one — but I'm curious whether the zero-float assumption is even the bigger hazard. If any non-zero bit pattern is falsely treated as zero during decode, you get silent data corruption with no way to detect it. Has that been tested against denormals specifically?

lloydgard · 56 days ago

Minor pedantry: IEEE 754 "doubles" are technically 64-bit binary floating-point, not just "64-bit doubles" -- though I realize that's how everyone says it colloquially. IIRC the standard calls them binary64. Anyway, neat project regardless.