zfs-comphist
A read-only ZFS analysis tool that reports block-level compression usage by algorithm across all datasets in a pool.
The problem
ZFS reports an aggregate compression ratio for a pool or dataset, but it does not expose which compression algorithms are actually stored on disk. When a pool's compression property changes — from lzjb to lz4, or from gzip to zstd — existing blocks are not rewritten. They remain on disk in their original compressed form until they are overwritten or the dataset is rewritten explicitly.
On pools that have existed through multiple OpenZFS generations, this means data may be distributed across several compression algorithms simultaneously. There is no built-in ZFS command to answer "how much of this dataset is still using legacy compression?" zfs-comphist answers that question by reading block pointers directly.
What it does
- Walks all datasets in a pool using the same read-only traversal interface as
zdb - Inspects the compression field in each block pointer (blkptr_t)
- Produces a per-dataset histogram of block counts by algorithm (off, lzjb, gzip, lz4, zstd)
- Reports compression ratios per algorithm and per dataset
- Identifies datasets with legacy compression that are candidates for rewriting
- Outputs results as plain text or JSON
Technical details
Written in C. Links against libzfslinux (or an included OpenZFS submodule for builds where distribution headers are unavailable). Read-only — no transactions are opened, no pool state is modified. Safe to run on live, mounted pools.
Tested on Debian 13, Ubuntu 24.04, FreeBSD 15, and Proxmox VE 9.1.4.