batchcorder

A Rust-backed Python library for caching Arrow record-batch streams so they can be replayed multiple times from a source that can only be read once.

Arrow RecordBatchReader is single-use: once consumed, it is gone. batchcorder wraps any Arrow stream source and stores each batch in a memory or disk cache, so multiple independent readers can replay the stream from any position.


Reference — Information