Zarr

Zarr is a binary file format for chunked, compressed, N-Dimensional array. It is used throughout the PyData ecosystem and especially for climate and biological science applications.

Zarr-Python is the official Python package for reading and writing Zarr arrays. Its main feature is a NumPy-like array that translates array operations into file IO seamlessly. KvikIO provides a GPU backend to Zarr-Python that enables GPUDirect Storage (GDS) seamlessly.

If the optional zarr-python dependency is installed, then kvikio.zarr will be available. KvikIO supports zarr-python 3.x.

Usage

Zarr-Python includes native support for reading Zarr chunks into device memory if you configure Zarr to use GPUs. You can use any store, but KvikIO provides kvikio.zarr.GDSStore to efficiently load data directly into GPU memory.

>>> import zarr
>>> from kvikio.zarr import GDSStore
>>> zarr.config.enable_gpu()
>>> store = GDSStore(root="data.zarr")
>>> z = zarr.create_array(
...     store=store, shape=(100, 100), chunks=(10, 10), dtype="float32", overwrite=True
... )
>>> type(z[:10, :10])
cupy.ndarray