This guide explains how to open Zarr archives stored as zip files in remote object storage (MinIO/S3) using fsspec's FSMap with Zarr v2.
Zarr v2's native ZipStore only accepts local file paths, not remote files or file-like objects. To access a .zip file stored in MinIO/S3, we need a different approach that bridges the gap between remote storage and Zarr's storage interface.
The solution uses a chain of abstractions:
MinIO/S3 Storage
↓ (accessed via)
S3FileSystem (treats bucket as filesystem)
↓ (opens file)
S3File object (file-like interface to remote zip)
↓ (wrapped by)
ZipFileSystem (treats zip contents as filesystem)
↓ (mapped to)
FSMap (key-value store interface)
↓ (consumed by)
Zarr (reads arrays and metadata)
S3FileSystem from the s3fs library provides a Python filesystem interface (fsspec) to S3/MinIO:
import s3fs fs = s3fs.S3FileSystem( client_kwargs={ 'endpoint_url': 'https://minio.example.com', 'verify': '/path/to/ca.crt' }, key='access_key', secret='secret_key', use_ssl=True )
This object lets you interact with MinIO buckets using familiar filesystem operations like fs.open(), fs.ls(), etc.
ZipFileSystem from fsspec.implementations.zip takes a file object (which can be remote) and exposes the zip archive's internal structure as a filesystem:
from fsspec.implementations.zip import ZipFileSystem # Open the remote zip file remote_file = fs.open('bucket/path/archive.zip', 'rb') # Create a filesystem view into the zip zip_fs = ZipFileSystem(fo=remote_file)
The fo parameter accepts any file-like object, including remote files from S3FileSystem. Now zip_fs treats the zip's contents as if they were a directory tree.
FSMap from fsspec.mapping implements Python's MutableMapping interface (dict-like behavior) on top of any fsspec filesystem:
from fsspec.mapping import FSMap # Create a mapping store store = FSMap(root='', fs=zip_fs)
The root='' parameter means “start at the zip's root directory”. FSMap now translates dictionary-style access (store[key]) into filesystem operations (zip_fs.open(key)).
Zarr v2 expects stores to implement the MutableMapping interface, which FSMap provides. When you open a Zarr group:
import zarr root = zarr.open(store, mode='r')
Zarr performs operations like:
store['.zgroup'] → reads the root metadatastore['array_name/.zarray'] → reads array metadata store['array_name/0.0.0'] → reads a specific chunkEach of these translates through the chain:
FSMap → ZipFileSystem → S3File → MinIO/S3 HTTP requestimport s3fs import zarr from fsspec.implementations.zip import ZipFileSystem from fsspec.mapping import FSMap # 1. Configure S3/MinIO access fs = s3fs.S3FileSystem( client_kwargs={ 'endpoint_url': 'https://minio.example.com', 'verify': '/path/to/ca.crt' }, key='your_access_key', secret='your_secret_key', use_ssl=True ) # 2. Open the remote zip file as a filesystem s3_path = 'my-bucket/data/experiment.zarr.zip' zip_fs = ZipFileSystem(fo=fs.open(s3_path, 'rb')) # 3. Create a mapping store for Zarr store = FSMap(root='', fs=zip_fs) # 4. Open with Zarr root = zarr.open(store, mode='r') # 5. Use the Zarr group normally print(root.tree()) array = root['my_array'][:]
Under the hood, a Zarr zip archive contains files like:
.zgroup # Root group metadata array_name/.zarray # Array metadata array_name/0.0.0 # Chunk at position (0,0,0) array_name/0.0.1 # Chunk at position (0,0,1) subgroup/.zgroup # Nested group metadata
When Zarr does store['array_name/0.0.0']:
zip_fs.open('array_name/0.0.0', 'rb').read()S3FileThis happens lazily - only when Zarr actually accesses specific data.
For read-only access (mode='r'), this approach works seamlessly.
For write operations, limitations apply:
ZipFileSystem in write mode requires recreating the entire zipRecommendation: Use this approach for read-only access to pre-created Zarr zip archives.
s3fs)s3fs has built-in caching - configure with cache_type parameter
You tried to pass a file object directly to zarr.ZipStore. Use ZipFileSystem + FSMap instead.
Add the CA certificate path to client_kwargs={'verify': '/path/to/cert.pem'}.
Check that root='' in FSMap is correct. If the zip has a subdirectory, use root='subdirectory/path'.
You can also use zarr.storage.FSStore instead of FSMap:
store = zarr.storage.FSStore(url='', fs=zip_fs) root = zarr.open(store, mode='r')
Both FSStore and FSMap provide the same MutableMapping interface. FSMap is more lightweight and part of core fsspec.
Execute the following with 'uv run', the dependencies are automatically resolved. This assumes you have our internal pypi registry set up with uv.
# /// script # requires-python = ">=3.8" # dependencies = [ # "boto3>=1.40.49", # "python-dotenv>=0.9.9", # "packaging>=25.0", # "rkns==0.6.2", # "s3fs[boto3]>=2023.12.0", # "typing-extensions>=4.15.0", # ] # /// import os from pathlib import Path from dotenv import load_dotenv from fsspec.implementations.zip import ZipFileSystem from fsspec.mapping import FSMap import s3fs import zarr import rkns # load credentials from .env file load_dotenv() access_key_id = os.getenv("STORAGE_ACCESS_KEY") secret_access_key = os.getenv("STORAGE_SECRET_KEY") endpoint_url = os.getenv("ENDPOINT") endpoint_url_full = os.getenv("ENDPOINT_FULL") # Specify the path to your custom CA certificate ca_cert_path = "ca.crt.cer" assert Path(ca_cert_path).is_file() # Create s3fs filesystem with custom cert fs = s3fs.S3FileSystem( client_kwargs={"endpoint_url": endpoint_url_full, "verify": str(ca_cert_path)}, key=access_key_id, secret=secret_access_key, use_ssl=True, ) s3_path = "rekonas-dataset-shhs-rkns/sub-shhs200001_ses-01_task-sleep_eeg.rkns" zip_fs = ZipFileSystem(fo=fs.open(s3_path, "rb")) store = zarr.storage.FSStore(url='', fs=zip_fs) rkns_obj = rkns.from_RKNS(store) print(rkns_obj.tree)