Writing mzMLb Documents¶
- class psims.mzmlb.writer.MzMLbWriter(h5_file, close=None, vocabularies=None, missing_reference_is_error=False, vocabulary_resolver=None, id=None, accession=None, h5_compression='blosc', h5_compression_options=None, h5_blocksize: int = 1048576, buffer_blocks: int = 10, **kwargs)[source]¶
A high level API for generating mzMLb HDF5 files from simple Python objects.
This class’s public interface is identical to
IndexedMzMLWriter
, with the exception of those related to HDF5 compression described below.Note
Although
h5py
can read and write through Python file-like objects, if they are used they must be opened in read+write mode to allow the file to be partially re-read during an update to an existing block.- h5_compression¶
A valid HDF5 compressor ID or compression scheme name or
None
. Available compression schemes are “gzip”/”zlib”, and ifhdf5plugin
is installed, “blosc”, “blosc:lz4”, “blosc:zlib”, and “blosc:zstd”. All Blosc-based compressors enable byte shuffling.- Type
- h5_compressor_options¶
The options to provide to the compressor designated by
h5_compressor
. For “gzip”, this a single integer setting the compression level, while Blosc takes a tuple of integers.
- h5_blocksize¶
The number of bytes to include in a single HDF5 data block. Smaller blocks improve random access speed at the expense of compression efficiency and space. Defaults to 2 ** 20, 1MB.
- Type
- buffer_blocks¶
The number of array blocks to buffer in memory before syncing to disk to reduce the number of resize operations. This applies to each array independently. Defaults to 10.
- Type
- create_array(data, name, last=None, dtype=<class 'numpy.float32'>, chunks=True)[source]¶
Store a typed data array as a named dataset in the HDF5 file.
Note
The array should not be textual unless they’ve already been translated into a byte array with terminal null bytes.
mzMLb Compression Methods¶
mzMLb can use any compression method that HDF5 can use. By default,
only the “zlib” (or “gzip”) compressors are included in h5py
,
which will be used by default. If hdf5plugin
is installed,
several additional compression options are available as well.
Note
Default Compressor
If hdf5plugin
is installed, the default compressor will be "blosc"
,
otherwise, it will be "gzip"
.
Compressor Name |
Defaults Options |
Available |
---|---|---|
blosc |
(0, 0, 0, 0, 5, 1, 1) |
Required |
blosc:lz4 |
(0, 0, 0, 0, 5, 1, 1) |
Required |
blosc:lz4hc |
(0, 0, 0, 0, 5, 1, 2) |
Required |
blosc:zlib |
(0, 0, 0, 0, 5, 1, 4) |
Required |
blosc:zstd |
(0, 0, 0, 0, 5, 1, 5) |
Required |
gzip |
4 |
Built-In to |
zlib |
4 |
Built-In to |