Writing mzML¶
Using the psims
library, ms_deisotope.output.mzml
can write an mzML
file with all associated metadata, including deconvoluted peak arrays, chromatograms,
and data transformations. The MzMLSerializer
class handles all facets of
this process.
This module also contains a specialized version of MzMLLoader
,
ProcessedMzMLLoader
, which can directly reconstruct each
deconvoluted peak list and provides fast access to an extended index of
metadata that MzMLSerializer
writes to an external file.
import ms_deisotope
from ms_deisotope.test.common import datafile
from ms_deisotope.output.mzml import MzMLSerializer
reader = ms_deisotope.MSFileLoader(datafile("small.mzML"))
with open("small.deconvoluted.mzML", 'wb') as fh:
writer = MzMLSerializer(fh, n_spectra=len(reader))
writer.copy_metadata_from(reader)
for bunch in reader:
bunch.precursor.pick_peaks()
bunch.precursor.deconvolute()
for product in bunch.products:
product.pick_peaks()
product.deconvolute()
writer.save(bunch)
writer.close()
- class ms_deisotope.output.mzml.MzMLSerializer(handle, n_spectra=200000, compression=None, deconvoluted=True, sample_name=None, build_extra_index=True, data_encoding=None, include_software_entry=True, close=None)[source]¶
Write
ms_deisotope
data structures to a file in mzML format.- base_peak_chromatogram_tracker¶
Accumulated mapping of scan time to base peak intensity. This is used to write the base peak chromatogram.
- Type
OrderedDict
- chromatogram_queue¶
Accumulate chromatogram data structures which will be written out after all spectra have been written to file.
- Type
list
- compression¶
The compression type to use for binary data arrays. Should be one of
"zlib"
,"none"
, orNone
- Type
str
- data_encoding¶
The encoding specification to specify the binary encoding of numeric data arrays that is passed to
write_spectrum()
and related methods.- Type
dict
orint
ornumpy.dtype
orstr
- data_processing_list¶
List of packaged
DataProcessingInformation
to write out- Type
list
- deconvoluted¶
Indicates whether the translation should include extra deconvolution information
- Type
bool
- file_contents_list¶
List of terms to include in the
<fileContents>
tag- Type
list
- handle¶
The file-like object being written to
- Type
file-like
- indexer¶
The external index builder
- Type
ExtendedScanIndex
- instrument_configuration_list¶
List of packaged
InstrumentInformation
to write out- Type
list
- n_spectra¶
The number of spectra to provide a size for in the
<spectrumList>
- Type
int
- processing_parameters¶
List of additional terms to include in a newly created
DataProcessingInformation
- Type
list
- sample_list¶
List of
SampleRun
objects to write out- Type
list
- sample_name¶
Default sample name
- Type
str
- sample_run¶
Description
- Type
SampleRun
- software_list¶
List of packaged
Software
objects to write out- Type
list
- source_file_list¶
List of packaged
SourceFile
objects to write out- Type
list
- total_ion_chromatogram_tracker¶
Accumulated mapping of scan time to total intensity. This is used to write the total ion chromatogram.
- Type
OrderedDict
- writer¶
The lower level writer implementation
- Type
MzMLWriter
- add_data_processing(data_processing_description: Union[ms_deisotope.data_source.metadata.data_transformation.DataProcessingInformation, ms_deisotope.data_source.metadata.data_transformation.ProcessingMethod])[source]¶
Add a new
DataProcessingInformation
orProcessingMethod
.Creates a new
<dataProcessing>
entry describing one or more<processingMethod>`s for a single referenced :class:`~.Software
instance.- Parameters
data_processing_description (
DataProcessingInformation
orProcessingMethod
) – Data manipulation sequence to add to the document
- add_file_contents(file_contents: Union[str, collections.abc.Mapping, ms_deisotope.data_source.metadata.file_information.FileContent])[source]¶
Add a key to the resulting
<fileDescription>
of the output document.- Parameters
file_contents (
str
orMapping
) – The parameter to add
- add_file_information(file_information: ms_deisotope.data_source.metadata.file_information.FileInformation)[source]¶
Add the information of a
FileInformation
to the output document.- Parameters
file_information (
FileInformation
) – The information to add.
- add_instrument_configuration(configuration: ms_deisotope.data_source.metadata.instrument_components.InstrumentInformation)[source]¶
Add an
InstrumentInformation
object to the output document.- Parameters
configuration (
InstrumentInformation
) – The instrument configuration to add
- add_processing_parameter(name: str, value: Optional[Union[str, int, float]] = None)[source]¶
Add a new processing method to the writer’s own
<dataProcessing>
element.- Parameters
name (str) – The processing technique’s name
value (obj) – The processing technique’s value, if any
- add_software(software_description: ms_deisotope.data_source.metadata.software.Software)[source]¶
Add a
Software
object to the output document.- Parameters
software_description (
Software
) – The software description to add
- add_software(software_description: ms_deisotope.data_source.metadata.software.Software)[source]¶
Add a
Software
object to the output document.- Parameters
software_description (
Software
) – The software description to add
- add_source_file(source_file: ms_deisotope.data_source.metadata.file_information.SourceFile)[source]¶
Add the
SourceFile
to the output document.- Parameters
source_file (
SourceFile
) – The source fil to add
- close()[source]¶
Finish writing scan data, write any pending metadata and close the file stream.
May call
complete()
.
- save(bunch: Union[ms_deisotope.data_source.scan.scan.Scan, ms_deisotope.data_source.scan.base.ScanBunch], **kwargs)¶
Save any scan information in bunch.
This method can handle
ScanBunch
orScanBase
instances, dispatching to the appropriate logic.- Parameters
bunch (
ScanBunch
orScanBase
) – The scan data to save. May be a collection of related scans or a single scan.
See also
- save_scan(scan: ms_deisotope.data_source.scan.base.ScanBase, **kwargs)[source]¶
Write a
Scan
to the output document as a collection of related<spectrum>
tags.Note
If no spectra have been written to the output document yet, this method will call
_add_spectrum_list()
and writes all of the metadata lists out. After this point, no new document-level metadata can be added.- Parameters
scan (
Scan
) – The scan to write.deconvoluted (
bool
) – Whether the scan to write out should include deconvolution information
- save_scan_bunch(bunch: ms_deisotope.data_source.scan.base.ScanBunch, **kwargs)[source]¶
Write a
ScanBunch
to the output document as a collection of related<spectrum>
tags.Note
If no spectra have been written to the output document yet, this method will call
_add_spectrum_list()
and writes all of the metadata lists out. After this point, no new document-level metadata can be added.- Parameters
bunch (
ScanBunch
) – The scan set to write.
- class ms_deisotope.output.mzml.ProcessedMzMLLoader(source_file, use_index=True, use_extended_index=True)[source]¶
Extends
MzMLLoader
to support deserializing preprocessed data and to provide indexing information.- extended_index¶
Holds the additional indexing information that may have been generated with the data file being accessed.
- Type
ExtendedIndex
- sample_run¶
- Type
SampleRun
- get_index_information_by_scan_id(scan_id: str) dict ¶
Get the scan description from the extended index.
- has_index_file() bool ¶
Checks if an extended index file exists for this reader.
- Returns
- Return type
bool