Writing mzML¶
Using the psims library, ms_deisotope.output.mzml can write an mzML
file with all associated metadata, including deconvoluted peak arrays, chromatograms,
and data transformations. The MzMLSerializer class handles all facets of
this process.
This module also contains a specialized version of MzMLLoader,
ProcessedMzMLLoader, which can directly reconstruct each
deconvoluted peak list and provides fast access to an extended index of
metadata that MzMLSerializer writes to an external file.
import ms_deisotope
from ms_deisotope.test.common import datafile
from ms_deisotope.output.mzml import MzMLSerializer
reader = ms_deisotope.MSFileLoader(datafile("small.mzML"))
with open("small.deconvoluted.mzML", 'wb') as fh:
writer = MzMLSerializer(fh, n_spectra=len(reader))
writer.copy_metadata_from(reader)
for bunch in reader:
bunch.precursor.pick_peaks()
bunch.precursor.deconvolute()
for product in bunch.products:
product.pick_peaks()
product.deconvolute()
writer.save(bunch)
writer.close()
- class ms_deisotope.output.mzml.MzMLSerializer(handle, n_spectra=200000, compression=None, deconvoluted=True, sample_name=None, build_extra_index=True, data_encoding=None, include_software_entry=True, close=None)[source]¶
Write
ms_deisotopedata structures to a file in mzML format.- base_peak_chromatogram_tracker¶
Accumulated mapping of scan time to base peak intensity. This is used to write the base peak chromatogram.
- Type
OrderedDict
- chromatogram_queue¶
Accumulate chromatogram data structures which will be written out after all spectra have been written to file.
- Type
list
- compression¶
The compression type to use for binary data arrays. Should be one of
"zlib","none", orNone- Type
str
- data_encoding¶
The encoding specification to specify the binary encoding of numeric data arrays that is passed to
write_spectrum()and related methods.- Type
dictorintornumpy.dtypeorstr
- data_processing_list¶
List of packaged
DataProcessingInformationto write out- Type
list
- deconvoluted¶
Indicates whether the translation should include extra deconvolution information
- Type
bool
- file_contents_list¶
List of terms to include in the
<fileContents>tag- Type
list
- handle¶
The file-like object being written to
- Type
file-like
- indexer¶
The external index builder
- Type
ExtendedScanIndex
- instrument_configuration_list¶
List of packaged
InstrumentInformationto write out- Type
list
- n_spectra¶
The number of spectra to provide a size for in the
<spectrumList>- Type
int
- processing_parameters¶
List of additional terms to include in a newly created
DataProcessingInformation- Type
list
- sample_list¶
List of
SampleRunobjects to write out- Type
list
- sample_name¶
Default sample name
- Type
str
- sample_run¶
Description
- Type
SampleRun
- software_list¶
List of packaged
Softwareobjects to write out- Type
list
- source_file_list¶
List of packaged
SourceFileobjects to write out- Type
list
- total_ion_chromatogram_tracker¶
Accumulated mapping of scan time to total intensity. This is used to write the total ion chromatogram.
- Type
OrderedDict
- writer¶
The lower level writer implementation
- Type
MzMLWriter
- add_data_processing(data_processing_description: Union[ms_deisotope.data_source.metadata.data_transformation.DataProcessingInformation, ms_deisotope.data_source.metadata.data_transformation.ProcessingMethod])[source]¶
Add a new
DataProcessingInformationorProcessingMethod.Creates a new
<dataProcessing>entry describing one or more<processingMethod>`s for a single referenced :class:`~.Softwareinstance.- Parameters
data_processing_description (
DataProcessingInformationorProcessingMethod) – Data manipulation sequence to add to the document
- add_file_contents(file_contents: Union[str, collections.abc.Mapping, ms_deisotope.data_source.metadata.file_information.FileContent])[source]¶
Add a key to the resulting
<fileDescription>of the output document.- Parameters
file_contents (
strorMapping) – The parameter to add
- add_file_information(file_information: ms_deisotope.data_source.metadata.file_information.FileInformation)[source]¶
Add the information of a
FileInformationto the output document.- Parameters
file_information (
FileInformation) – The information to add.
- add_instrument_configuration(configuration: ms_deisotope.data_source.metadata.instrument_components.InstrumentInformation)[source]¶
Add an
InstrumentInformationobject to the output document.- Parameters
configuration (
InstrumentInformation) – The instrument configuration to add
- add_processing_parameter(name: str, value: Optional[Union[str, int, float]] = None)[source]¶
Add a new processing method to the writer’s own
<dataProcessing>element.- Parameters
name (str) – The processing technique’s name
value (obj) – The processing technique’s value, if any
- add_software(software_description: ms_deisotope.data_source.metadata.software.Software)[source]¶
Add a
Softwareobject to the output document.- Parameters
software_description (
Software) – The software description to add
- add_software(software_description: ms_deisotope.data_source.metadata.software.Software)[source]¶
Add a
Softwareobject to the output document.- Parameters
software_description (
Software) – The software description to add
- add_source_file(source_file: ms_deisotope.data_source.metadata.file_information.SourceFile)[source]¶
Add the
SourceFileto the output document.- Parameters
source_file (
SourceFile) – The source fil to add
- close()[source]¶
Finish writing scan data, write any pending metadata and close the file stream.
May call
complete().
- save(bunch: Union[ms_deisotope.data_source.scan.scan.Scan, ms_deisotope.data_source.scan.base.ScanBunch], **kwargs)¶
Save any scan information in bunch.
This method can handle
ScanBunchorScanBaseinstances, dispatching to the appropriate logic.- Parameters
bunch (
ScanBunchorScanBase) – The scan data to save. May be a collection of related scans or a single scan.
See also
- save_scan(scan: ms_deisotope.data_source.scan.base.ScanBase, **kwargs)[source]¶
Write a
Scanto the output document as a collection of related<spectrum>tags.Note
If no spectra have been written to the output document yet, this method will call
_add_spectrum_list()and writes all of the metadata lists out. After this point, no new document-level metadata can be added.- Parameters
scan (
Scan) – The scan to write.deconvoluted (
bool) – Whether the scan to write out should include deconvolution information
- save_scan_bunch(bunch: ms_deisotope.data_source.scan.base.ScanBunch, **kwargs)[source]¶
Write a
ScanBunchto the output document as a collection of related<spectrum>tags.Note
If no spectra have been written to the output document yet, this method will call
_add_spectrum_list()and writes all of the metadata lists out. After this point, no new document-level metadata can be added.- Parameters
bunch (
ScanBunch) – The scan set to write.
- class ms_deisotope.output.mzml.ProcessedMzMLLoader(source_file, use_index=True, use_extended_index=True)[source]¶
Extends
MzMLLoaderto support deserializing preprocessed data and to provide indexing information.- extended_index¶
Holds the additional indexing information that may have been generated with the data file being accessed.
- Type
ExtendedIndex
- sample_run¶
- Type
SampleRun
- get_index_information_by_scan_id(scan_id: str) dict¶
Get the scan description from the extended index.
- has_index_file() bool¶
Checks if an extended index file exists for this reader.
- Returns
- Return type
bool