File Description

A mass spectrometry data file contains heterogenous types of information derived from one or more provenance sources. Some formats, like mzML, track this information. This information can be queried to make the object select behavior appropriate to its contents and to provide a clear chai of evidence back to the original raw data.

Defines types for describing different kinds of mass spectrometry data files and their contents, and a database of controlled vocabulary terms for them.

class ms_deisotope.data_source.metadata.file_information.FileInformation(contents=None, source_files=None)[source]

Describes the type of data found in this file and the source files that contributed to it.

Implements the MutableMapping Interface

contents

A mapping between controlled vocabullary names or user-defined names and an optional value. For standard controlled names see content_keys

Type

dict

source_files

The set of files which either define the current file, or were used to create the current file if recorded.

Type

list of SourceFile objects

add_content(key, value=None)[source]

Adds a new key-value pair to contents with an optional value

Parameters
  • key (str or content) – The content name, either a CV-term or a user-defined name

  • value (object, optional) – The optional value, which should be any type of object whose meaning makes sense given the definition of key

add_file(source, check=True)[source]

Add a new file to source_files

If source is a string, it will be interpreted as a path and an instance of SourceFile will be created using SourceFile.from_path(). Otherwise, it is assumed to be an instance of SourceFile.

Parameters
  • source (str or SourceFile) – Either the path to a file to be added to the source file collection, or an instance of SourceFile

  • check (bool, optional) – Whether or not to check and validate that a path points to a real file

Raises

ValueError – If a path fails to validate as real

copy()[source]

Create a deep copy of this object

Returns

Return type

FileInformation

get_content(key)[source]

Retrieve the value of key from contents.

This method is aliased to __getitem__()

Parameters

key (str or FileContent) –

Returns

Return type

object

has_content(key)[source]

Check if key is found in content

Parameters

key (str or FileContent) –

Returns

Return type

bool

remove_content(key)[source]

Remove a key from content

Parameters

key (str or FileContent) – The content key to remove

class ms_deisotope.data_source.metadata.file_information.SourceFile(name, location, id=None, id_format=None, file_format=None, parameters=None)[source]

Represents a single raw data file which either defines or contributed data to another data file, the “reference file”

file_format

The name of a data file format. See file_formats

Type

FileFormat

id

The unique identifier for this file, among files which contributed to the reference file

Type

str

id_format

The name of a formal identifier schema. See id_formats

Type

IDFormat

location

The directory path to this file on the machine it was last read on to contribute to or define the reference file

Type

str

name

The base name of this file

Type

str

parameters

A set of key-value pairs associated with this file, either encoding extra metadata annotations, or precomputed hash checksums

Type

dict

classmethod from_path(path)[source]

Construct a new SourceFile from a path to a real file on the local file system.

Parameters

path (str) – The path to the file to describe

Returns

Return type

SourceFile

ms_deisotope.data_source.metadata.file_information.id_formats

These are the recognized formats for encoding scan identifiers from raw and processed mass spectrometry data files, derived from the HUPO PSI-MS controlled vocabulary.

Error

Unable to execute python code at file_metadata.rst:20:

‘<’ not supported between instances of ‘IDFormat’ and ‘IDFormat’

ms_deisotope.data_source.metadata.file_information.file_formats

These are the recognized formats for storing raw and processed mass spectrometry data in, derived from the HUPO PSI-MS controlled vocabulary.

Error

Unable to execute python code at file_metadata.rst:50:

‘<’ not supported between instances of ‘FileFormat’ and ‘FileFormat’

ms_deisotope.data_source.metadata.file_information.content_keys

These are commonly used to describe the contents of a mass spectrometry data file, derived from the HUPO PSI-MS controlled vocabulary.

Error

Unable to execute python code at file_metadata.rst:67:

‘<’ not supported between instances of ‘FileContent’ and ‘FileContent’