libgunshotmatch.peak

Classes representing peaks, and functions for peak filtering.

Classes:

PeakList([iterable])

Represents a list of peaks.

QualifiedPeak([rt, ms, minutes, outlier, …])

A Peak that has been identified using NIST MS Search and contains a list of possible identities.

QualifiedPeakList([iterable])

Represents a list of qualified peaks.

Functions:

align_peaks(peaks[, rt_modulation, …])

Perform peak alignment.

base_peak_mass(peak)

Returns the mass of the largest fragment in the peak’s mass spectrum.

filter_aligned_peaks(alignment[, …])

Filter aligned peaks by minimum average peak area, and to the top n largest peaks.

filter_peaks(peak_list, tic[, noise_filter, …])

Filter a list of peaks to remove noise and peaks due to e.g.

peak_from_dict(d)

Construct a Peak from a dictionary.

write_alignment(alignment, project_name, …)

Write the alignment data (retention times, peak areas, mass spectra) to disk.

class PeakList(iterable=(), /)[source]

Bases: List[Peak]

Represents a list of peaks.

Attributes:

datafile_name

String identifier for the datafile the peaks were detected in.

Methods:

to_list()

Return a list of pure-Python dictionaries representing the peaks and their mass spectra.

datafile_name = None

Type:    Optional[str]

String identifier for the datafile the peaks were detected in.

to_list()[source]

Return a list of pure-Python dictionaries representing the peaks and their mass spectra.

Return type

List[Dict[str, Any]]

class QualifiedPeak(rt=0.0, ms=None, minutes=False, outlier=False, hits=None, peak_number=None)[source]

Bases: Peak

A Peak that has been identified using NIST MS Search and contains a list of possible identities.

Parameters

Methods:

from_dict(d)

Construct a QualifiedPeak from a dictionary.

from_peak(peak)

Construct QualifiedPeak from a Peak.

to_dict()

Returns a dictionary representation of this peak.

Attributes:

hits

List of possible identities for the peak.

peak_number

Optional numerical identifier for the peak, such as in an Alignment.

classmethod from_dict(d)[source]

Construct a QualifiedPeak from a dictionary.

Parameters

d (Mapping[str, Any])

Return type

QualifiedPeak

classmethod from_peak(peak)[source]

Construct QualifiedPeak from a Peak.

The resulting QualifiedPeak will not have hits or peak_number set, but those attributes can be set after calling this method.

Parameters

peak (Peak)

Return type

QualifiedPeak

hits

Type:    List[SearchResult]

List of possible identities for the peak.

peak_number

Type:    Optional[int]

Optional numerical identifier for the peak, such as in an Alignment.

to_dict()[source]

Returns a dictionary representation of this peak.

All keys are native, JSON-serializable, Python objects.

Return type

Dict[str, Any]

class QualifiedPeakList(iterable=(), /)[source]

Bases: List[QualifiedPeak]

Represents a list of qualified peaks.

Attributes:

datafile_name

String identifier for the datafile the peaks were detected in.

Methods:

to_list()

Return a list of pure-Python dictionaries representing the peaks and their mass spectra.

datafile_name = None

Type:    Optional[str]

String identifier for the datafile the peaks were detected in.

to_list()[source]

Return a list of pure-Python dictionaries representing the peaks and their mass spectra.

Return type

List[Dict[str, Any]]

align_peaks(peaks, rt_modulation=2.5, gap_penalty=0.3, min_peaks=1)[source]

Perform peak alignment.

Parameters
  • peaks (List[PeakList]) – List of list of identified peaks. Each PeakList must have its datafile_name attribute set.

  • rt_modulation (float) – Retention time tolerance parameter for pairwise alignments. Default 2.5.

  • gap_penalty (float) – Gap parameter for pairwise alignments. Default 0.3.

  • min_peaks (int) – Minimum number of peaks required for the alignment position to survive filtering. If set to -1 the number of repeats in the project are used. Default 1.

Return type

Alignment

base_peak_mass(peak)[source]

Returns the mass of the largest fragment in the peak’s mass spectrum.

Parameters

peak (Peak)

New in version v0.11.0.

Return type

float

filter_aligned_peaks(alignment, top_n_peaks=80, min_peak_area=0)[source]

Filter aligned peaks by minimum average peak area, and to the top n largest peaks.

Parameters
  • alignment (Alignment)

  • top_n_peaks (int) – Filter to the largest n peaks. If 0 all peaks are included. Default 80.

  • min_peak_area (float) – Exclude aligned peaks with an average peak area below this threshold. Default 0.

Return type

DataFrame

Returns

pandas.DataFrame giving the retention times of the aligned peaks.

filter_peaks(peak_list, tic, noise_filter=True, noise_threshold=2, base_peak_filter=(73, 147), rt_range=None)[source]

Filter a list of peaks to remove noise and peaks due to e.g. column bleed.

Parameters
  • peak_list (List[Peak])

  • tic (IonChromatogram) – The TIC of the GC-MS data from which these peaks were identified.

  • noise_filter (bool) – Whether to perform automatic noise filtering of the peak list. Default True.

  • noise_threshold (int) – The minimum number of ions that must have intensities above the noise floor, otherwise the peak is excluded. Default 2.

  • base_peak_filter (Collection[int]) – Peaks whose base peak is at one of the listed masses (m/z) are excluded. Default (73, 147).

  • rt_range (Optional[Sequence[float]]) – Optional retention time range (in minutes) to filter the peak list to. Default None.

Return type

PeakList

peak_from_dict(d)[source]

Construct a Peak from a dictionary.

Parameters

d (Dict[str, Any])

Return type

Peak

write_alignment(alignment, project_name, output_dir, require_all_datafiles=False)[source]

Write the alignment data (retention times, peak areas, mass spectra) to disk.

The output files are as follows:

  • {project_name}_alignment_rt.csv, containing the aligned retention times.

  • {project_name}_alignment_area.csv, containing the peak areas for the corresponding aligned retention times.

  • {project_name}_alignment_rt.json, containing the aligned retention times.

  • {project_name}_alignment_area.json, containing the peak areas for the corresponding aligned retention times.

  • {project_name}_alignment_ms.json, containing the mass spectra for the corresponding aligned retention times.

Parameters
  • alignment (Alignment)

  • project_name (str) – The name of the project. Prefixed to all filenames.

  • output_dir (Union[str, Path, PathLike]) – Directory to store the output files in.

  • require_all_datafiles (bool) – Whether the peak must be present in all experiments to be included in the data frame. Default False.