libgunshotmatch.peak
¶
Classes representing peaks, and functions for peak filtering.
Classes:
|
Represents a list of peaks. |
|
A Peak that has been identified using NIST MS Search and contains a list of possible identities. |
|
Represents a list of qualified peaks. |
Functions:
|
Perform peak alignment. |
|
Returns the mass of the largest fragment in the peak’s mass spectrum. |
|
Filter aligned peaks by minimum average peak area, and to the top |
|
Filter a list of peaks to remove noise and peaks due to e.g. |
Construct a |
|
|
Write the alignment data (retention times, peak areas, mass spectra) to disk. |
-
class
PeakList
(iterable=(), /)[source]¶ -
Represents a list of peaks.
Attributes:
String identifier for the datafile the peaks were detected in.
Methods:
to_list
()Return a list of pure-Python dictionaries representing the peaks and their mass spectra.
-
datafile_name
= None¶ -
String identifier for the datafile the peaks were detected in.
-
-
class
QualifiedPeak
(rt=0.0, ms=None, minutes=False, outlier=False, hits=None, peak_number=None)[source]¶ Bases:
Peak
A Peak that has been identified using NIST MS Search and contains a list of possible identities.
- Parameters
rt (
float
) – Retention time. Default0.0
.ms (
Optional
[MassSpectrum
]) – The mass spectrum at the apex of the peak. DefaultNone
.minutes (
bool
) – Retention time units flag. IfTrue
, retention time is in minutes; ifFalse
retention time is in seconds. DefaultFalse
.outlier (
bool
) – Whether the peak is an outlier. DefaultFalse
.hits (
Optional
[List
[SearchResult
]]) – List of possible identities for the peak. DefaultNone
.peak_number (
Optional
[int
]) – Optional numerical identifier for thePeak
, such as in anAlignment
. DefaultNone
.
Methods:
from_dict
(d)Construct a
QualifiedPeak
from a dictionary.from_peak
(peak)Construct
QualifiedPeak
from aPeak
.to_dict
()Returns a dictionary representation of this peak.
Attributes:
List of possible identities for the peak.
Optional numerical identifier for the peak, such as in an
Alignment
.-
classmethod
from_dict
(d)[source]¶ Construct a
QualifiedPeak
from a dictionary.- Parameters
- Return type
-
classmethod
from_peak
(peak)[source]¶ Construct
QualifiedPeak
from aPeak
.The resulting
QualifiedPeak
will not havehits
orpeak_number
set, but those attributes can be set after calling this method.- Parameters
peak (
Peak
)- Return type
-
hits
¶ Type:
List
[SearchResult
]List of possible identities for the peak.
-
class
QualifiedPeakList
(iterable=(), /)[source]¶ Bases:
List
[QualifiedPeak
]Represents a list of qualified peaks.
Attributes:
String identifier for the datafile the peaks were detected in.
Methods:
to_list
()Return a list of pure-Python dictionaries representing the peaks and their mass spectra.
-
datafile_name
= None¶ -
String identifier for the datafile the peaks were detected in.
-
-
align_peaks
(peaks, rt_modulation=2.5, gap_penalty=0.3, min_peaks=1)[source]¶ Perform peak alignment.
- Parameters
peaks (
List
[PeakList
]) – List of list of identified peaks. EachPeakList
must have itsdatafile_name
attribute set.rt_modulation (
float
) – Retention time tolerance parameter for pairwise alignments. Default2.5
.gap_penalty (
float
) – Gap parameter for pairwise alignments. Default0.3
.min_peaks (
int
) – Minimum number of peaks required for the alignment position to survive filtering. If set to-1
the number of repeats in the project are used. Default1
.
- Return type
-
base_peak_mass
(peak)[source]¶ Returns the mass of the largest fragment in the peak’s mass spectrum.
- Parameters
peak (
Peak
)
New in version v0.11.0.
- Return type
-
filter_aligned_peaks
(alignment, top_n_peaks=80, min_peak_area=0)[source]¶ Filter aligned peaks by minimum average peak area, and to the top
n
largest peaks.- Parameters
- Return type
- Returns
pandas.DataFrame
giving the retention times of the aligned peaks.
-
filter_peaks
(peak_list, tic, noise_filter=True, noise_threshold=2, base_peak_filter=(73, 147), rt_range=None)[source]¶ Filter a list of peaks to remove noise and peaks due to e.g. column bleed.
- Parameters
tic (
IonChromatogram
) – The TIC of the GC-MS data from which these peaks were identified.noise_filter (
bool
) – Whether to perform automatic noise filtering of the peak list. DefaultTrue
.noise_threshold (
int
) – The minimum number of ions that must have intensities above the noise floor, otherwise the peak is excluded. Default2
.base_peak_filter (
Collection
[int
]) – Peaks whose base peak is at one of the listed masses (m/z) are excluded. Default(73, 147)
.rt_range (
Optional
[Sequence
[float
]]) – Optional retention time range (in minutes) to filter the peak list to. DefaultNone
.
- Return type
-
write_alignment
(alignment, project_name, output_dir, require_all_datafiles=False)[source]¶ Write the alignment data (retention times, peak areas, mass spectra) to disk.
The output files are as follows:
{project_name}_alignment_rt.csv
, containing the aligned retention times.{project_name}_alignment_area.csv
, containing the peak areas for the corresponding aligned retention times.{project_name}_alignment_rt.json
, containing the aligned retention times.{project_name}_alignment_area.json
, containing the peak areas for the corresponding aligned retention times.{project_name}_alignment_ms.json
, containing the mass spectra for the corresponding aligned retention times.
- Parameters
alignment (
Alignment
)project_name (
str
) – The name of the project. Prefixed to all filenames.output_dir (
Union
[str
,Path
,PathLike
]) – Directory to store the output files in.require_all_datafiles (
bool
) – Whether the peak must be present in all experiments to be included in the data frame. DefaultFalse
.