libgunshotmatch.peak¶
Classes representing peaks, and functions for peak filtering.
Classes:
|
Represents a list of peaks. |
|
A Peak that has been identified using NIST MS Search and contains a list of possible identities. |
|
Represents a list of qualified peaks. |
Functions:
|
Perform peak alignment. |
|
Returns the mass of the largest fragment in the peak’s mass spectrum. |
|
Filter aligned peaks by minimum average peak area, and to the top |
|
Filter a list of peaks to remove noise and peaks due to e.g. |
Construct a |
|
|
Write the alignment data (retention times, peak areas, mass spectra) to disk. |
|
Write the alignment data (retention times, peak areas, mass spectra) to disk. |
-
class
PeakList(iterable=(), /)[source]¶ -
Represents a list of peaks.
Attributes:
String identifier for the datafile the peaks were detected in.
Methods:
to_list()Return a list of pure-Python dictionaries representing the peaks and their mass spectra.
-
datafile_name= None¶ -
String identifier for the datafile the peaks were detected in.
-
-
class
QualifiedPeak(rt=0.0, ms=None, minutes=False, outlier=False, hits=None, peak_number=None)[source]¶ Bases:
PeakA Peak that has been identified using NIST MS Search and contains a list of possible identities.
- Parameters
rt (
float) – Retention time. Default0.0.ms (
Optional[MassSpectrum]) – The mass spectrum at the apex of the peak. DefaultNone.minutes (
bool) – Retention time units flag. IfTrue, retention time is in minutes; ifFalseretention time is in seconds. DefaultFalse.outlier (
bool) – Whether the peak is an outlier. DefaultFalse.hits (
Optional[List[SearchResult]]) – List of possible identities for the peak. DefaultNone.peak_number (
Optional[int]) – Optional numerical identifier for thePeak, such as in anAlignment. DefaultNone.
Methods:
from_dict(d)Construct a
QualifiedPeakfrom a dictionary.from_peak(peak)Construct
QualifiedPeakfrom aPeak.to_dict()Returns a dictionary representation of this peak.
Attributes:
List of possible identities for the peak.
Optional numerical identifier for the peak, such as in an
Alignment.-
classmethod
from_dict(d)[source]¶ Construct a
QualifiedPeakfrom a dictionary.- Parameters
- Return type
-
classmethod
from_peak(peak)[source]¶ Construct
QualifiedPeakfrom aPeak.The resulting
QualifiedPeakwill not havehitsorpeak_numberset, but those attributes can be set after calling this method.- Parameters
peak (
Peak)- Return type
-
hits¶ Type:
List[SearchResult]List of possible identities for the peak.
-
class
QualifiedPeakList(iterable=(), /)[source]¶ Bases:
List[QualifiedPeak]Represents a list of qualified peaks.
Attributes:
String identifier for the datafile the peaks were detected in.
Methods:
to_list()Return a list of pure-Python dictionaries representing the peaks and their mass spectra.
-
datafile_name= None¶ -
String identifier for the datafile the peaks were detected in.
-
-
align_peaks(peaks, rt_modulation=2.5, gap_penalty=0.3, min_peaks=1)[source]¶ Perform peak alignment.
- Parameters
peaks (
List[PeakList]) – List of list of identified peaks. EachPeakListmust have itsdatafile_nameattribute set.rt_modulation (
float) – Retention time tolerance parameter for pairwise alignments. Default2.5.gap_penalty (
float) – Gap parameter for pairwise alignments. Default0.3.min_peaks (
int) – Minimum number of peaks required for the alignment position to survive filtering. If set to-1the number of repeats in the project are used. Default1.
- Return type
-
base_peak_mass(peak)[source]¶ Returns the mass of the largest fragment in the peak’s mass spectrum.
- Parameters
peak (
Peak)
New in version v0.11.0.
- Return type
-
filter_aligned_peaks(alignment, top_n_peaks=80, min_peak_area=0)[source]¶ Filter aligned peaks by minimum average peak area, and to the top
nlargest peaks.- Parameters
- Return type
- Returns
pandas.DataFramegiving the retention times of the aligned peaks.
-
filter_peaks(peak_list, tic, noise_filter=True, noise_threshold=2, base_peak_filter=(73, 147), rt_range=None)[source]¶ Filter a list of peaks to remove noise and peaks due to e.g. column bleed.
- Parameters
tic (
IonChromatogram) – The TIC of the GC-MS data from which these peaks were identified.noise_filter (
bool) – Whether to perform automatic noise filtering of the peak list. DefaultTrue.noise_threshold (
int) – The minimum number of ions that must have intensities above the noise floor, otherwise the peak is excluded. Default2.base_peak_filter (
Collection[int]) – Peaks whose base peak is at one of the listed masses (m/z) are excluded. Default(73, 147).rt_range (
Optional[Sequence[float]]) – Optional retention time range (in minutes) to filter the peak list to. DefaultNone.
- Return type
-
write_alignment(alignment, project_name, output_dir, require_all_datafiles=False)[source]¶ Write the alignment data (retention times, peak areas, mass spectra) to disk.
The output files are as follows:
{project_name}_alignment_rt.csv, containing the aligned retention times.{project_name}_alignment_area.csv, containing the peak areas for the corresponding aligned retention times.{project_name}_alignment_rt.json, containing the aligned retention times.{project_name}_alignment_area.json, containing the peak areas for the corresponding aligned retention times.{project_name}_alignment_ms.json, containing the mass spectra for the corresponding aligned retention times.
- Parameters
alignment (
Alignment)project_name (
str) – The name of the project. Prefixed to all filenames.output_dir (
Union[str,Path,PathLike]) – Directory to store the output files in.require_all_datafiles (
bool) – Whether the peak must be present in all experiments to be included in the data frame. DefaultFalse.
-
write_project_alignment(project, output_dir, require_all_datafiles=False)[source]¶ Write the alignment data (retention times, peak areas, mass spectra) to disk.
The output files are as follows:
{project.name}_alignment_rt.csv, containing the aligned retention times.{project.name}_alignment_area.csv, containing the peak areas for the corresponding aligned retention times.{project.name}_alignment_rt.json, containing the aligned retention times.{project.name}_alignment_area.json, containing the peak areas for the corresponding aligned retention times.{project.name}_alignment_ms.json, containing the mass spectra for the corresponding aligned retention times.
- Parameters
New in version 0.12.0: Added as an alternative to
write_alignment(). This function sorts the columns to match the order ofproject.datafile_data.