libgunshotmatch.consolidate
¶
Functions for combining peak identifications across aligned peaks into a single set of results.
Classes:
|
A Peak that has been produced by consolidating the properties and search results of several qualified peaks. |
|
Class to filter a list of consolidated peaks to exclude peaks by hit name, match factor etc. |
|
Represents a candidate compound for a peak. |
|
Inverted version of |
Functions:
|
Sum the intensities across all mass spectra in the given peak. |
|
Find the most likely compound for each peak. |
|
Between Samples Spectra Comparison. |
-
class
ConsolidatedPeak
(rt_list, area_list, ms_list, *, minutes=False, hits=None, ms_comparison=None, meta=None)[source]¶ A Peak that has been produced by consolidating the properties and search results of several qualified peaks.
- Parameters
rt_list (
List
[float
]) – List of retention times of the aligned peaks.area_list (
List
[float
]) – List of peak areas for the aligned peaks.ms_list (
MutableSequence
[Optional
[MassSpectrum
]]) – List of mass spectra for the aligned peaks.minutes (
bool
) – Retention time units flag. IfTrue
, retention time is in minutes; ifFalse
retention time is in seconds. DefaultFalse
.hits (
Optional
[List
[ConsolidatedSearchResult
]]) – Optional list of possible identities for this peak. DefaultNone
.ms_comparison (
Union
[Mapping
[str
,float
],Series
,None
]) – Mapping or PandasSeries
giving pairwise mass spectral comparison scores. DefaultNone
.meta (
Optional
[Dict
[str
,Any
]]) – Optional dictionary for storing e.g. peak number or whether the peak should be hidden. DefaultNone
.
Methods:
__len__
()How many instances of the peak make up this
ConsolidatedPeak
.from_dict
(d)Construct a
ConsolidatedPeak
from a dictionary.to_dict
()Returns a dictionary representation of this
ConsolidatedPeak
.Attributes:
The average peak area across the aligned peaks.
List of peak areas for the aligned peaks.
The standard deviation of the peak area across the aligned peaks.
The average of the pairwise mass spectral comparison scores.
Optional list of possible identities for this peak.
Optional dictionary for storing e.g.
Pairwise mass spectral comparison scores.
The standard deviation of the pairwise mass spectral comparison scores.
List of mass spectra for the aligned peaks.
The average retention time across the aligned peaks.
List of retention times of the aligned peaks.
The standard deviation of the retention time across the aligned peaks.
-
__len__
()[source]¶ How many instances of the peak make up this
ConsolidatedPeak
.- Return type
-
property
area_stdev
¶ The standard deviation of the peak area across the aligned peaks.
- Return type
-
property
average_ms_comparison
¶ The average of the pairwise mass spectral comparison scores.
- Return type
-
classmethod
from_dict
(d)[source]¶ Construct a
ConsolidatedPeak
from a dictionary.- Parameters
- Return type
-
hits
¶ Type:
List
[ConsolidatedSearchResult
]Optional list of possible identities for this peak.
-
meta
¶ -
Optional dictionary for storing e.g. peak number or whether the peak should be hidden.
-
property
ms_comparison_stdev
¶ The standard deviation of the pairwise mass spectral comparison scores.
- Return type
-
ms_list
¶ Type:
MutableSequence
[Optional
[MassSpectrum
]]List of mass spectra for the aligned peaks.
-
property
rt_stdev
¶ The standard deviation of the retention time across the aligned peaks.
- Return type
-
class
ConsolidatedPeakFilter
(name_filter=[], min_match_factor=600, min_appearances=- 1, verbose=False)[source]¶ Class to filter a list of consolidated peaks to exclude peaks by hit name, match factor etc.
New in version 0.2.0.
- Parameters
name_filter (
Iterable
[str
]) – List of glob-style matches for compound names. Consolidated peaks matching any of these will be excluded. Default[]
.min_match_factor (
int
) – Minimum average match factor. Consolidated peaks with an average match factor below this will be excluded. Default600
.min_appearances (
int
) – Number of times the hit must appear across the individual aligned peaks. Consolidated peaks where the most common hit appears fewer times than this will be excluded. If set to-1
the number of instances of the peak in the project are used. Default-1
.verbose (
bool
) – IfTrue
details of excluded peaks will be printed. DefaultFalse
.
Methods:
filter
(consolidated_peaks)Filter a list of consolidated peaks.
from_method
(method)Construct a
ConsolidatedPeakFilter
from aConsolidateMethod
.print_skip_reason
(peak, reason)Print the reason for skipping a peak, if
ConsolidatedPeakFilter.verbose
isTrue
.should_filter_peak
(peak)Returns
True
if the peak should be excluded based on the current filter options.Attributes:
Number of times the hit must appear across the individual aligned peaks.
Minimum average match factor.
List of glob-style matches for compound names.
If
True
details of excluded peaks will be printed.-
filter
(consolidated_peaks)[source]¶ Filter a list of consolidated peaks.
- Parameters
consolidated_peaks (
List
[ConsolidatedPeak
])- Return type
-
classmethod
from_method
(method)[source]¶ Construct a
ConsolidatedPeakFilter
from aConsolidateMethod
.- Parameters
method (
ConsolidateMethod
)- Return type
-
min_appearances
¶ Type:
int
Number of times the hit must appear across the individual aligned peaks.
Consolidated peaks where the most common hit appears fewer times than this will be excluded.
If set to
-1
the number of instances of the peak in the project are used.
-
min_match_factor
¶ Type:
int
Minimum average match factor.
Consolidated peaks with an average match factor below this will be excluded.
-
name_filter
¶ -
List of glob-style matches for compound names.
Consolidated peaks matching any of these will be excluded.
-
print_skip_reason
(peak, reason)[source]¶ Print the reason for skipping a peak, if
ConsolidatedPeakFilter.verbose
isTrue
.- Parameters
peak (
ConsolidatedPeak
) – The peak being skipped.reason (
str
) – The reason for skipping the peak.
-
should_filter_peak
(peak)[source]¶ Returns
True
if the peak should be excluded based on the current filter options.- Parameters
peak (
ConsolidatedPeak
)- Return type
-
class
InvertedFilter
(name_filter=[], min_match_factor=600, min_appearances=- 1, verbose=False)[source]¶ Bases:
libgunshotmatch.consolidate.ConsolidatedPeakFilter
Inverted version of
ConsolidatedPeakFilter
.Returns peaks which would be excluded by a
ConsolidatedPeakFilter
.New in version 0.10.0.
- Parameters
name_filter (
Iterable
[str
]) – List of glob-style matches for compound names. Consolidated peaks matching any of these will be excluded. Default[]
.min_match_factor (
int
) – Minimum average match factor. Consolidated peaks with an average match factor below this will be excluded. Default600
.min_appearances (
int
) – Number of times the hit must appear across the individual aligned peaks. Consolidated peaks where the most common hit appears fewer times than this will be excluded. If set to-1
the number of instances of the peak in the project are used. Default-1
.verbose (
bool
) – IfTrue
details of excluded peaks will be printed. DefaultFalse
.
Methods:
filter
(consolidated_peaks)Filter a list of consolidated peaks.
print_skip_reason
(peak, reason)Print the reason for skipping a peak, if
ConsolidatedPeakFilter.verbose
isTrue
.-
filter
(consolidated_peaks)[source]¶ Filter a list of consolidated peaks.
- Parameters
consolidated_peaks (
List
[ConsolidatedPeak
])- Return type
-
print_skip_reason
(peak, reason)[source]¶ Print the reason for skipping a peak, if
ConsolidatedPeakFilter.verbose
isTrue
.- Parameters
peak (
ConsolidatedPeak
) – The peak being skipped.reason (
str
) – The reason for skipping the peak.
-
class
ConsolidatedSearchResult
(name, cas, mf_list=[], rmf_list=[], hit_numbers=[], reference_data=None)[source]¶ Represents a candidate compound for a peak.
This is determined from a set of
SearchResults
for a set of aligned peaks.- Parameters
name (
str
) – The name of the candidate compound.cas (
str
) – The CAS number of the compound.mf_list (
List
[int
]) – List of Match Factors comparing the mass spectrum of the peak with the reference spectrum in each aligned peak. Will contain NaN where the compound was not in the hit list for a peak. Default[]
.rmf_list (
List
[int
]) – List of Reverse Match Factors comparing the reference spectrum with the spectrum for each aligned peak. Will contain NaN where the compound was not in the hit list for a peak. Default[]
.hit_numbers (
List
[int
]) – List of “hit” numbers from NIST MS Search. Lower is better. Will contain NaN where the compound was not in the hit list for a peak. Default[]
.reference_data (
Union
[Dict
,ReferenceData
,None
]) – The reference mass spectrum for the compound from the NIST library. DefaultNone
.
Methods:
__len__
()The number of aligned peaks the compound appeared in the hit list for.
from_dict
(d)Construct a
ConsolidatedSearchResult
from a dictionary.to_dict
()Returns a dictionary representation of this
ConsolidatedSearchResult
.Attributes:
The average hit number.
The CAS number of the compound.
The standard deviation of the hit numbers.
List of “hit” numbers from NIST MS Search.
The average match factor.
The standard deviation of the match factors.
List of Match Factors comparing the mass spectrum of the peak with the reference spectrum in each aligned peak.
The name of the candidate compound.
The reference mass spectrum for the compound from the NIST library.
The average reverse match factor.
The standard deviation of the reverse match factors.
List of Reverse Match Factors comparing the reference spectrum with the spectrum for each aligned peak.
-
__len__
()[source]¶ The number of aligned peaks the compound appeared in the hit list for.
- Return type
-
property
average_hit_number
¶ The average hit number.
Missing values (where the compound was not in the hit list for a peak) are excluded from the calculation.
- Return type
-
classmethod
from_dict
(d)[source]¶ Construct a
ConsolidatedSearchResult
from a dictionary.- Parameters
- Return type
-
property
hit_number_stdev
¶ The standard deviation of the hit numbers.
Missing values (where the compound was not in the hit list for a peak) are excluded from the calculation.
- Return type
-
hit_numbers
¶ -
List of “hit” numbers from NIST MS Search.
Lower is better. Will contain NaN where the compound was not in the hit list for a peak.
-
property
match_factor
¶ The average match factor.
Missing values (where the compound was not in the hit list for a peak) are excluded from the calculation.
- Return type
-
property
match_factor_stdev
¶ The standard deviation of the match factors.
Missing values (where the compound was not in the hit list for a peak) are excluded from the calculation.
- Return type
-
mf_list
¶ -
List of Match Factors comparing the mass spectrum of the peak with the reference spectrum in each aligned peak.
Will contain NaN where the compound was not in the hit list for a peak.
-
reference_data
¶ Type:
Optional
[ReferenceData
]The reference mass spectrum for the compound from the NIST library.
-
property
reverse_match_factor
¶ The average reverse match factor.
Missing values (where the compound was not in the hit list for a peak) are excluded from the calculation.
- Return type
-
property
reverse_match_factor_stdev
¶ The standard deviation of the reverse match factors.
Missing values (where the compound was not in the hit list for a peak) are excluded from the calculation.
- Return type
-
rmf_list
¶ -
List of Reverse Match Factors comparing the reference spectrum with the spectrum for each aligned peak.
Will contain NaN where the compound was not in the hit list for a peak.
-
match_counter
(engine, peak_numbers, qualified_peaks, ms_comp_data)[source]¶ Find the most likely compound for each peak.
- Parameters
- Return type