nomspectra.spectrum module
- class nomspectra.spectrum.Spectrum(table: Optional[pandas.core.frame.DataFrame] = None, metadata: Optional[Dict] = None)[source]
Bases:
objectA class used to represent mass spectrum
- __init__(table: Optional[pandas.core.frame.DataFrame] = None, metadata: Optional[Dict] = None) pandas.core.frame.DataFrame[source]
- Parameters
table (pandas Datarame) – Optional. Consist spectrum (mass and intensity of peaks) and all calculated parameters like brutto formulas, calculated mass, relative errorr
metadata (Dict) – Optional. Default None. To add some data into spectrum metedata.
- ai() nomspectra.spectrum.Spectrum[source]
Calculate AI (aromaticity index)
Add column “AI” to self.table
- Return type
References
Koch, Boris P., and T. Dittmar. “From mass to structure: An aromaticity index for high resolution mass data of natural organic matter.” Rapid communications in mass spectrometry 20.5 (2006): 926-932.
- assign(brutto_dict: Optional[dict] = None, generated_bruttos_table: Optional[pandas.core.frame.DataFrame] = None, rel_error: Optional[float] = None, abs_error: Optional[float] = None, sign: str = '-', mass_min: Optional[float] = None, mass_max: Optional[float] = None, intensity_min: Optional[float] = None, intensity_max: Optional[float] = None, charge_max: int = 1) nomspectra.spectrum.Spectrum[source]
Assigning brutto formulas to signal by mass
- Parameters
brutto_dict (dict) – Optional. Deafault None. Custom Dictonary for generate brutto table. Example: {‘C’:(4, 51),’H’:(4, 101),’O’:(0,26), ‘N’:(0,4), ‘C_13’:(0,3)}
generated_bruttos_table (pandas DataFrame) – Optional. Contain column ‘mass’ and elements, should be sorted by ‘mass’. Can be generated by function brutto_generator.brutto_gen(). if ‘None’ generate table with default elemnets and ranges C: 4-50, H 4-100, O 0-25, N 0-3, S 0-2.
rel_error (float) – Optional. default 0.5, permissible error in ppm for assign mass to brutto formulas
abs_error (float) – Optional. default None, permissible absolute error for assign mass to brutto formulas
sign (str) – Optional. Deafult ‘-‘. Mode in which mass spectrum was gotten. ‘-’ for negative mode ‘+’ for positive mode ‘0’ for neutral
mass_min (float) – Optional. Default None. Minimall mass for assigment
mass_max (float) – Optional. Default None. Maximum mass for assigment
intensity_min (float) – Optional. Default None. Minimall intensity for assigment
intensity_max (float) – Optional. Default None. Maximum intensity for assigment
charge_max (int) – Maximum charge in m/z. Default 1.
- Return type
- brutto() nomspectra.spectrum.Spectrum[source]
Calculate string with brutto from assign table
Add column “britto” to self.table
- Return type
- cai() nomspectra.spectrum.Spectrum[source]
Calculate CAI (C - O - N - S - P)
Add column “CAI” to self.table
- Return type
- calc_all_metrics() nomspectra.spectrum.Spectrum[source]
Calculated all available metrics
- Return type
- calc_error(sign: Optional[str] = None) nomspectra.spectrum.Spectrum[source]
Calculate relative and absolute error of assigned peaks from measured and calculated masses
Add columns “abs_error” and “rel_error” to self.table
- Parameters
sign ({'-', '+', '0'}) – Optional. Default None and get from metatdata or calculated by self. Mode in which mass spectrum was gotten. ‘-’ for negative mode ‘+’ for positive mode ‘0’ for neutral
- Return type
- calc_mass() nomspectra.spectrum.Spectrum[source]
Calculate mass from assigned brutto formulas and elements exact masses
Add column “calc_mass” to self.table
- Return type
- copy() nomspectra.spectrum.Spectrum[source]
Deepcopy of self Spectrum object
- Return type
- cram() nomspectra.spectrum.Spectrum[source]
Mark rows that fit CRAM conditions (carboxylic-rich alicyclic molecules)
Add column “CRAM” to self.table
- Return type
References
Hertkorn, N. et al. Characterization of a major refractory component of marine dissolved organic matter. Geochimica et. Cosmochimica Acta 70, 2990-3010 (2006)
- dbe() nomspectra.spectrum.Spectrum[source]
Calculate DBE (1 + C - 0.5 * (H + N))
Add column “DBE” to self.table
- Return type
- dbe_ai() nomspectra.spectrum.Spectrum[source]
Calculate DBE_AI (1 + C - O - S - 0.5 * (H + N + P))
Add column “DBE_AI” to self.table
- Return type
- dbe_o() nomspectra.spectrum.Spectrum[source]
Calculate DBE - O
Add column “DBE-O” to self.table
- Return type
- dbe_oc() nomspectra.spectrum.Spectrum[source]
Calculate (DBE - O) / C
Add column “DBE-OC” to self.table
- Return type
- drop_unassigned() nomspectra.spectrum.Spectrum[source]
Drop unassigned by brutto rows
- Return type
Caution
Danger of lose data - with these operation we exclude data that can be usefull
- filter_by_C13(rel_error: float = 0.5, remove: bool = False) nomspectra.spectrum.Spectrum[source]
Check if peaks have the same brutto with C13 isotope
- Parameters
rel_error (float) – Optional. Default 0.5. Allowable ppm error when checking c13 isotope peak
remove (bool) – Optional, default False. Drop unassigned peaks and peaks without C13 isotope
- Return type
- find_elements() Sequence[str][source]
Find elements from columns of mass spectrum table.
For example, column ‘C’ will be recognised as carbon 12C, column ‘C_13” as 13C
- Return type
list
- get_dbe_vs_o(olim: Optional[Tuple[int, int]] = None, draw: bool = True, ax: Optional[matplotlib.pyplot.axes] = None, **kwargs) Tuple[float, float][source]
Calculate DBE vs nO by linear fit
- Parameters
olim (tuple of two int) – limit for nO. Deafult None
draw (bool) – draw scatter DBE vs nO and how it is fitted
ax (matplotlib axes) – ax fo outer plot. Default None
**kwargs (dict) – dict for additional condition to scatter matplotlib
- Returns
a and b in fit DBE = a * nO + b
- Return type
(float, float)
References
Bae, E., Yeo, I. J., Jeong, B., Shin, Y., Shin, K. H., & Kim, S. (2011). Study of double bond equivalents and the numbers of carbon and oxygen atom distribution of dissolved organic matter with negative-mode FT-ICR MS. Analytical chemistry, 83(11), 4193-4199.
- get_mol_class(how_average: str = 'weight', how: Optional[str] = None) pandas.core.frame.DataFrame[source]
get molercular class density
- Parameters
how_average ({'weight', 'count'}) – how average density. Default “weight” - weight by intensity. Also can be “count”.
how ({'kellerman', 'perminova', 'laszakovits'}) – How devide to calsses. Optional. Default ‘laszakovits’
- Return type
pandas Dataframe
References
Laszakovits, J. R., & MacKay, A. A. Journal of the American Society for Mass Spectrometry, 2021, 33(1), 198-202. A. M. Kellerman, T. Dittmar, D. N. Kothawala, L. J. Tranvik. Nat. Commun. 5, 3804 (2014) Perminova I. V. Pure and Applied Chemistry. 2019. Vol. 91, № 5. P. 851-864
- get_mol_metrics(metrics: Optional[Sequence[str]] = None, func: Optional[str] = None) pandas.core.frame.DataFrame[source]
Get average metrics
- Parameters
metrics (Sequence[str]) – Optional. Default None. Chose metrics fot watch.
func ({'weight', 'mean', 'median', 'max', 'min', 'std'}) – How calculate average. My be “weight” (default - weight average on intensity), “mean”, “median”, “max”, “min”, “std” (standard deviation)
- Return type
pandas DataFrame
- get_squares_vk(how_average: str = 'weight', ax: Optional[matplotlib.pyplot.axes] = None, draw: bool = False) pandas.core.frame.DataFrame[source]
Calculate density in Van Krevelen diagram divided into 20 squares
Squares index in Van-Krevelen diagram if H/C is rows, O/C is columns: [[5, 10, 15, 20],
[4, 9, 14, 19], [3, 8, 13, 18], [2, 7, 12, 17], [1, 6, 11, 16]]
H/C divided by [0-0.6, 0.6-1, 1-1.4, 1.4-1.8, 1.8-2.2] O/C divided by [0-0.25, 0.25-0.5, 0.5-0.75, 0.75-1.0]
- Parameters
how_average ({'weight', 'count'}) – How calculate average. My be “count” or “weight” (default)
ax (matplotlib ax) – Optional. external ax
draw (bool) – Optional. Default False. Plot heatmap
- Return type
Pandas Dataframe
References
Perminova I. V. From green chemistry and nature-like technologies towards ecoadaptive chemistry and technology // Pure and Applied Chemistry. 2019. Vol. 91, № 5. P. 851-864.
- hc_oc() nomspectra.spectrum.Spectrum[source]
Calculate H/C and O/C
Add columns “H/C” and “O/C” to self.table
- Return type
- head(num: Optional[int] = None) pandas.core.frame.DataFrame[source]
Show head of mass spec table
- Parameters
num (int) – Optional. number of head string
- Return type
Pandas Dataframe
- intens_sub(other: nomspectra.spectrum.Spectrum) nomspectra.spectrum.Spectrum[source]
Substruction of other spectrum from self by intensivity
Result Contain only peaks that higher than in other. And intensity of this peaks is substraction of self and other.
- Parameters
other (Spectrum object) – other mass-scpectrum
- Return type
- kendrick() nomspectra.spectrum.Spectrum[source]
Calculate Kendrick mass and Kendrick mass defect
Add columns “Ke” and ‘KMD” to self.table
- Return type
- merge_duplicates() nomspectra.spectrum.Spectrum[source]
merge duplicataes with the same calculated mass with sum intensity
- Return type
- merge_isotopes() nomspectra.spectrum.Spectrum[source]
Merge isotopes.
For example if specrum list have ‘C’ and ‘C_13’ they will be summed in ‘C’ column.
- Return type
Caution
Danger of lose data - with these operation we exclude data that can be usefull
- mol_class(how: Optional[str] = None) nomspectra.spectrum.Spectrum[source]
Assign molecular class for formulas
Add column “class” to self.table
- Parameters
how ({'kellerman', 'perminova', 'laszakovits'}) – How devide to calsses. Optional. Default ‘laszakovits’
- Return type
References
Laszakovits, J. R., & MacKay, A. A. Journal of the American Society for Mass Spectrometry, 2021, 33(1), 198-202. A. M. Kellerman, T. Dittmar, D. N. Kothawala, L. J. Tranvik. Nat. Commun. 2014, 5, 3804 Perminova I. V. Pure and Applied Chemistry. 2019. Vol. 91, № 5. P. 851-864
- noise_filter(force: float = 1.5, intensity: Optional[float] = None, quantile: Optional[float] = None) nomspectra.spectrum.Spectrum[source]
Remove noise from spectrum
- Parameters
intensity (float) – Cut by min intensity. Default None and dont apply.
quantile (float) – Cut by quantile. For example 0.1 mean that 10% of peaks with minimal intensity will be cutted. Default None and dont aplly
force (float) – How many peaks should cut when auto-search noise level. Default 1.5 means that peaks with intensity more than noise level*1.5 will be cutted
- Return type
Caution
There is risk of loosing data. Do it cautiously. Level of noise may be determenided wrong. Draw and watch spectrum.
- normalize(how: str = 'sum') nomspectra.spectrum.Spectrum[source]
Intensity normalize by intensity
- Parameters
how ({'sum', 'max', 'median', 'mean'}) – ‘sum’ for normilize by sum of intensity of all peaks. (default) ‘max’ for normilize by higher intensity peak. ‘median’ for normilize by median of peaks intensity. ‘mean’ for normilize by mean of peaks intensity.
- Return type
- nosc() nomspectra.spectrum.Spectrum[source]
Calculate Normal oxidation state of carbon (NOSC)
Add column “NOSC” to self.table
Notes
>0 - oxidate state. <0 - reduce state. 0 - neutral state
References
Boye, Kristin, et al. “Thermodynamically controlled preservation of organic carbon in floodplains.” Nature Geoscience 10.6 (2017): 415-419.
- Return type
- static read_csv(filename: Union[pathlib.Path, str], mapper: Optional[Mapping[str, str]] = None, ignore_columns: Optional[Sequence[str]] = None, take_columns: Optional[Sequence[str]] = None, take_only_mz: bool = False, sep: str = ',', intens_min: Optional[Union[int, float]] = None, intens_max: Optional[Union[int, float]] = None, mass_min: Optional[Union[int, float]] = None, mass_max: Optional[Union[int, float]] = None, assign_mark: bool = False, metadata: Optional[Dict] = None) nomspectra.spectrum.Spectrum[source]
Read mass spectrum table from csv (Comma-Separated Values) file
All parameters is optional except filename File must have header and at last two main columns: mass and intensity
- Parameters
filename (str) – path to mass spectrum table
mapper (dict) – dictonary for recognize columns in mass spectrum file. Example: {‘m/z’:’mass’,’I’:’intensity’}
ignore_columns (Sequence[str]) – list with names of columns that willn’t loaded. if None load all columns. Example: [“index”, “s/n”]
take_columns (Sequence[str]) – list with names of columns that will be loaded, other will be ignored if None load all columns. Example: [“mass”, “intensity”, “C”, “H”, “N”, “O”]
take_only_mz (bool) – Load only mass and intesivity columns
sep (str) – separator in mass spectrum table, t - for tab.
intens_min (numeric) – bottom limit for intensity. by default it is None and don’t restrict by this.
intens_max (numeric) – upper limit for intensivity. by default it is None and don’t restrict by this
mass_min (numeric) – bottom limit for m/z. by default it is None and don’t restrict by this
mass_max (numeric) – upper limit for m/z. by default it is None and don’t restrict by this
assign_mark (bool) – default False. Mark peaks as assigned if they have elements. Need for load mass-list treated by external software
metadata (Dict) – Optional. Default None. Metadata object that consist dictonary of metadata. if name not in metadata - name will take from filename.
- Return type
- static read_json(filename: Union[pathlib.Path, str]) nomspectra.spectrum.Spectrum[source]
Read mass spectrum from json own format
- Parameters
filename (str) – path to mass spectrum json file
- Return type
- simmilarity(other: nomspectra.spectrum.Spectrum, mode: Union[str, Callable] = 'cosine', func=None) float[source]
Calculate Simmilarity of self spectrum with other spectrum
- Parameters
other (Spectrum object) – second MaasSpectrum object with that calc simmilarity
mode ({"tanimoto", "jaccard", "cosine"} or Function) – one of the simple simmilarity functions Mode can be: “tanimoto”, “jaccard”, “cosine”. Default cosine. May also send here function, that will be applayed to two pandas DataFrame with Spectrum data
- Return type
float
- tail(num: Optional[int] = None) pandas.core.frame.DataFrame[source]
Show tail of Spectrum table
- Parameters
num (int) – Optional. number of tail string
- Return type
Pandas Dataframe
- to_csv(filename: Union[pathlib.Path, str], sep: str = ',') None[source]
Save Spectrum mass-list to csv file
- Parameters
filename (str) – Path for saving mass spectrum table with calculation
sep (str) – Optional. Separator in saved file. By default it is ‘,’
Caution
Metadata will be lost. For save them too use to_json method