nomspectra.spectrum module

class nomspectra.spectrum.Spectrum(table: Optional[pandas.core.frame.DataFrame] = None, metadata: Optional[Dict] = None)[source]

Bases: object

A class used to represent mass spectrum

__init__(table: Optional[pandas.core.frame.DataFrame] = None, metadata: Optional[Dict] = None) pandas.core.frame.DataFrame[source]
Parameters
  • table (pandas Datarame) – Optional. Consist spectrum (mass and intensity of peaks) and all calculated parameters like brutto formulas, calculated mass, relative errorr

  • metadata (Dict) – Optional. Default None. To add some data into spectrum metedata.

ai() nomspectra.spectrum.Spectrum[source]

Calculate AI (aromaticity index)

Add column “AI” to self.table

Return type

Spectrum

References

Koch, Boris P., and T. Dittmar. “From mass to structure: An aromaticity index for high resolution mass data of natural organic matter.” Rapid communications in mass spectrometry 20.5 (2006): 926-932.

assign(brutto_dict: Optional[dict] = None, generated_bruttos_table: Optional[pandas.core.frame.DataFrame] = None, rel_error: Optional[float] = None, abs_error: Optional[float] = None, sign: str = '-', mass_min: Optional[float] = None, mass_max: Optional[float] = None, intensity_min: Optional[float] = None, intensity_max: Optional[float] = None, charge_max: int = 1) nomspectra.spectrum.Spectrum[source]

Assigning brutto formulas to signal by mass

Parameters
  • brutto_dict (dict) – Optional. Deafault None. Custom Dictonary for generate brutto table. Example: {‘C’:(4, 51),’H’:(4, 101),’O’:(0,26), ‘N’:(0,4), ‘C_13’:(0,3)}

  • generated_bruttos_table (pandas DataFrame) – Optional. Contain column ‘mass’ and elements, should be sorted by ‘mass’. Can be generated by function brutto_generator.brutto_gen(). if ‘None’ generate table with default elemnets and ranges C: 4-50, H 4-100, O 0-25, N 0-3, S 0-2.

  • rel_error (float) – Optional. default 0.5, permissible error in ppm for assign mass to brutto formulas

  • abs_error (float) – Optional. default None, permissible absolute error for assign mass to brutto formulas

  • sign (str) – Optional. Deafult ‘-‘. Mode in which mass spectrum was gotten. ‘-’ for negative mode ‘+’ for positive mode ‘0’ for neutral

  • mass_min (float) – Optional. Default None. Minimall mass for assigment

  • mass_max (float) – Optional. Default None. Maximum mass for assigment

  • intensity_min (float) – Optional. Default None. Minimall intensity for assigment

  • intensity_max (float) – Optional. Default None. Maximum intensity for assigment

  • charge_max (int) – Maximum charge in m/z. Default 1.

Return type

Spectrum

brutto() nomspectra.spectrum.Spectrum[source]

Calculate string with brutto from assign table

Add column “britto” to self.table

Return type

Spectrum

cai() nomspectra.spectrum.Spectrum[source]

Calculate CAI (C - O - N - S - P)

Add column “CAI” to self.table

Return type

Spectrum

calc_all_metrics() nomspectra.spectrum.Spectrum[source]

Calculated all available metrics

Return type

Spectrum

calc_error(sign: Optional[str] = None) nomspectra.spectrum.Spectrum[source]

Calculate relative and absolute error of assigned peaks from measured and calculated masses

Add columns “abs_error” and “rel_error” to self.table

Parameters

sign ({'-', '+', '0'}) – Optional. Default None and get from metatdata or calculated by self. Mode in which mass spectrum was gotten. ‘-’ for negative mode ‘+’ for positive mode ‘0’ for neutral

Return type

Spectrum

calc_mass() nomspectra.spectrum.Spectrum[source]

Calculate mass from assigned brutto formulas and elements exact masses

Add column “calc_mass” to self.table

Return type

Spectrum

copy() nomspectra.spectrum.Spectrum[source]

Deepcopy of self Spectrum object

Return type

Spectrum

cram() nomspectra.spectrum.Spectrum[source]

Mark rows that fit CRAM conditions (carboxylic-rich alicyclic molecules)

Add column “CRAM” to self.table

Return type

Spectrum

References

Hertkorn, N. et al. Characterization of a major refractory component of marine dissolved organic matter. Geochimica et. Cosmochimica Acta 70, 2990-3010 (2006)

dbe() nomspectra.spectrum.Spectrum[source]

Calculate DBE (1 + C - 0.5 * (H + N))

Add column “DBE” to self.table

Return type

Spectrum

dbe_ai() nomspectra.spectrum.Spectrum[source]

Calculate DBE_AI (1 + C - O - S - 0.5 * (H + N + P))

Add column “DBE_AI” to self.table

Return type

Spectrum

dbe_o() nomspectra.spectrum.Spectrum[source]

Calculate DBE - O

Add column “DBE-O” to self.table

Return type

Spectrum

dbe_oc() nomspectra.spectrum.Spectrum[source]

Calculate (DBE - O) / C

Add column “DBE-OC” to self.table

Return type

Spectrum

drop_unassigned() nomspectra.spectrum.Spectrum[source]

Drop unassigned by brutto rows

Return type

Spectrum

Caution

Danger of lose data - with these operation we exclude data that can be usefull

filter_by_C13(rel_error: float = 0.5, remove: bool = False) nomspectra.spectrum.Spectrum[source]

Check if peaks have the same brutto with C13 isotope

Parameters
  • rel_error (float) – Optional. Default 0.5. Allowable ppm error when checking c13 isotope peak

  • remove (bool) – Optional, default False. Drop unassigned peaks and peaks without C13 isotope

Return type

Spectrum

find_elements() Sequence[str][source]

Find elements from columns of mass spectrum table.

For example, column ‘C’ will be recognised as carbon 12C, column ‘C_13” as 13C

Return type

list

get_dbe_vs_o(olim: Optional[Tuple[int, int]] = None, draw: bool = True, ax: Optional[matplotlib.pyplot.axes] = None, **kwargs) Tuple[float, float][source]

Calculate DBE vs nO by linear fit

Parameters
  • olim (tuple of two int) – limit for nO. Deafult None

  • draw (bool) – draw scatter DBE vs nO and how it is fitted

  • ax (matplotlib axes) – ax fo outer plot. Default None

  • **kwargs (dict) – dict for additional condition to scatter matplotlib

Returns

a and b in fit DBE = a * nO + b

Return type

(float, float)

References

Bae, E., Yeo, I. J., Jeong, B., Shin, Y., Shin, K. H., & Kim, S. (2011). Study of double bond equivalents and the numbers of carbon and oxygen atom distribution of dissolved organic matter with negative-mode FT-ICR MS. Analytical chemistry, 83(11), 4193-4199.

get_mol_class(how_average: str = 'weight', how: Optional[str] = None) pandas.core.frame.DataFrame[source]

get molercular class density

Parameters
  • how_average ({'weight', 'count'}) – how average density. Default “weight” - weight by intensity. Also can be “count”.

  • how ({'kellerman', 'perminova', 'laszakovits'}) – How devide to calsses. Optional. Default ‘laszakovits’

Return type

pandas Dataframe

References

Laszakovits, J. R., & MacKay, A. A. Journal of the American Society for Mass Spectrometry, 2021, 33(1), 198-202. A. M. Kellerman, T. Dittmar, D. N. Kothawala, L. J. Tranvik. Nat. Commun. 5, 3804 (2014) Perminova I. V. Pure and Applied Chemistry. 2019. Vol. 91, № 5. P. 851-864

get_mol_metrics(metrics: Optional[Sequence[str]] = None, func: Optional[str] = None) pandas.core.frame.DataFrame[source]

Get average metrics

Parameters
  • metrics (Sequence[str]) – Optional. Default None. Chose metrics fot watch.

  • func ({'weight', 'mean', 'median', 'max', 'min', 'std'}) – How calculate average. My be “weight” (default - weight average on intensity), “mean”, “median”, “max”, “min”, “std” (standard deviation)

Return type

pandas DataFrame

get_squares_vk(how_average: str = 'weight', ax: Optional[matplotlib.pyplot.axes] = None, draw: bool = False) pandas.core.frame.DataFrame[source]

Calculate density in Van Krevelen diagram divided into 20 squares

Squares index in Van-Krevelen diagram if H/C is rows, O/C is columns: [[5, 10, 15, 20],

[4, 9, 14, 19], [3, 8, 13, 18], [2, 7, 12, 17], [1, 6, 11, 16]]

H/C divided by [0-0.6, 0.6-1, 1-1.4, 1.4-1.8, 1.8-2.2] O/C divided by [0-0.25, 0.25-0.5, 0.5-0.75, 0.75-1.0]

Parameters
  • how_average ({'weight', 'count'}) – How calculate average. My be “count” or “weight” (default)

  • ax (matplotlib ax) – Optional. external ax

  • draw (bool) – Optional. Default False. Plot heatmap

Return type

Pandas Dataframe

References

Perminova I. V. From green chemistry and nature-like technologies towards ecoadaptive chemistry and technology // Pure and Applied Chemistry. 2019. Vol. 91, № 5. P. 851-864.

hc_oc() nomspectra.spectrum.Spectrum[source]

Calculate H/C and O/C

Add columns “H/C” and “O/C” to self.table

Return type

Spectrum

head(num: Optional[int] = None) pandas.core.frame.DataFrame[source]

Show head of mass spec table

Parameters

num (int) – Optional. number of head string

Return type

Pandas Dataframe

intens_sub(other: nomspectra.spectrum.Spectrum) nomspectra.spectrum.Spectrum[source]

Substruction of other spectrum from self by intensivity

Result Contain only peaks that higher than in other. And intensity of this peaks is substraction of self and other.

Parameters

other (Spectrum object) – other mass-scpectrum

Return type

Spectrum

kendrick() nomspectra.spectrum.Spectrum[source]

Calculate Kendrick mass and Kendrick mass defect

Add columns “Ke” and ‘KMD” to self.table

Return type

Spectrum

merge_duplicates() nomspectra.spectrum.Spectrum[source]

merge duplicataes with the same calculated mass with sum intensity

Return type

Spectrum

merge_isotopes() nomspectra.spectrum.Spectrum[source]

Merge isotopes.

For example if specrum list have ‘C’ and ‘C_13’ they will be summed in ‘C’ column.

Return type

Spectrum

Caution

Danger of lose data - with these operation we exclude data that can be usefull

mol_class(how: Optional[str] = None) nomspectra.spectrum.Spectrum[source]

Assign molecular class for formulas

Add column “class” to self.table

Parameters

how ({'kellerman', 'perminova', 'laszakovits'}) – How devide to calsses. Optional. Default ‘laszakovits’

Return type

Spectrum

References

Laszakovits, J. R., & MacKay, A. A. Journal of the American Society for Mass Spectrometry, 2021, 33(1), 198-202. A. M. Kellerman, T. Dittmar, D. N. Kothawala, L. J. Tranvik. Nat. Commun. 2014, 5, 3804 Perminova I. V. Pure and Applied Chemistry. 2019. Vol. 91, № 5. P. 851-864

noise_filter(force: float = 1.5, intensity: Optional[float] = None, quantile: Optional[float] = None) nomspectra.spectrum.Spectrum[source]

Remove noise from spectrum

Parameters
  • intensity (float) – Cut by min intensity. Default None and dont apply.

  • quantile (float) – Cut by quantile. For example 0.1 mean that 10% of peaks with minimal intensity will be cutted. Default None and dont aplly

  • force (float) – How many peaks should cut when auto-search noise level. Default 1.5 means that peaks with intensity more than noise level*1.5 will be cutted

Return type

Spectrum

Caution

There is risk of loosing data. Do it cautiously. Level of noise may be determenided wrong. Draw and watch spectrum.

normalize(how: str = 'sum') nomspectra.spectrum.Spectrum[source]

Intensity normalize by intensity

Parameters

how ({'sum', 'max', 'median', 'mean'}) – ‘sum’ for normilize by sum of intensity of all peaks. (default) ‘max’ for normilize by higher intensity peak. ‘median’ for normilize by median of peaks intensity. ‘mean’ for normilize by mean of peaks intensity.

Return type

Spectrum

nosc() nomspectra.spectrum.Spectrum[source]

Calculate Normal oxidation state of carbon (NOSC)

Add column “NOSC” to self.table

Notes

>0 - oxidate state. <0 - reduce state. 0 - neutral state

References

Boye, Kristin, et al. “Thermodynamically controlled preservation of organic carbon in floodplains.” Nature Geoscience 10.6 (2017): 415-419.

Return type

Spectrum

static read_csv(filename: Union[pathlib.Path, str], mapper: Optional[Mapping[str, str]] = None, ignore_columns: Optional[Sequence[str]] = None, take_columns: Optional[Sequence[str]] = None, take_only_mz: bool = False, sep: str = ',', intens_min: Optional[Union[int, float]] = None, intens_max: Optional[Union[int, float]] = None, mass_min: Optional[Union[int, float]] = None, mass_max: Optional[Union[int, float]] = None, assign_mark: bool = False, metadata: Optional[Dict] = None) nomspectra.spectrum.Spectrum[source]

Read mass spectrum table from csv (Comma-Separated Values) file

All parameters is optional except filename File must have header and at last two main columns: mass and intensity

Parameters
  • filename (str) – path to mass spectrum table

  • mapper (dict) – dictonary for recognize columns in mass spectrum file. Example: {‘m/z’:’mass’,’I’:’intensity’}

  • ignore_columns (Sequence[str]) – list with names of columns that willn’t loaded. if None load all columns. Example: [“index”, “s/n”]

  • take_columns (Sequence[str]) – list with names of columns that will be loaded, other will be ignored if None load all columns. Example: [“mass”, “intensity”, “C”, “H”, “N”, “O”]

  • take_only_mz (bool) – Load only mass and intesivity columns

  • sep (str) – separator in mass spectrum table, t - for tab.

  • intens_min (numeric) – bottom limit for intensity. by default it is None and don’t restrict by this.

  • intens_max (numeric) – upper limit for intensivity. by default it is None and don’t restrict by this

  • mass_min (numeric) – bottom limit for m/z. by default it is None and don’t restrict by this

  • mass_max (numeric) – upper limit for m/z. by default it is None and don’t restrict by this

  • assign_mark (bool) – default False. Mark peaks as assigned if they have elements. Need for load mass-list treated by external software

  • metadata (Dict) – Optional. Default None. Metadata object that consist dictonary of metadata. if name not in metadata - name will take from filename.

Return type

Spectrum

static read_json(filename: Union[pathlib.Path, str]) nomspectra.spectrum.Spectrum[source]

Read mass spectrum from json own format

Parameters

filename (str) – path to mass spectrum json file

Return type

Spectrum

simmilarity(other: nomspectra.spectrum.Spectrum, mode: Union[str, Callable] = 'cosine', func=None) float[source]

Calculate Simmilarity of self spectrum with other spectrum

Parameters
  • other (Spectrum object) – second MaasSpectrum object with that calc simmilarity

  • mode ({"tanimoto", "jaccard", "cosine"} or Function) – one of the simple simmilarity functions Mode can be: “tanimoto”, “jaccard”, “cosine”. Default cosine. May also send here function, that will be applayed to two pandas DataFrame with Spectrum data

Return type

float

tail(num: Optional[int] = None) pandas.core.frame.DataFrame[source]

Show tail of Spectrum table

Parameters

num (int) – Optional. number of tail string

Return type

Pandas Dataframe

to_csv(filename: Union[pathlib.Path, str], sep: str = ',') None[source]

Save Spectrum mass-list to csv file

Parameters
  • filename (str) – Path for saving mass spectrum table with calculation

  • sep (str) – Optional. Separator in saved file. By default it is ‘,’

Caution

Metadata will be lost. For save them too use to_json method

to_json(filename: Union[pathlib.Path, str]) None[source]

Save Spectrum mass-list to json format

Parameters

filename (str) – Path for saving mass spectrum table with calculation to json file