identify_split_peaks.Rd
Find split peaks where a single analyte is distributed across one or more peakgroups.
identify_split_peaks(
mzroll_db_con,
clamr_config,
max_rt_deviation = 5,
ic_floor = 2^10,
anticorrelation_co = -0.1,
signal_frac_IQR_co = 0.75,
spectra_corr_co = 0.8,
stratify_regex = NULL,
n_top_spectra_summed = 3L,
quality_weights = c(purity = 2, quality = 1)
)
a connection to a mzroll database as produced by mzroll_db_sqlite
a named list of mass spec parameters with special formatting of instrument tolerances generated by build_clamr_config
.
maximum rt deviation between two peakgroups which could be the same analyte.
floor all low or missing signals to this value for the purpose of calculating anti-correlation in log-space.
cutoff for what is a strong anti-correlation as a sign of peak splitting (most signal of some samples in group A and others in group B will induce a negative correlation between A and B).
cutoff for interquartile range of fractional signal spread between candidate split peak pairs.
cutoff for correlation between consensus spectra of the two peakgroups.
NULL for no stratification or a string regular expression indicating categories which should drive peak splitting (e.g., batch or date).
integer counts of maximum number of spectra to aggregate
length 2 named vector with names "purity" and "quality" indicating the relative amount to weight by precursor purity (i.e., the amount of isolated signal matching the precursorMz) versus peak quality (i.e., good peak shapes).
a tibble containing two variables:
groupId_old
- current groupIds in the mzroll_db_con
groupId_new
- updated groupIds in the mzroll_db_con
To be conservative, identified split peak pairs must satisfy all of the following conditions:
mass agreement
- within the mass tolerance specified in the
clamr_config
retention time agreement
- RTs are within
max_rt_deviation
of one another
mutual exclusivity
- log-abundances must be anti-correlated
beyond anticorrelation_co
and inter-quartile range of the signal
split fracitons (fraction of signal in A versus B: A / (A + B)) above
signal_frac_IQR_co
.
batch-driven
[optional] - if batches are provided (using
stratify_regex
) then signal fraction variation should be explained
by batches.
fragmentation agreement
- fragmentation spectra are correlated above spectra_corr_co