Collapse Metabolites — collapse_metabolites • claman

Combine multiple measurements of the same metabolite into a consensus to simplify result's presentation and pathway analysis.

collapse_metabolites(
  mzroll_list,
  preserve_distinct_methods = FALSE,
  preserve_adducts = FALSE
)

Arguments

mzroll_list	output of process_mzroll or process_mzroll_multi features: one row per unique analyte (defined by a unique groupId), samples: one row per unique sample (defined by a unique sampleId), measurements: one row per peak (samples x peakgroups)
preserve_distinct_methods	if TRUE then collapse metabolites for each method separately. If FALSE, collapse over methods.
preserve_adducts	if TRUE then different ions of the same metabolite will not be collapsed.

mzroll_list

output of process_mzroll or process_mzroll_multi

features: one row per unique analyte (defined by a unique groupId),
samples: one row per unique sample (defined by a unique sampleId),
measurements: one row per peak (samples x peakgroups)

preserve_distinct_methods

if TRUE then collapse metabolites for each method separately. If FALSE, collapse over methods.

preserve_adducts

if TRUE then different ions of the same metabolite will not be collapsed.

Value

an mzroll_list

Details

Analytes are first aggregated by retaining the maximum intensity peak on a sample-by-sample basis over peakgroups of the same ion (i.e., same compoundName and adductName). This is meant to deal with peakgroup splitting. Once measurements are reduced to unique ions, ions can be further aggregated to metabolites by taking the median quant value on a sample-by-sample basis while preserving either adducts or methods.

Examples

collapse_metabolites(nplug_mzroll_augmented)
#> $features
#> # A tibble: 106 × 10
#>    groupId compoundName  smiles tagString    mz    rt compoundDB searchTableName
#>    <fct>   <chr>         <chr>  <chr>     <dbl> <dbl> <chr>      <chr>          
#>  1 1       Ribose-P      NA     ""           NA    NA NA         NA             
#>  2 2       OMP           NA     ""           NA    NA NA         NA             
#>  3 3       1,3-diphopsh… NA     ""           NA    NA NA         NA             
#>  4 4       3-hydroxy-3-… NA     ""           NA    NA NA         NA             
#>  5 5       3-phosphogly… NA     ""           NA    NA NA         NA             
#>  6 6       6-phospho-D-… NA     ""           NA    NA NA         NA             
#>  7 7       acetyl-CoA    NA     ""           NA    NA NA         NA             
#>  8 8       aconitate     NA     ""           NA    NA NA         NA             
#>  9 9       adenosine     NA     ""           NA    NA NA         NA             
#> 10 10      ADP           NA     ""           NA    NA NA         NA             
#> # … with 96 more rows, and 2 more variables: label <chr>, pathway <chr>
#> 
#> $samples
#> # A tibble: 136 × 13
#>    sampleId name     filename samples_tbl_row sample_name  month replicate    DR
#>    <fct>    <chr>    <chr>              <int> <chr>        <chr> <chr>     <dbl>
#>  1 1        NH4_0.0… NA                     1 NH4_0.055_f… Jun   B         0.055
#>  2 2        NH4_0.0… NA                     2 NH4_0.055_f… Jun   D         0.055
#>  3 3        NH4_0.0… NA                     3 NH4_0.055_p… Jun   A         0.055
#>  4 4        NH4_0.0… NA                     4 NH4_0.055_p… Jun   C         0.055
#>  5 5        NH4_0.1… NA                     5 NH4_0.173_f… Jun   B         0.173
#>  6 6        NH4_0.1… NA                     6 NH4_0.173_f… Jun   D         0.173
#>  7 7        NH4_0.1… NA                     7 NH4_0.173_p… Jun   A         0.173
#>  8 8        NH4_0.1… NA                     8 NH4_0.173_p… Jun   C         0.173
#>  9 9        NH4_0.2… NA                     9 NH4_0.210_f… Jun   B         0.21 
#> 10 10       NH4_0.2… NA                    10 NH4_0.210_f… Jun   D         0.21 
#> # … with 126 more rows, and 5 more variables: limitation <chr>, exp_ref <chr>,
#> #   extraction <chr>, condition <int>, reference <int>
#> 
#> $measurements
#> # A tibble: 14,416 × 4
#>    groupId sampleId log2_abundance centered_log2_abundance
#>    <fct>   <fct>             <dbl>                   <dbl>
#>  1 1       1                  15.1                    1.21
#>  2 1       2                  14.9                    1.07
#>  3 1       3                  15.4                    1.55
#>  4 1       4                  15.5                    1.59
#>  5 1       5                  15.4                    1.49
#>  6 1       6                  15.6                    1.76
#>  7 1       7                  16.0                    2.13
#>  8 1       8                  16.0                    2.09
#>  9 1       9                  15.4                    1.58
#> 10 1       10                 15.5                    1.65
#> # … with 14,406 more rows
#> 
#> $design
#> $design$features
#> # A tibble: 10 × 2
#>    variable        type               
#>    <chr>           <chr>              
#>  1 groupId         feature_primary_key
#>  2 compoundName    character          
#>  3 smiles          character          
#>  4 tagString       character          
#>  5 mz              numeric            
#>  6 rt              numeric            
#>  7 compoundDB      character          
#>  8 searchTableName character          
#>  9 label           character          
#> 10 pathway         character          
#> 
#> $design$samples
#> # A tibble: 13 × 2
#>    variable        type              
#>    <chr>           <chr>             
#>  1 sampleId        sample_primary_key
#>  2 name            character         
#>  3 filename        character         
#>  4 samples_tbl_row integer           
#>  5 sample_name     character         
#>  6 month           character         
#>  7 replicate       character         
#>  8 DR              numeric           
#>  9 limitation      character         
#> 10 exp_ref         character         
#> 11 extraction      character         
#> 12 condition       integer           
#> 13 reference       integer           
#> 
#> $design$measurements
#> # A tibble: 4 × 2
#>   variable                type               
#>   <chr>                   <chr>              
#> 1 groupId                 feature_primary_key
#> 2 sampleId                sample_primary_key 
#> 3 log2_abundance          numeric            
#> 4 centered_log2_abundance numeric            
#> 
#> $design$feature_pk
#> [1] "groupId"
#> 
#> $design$sample_pk
#> [1] "sampleId"
#> 
#> 
#> attr(,"class")
#> [1] "triple_omic" "tomic"       "mzroll"