Skip to contents

Determine taxa whose absolute abundances, per unit volume, of the ecosystem (e.g., gut) are significantly different with changes in the covariate of interest (e.g., group). The current version of ancombc2 function implements Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC2) in cross-sectional and repeated measurements data. In addition to the two-group comparison, ANCOM-BC2 also supports testing for continuous covariates and multi-group comparisons, including the global test, pairwise directional test, Dunnett's type of test, and trend test.

Usage

step_ancom(
  rec,
  fix_formula = get_var(rec)[[1]],
  rand_formula = NULL,
  p_adj_method = "holm",
  prv_cut = 0.1,
  lib_cut = 0,
  s0_perc = 0.05,
  group = NULL,
  struc_zero = FALSE,
  neg_lb = FALSE,
  alpha = 0.05,
  n_cl = 1,
  verbose = FALSE,
  global = FALSE,
  pairwise = FALSE,
  dunnet = FALSE,
  trend = FALSE,
  rarefy = FALSE,
  id = rand_id("ancom")
)

# S4 method for class 'Recipe'
step_ancom(
  rec,
  fix_formula = get_var(rec)[[1]],
  rand_formula = NULL,
  p_adj_method = "holm",
  prv_cut = 0.1,
  lib_cut = 0,
  s0_perc = 0.05,
  group = NULL,
  struc_zero = FALSE,
  neg_lb = FALSE,
  alpha = 0.05,
  n_cl = 1,
  verbose = FALSE,
  global = FALSE,
  pairwise = FALSE,
  dunnet = FALSE,
  trend = FALSE,
  rarefy = FALSE,
  id = rand_id("ancom")
)

# S4 method for class 'PrepRecipe'
step_ancom(
  rec,
  fix_formula = get_var(rec)[[1]],
  rand_formula = NULL,
  p_adj_method = "holm",
  prv_cut = 0.1,
  lib_cut = 0,
  s0_perc = 0.05,
  group = NULL,
  struc_zero = FALSE,
  neg_lb = FALSE,
  alpha = 0.05,
  n_cl = 1,
  verbose = FALSE,
  global = FALSE,
  pairwise = FALSE,
  dunnet = FALSE,
  trend = FALSE,
  rarefy = FALSE,
  id = rand_id("ancom")
)

Arguments

rec

A Recipe object. The step will be added to the sequence of operations for this Recipe.

fix_formula

the character string expresses how the microbial absolute abundances for each taxon depend on the fixed effects in metadata. When specifying the fix_formula, make sure to include the group variable in the formula if it is not NULL.

rand_formula

the character string expresses how the microbial absolute abundances for each taxon depend on the random effects in metadata. ANCOM-BC2 follows the lmerTest package in formulating the random effects. See ?lmerTest::lmer for more details. Default is NULL.

p_adj_method

character. method to adjust p-values. Default is "holm". Options include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". See ?stats::p.adjust for more details.

prv_cut

a numerical fraction between 0 and 1. Taxa with prevalences less than prv_cut will be excluded in the analysis. For instance, suppose there are 100 samples, if a taxon has nonzero counts presented in less than 10 samples, it will not be further analyzed. Default is 0.10.

lib_cut

a numerical threshold for filtering samples based on library sizes. Samples with library sizes less than lib_cut will be excluded in the analysis. Default is 0, i.e. do not discard any sample.

s0_perc

a numerical fraction between 0 and 1. Inspired by Significance Analysis of Microarrays (SAM) methodology, a small positive constant is added to the denominator of ANCOM-BC2 test statistic corresponding to each taxon to avoid the significance due to extremely small standard errors, especially for rare taxa. This small positive constant is chosen as s0_perc-th percentile of standard error values for each fixed effect. Default is 0.05 (5th percentile).

group

character. The name of the group variable in metadata. group should be discrete. Specifying group is required for detecting structural zeros and performing multi-group comparisons (global test, pairwise directional test, Dunnett's type of test, and trend test). Default is NULL. If the group of interest contains only two categories, leave it as NULL.

struc_zero

logical. Whether to detect structural zeros based on group. Default is FALSE. See Details for a more comprehensive discussion on structural zeros.

neg_lb

logical. Whether to classify a taxon as a structural zero using its asymptotic lower bound. Default is FALSE.

alpha

numeric. Level of significance. Default is 0.05.

n_cl

numeric. The number of nodes to be forked. For details, see ?parallel::makeCluster. Default is 1 (no parallel computing).

verbose

logical. Whether to generate verbose output during the ANCOM-BC2 fitting process. Default is FALSE.

global

logical. Whether to perform the global test. Default is FALSE.

pairwise

logical. Whether to perform the pairwise directional test. Default is FALSE.

dunnet

logical. Whether to perform the Dunnett's type of test. Default is FALSE.

trend

logical. Whether to perform trend test. Default is FALSE.

rarefy

Boolean indicating if OTU counts must be rarefyed. This rarefaction uses the standard R sample function to resample from the abundance values in the otu_table component of the first argument, physeq. Often one of the major goals of this procedure is to achieve parity in total number of counts between samples, as an alternative to other formal normalization procedures, which is why a single value for the sample.size is expected. If 'no_seed', rarefaction is performed without a set seed.

id

A character string that is unique to this step to identify it.

Value

An object of class Recipe

See also

Examples

data(metaHIV_phy)

## Init Recipe
rec <-
  recipe(metaHIV_phy, "RiskGroup2", "Phylum") |>
  step_subset_taxa(tax_level = "Kingdom", taxa = c("Bacteria", "Archaea")) |>
  step_filter_taxa(.f = "function(x) sum(x > 0) >= (0.4 * length(x))")

rec
#> ── DAR Recipe ──────────────────────────────────────────────────────────────────
#> Inputs:
#> 
#>       phyloseq object with 451 taxa and 156 samples 
#>       variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) 
#>       taxonomic level Phylum 
#> 
#> Preporcessing steps:
#> 
#>       step_subset_taxa() id = subset_taxa__Roti_tissue 
#>       step_filter_taxa() id = filter_taxa__Linzer_torte 
#> 
#> DA steps:
#> 

## Define step with default parameters and prep
rec <-
  step_ancom(rec) |>
  prep(parallel = FALSE)
#> Registered S3 methods overwritten by 'proxy':
#>   method               from    
#>   print.registry_field registry
#>   print.registry_entry registry
#> Warning: The number of taxa used for estimating sample-specific biases is: 6
#> A large number of taxa (>50) is required for the consistent estimation of biases
#> Loading required package: foreach
#> Loading required package: rngtools
#> Warning: The number of taxa used for estimating sample-specific biases is: 6
#> A large number of taxa (>50) is required for the consistent estimation of biases
#> Warning: The number of taxa used for estimating sample-specific biases is: 6
#> A large number of taxa (>50) is required for the consistent estimation of biases

rec
#> ── DAR Results ─────────────────────────────────────────────────────────────────
#> Inputs:
#> 
#>       phyloseq object with 76 taxa and 156 samples 
#>       variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) 
#>       taxonomic level Phylum 
#> 
#> Results:
#> 
#>       ancom__Trdelník diff_taxa = 24 
#> 
#>       24 taxa are present in all tested methods 
#> 

## Wearing rarefaction only for this step
rec <-
  recipe(metaHIV_phy, "RiskGroup2", "Species") |>
  step_ancom(rarefy = TRUE)

rec
#> ── DAR Recipe ──────────────────────────────────────────────────────────────────
#> Inputs:
#> 
#>       phyloseq object with 451 taxa and 156 samples 
#>       variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) 
#>       taxonomic level Species 
#> 
#> Preporcessing steps:
#> 
#> 
#> DA steps:
#> 
#>       step_ancom() id = ancom__Pineapple_cake