Differential Abundance Analysis by Consensus • dar

Introduction

Differential abundance testing in microbiome data challenges both parametric and non-parametric statistical methods, due to its sparsity, high variability and compositional nature. Microbiome-specific statistical methods often assume classical distribution models or take into account compositional specifics. These produce results that range within the specificity vs sensitivity space in such a way that type I and type II error are difficult to ascertain in real microbiome data when a single method is used. Recently, a consensus approach based on multiple differential abundance (DA) methods was recently suggested in order to increase robustness.

With dar, you can use dplyr-like pipeable sequences of DA methods and then apply different consensus strategies. In this way we can obtain more reliable results in a fast, consistent and reproducible way.

Installation

You can install the development version of dar from GitHub with:

# install.packages("pak")
pak::pkg_install("MicrobialGenomics-IrsicaixaOrg/dar")

Usage

library(dar)
#> Registered S3 methods overwritten by 'vegan':
#>   method         from      
#>   reorder.hclust seriation 
#>   rev.hclust     dendextend
data("metaHIV_phy")

## Define recipe
rec <-
  recipe(metaHIV_phy, var_info = "RiskGroup2", tax_info = "Species") |>
  step_subset_taxa(tax_level = "Kingdom", taxa = c("Bacteria", "Archaea")) |>
  step_filter_taxa(.f = "function(x) sum(x > 0) >= (0.03 * length(x))") |>
  step_maaslin() |>
  step_aldex()

rec
#> ── DAR Recipe ──────────────────────────────────────────────────────────────────
#> Inputs:
#> 
#>      ℹ phyloseq object with 451 taxa and 156 samples 
#>      ℹ variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) 
#>      ℹ taxonomic level Species 
#> 
#> Preporcessing steps:
#> 
#>      ◉ step_subset_taxa() id = subset_taxa__Komaj_sehen 
#>      ◉ step_filter_taxa() id = filter_taxa__Zlebia 
#> 
#> DA steps:
#> 
#>      ◉ step_maaslin() id = maaslin__Mille_feuille 
#>      ◉ step_aldex() id = aldex__Shakarbura

## Prep recipe
da_results <- prep(rec, parallel = TRUE)
da_results
#> ── DAR Results ─────────────────────────────────────────────────────────────────
#> Inputs:
#> 
#>      ℹ phyloseq object with 278 taxa and 156 samples 
#>      ℹ variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) 
#>      ℹ taxonomic level Species 
#> 
#> Results:
#> 
#>      ✔ maaslin__Mille_feuille diff_taxa = 52 
#>      ✔ aldex__Shakarbura diff_taxa = 96 
#> 
#>      ℹ 35 taxa are present in all tested methods

## Consensus strategy
n_methods <- 2
da_results <- bake(da_results, count_cutoff = n_methods)
da_results
#> ── DAR Results ─────────────────────────────────────────────────────────────────
#> Inputs:
#> 
#>      ℹ phyloseq object with 278 taxa and 156 samples 
#>      ℹ variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) 
#>      ℹ taxonomic level Species 
#> 
#> Results:
#> 
#>      ✔ maaslin__Mille_feuille diff_taxa = 52 
#>      ✔ aldex__Shakarbura diff_taxa = 96 
#> 
#>      ℹ 35 taxa are present in all tested methods 
#> 
#> Bakes:
#> 
#>      ◉ 1 -> count_cutoff: 2, weights: NULL, exclude: NULL, id: bake__Birnbrot

## Results
cool(da_results)
#> ℹ Bake for count_cutoff = 2
#> # A tibble: 35 × 2
#>    taxa_id taxa                        
#>    <chr>   <chr>                       
#>  1 Otu_78  Bacteroides_uniformis       
#>  2 Otu_88  Odoribacter_splanchnicus    
#>  3 Otu_119 Alistipes_putredinis        
#>  4 Otu_129 Parabacteroides_merdae      
#>  5 Otu_125 Parabacteroides_distasonis  
#>  6 Otu_82  Barnesiella_intestinihominis
#>  7 Otu_96  Prevotella_copri            
#>  8 Otu_51  Bacteroides_dorei           
#>  9 Otu_332 Catenibacterium_mitsuokai   
#> 10 Otu_62  Bacteroides_ovatus          
#> # ℹ 25 more rows

Contributing

If you think you have encountered a bug, please submit an issue.
Either way, learn how to create and share a reprex (a minimal, reproducible example), to clearly communicate about your code.
Working on your first Pull Request? You can learn how from this free series How to Contribute to an Open Source Project on GitHub

Code of Conduct

Please note that the dar project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.