Overview

Community diversity analysis was performed on Group using phyloseq (McMurdie and Holmes, PLoS ONE v8:e61217, 2013 PMID: 23630581 ) and vegan (Oksanen et al., GitHub and DOI ) packages in R. Sample Group was characterised with respect to Control and Alzheimers and AlzCB.



In all aspects of the study, false discovery rate (FDR) of 0.25 was considered significant and 0.05 was considered highly significant. Where more than 20 taxa were present at any taxon level, only 20 were considered. Taxon levels in the database were considered to be Kingdom, Phylum, Class, Order, Family, Genus, Species.

The analysis proceeded in several stages outlined below. These included quality control of the data, analysis of relative abundance and differential abundance for each taxon.


Quality Control

Several quality control checks were performed. 1) sanity checks of variables present in the data object, 2) measurements of counts (ASVs) per sample, 3) read counts per Group and Volunteer, and 4) rarefaction analysis. Rarefaction analysis uses random sampling of successively larger numbers of ASVs from each sample, and full coverage of available taxa in a sample is demonstrated by plateauing of the number of ASVs observed at higher ASVs sampled.

Sanity checks

## Checks of the phyloseq dataset
## Sample variables:  fastq_1 fastq_2 strandedness ID Home Volunteer Group Rep sample_id ancom_group_var biol_rep 
## Number of taxa:  7251 
## Taxonomic rank names:  Kingdom Phylum Class Order Family Genus Species

ASVs per sample

Figure QC1 - ASVs per sample. See supplementary pdf 1_seq_depth_histo.pdf

Read counts per sample

Figure QC2 - Read counts per sample. See supplementary pdf 1_seq_depth_by_.pdf

Rarefaction curves

Figure QC3 - Rarefaction curves. See supplementary pdf 1_rarefaction_curves.pdf

Taxon and sample removal, merging at species level

2 taxa were removed by the filter Kingdom_IS_Bacteria,Archaea;Phylum_NOT_Cyanobacteria/Chloroplast;Class_NOT_Chloroplast;Family_NOT_Mitochondria, leaving 7249 taxa. Raw ASVs, taxonomy metadata and counts, can be found in supplementary file 1_seqtab_nochim_counts.txt. Accompanying sample metadata is found in supplementary file 1_seqtab_sample_meta.txt

0 of 69 samples with < 500 remaining counts were removed.

69 samples with technical replicates were collapsed into 42 samples.

11 of 7249 original ASVs were removed because of zero counts after low-count sample removal.

7238 original ASVs were merged into 629 taxa at Species level.


Data Analysis


Analysis of Relative Abundance

After quality control, analysis of taxon relative abundance took place at all levels of the taxonomy within each sample, plotted below.

Species

plot shows all samples by Volunteer see supplementary pdf 2_rel_abundance_by_taxon.pdf

see supplementary pdf 2_rel_abundance_by_taxon_heatmaps.pdf

Genus

plot shows all samples by Volunteer see supplementary pdf 2_rel_abundance_by_taxon.pdf

see supplementary pdf 2_rel_abundance_by_taxon_heatmaps.pdf

Family

plot shows all samples by Volunteer see supplementary pdf 2_rel_abundance_by_taxon.pdf

see supplementary pdf 2_rel_abundance_by_taxon_heatmaps.pdf

Order

plot shows all samples by Volunteer see supplementary pdf 2_rel_abundance_by_taxon.pdf

see supplementary pdf 2_rel_abundance_by_taxon_heatmaps.pdf

Class

plot shows all samples by Volunteer see supplementary pdf 2_rel_abundance_by_taxon.pdf

see supplementary pdf 2_rel_abundance_by_taxon_heatmaps.pdf

Phylum

plot shows all samples by Volunteer see supplementary pdf 2_rel_abundance_by_taxon.pdf

see supplementary pdf 2_rel_abundance_by_taxon_heatmaps.pdf


Alpha Diversity

Alpha diversity measures (Observed, Shannon, InvSimpson) were computed on each sample. Changes in alpha diversity were assessed in terms of each main variable (Group in this case). Since alpha diversity measures are not in general normally distributed, significance of changes was assessed by non-parametric tests. Kruskal-Wallis tests were performed for Group.

Table of Measures

sample_id Volunteer Group biol_rep uniq_biol_rep counts Observed Shannon InvSimpson
Control__v01_v01 A001-1m v01 Control v01 Control__v01_v01 282414 147 3.396561 16.580762
Control__v02_v02 A002-1m v02 Control v02 Control__v02_v02 137209 138 3.554091 19.138450
Control__v03_v03 A003-1 v03 Control v03 Control__v03_v03 92912 125 3.570713 19.766966
Control__v04_v04 A004-1m v04 Control v04 Control__v04_v04 218270 128 3.038310 9.373335
Alzheimers__v46_v46 A046-1m v46 Alzheimers v46 Alzheimers__v46_v46 185264 164 3.525128 16.991952
Alzheimers__v15_v15 B015-1m v15 Alzheimers v15 Alzheimers__v15_v15 273526 168 3.470184 17.892687
Alzheimers__v16_v16 B016-1m v16 Alzheimers v16 Alzheimers__v16_v16 284720 163 3.449342 17.000746
Control__v07_v07 C007-1 v07 Control v07 Control__v07_v07 27148 104 3.088562 9.926614
Control__v08_v08 D008-1m v08 Control v08 Control__v08_v08 153925 156 3.516417 18.460324
Control__v10_v10 D010-1m v10 Control v10 Control__v10_v10 248717 138 3.664760 21.546652
Control__v11_v11 D011-1m v11 Control v11 Control__v11_v11 206383 189 4.062241 35.853788
Control__v12_v12 D012-1m v12 Control v12 Control__v12_v12 206820 170 3.843848 22.636663
Control__v18_v18 D018-1m v18 Control v18 Control__v18_v18 263635 98 2.833054 8.166896
Control__v13_v13 E013-1m v13 Control v13 Control__v13_v13 121983 209 4.113058 37.329289
Control__v14_v14 E014-1m v14 Control v14 Control__v14_v14 129252 167 3.647164 17.249690
Control__v17_v17 F017-2 v17 Control v17 Control__v17_v17 42763 138 3.669954 19.093633
AlzCB__v24_v24 F024-1m v24 AlzCB v24 AlzCB__v24_v24 235760 186 3.539712 14.350865
Alzheimers__v21_v21 G021-1m v21 Alzheimers v21 Alzheimers__v21_v21 232128 154 3.718337 23.187320
Alzheimers__v22_v22 G022-1m v22 Alzheimers v22 Alzheimers__v22_v22 221400 182 3.668033 20.810292
Alzheimers__v25_v25 G025-1m v25 Alzheimers v25 Alzheimers__v25_v25 189481 176 3.664448 20.120911
Control__v26_v26 H026-1m v26 Control v26 Control__v26_v26 253193 83 2.839464 11.747724
Alzheimers__v29_v29 I029-1 v29 Alzheimers v29 Alzheimers__v29_v29 129762 144 3.802987 30.701727
Alzheimers__v30_v30 I030-1 v30 Alzheimers v30 Alzheimers__v30_v30 108335 155 3.808676 25.977929
Alzheimers__v31_v31 I031-1m v31 Alzheimers v31 Alzheimers__v31_v31 219789 209 4.017790 29.009821
Alzheimers__v32_v32 I032-1 v32 Alzheimers v32 Alzheimers__v32_v32 67662 153 3.777499 25.391709
AlzCB__v33_v33 I033-1m v33 AlzCB v33 AlzCB__v33_v33 291700 191 3.606894 17.095154
Alzheimers__v34_v34 I034-1 v34 Alzheimers v34 Alzheimers__v34_v34 116118 126 3.591972 21.324417
AlzCB__v36_v36 J036-1 v36 AlzCB v36 AlzCB__v36_v36 76322 173 3.796277 23.372410
AlzCB__v37_v37 J037-1 v37 AlzCB v37 AlzCB__v37_v37 99024 167 4.079923 40.764169
AlzCB__v38_v38 J038-1 v38 AlzCB v38 AlzCB__v38_v38 117361 145 3.659899 21.903477
AlzCB__v39_v39 J039-1 v39 AlzCB v39 AlzCB__v39_v39 45884 108 3.512264 18.831273
AlzCB__v40_v40 J040-1m v40 AlzCB v40 AlzCB__v40_v40 221046 166 3.105160 6.159539
Alzheimers__v43_v43 K043-1 v43 Alzheimers v43 Alzheimers__v43_v43 136308 110 3.193978 15.077346
Alzheimers__v49_v49 M049-1m v49 Alzheimers v49 Alzheimers__v49_v49 261718 183 3.494278 13.777032
AlzCB__v50_v50 M050-1 v50 AlzCB v50 AlzCB__v50_v50 106041 122 3.333064 14.022403
Control__v52_v52 N052-1m v52 Control v52 Control__v52_v52 246273 121 3.533196 21.670846
Control__v53_v53 N053-1m v53 Control v53 Control__v53_v53 73896 114 3.410619 17.466268
Control__v54_v54 N054-1 v54 Control v54 Control__v54_v54 113016 128 3.542337 18.185697
Control__v55_v55 N055-1m v55 Control v55 Control__v55_v55 265229 137 3.233525 13.572833
AlzCB__v56_v56 N056-1m v56 AlzCB v56 AlzCB__v56_v56 232633 145 3.036665 8.763989
Alzheimers__v58_v58 N058-1m v58 Alzheimers v58 Alzheimers__v58_v58 283057 158 3.752273 21.492976
AlzCB__v59_v59 N059-1 v59 AlzCB v59 AlzCB__v59_v59 132854 182 3.745094 21.879856

Differential Analysis Plots

Observed

see supplementary pdf 3_alpha_diversity_vs__Group.pdf and accompanying table in 3_alpha_diversity_table.txt

Shannon

see supplementary pdf 3_alpha_diversity_vs__Group.pdf and accompanying table in 3_alpha_diversity_table.txt

InvSimpson

see supplementary pdf 3_alpha_diversity_vs__Group.pdf and accompanying table in 3_alpha_diversity_table.txt


Beta Diversity

Beta diversity plots were constructed based on Principal Coordinate analysis of Bray-Curtis dissimilarities. The adonis function from the vegan package in R was used to perform analysis of variance on these measures.

coloured by Group

see supplementary pdf 4_beta_diversity_vs__Group.pdf

coloured by Volunteer

see supplementary pdf 4_beta_diversity_vs__Group.pdf

adonis analysis of variance

see supplementary pdf 4_beta_diversity_vs__Group.pdf


Differential Abundance Analysis

Differential abundance analysis was carried out at all levels of the taxonomy tree using DESeq2 (Love et al., Genome Biology v15:550 2014 PMID: 25516281 ) and ANCOMBC (Lin & Peddada, Nature Methods v21:83-91 2024 PMID: 38158428 ) packages in R, except where fewer than five taxa precluded meaningful analysis. In all such analyses, taxa were discarded if less than 25% of Volunteers had non-zero counts. Secondly, taxa were also discarded if less than 50% of samples had at least 20 counts.

Plots of between-group contrasts

Differential abundance analysis was carried out in two phases. The first phase addressed response of taxa to the main effects and interactions individually. Volcano plots were constructed for each level of the taxonomy and for all possible contrasts in each main effect and all possible interaction effects. It should be noted that DESeq2 limits interaction effects as relative to the first (control) treatment for each variable. All volcano plots were plotted with mouse-over tooltips assigned to significantly-changed taxa. Follow-up plots of normalised counts for individual taxa were also plotted, coloured by the appropriate main variable. For these follow-up plots, all plots for a given taxon level were sorted in order of decreasing significance and the top 20 were plotted on this html report. These and all plots of lower significance are present in the indicated pdf files.

Species

Volcano plots

Most-significant findings

Note that points in graphs represent data that have been processed extensively by DESeq, being first normalised for sequencing depth, then in rare cases outlier-corrected based on Cook distance at 99th percentile. After this a variance stabilising transformation has been applied including log2 transformation and regularisation to handle low counts, and then batch correction has been applied if relevant.

For all graphs and data, including non-significant, see supplementary files 5_seqtab_nochim_tables.xlsx, 5_diffabund_analyses_grpcontrasts.xlsx, 5_diffabund_analyses_grpcontrasts_volcanos.pdf and 5_diffabund_analyses_grpcontrasts_dotplots.pdf.

Status, warning and error messages

At Species level, 540 of 629 were lost because less than 25% of subjects had >0 counts, or because less than 50% of samples had at least 20counts.

Of 89 taxa, DA model fits converged for 89 taxa, so these only were further analysed.

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 89 of 89 taxa have at least 8 Volunteers (= user-spec minimum fraction 0.25 x 32 Volunteers), ok

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 85 of 89 taxa have at least 20 (user-spec minimum) ASV counts in at least 16 samples (= user-spec minimum fraction 0.5 x 32 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 89 of 89 taxa have at least 7 Volunteers (= user-spec minimum fraction 0.25 x 28 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 84 of 89 taxa have at least 20 (user-spec minimum) ASV counts in at least 14 samples (= user-spec minimum fraction 0.5 x 28 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 89 of 89 taxa have at least 6 Volunteers (= user-spec minimum fraction 0.25 x 24 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 88 of 89 taxa have at least 20 (user-spec minimum) ASV counts in at least 12 samples (= user-spec minimum fraction 0.5 x 24 relevant samples), ok

Genus

Volcano plots

Most-significant findings

Note that points in graphs represent data that have been processed extensively by DESeq, being first normalised for sequencing depth, then in rare cases outlier-corrected based on Cook distance at 99th percentile. After this a variance stabilising transformation has been applied including log2 transformation and regularisation to handle low counts, and then batch correction has been applied if relevant.

For all graphs and data, including non-significant, see supplementary files 5_seqtab_nochim_tables.xlsx, 5_diffabund_analyses_grpcontrasts.xlsx, 5_diffabund_analyses_grpcontrasts_volcanos.pdf and 5_diffabund_analyses_grpcontrasts_dotplots.pdf.

Status, warning and error messages

At Genus level, 245 of 315 were lost because less than 25% of subjects had >0 counts, or because less than 50% of samples had at least 20counts.

Of 70 taxa, DA model fits converged for 70 taxa, so these only were further analysed.

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 70 of 70 taxa have at least 8 Volunteers (= user-spec minimum fraction 0.25 x 32 Volunteers), ok

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 68 of 70 taxa have at least 20 (user-spec minimum) ASV counts in at least 16 samples (= user-spec minimum fraction 0.5 x 32 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 70 of 70 taxa have at least 7 Volunteers (= user-spec minimum fraction 0.25 x 28 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 68 of 70 taxa have at least 20 (user-spec minimum) ASV counts in at least 14 samples (= user-spec minimum fraction 0.5 x 28 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 70 of 70 taxa have at least 6 Volunteers (= user-spec minimum fraction 0.25 x 24 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 70 of 70 taxa have at least 20 (user-spec minimum) ASV counts in at least 12 samples (= user-spec minimum fraction 0.5 x 24 relevant samples), ok

Family

Volcano plots

Most-significant findings

Note that points in graphs represent data that have been processed extensively by DESeq, being first normalised for sequencing depth, then in rare cases outlier-corrected based on Cook distance at 99th percentile. After this a variance stabilising transformation has been applied including log2 transformation and regularisation to handle low counts, and then batch correction has been applied if relevant.

For all graphs and data, including non-significant, see supplementary files 5_seqtab_nochim_tables.xlsx, 5_diffabund_analyses_grpcontrasts.xlsx, 5_diffabund_analyses_grpcontrasts_volcanos.pdf and 5_diffabund_analyses_grpcontrasts_dotplots.pdf.

Status, warning and error messages

At Family level, 86 of 117 were lost because less than 25% of subjects had >0 counts, or because less than 50% of samples had at least 20counts.

Of 31 taxa, DA model fits converged for 31 taxa, so these only were further analysed.

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 31 of 31 taxa have at least 8 Volunteers (= user-spec minimum fraction 0.25 x 32 Volunteers), ok

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 31 of 31 taxa have at least 20 (user-spec minimum) ASV counts in at least 16 samples (= user-spec minimum fraction 0.5 x 32 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 31 of 31 taxa have at least 7 Volunteers (= user-spec minimum fraction 0.25 x 28 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 31 of 31 taxa have at least 20 (user-spec minimum) ASV counts in at least 14 samples (= user-spec minimum fraction 0.5 x 28 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 31 of 31 taxa have at least 6 Volunteers (= user-spec minimum fraction 0.25 x 24 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 31 of 31 taxa have at least 20 (user-spec minimum) ASV counts in at least 12 samples (= user-spec minimum fraction 0.5 x 24 relevant samples), ok

Order

Volcano plots

Most-significant findings

Note that points in graphs represent data that have been processed extensively by DESeq, being first normalised for sequencing depth, then in rare cases outlier-corrected based on Cook distance at 99th percentile. After this a variance stabilising transformation has been applied including log2 transformation and regularisation to handle low counts, and then batch correction has been applied if relevant.

For all graphs and data, including non-significant, see supplementary files 5_seqtab_nochim_tables.xlsx, 5_diffabund_analyses_grpcontrasts.xlsx, 5_diffabund_analyses_grpcontrasts_volcanos.pdf and 5_diffabund_analyses_grpcontrasts_dotplots.pdf.

Status, warning and error messages

At Order level, 43 of 62 were lost because less than 25% of subjects had >0 counts, or because less than 50% of samples had at least 20counts.

Of 19 taxa, DA model fits converged for 19 taxa, so these only were further analysed.

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 19 of 19 taxa have at least 8 Volunteers (= user-spec minimum fraction 0.25 x 32 Volunteers), ok

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 19 of 19 taxa have at least 20 (user-spec minimum) ASV counts in at least 16 samples (= user-spec minimum fraction 0.5 x 32 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 19 of 19 taxa have at least 7 Volunteers (= user-spec minimum fraction 0.25 x 28 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 19 of 19 taxa have at least 20 (user-spec minimum) ASV counts in at least 14 samples (= user-spec minimum fraction 0.5 x 28 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 19 of 19 taxa have at least 6 Volunteers (= user-spec minimum fraction 0.25 x 24 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 19 of 19 taxa have at least 20 (user-spec minimum) ASV counts in at least 12 samples (= user-spec minimum fraction 0.5 x 24 relevant samples), ok

Class

Volcano plots

Most-significant findings

Note that points in graphs represent data that have been processed extensively by DESeq, being first normalised for sequencing depth, then in rare cases outlier-corrected based on Cook distance at 99th percentile. After this a variance stabilising transformation has been applied including log2 transformation and regularisation to handle low counts, and then batch correction has been applied if relevant.

For all graphs and data, including non-significant, see supplementary files 5_seqtab_nochim_tables.xlsx, 5_diffabund_analyses_grpcontrasts.xlsx, 5_diffabund_analyses_grpcontrasts_volcanos.pdf and 5_diffabund_analyses_grpcontrasts_dotplots.pdf.

Status, warning and error messages

At Class level, 19 of 27 were lost because less than 25% of subjects had >0 counts, or because less than 50% of samples had at least 20counts.

Of 8 taxa, DA model fits converged for 8 taxa, so these only were further analysed.

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 8 of 8 taxa have at least 8 Volunteers (= user-spec minimum fraction 0.25 x 32 Volunteers), ok

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 8 of 8 taxa have at least 20 (user-spec minimum) ASV counts in at least 16 samples (= user-spec minimum fraction 0.5 x 32 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 8 of 8 taxa have at least 7 Volunteers (= user-spec minimum fraction 0.25 x 28 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 8 of 8 taxa have at least 20 (user-spec minimum) ASV counts in at least 14 samples (= user-spec minimum fraction 0.5 x 28 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 8 of 8 taxa have at least 6 Volunteers (= user-spec minimum fraction 0.25 x 24 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 8 of 8 taxa have at least 20 (user-spec minimum) ASV counts in at least 12 samples (= user-spec minimum fraction 0.5 x 24 relevant samples), ok

Phylum

Volcano plots

Most-significant findings

Note that points in graphs represent data that have been processed extensively by DESeq, being first normalised for sequencing depth, then in rare cases outlier-corrected based on Cook distance at 99th percentile. After this a variance stabilising transformation has been applied including log2 transformation and regularisation to handle low counts, and then batch correction has been applied if relevant.

For all graphs and data, including non-significant, see supplementary files 5_seqtab_nochim_tables.xlsx, 5_diffabund_analyses_grpcontrasts.xlsx, 5_diffabund_analyses_grpcontrasts_volcanos.pdf and 5_diffabund_analyses_grpcontrasts_dotplots.pdf.

Status, warning and error messages

At Phylum level, 6 of 12 were lost because less than 25% of subjects had >0 counts, or because less than 50% of samples had at least 20counts.

Of 6 taxa, DA model fits converged for 6 taxa, so these only were further analysed.

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 6 of 6 taxa have at least 8 Volunteers (= user-spec minimum fraction 0.25 x 32 Volunteers), ok

analysing main effects and interactions (contrast Group Alzheimers-Control): in 32 samples with relevant metadata, 6 of 6 taxa have at least 20 (user-spec minimum) ASV counts in at least 16 samples (= user-spec minimum fraction 0.5 x 32 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 6 of 6 taxa have at least 7 Volunteers (= user-spec minimum fraction 0.25 x 28 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Control): in 28 samples with relevant metadata, 6 of 6 taxa have at least 20 (user-spec minimum) ASV counts in at least 14 samples (= user-spec minimum fraction 0.5 x 28 relevant samples), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 6 of 6 taxa have at least 6 Volunteers (= user-spec minimum fraction 0.25 x 24 Volunteers), ok

analysing main effects and interactions (contrast Group AlzCB-Alzheimers): in 24 samples with relevant metadata, 6 of 6 taxa have at least 20 (user-spec minimum) ASV counts in at least 12 samples (= user-spec minimum fraction 0.5 x 24 relevant samples), ok

Kingdom

Volcano plots

Most-significant findings

Note that points in graphs represent data that have been processed extensively by DESeq, being first normalised for sequencing depth, then in rare cases outlier-corrected based on Cook distance at 99th percentile. After this a variance stabilising transformation has been applied including log2 transformation and regularisation to handle low counts, and then batch correction has been applied if relevant.

For all graphs and data, including non-significant, see supplementary files 5_seqtab_nochim_tables.xlsx, 5_diffabund_analyses_grpcontrasts.xlsx, 5_diffabund_analyses_grpcontrasts_volcanos.pdf and 5_diffabund_analyses_grpcontrasts_dotplots.pdf.

Status, warning and error messages

Differential abundance analysis could not proceed at Kingdom level because there were only 1 taxa before initial taxon agglomeration.

QC on DA pvalues

Library Versions

## R version 4.5.0 (2025-04-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
## [1] C
## 
## time zone: Europe/London
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] doRNG_1.8.6.2               rngtools_1.5.2              foreach_1.5.2               microbiome_1.30.0           ANCOMBC_2.11.1              ggplotify_0.1.2             pheatmap_1.0.13            
##  [8] ggrastr_1.0.2               ggpubr_0.6.0                plotly_4.10.4               ape_5.8-1                   ARTool_0.11.2               RColorBrewer_1.1-3          openxlsx_4.2.8             
## [15] DESeq2_1.48.1               SummarizedExperiment_1.38.1 Biobase_2.68.0              MatrixGenerics_1.20.0       matrixStats_1.5.0           GenomicRanges_1.60.0        GenomeInfoDb_1.44.0        
## [22] IRanges_2.42.0              S4Vectors_0.46.0            BiocGenerics_0.54.0         generics_0.1.4              vegan_2.7-1                 permute_0.9-7               lubridate_1.9.3            
## [29] forcats_1.0.0               stringr_1.5.1               dplyr_1.1.4                 purrr_1.0.4                 readr_2.1.5                 tidyr_1.3.1                 tibble_3.3.0               
## [36] ggplot2_3.5.2               tidyverse_2.0.0             speedyseq_0.5.3.9021        phyloseq_1.52.0            
## 
## loaded via a namespace (and not attached):
##   [1] splines_4.5.0           cellranger_1.1.0        rpart_4.1.24            lifecycle_1.0.4         Rdpack_2.6.4            rstatix_0.7.2           doParallel_1.0.17       lattice_0.22-5         
##   [9] MASS_7.3-65             crosstalk_1.2.1         backports_1.5.0         magrittr_2.0.3          Hmisc_5.2-3             sass_0.4.10             rmarkdown_2.29          jquerylib_0.1.4        
##  [17] yaml_2.3.10             zip_2.3.1               gld_2.6.7               cowplot_1.1.3           minqa_1.2.8             ade4_1.7-23             multcomp_1.4-28         abind_1.4-8            
##  [25] Rtsne_0.17              expm_1.0-0              nnet_7.3-20             yulab.utils_0.2.0       TH.data_1.1-3           sandwich_3.1-1          GenomeInfoDbData_1.2.14 codetools_0.2-20       
##  [33] DelayedArray_0.34.1     energy_1.7-12           tidyselect_1.2.1        UCSC.utils_1.4.0        farver_2.1.2            lme4_1.1-37             gmp_0.7-5               base64enc_0.1-3        
##  [41] jsonlite_2.0.0          multtest_2.64.0         e1071_1.7-16            Formula_1.2-5           survival_3.8-3          iterators_1.0.14        emmeans_1.11.1          tools_4.5.0            
##  [49] DescTools_0.99.60       Rcpp_1.0.14             glue_1.8.0              gridExtra_2.3           SparseArray_1.8.0       xfun_0.52               mgcv_1.9-1              numDeriv_2016.8-1.1    
##  [57] withr_3.0.2             fastmap_1.2.0           boot_1.3-31             rhdf5filters_1.20.0     digest_0.6.37           timechange_0.3.0        R6_2.6.1                gridGraphics_0.5-1     
##  [65] estimability_1.5.1      colorspace_2.1-1        Cairo_1.6-2             gtools_3.9.5            dichromat_2.0-0.1       utf8_1.2.6              data.table_1.17.4       class_7.3-23           
##  [73] CVXR_1.0-15             httr_1.4.7              htmlwidgets_1.6.4       S4Arrays_1.8.1          pkgconfig_2.0.3         gtable_0.3.6            Exact_3.3               Rmpfr_0.9-5            
##  [81] XVector_0.48.0          htmltools_0.5.8.1       carData_3.0-5           biomformat_1.36.0       scales_1.4.0            lmom_3.2                reformulas_0.4.1        knitr_1.50             
##  [89] rstudioapi_0.17.1       tzdb_0.5.0              reshape2_1.4.4          checkmate_2.3.2         coda_0.19-4.1           nlme_3.1-168            nloptr_2.2.1            proxy_0.4-27           
##  [97] cachem_1.1.0            zoo_1.8-14              rhdf5_2.52.1            rootSolve_1.8.2.4       parallel_4.5.0          vipor_0.4.7             foreign_0.8-90          pillar_1.10.2          
## [105] grid_4.5.0              vctrs_0.6.5             car_3.1-2               xtable_1.8-4            cluster_2.1.8.1         htmlTable_2.4.3         beeswarm_0.4.0          evaluate_1.0.3         
## [113] mvtnorm_1.3-3           cli_3.6.5               locfit_1.5-9.8          compiler_4.5.0          rlang_1.1.6             crayon_1.5.3            ggsignif_0.6.4          labeling_0.4.3         
## [121] plyr_1.8.9              fs_1.6.6                ggbeeswarm_0.7.2        stringi_1.8.7           viridisLite_0.4.2       BiocParallel_1.42.1     lmerTest_3.1-3          gsl_2.1-8              
## [129] Biostrings_2.76.0       lazyeval_0.2.2          Matrix_1.7-3            hms_1.1.3               bit64_4.6.0-1           Rhdf5lib_1.30.0         haven_2.5.5             rbibutils_2.3          
## [137] igraph_2.1.4            broom_1.0.5             bslib_0.9.0             bit_4.6.0               readxl_1.4.5