Why have you changed your pipeline?

Recent developments have changed the way amplicons are analysed as it’s widely accepted that cluster-based algorithms such as UCLUST and UPARSE, which were the basis of QIIME 1, tend to have lower sensitivity and can lead to misclassification of amplicon datasets. 

Upgrading our pipeline to QIIME 2 and the Divisive Amplicon Denoising Algorithm 2 (DADA2) brings our service in line with the industry-accepted analysis workflow.

What’s DADA2 and why are you changing the nomenclature?

DADA2 is a revolutionary algorithm that has been developed recently and has quickly become the industry standard for amplicon analysis. This algorithm implements a dynamically determined error correction profile which improves the accuracy of amplicon analysis and can therefore dynamically determine what might be an erroneous base in a read and correct it. 

In addition to this, DADA2 is faster at processing large datasets. Through its various improvements over previously accepted methods, there has been a change of nomenclature, from Operational Taxonomic Unit (OTU) to Amplicon Single Variant (ASV). The change to ASV has been driven by DADA2’s ability to resolve marker genes such as 16S with as little as a single variant in the sequence rather than grouping reads using a 97% similarity threshold. This improves taxonomic classification and reduces misclassifications.

You can read more about DADA2 here: Callahan, Benjamin J., et al. "DADA2: high-resolution sample inference from Illumina amplicon data." Nature methods 13.7 (2016): 581.

What else have you changed with the pipeline?

We understand that traditionally bacterial classifications are completed against GreenGenes, however, since GreenGenes has not been updated in recent times, we have shifted towards classifying our bacterial datasets using SILVA and NCBI, in addition to GreenGenes. Our fungal datasets will continue to be classified against UNITE and NCBI.

Further to this, we have changed the way we present the relative abundance figures by using QIIME 2. This brings with it a more user-friendly interface and allows you to more easily visualise your data. The absolute and relative abundances will be provided in a spreadsheet and the bar chats will be provided in .html file format.

How do I choose between GreenGenes and SILVA?

It all depends on the type of downstream analysis you are planning to perform. If you’re after taxonomy resolution you can use SILVA database results. If, however, you are performing downstream functional analysis, some analyses are built specifically on the GreenGenes database and may not work with SILVA - in those instances you can use the GreenGenes results.

Why is my .biom file is not working?

You will need to update your downstream analysis workflow if it is based on the older .biom format. The new format is in HDF5 format. Alternatively, you can follow this guide to convert it: https://biom-format.org/documentation/biom_conversion.html

This all sounds great, but what if I’ve completed my work using QIIME 1?

We’ll still maintain our QIIME 1 pipeline for a short period of time, but if you specifically require QIIME 1 for an ongoing project, please contact your Account Manager and discuss your options. 

Alternatively, if you would like to re-analyse your datasets using QIIME 2, please contact your Account Manager who’ll be able to assist you with special pricing.

Why there are weird “alpha-numerical” present in my results instead of “OTUs”?

The alpha-numerical IDs are hashes of each Amplicon sequence variant (ASV). As we use DADA2 the outputs are now ASVs instead of OTUs. If you are interested in having OTU outputs, please contact your Account Manager and they can guide you.

Why there are multiple biom files in my results folder?

The pipeline default feature table doesn’t contain “taxonomy information”. As taxonomy information is useful for most of downstream analysis pipeline, we appended taxonomy information generated from each database compared with in an individual biom file for your convenience.

Why can’t I see html plots, when I double click files after download?

By default, the folders are zipped while downloading results. When you double-click on the file, Windows OS by default will show the content of the folder, but you still need to “unzip” files in order to view or use them.

To unzip the files, hover your mouse over the zipped file > Right click > select “7z” (or equivalent software) > Extra files (or) Extra to folder. They use the extracted folder for viewing html plots.