A De Novo Robust Clustering Approach for Amplicon-Based Sequence Data

Abstract : When analyzing microbial communities, an active and computational challenge concerns the categorization of 16S rRNA gene sequences into operational taxonomic units (OTUs). Established clustering tools use a one pass algorithm in order to tackle high numbers of gene sequences and produce OTUs in reasonable time. However, all of the current tools are based on a crisp clustering approach, where a gene sequence is assigned to one cluster. The weak quality of the output compared to more complex clustering algorithms, forces the user to post-process the obtained OTUs. Providing a membership degree when assigning a gene sequence to an OTU, will help the user during the post-processing task. Moreover it is possible to use this membership degree to automatically evaluate the quality of the obtained OTUs. So the goal of this work is to propose a new clustering approach that takes into account uncertainty when producing OTUs, and improves both the quality and the presentation of the OTUs results.
Type de document :
Pré-publication, Document de travail
2017
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger

https://hal-clermont-univ.archives-ouvertes.fr/hal-01447699
Contributeur : Alexandre Bazin <>
Soumis le : lundi 8 mai 2017 - 19:57:00
Dernière modification le : mercredi 14 mars 2018 - 16:44:15
Document(s) archivé(s) le : mercredi 9 août 2017 - 15:34:05

Fichier

ArticleECCB.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01447699, version 2

Citation

Alexandre Bazin, Didier Debroas, Engelbert Mephu Nguifo. A De Novo Robust Clustering Approach for Amplicon-Based Sequence Data. 2017. 〈hal-01447699v2〉

Partager

Métriques

Consultations de la notice

260

Téléchargements de fichiers

220