Marcelino V.R. & Verbruggen H. (2017)
Reference datasets of tufA and UPA markers to identify algae in metabarcoding surveys
Data in Brief 11: 273-276


The data presented here are related to the research article “Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae” (Marcelino and Verbruggen, 2016) [1]. Here we provide reference datasets of the elongation factor Tu (tufA) and the Universal Plastid Amplicon (UPA) markers in a format that is ready-to-use in the QIIME pipeline (Caporaso et al., 2010) [2]. In addition to sequences previously available in GenBank, we included newly discovered endolithic algae lineages using both amplicon sequencing (Marcelino and Verbruggen, 2016) [1] and chloroplast genome data (Marcelino et al., 2016; Verbruggen et al., in press) [3,4]. We also provide a script to convert GenBank flatfiles into reference datasets that can be used with other markers. The tufA and UPA reference datasets are made publicly available here to facilitate biodiversity assessments of microalgal communities.

