BalticMicrobeDB is a database containing quantifications of genes from the Baltic Sea Reference Metagenome (BARM). The quantifications are summarized for both taxonomic and functional annotations and can be compared between samples. Furthermore, a BLAST based query can be performed to compare any given sequence to what genes and annotations that are present within BARM. Worth noting is that only protein coding genes with either taxonomic or a functional anotation is included in the BalticMicrobeDB.
Three sample sets are currently included, a time series, a spatial transect and a redoxcline sample set. The time series originates from the Linnaeus Microbial Observatory, year 2012, and is simply named lmo in the database. The spatial transect, named transect in the database, consists of three samples at different depths for each of ten stations with locations ranging across the entire Baltic Sea. The redoxcline sample set consists of samples from two stations (Gotland Deep and Boknis Eck) targeting the redoxcline.
The genes in the database originates from a combined assembly of these three datasets with the combined size of 2.6 billion reads. The assembly was conducted with Megahit followed by gene prediction with Prodigal and resulted in 6.8 million unique gene sequences. Out of these, 5.4 million genes received some kind of annotation and was included in the BalticMicrobeDB.
All data related to BARM, such as gene sequences, taxonomic annotations, functional annotations, and quantifications of individual genes can be downloaded from Figshare. Read sequences together with sample metadata for the transect and redoxcline datasets can be found on European Nucleotide Archive and for the lmo dataset at NCBI Short Read Archive.
For more detailed information and for citing BalticMicrobeDB or BARM when you use it in your publication, please refer to the accompanying manuscript (submitted) "BARM and BalticMicrobeDB: a reference metagenome and interface to meta-omic data for the Baltic Sea".