There are three main targets of the UK Barley Sequencing Consortium currently in progress.
Barley genome browser
To guarantee widespread awareness of barley genomic resources, the annotated barley genome reference assembly has been integrated within Ensembl Plants, EBI’s portal for genome scale data, alongside other cereal and plant genomes.
The EBI has a strong track record in the provision of high-volume, high-availability data services and as the European partner in many of the leading international database collaborations in molecular biology is uniquely placed to provide an integrated service whereby the annotated barley genome can be served to users connected to the resources (raw DNA and RNA sequence, reference proteins and protein domains, etc.) used to assemble and annotate it. The Ensembl software framework is under continuous development and although originally developed in the context of the human genome project, has since been successfully applied to genomes from across taxonomy.
Genome assembly, variant discovery and annotation
Sequencing and assembly Chromosome 2H
A cost effective, high throughput, transposase-based pipeline to sequence BACs from a minimal tile path is being used at TGAC. BACs are grown and processed in 384 well format and a 48 x48 dual index barcode employed to facilitate pooling 2304 BACs in a single lane of an Illumina HiSeq 2500. We are using this approach to sequence the minimal tile path of Barley chromosome 2H. Individually barcoded sequenced BACs undergo a bioinformatics pipeline for assembly and post-filtering. We aim to improve the assemblies further by pooling BAC DNA from each 384 well plate and constructing mate pair libraries and using this data to scaffold contigs.
From exome sequencing to variant discovery in barley
Exome capture helps to reduce the cost of genome sequencing by targeting only a coding portion of the genome. We are employing the exome capture developed (Mascher et al., 2013) to sequence the exomes of 400 barley varieties and their wild relatives. The variation identified from these exome-based approaches can be used to develop molecular markers for barley breeding and SNP chips for high-throughput genotyping.
Generation of an extensive transcriptome reference atlas
Transcriptome sequencing using second generation sequencing technology permits gene structures to be annotated through definition of coding and non-coding regions, thus validating predictions of gene models. Transcriptional and post-transcriptional modifications, such as variation in start site position, alternative splicing, and polyadenylation, can be recognised when aligned to a genome of reference or performing de novo assembly studies.
RNA has been obtained from tissues of barley (cultivar Morex) from 16 different tissues and/or stages of development and will provide a reference “atlas” of gene expression. RNAseq has been generated from triplicate biological samples using Illumina HiSeq technology at TGAC. This is an extension to the published set of data (Nature 479, 711-716, 2012) and is the most extensive quantitative set of RNAseq available for barley. The preliminary differential expression results (only 8 stages of development) of this experiment can be seen in the morexGenes database.
From this dataset, we have in progress studies on the characterisation of RNA splicing events. To validate in silico gene predictions in alternative splicing events we are using an established quantitative RT-PCR procedure (Simpson et al., 2008).
These methodologies represent a novel genome-wide component of our analyses which we will cross reference to data from other cereal genomes to detect conserved and unique alternative splicing events and other patterns of gene expression.