Work package 4 is responsible for the ‘Data access and analysis’ work package, which will work to advance the data landscape for wheat comparative analysis through new sequencing experiments, and the development of an informatics infrastructure to provide a comprehensive, federated, and open data portal.
We will generate the underpinning data, and subsequently develop the tools and resources, to form the UK national node of the global Wheat Information System, in order to connect with and bridge the gaps in current applications. Enabling computational analysis and interrogation for breeders and biologists, alongside the capture, curation and integration of reference data, will promote a ‘genomic supermarket’ to federate the wheat datasets required for large scale complex comparative analyses, and the dissemination of these data back to the research community. All data within DFW will be freely available with no restrictions.
Wheat research produces a huge amount of data not only because it has a very large and complex genome, but also because of its importance as a UK and international crop. Doing science with wheat can be difficult if the data are not easy to access and use in other crop research, so this programme will make sure researchers can access this data openly and easily, as well as being able to make their own data available in this way. This will provide both scientists and industry professionals with the tools and infrastructures they need to do research into improving yields and resistance in wheat for the future.
Wheat has been extremely successful at adaptation to different environments and is grown throughout the world, occupying more territory than any other crop. Understanding the diversity of cultivated wheat that allowed it to rapidly adapt to diverse environment is key for the future of wheat improvement. Traditionally, breeders select for traits observed on the whole plant level.
Using modern sequencing techniques and bioinformatics applications to link subcellular regulation of plant development with trait expression will accelerate breeding programmes. Furthermore, linking methods of recording physical characteristics of a plant in a rapid and precise fashion using new technology (“phenotyping”) with our genomics landscape of a range of commercially important wheat varieties (“genotyping”) will be vital to feed into advances to supply the global population with food.
We will work with other large-scale data generation projects for wheat, such as those proposed in the Global Challenges ODA. This will involve the coordination of delivery of integrated datasets within proposed timescales, through effectively managed analysis and dissemination. Given the complexity of the wheat genome, new and improved analytical tools are needed to enable researchers and breeders to integrate and interrogate data for the functional analysis of diversity within the Triticeae lineage.
Key objectives for this topic include the generation of genomic resources, information about gene regulation, development of computational biology tools, and the assessment of plant physiology through molecular and in-field phenotyping.
National and international wheat research is generating ever increasing quantities of data across a diverse and varied landscape of data types, such as multiple genome and transcriptome sequences, markers, phenotyping data, epigenetic and structural information. As such, there is an increasing need for effective computational analysis, and access to high-performance compute resources to integrate and harness disparate reference data. We will use our Grassroots information infrastructure (https://grassroots/tools/ ) to form the backbone of the data platform, connecting to vital community resources such as Ensembl Plants, CerealsDB, KnetMiner and SeedStor. This will allow wheat researchers to access a single point of call for DFW wheat data, whilst supporting the development and community spirit of the individual data resources.
Objectives for this topic include data coordination, facilitating integrative analysis and the
implementation of a wheat data hub.