Document Type


Publication details

Naz, S 2017, 'Development of interactive statistical modules and workflows for exploration of diverse sets of crop phenotyping data', MSc thesis, Southern Cross University, Lismore, NSW.

Copyright Crops For the Future (CFF) 2017


The objective of the research project was to develop and refine interactive statistical analysis tools for a generic crop genetics data schema.

A wide range of public domain databases have been generated to manage data describing different aspects of plant characteristics, taxonomy, biology, genetics and genomics. These data sets can be used to underpin cultivar development as well as to assist in genetic resource conservation, breeding and pre-breeding. Shortage of precise phenotypic data analysis tools is a key limiting step in rapid crop improvement that makes use of all available germplasm. A parsimonious but comprehensive data structure with a rich trait phenotypic contents along with easy and open access to datasets and interactive tools, is desirable to encourage interaction between researchers and breeders. Moreover, informed access to integrated datasets, will help develop a deeper understanding of phenotypic characteristics.

In this project four tools were developed using open source software R and Shiny for crop phenotypic data exploration, navigation, comparison and analysis. The CropStoreDB (Love et al. 2012) database was used as an example database to implement those tools. CropStoreDB is a MySQL based database which manages genetics data and is equipped with raw datasets, relevant metadata, versions and descriptions. These tools are generic and can be implemented using other similar databases and plant species, for example, wheat, corn, etc. Open source software was being used for the development of interactive workflows for plant trait and phenotypic data. This enhances data navigation and offers real-time analysis tools. The interactive analysis toolkit developed will enable researchers and crop plant breeders to understand how existing varieties compare with available variation in the underlying genepool, and contribute to efficient breeding of new varieties with improved quality and adaptation to growing environment. The tools are intended to serve as an adaptable blueprint for future flexible interfaces which will be developed as part of CropStoreDB (