The precise anatomical location of gene expression is an essential component of the study of gene function. For most model organisms this task is usually undertaken via visual inspection of gene expression images by interested researchers. Computational analysis of gene expression has been developed in several model organisms, notably in Drosophila which exhibits a uniform shape and outline in the early stages of development. Here we address the challenge of computational analysis of gene expression in Xenopus, where the range of developmental stages of interest encompasses a wide range of embryo size and shape. Embryos may have different orientation across images, and, in addition, embryos have a pigmented epidermis that can mask or confuse underlying gene expression. Here we report the development of a set of computational tools capable of processing large image sets with variable characteristics. These tools efficiently separate the Xenopus embryo from the background, separately identify both histochemically stained and naturally pigmented regions within the embryo, and can sort images from the same gene and developmental stage according to similarity of gene expression patterns without information about relative orientation. We tested these methods on a large, but highly redundant, collection of 33,289 in situ hybridization images, allowing us to select representative images of expression patterns at different embryo orientations. This has allowed us to put a much smaller subset of these images into the public domain in an effective manner. The 'isimage' module and the scripts developed are implemented in Python and freely available on https://pypi.python.org/pypi/isimage/.
PLoS Comput Biol
Animals, Computational Biology, Data Curation, Embryo, Nonmammalian, Gene Expression, Gene Expression Profiling, Gene Expression Regulation, Developmental, Image Processing, Computer-Assisted, In Situ Hybridization, In Situ Hybridization, Fluorescence, Software, Transcriptome, Xenopus laevis