The seed storage globulins found among virtually all spermatophytes comprise a multi-gene family of proteins with ancient evolutionary origins. The two main groups of storage globulins include the legumins (11S) and vicilins (7S), both of which play a main role in protein deposition and storage in the seed endosperm. Composed of two cupin domains (bicupin), these proteins have been recently noted not only for their close structural relationships among the two subfamilies (7S and 11S) but also for their similarity to other proteins such as germin-like proteins (GLP's), bacterial oxalate decarboxylases, and other cupin containing proteins. Previous studies have investigated the evolutionary relationships among the legumin and vicilin groups, as well as their presumed evolutionary link to other cupin containing proteins; however these have each come short of any comprehensive resolved evolutionary history of the globulin family. This study focuses first on resolving the relationships among the cupin super-family in relation to the storage globulins, as well as the GLP's, which have been postulated to be the single domain ancestors of the bicupin storage globulins. Nucleotide coding sequences for both N-terminus and C-terminus cupin domains of the storage globulins, including conserved non-cupin domain helical repeats and inter-domain spacers were aligned to a comparably sized set of single cupin coding sequences (CDS). The phylogenetic relationships among the two globulin domains as well as the single cupin genes were elucidated using Bayesian inference of tree likelihoods. Further phylogenetic analysis was performed on the complete CDS's for all storage globulin sequences in the study, using an appropriate out-group of similar overall domain architecture determined by the overall topology of the cupin super-family. This globulin muti-gene tree was used, along with an alignment corresponding to structurally resolved portions of the mature globulin peptides, to perform an analysis of patterns of selection among the various lineages of cupin-containing globulins. The results of these analyses provide evidence for a common origin of all cupin containing genes. The GLP and storage globulin domains do not appear to be immediate ancestors of one another, but are grouped with the fungal spherulins as well, suggesting that the single cupin genes which gave rise to these groups had already diverged prior to the rise of land plants. The storage globulin gene tree provides evidence supporting the notion that true legumins and vicilins were recruited as seed storage proteins independent of one another, after their divergence. This is evidenced by the fact that they comprise two separate groups each with basal non-storage 11S/7S-like proteins. Additional insight into the differentiating selection pressures provides a clearer picture of how similar suites of physicochemical properties came under selection after the recruitment of the 11S and 7S families as seed specific proteins. Regions under strong destabilizing selection correspond to regions known to be of importance in the overall structure of storage globulins. Strong destabilizing selection at the pore of the globulin subunit suggests that this region may have undergone more functional diversification than previously thought to have occurred among the legumins and vicilins.



College and Department

Life Sciences; Plant and Wildlife Sciences



Date Submitted


Document Type





domain duplication, gene duplication, globulin, storage protein, molecular evolution, phylogenetics, natural selection