Complex genomic diseases are well known to be affected by environmental factors. Despite the obvious association of Genotype and Environment (G x E) to Phenotype (= P), current efforts to understand the etiology of complex diseases have been focused mainly on G, ignoring the E part. It is not surprising that this is the case, since environment variables such as diet, education, exposure to pollution, etc., are not easily to measurable, let alone be encoded in a computationally friendly format.
This difficulty in measuring E however does not justify its conspicuous absence in Genome Wide Association Studies (GWAS) or other genomic studies. Providing a proper controlled vocabulary or ontology that describes environmental phenomena amenably should help overcome barriers in following and annotating E.
In the field of Genome Medicine, ontologies have been extremely useful for characterization of phenotypic traits in patients. For example, the London Dysmorphology Database, the Human Phenotype Ontology and the Gene Ontology all have proven invaluable for description of clinical traits, molecular processes, biological function, etc.
The current situation of lack of environmental health factors in genome association studies could be comparable to years ago when diseases were being characterized without formal descriptions of phenotypes. Unfortunately, it seems that so far the efforts to include E have been rather limited and not enough drive has been produced to formally encode it. A simple search in google shows no ontologies applied to E and neither E is included in the NCBO Bioportal for anything related to health and disease.
Collection and encoding of environmental factors in a formal structured ontology will surely help shed light on the missing heritability of complex diseases and beyond. Let’s hope that such efforts are soon embraced. Perhaps such an initiative could help reverse the traditional trend to ignore the E part of the G x E = P equation.