Next Generation Sequencing (NGS) offers the promise of revolutionizing our ability to diagnose genetic disorders. Fuelled by the exponential decrease in the cost of sequencing, NGS can now be outsourced, making it accessible to labs with modest budgets. A personal exome (the sum of all coding regions in a genome) is currently priced at $999 by some providers. Although not as comprehensive as whole genome sequencing, exomes provide the ability to shed light on the origin of causative mutations lying on genes.(by SarahKusala, CC-BY 3.0)
Getting the raw sequence data is the easy part. The challenging part is to extract and interpret clinically the genomic variation found in the raw data. The extraction of variants from raw NGS data can be influenced by many factors such as the sequence read depth, the alignment of reads and the variant calling algorithm. If one is to find the variants that may be of clinical relevance, filtering is required. This filtering may be performed by comparing genome data against data from the “normal” variation found in the 1000genomes project and dbSNP. Depending on the length of the mutation, there are three main kinds of variants: SNPs, indels and CNVs. SNPs constitute single point mutations (one DNA base), Indels insertions or deletions of up to about 1Kb and CNVs deletions or duplications from 1Kb to many megabases long.
It is well known, however, that many SNPs fall into locations that are far from genes, yet they can cause phenotypic effects. But assuming that one is looking at coding regions, many pieces of software have been developed to predict the effects of SNP mutations: stop codons, missense mutations and frameshifts.
Indels and CNVs are slightly harder to interpret clinically. CNVs can encompass many genes and their phenotypic effect cannot be clearly established unless several patients have been observed with a similar CNV. It is not uncommon for a normal individual to carry hundreds of indels and CNVs.
One of the most important challenges in the clinic when implementing genomics is going to be how to deal with the huge amounts of data produced. There is going to be a great number of patients sequenced, all of them producing a huge number of genomic features of unknown significance. Given that in order to confidently interpret a rare variant it is needed to have evidence from several patients, it is not surprising that another big challenge is how this information is going to be shared. A lot more data about a patient means that the chances of personal identification are increased even if this information is anonymous. Thinking about a few routinely carried out tests today, it is possible to uniquely identify a person only with a handful of SNPs. Imagine when one possesses thousands of genomic variants from one patient.
Moreover, if this data is to be shared, a big challenge is going to be how it is going to be compared. Different labs have different Quality Control (QC) standards and different platforms. Each sequencing run may have different read depths and different levels of confidence in terms of whether a called variant is true. Another issue will be how the annotation of phenotypes will be carried out. There are phenotypic ontologies like the Human Phenotype Ontology, that allows a reasonably complete set of clinical descriptions. Nevertheless there is no guarantee that phenotypic descriptions even using the same ontology will have the same level of annotation. All these factors are going to need consideration when interpreting NGS in the clinic.
One of the main hurdles impairing the access of NGS to the clinic can also be the health system in the country. The UK seems to have been able for now to put together many state funded clinical labs to work together. Unfortunately, this would be unthinkable in countries like Spain, where instead of 1 unique health system, there are 17, as many as autonomous regions there are. Sequencing technologies require a lot of different sectors coordinating together in order to set up the appropriate platforms that guarantee the access of the technology, its proper interpretation and the protection of the patient’s privacy.
The other side of the coin is that these technologies are going to become increasingly affordable, not just for the rich countries but also for the emerging. The accessibility of this technology will make it ubiquitous in many labs around the world, not just to those looking for diagnosis of patients with genomic disorders. Expect sequencing routinely performed for cancer tissues and even at birth. Based on current estimates, it is likely that by 2020 there will be hundreds of millions of genomes sequenced.
Sequencing is going to revolutionize clinical practice. The degree to which it will revolutionize it depends on how we harness the challenges described above. There will be technical problems but also institutional ones that are more problematic to solve. The race for harnessing NGS in the clinical setting is on.