Bastian Greshake is cofounder of openSNP, a recently launched open source project whose aim is to collect the genotypes from Direct-to-Consumer (DTC) genetic testing. Customers who wish to donate their genetic data for further analysis can have here a place where they can do so. openSNP does also allow the possibility of addition of phenotypes plus some user-friendly search interfaces. Users who donate their data to openSNP automatically make their personal data available in the public domain. For a project that was started in June 2011 by two Masters students, the work carried out so far shows great promise. Here readers have a chance to learn how they have done it. [Manuel]
Manuel was kind enough to allow me to write a small blog post about the openSNP project, which I launched together with a friend a couple of weeks ago. And instead of boring you with too much technical details I want to use this space to give you a small history on how we got the idea for the project and why we pursued it. Personally I’ve been fascinated with Direct-to-Customer (DTC) genetic testing for quite a while, namely since 23andMe started, and always wanted to get my hands on my personal data. During one of their sales in April of this year they dropped the prices to a level that made it affordable even to students like myself. So I took this opportunity and about a month later I got back my results. I immediately started to play around with the raw data that was delivered.
Quite similar to Manuel’s efforts, trying to learn about his risks and how they affect his family, I wondered what kind of information I could find about my parents by analyzing my results. So I started to look for my homozygous SNPs and for the heterozygous ones that showed higher or lower risk of a disease. Next I looked around for some more DTC-data on the web, to have something more to play with and found a list of 23andMe genotypes on SNPedia that helped me a lot. I found, however, somehow disappointing that this list did include some broken links. They had to be checked by hand in regular intervals to find the latest data and had no further information about the people that uploaded the data. And who knows how many people would have already published their results not listed there?
Because of this I started to work on a small repository that could host DTC raw data in June of this year and found some friends that were willing to help me with the project. While we worked on the project we had some more ideas: we added the basic functionality to add phenotypic information, deciding that it would be great if people had access to the latest literature on genetic risk factors that they have been tested for. So we started to add information from SNPedia, the Public Library of Science as well as the crowd-sourced database of Mendeley to enhance the user experience.
About a month later, at the end of September we published the first version of our application. Our vision for openSNP – “Crowdsourcing Genome Wide Association Studies” – may be a bit over the top right now, as the number of users is still quite small. But that is basically what we want to achieve: to create a public resource of genetic information that ultimately should be used to create new knowledge. We are really happy with how things have turned out so far. In less than four weeks over 30 people have been willing to upload their testing results, fixed some bugs and started to work on new features that hopefully will be implemented in the near future.
Direct-To-Consumer genetic testing is here to stay. And with the prospect of exome and whole genome sequencing already becoming available as services, there is a great need for platforms that make results and knowledge available – for the interested public as well as for scientists. Hopefully we can help to make this happen.