I have the pleasure to present a tech talk at the Barcelona NGS’17 conference this 3d of April.
Here is the abstract of my presentation:
Repositive: A One-stop Shop for Finding and Accessing Genomics Data
Common practice suggests that human-origin genome data should be deposited in public repositories for further reuse. Finding and accessing deposited genome datasets, however, is cumbersome, with data and metadata being scattered throughout the internet, annotated inconsistently and often machine unreadable. This provokes a huge loss of opportunities, hence wasting resources and research investment. The Repositive platform is an online portal and community of users that facilitates finding, accessing, and sharing of published genomic data: a one-stop shop to discover and explore a research question’s most relevant genomic datasets. Repositive holds descriptions and metadata about existing deposited datasets across hundreds of data sources from around the world. Its interface leverages the crowdsourcing of dataset metadata curation via its social networking capabilities. Repositive currently indexes more than 1 Million genomic datasets. These datasets cover population studies, microbiomes, methylomes and other NGS data. Datasets are further classified in curated collections. The Chinese Control Data Collection, for instance, indexes 10 datasets from more than 600,000 individuals of Chinese and other Asian ethnicities, including data from healthy, diseased, and reference individuals, some open access, some requiring data access agreements. By using the Repositive platform, users are able to find all published genome datasets and understand the genomic data landscape for a particular disease or condition, hence drastically lowering the barrier to genomic data access.