Bioinformatics

Remapping from NCBI36/hg18 to GRCh37/hg19


Given the huge response I have at work about remapping features into another assembly, I present here an adapted version for how to remap a feature from NCBI36/hg18 to GRCh37/hg19 using UCSC’s liftOver tool.

Important:

Please make sure you know in advance the assembly to which your aberration data is currently mapped to. If by mistake you remap an aberration already in GRCh37 to GRCh37 you will get new coordinates for the region mapped to the wrong coordinates.
UCSC’s Genome Browser provides a web facility to convert coordinates from one assembly into another. To convert coordinates using their liftOver tool do the following:

  1. Make sure that your data is in BED format, e.g.  “chr3     100000  999990  myPatientId0000123” –> aberration in NCBI36/hg18
  2. Note that each field is separated by a tab and each line by a character return. Please follow this strictly or the remapping tool may throw an error.
  3. Add as many lines as aberrations you would like to remap.
  4. Go to the liftOver page
  5. Select “Original Assembly” Mar. 2006 (NCBI36/hg18) and “New Assembly” Feb. 2009 (GRCh37/hg19)
  6. Leave all other parameters (Minimum ratio of bases that must remap, etc) with default values
  7. Paste your aberration in the input box where it says “Paste in data” and hit submit
  8. To get results, scroll down the page and click on the “View Conversions” link.
  9. Here is the result I get:
chr3  125000      1024990     myPatientId0000123

Please note that your feature may not remap because the region is partially or entirely deleted in the new assembly or split in GRCh37. In this case I recommend that you use another start or end point position, maybe use the start/end of alternative probes until you find a region where it maps. Another possibility would be to look at the genes for the region in the old assembly and select a region in GRCh37 that includes the same genes as in NCBI36. Each of these solutions require careful deliberation and may not be applicable to your particular case (e.g. genes in different chromosomes would not allow remapping based on genes).

I hope this is helpful.

Follow Manuel Corpas on Twitter

Categories: Bioinformatics, Tutorials

Tagged as: , ,

2 replies »

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s