Setting up a data server for IGV

The Broad’s Integrative Genome Viewer (IGV) is a popular tool for visualising genomic data. It can be run with java webstart or installed locally. See the downloads page for details.

If you have some genome data you want people to be able to view with IGV, you can set up your own data server. Essentially, all you have to do is put your datafiles online somewhere (IGV understands http basic auth if you need to password protect the data) and then create registry files to tell IGV where the data is. There are instructions on the Broad website.

This is how I set our IGV server up:

You need a registry file for each genome you want to provide data for. I’ll use hg18 as an example. Create a registry file for hg18 on your webserver. Mine is in

/srv/www/igv/igv_registry_hg18.php

Corresponding to a URL of

http://my.server.com/igv/igv_registry_hg18.php

The reason these registry files are php, rather than just plain text is that they pull in the contents of the Broad’s registry files as well as pointing to any local datasets or annotation.

It should look something like this:

<?php
// set the mime type to xml
$mtype = "text/plain";
header("Content-Type: $mtype");

// add my datasets to the list
echo "http://my.server.com/igv/igv_data/hg18/datasets.php\n";

// add the Broad datasets to the list
$broad_reg = file_get_contents(
"http://www.broadinstitute.org/igvdata/hg18_dataServerRegistry.txt");
echo "$broad_reg\n";
?>

Which produces:

http://my.server.com/igv/igv_data/hg18/datasets.php
http://www.broadinstitute.org/igvdata/annotations/hg18/hg18_annotations.xml
http://igv.broadinstitute.org/data/hg18/tcga/tcga_external.xml
http://www.broadinstitute.org/igvdata/mmgp_hg18.xml
http://www.broadinstitute.org/igvdata/gcm/gcm.xml
http://www.broadinstitute.org/igvdata/encode/hg18/hg18_encode_color.xml
http://www.broadinstitute.org/igvdata/epigenetics/epigenetics_hg18_public.xml
http://www.broadinstitute.org/igvdata/epigenetics/WilmsTumor/wilms_tumor.xml
http://www.broadinstitute.org/igvdata/1KG/1KG.xml
http://www.broadinstitute.org/igvdata/BodyMap/BodyMap.xml
http://www.broadinstitute.org/igvdata/tutorials/tutorials.xml

As you can see the registry file points to a datasets.php file url:
http://my.server.com/igv/igv_data/hg18/datasets.php

Corresponding to a php file on my server:
/srv/www/igv/igv_data/hg18/datasets.php

This datasets file stores a list of available datasets on my server for the hg18 genome build.

Rather than just serving up one big datasets.xml file, describing all of our hg18 datasets, I prefer to have the xml descriptions of each dataset in the dataset directory and have a php file that collates them all. It’s easier to manage and more readable.

<?php
// set mime type to xml
$mtype = "text/xml";
header("Content-Type: $mtype");
?>

<?php

// include all the dataset xml descriptions you want.
include("some_data_folder/dataset.xml");
include("some_other_data_folder/dataset.xml");

?>

The individual dataset.xml files should look something like:

<Category name="My Dataset">
<Category name="Control">
<Resource name="Control Reads"
path="http://my.server.com/igv/igv_data/hg18/some_dataset/ctrl_reads.bam"/>
<Resource name="Control Peaks"
path="http://my.server.com/igv/igv_data/hg18/some_dataset/ctrl_peaks.bed"/>
</Category>
<Category name="Treatment">
<Resource name="Treatment Reads"
path="http://my.server.com/igv/igv_data/hg18/some_dataset/treat_reads.bam"/>
<Resource name="Treatment Peaks"
path="http://my.server.com/igv/igv_data/hg18/some_dataset/treat_peaks.bed"/>
</Category>
</Category>

and so on.

Categories can be nested multiple times. You can have as many as you like, containing as many Resources as you need.

To use your new data server in IGV, start IGV and go to View -> Preferences -> Advanced

Click the Edit Server Properties checkbox and replace the Data Registry URL with http://my_server.com/igv/igv_registry_$$.php

Hit OK.

The $$ is a placeholder for the genome name, so you can create more registries, for example igv_registry_mm9.php, and you won’t have to change this setting.

Now, if you select the hg18 genome go to File -> Load From Server, you will be able to select from a list of datasets hosted on the BRC-MH servers, as well as those hosted by the Broad.

Advertisements

One response to “Setting up a data server for IGV

  1. Nice post! Nice way of using PHP for generating a dynamic XML for IGV. Good job! After reading your post, I used PHP and regex to create a dynamic XML of the bam files on my server! Thanks for sharing…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s