biomaRt mapping from human illumina ID to mouse Ensembl ID

Problem:

Dataset 1: ChIPseq-derived transcription factor binding sites. Mouse. Mapped to nearest Ensembl Gene ID,

Dataset 2: Human Illumina Ref6 expression array data (GPL6097, I think) from various cell lines with varying amounts of said transcription factor.

Question:

What are the targets of the transcription factor doing in the expression datasets?

Quick Mapping

As a quick approximation, map the Illumina human IDs to their Human Ensembl IDs, then grab the Ensembl IDs of the homologous mouse gene, filter to include only those annotated as nearest to a binding site and then it’s east to pull out the expression of the binding targets (listing the myriad reasons why the results could well be biologically meaningless is left as an exercise for the reader…).

Reason I ❤ biomaRt


library(biomaRt)
ensembl.human = useMart("ensembl", dataset="hsapiens_gene_ensembl")
ensembl.mouse = useMart("ensembl", dataset="mmusculus_gene_ensembl")
homologs <- getLDS(
attributes=c('illumina_v1', 'ensembl_gene_id', 'chromosome_name'),
filters='illumina_v1',
values=human.illumina.ids,
mart=ensembl.human,
attributesL='ensembl_gene_id',
filtersL='ensembl_gene_id',
valuesL=tfbs.nearest.mouse.ensembl.ids
martL=ensembl.mouse,
uniqueRows=TRUE
)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s