Overview  |   Target-Function-Expression   |   miRNA expression   |  miRNA with inferred function   Documentation

 

miRGator Documentation

Contents

   1. Overview

   2. Target-Function-Expression Module

         2.A. Target navigation

               2.A.a. Input GUI and target summary table

               2.A.b. Output GUI and miRNA-target gene table

               2.A.c. Detailed information on miRNA-target mRNA pair

         2.B. Functional enrichment analysis

               2.B.a. Gene ontology

               2.B.b. Pathways (KEGG, GenMAPP, BioCarta)

               2.B.c. IPA disease ontology

         2.C. Correlated expression analysis

   3. 'miRNA with inferred Function' Module

   4. miRNA Expression Module

         4.A. Expression profiling of a miRNA

         4.B. Differential expression studies

   5. Data sources

 

 

1. Overview   top

    MicroRNAs (miRNAs) constitute an important class of regulators that are involved in various cellular and disease processes. However, the functional significance of each miRNA is mostly unknown due to the difficulty in identifying target genes and the lack of genome-wide expression data combining miRNAs, mRNAs and proteins. We introduce a novel database, miRGator, that integrates the gene expression data, target prediction, functional analysis and genome annotation.

http://genome.ewha.ac.kr/miRGator/miRGator_logo.bmp

<Schematic overview of miRGator system>

    MiRNA function is inferred from the list of target genes predicted by miRanda, PicTar, and TargetScanS programs. Statistical enrichment test of target genes in each term is performed for gene ontology, pathway, and disease annotations. Associated terms may provide valuable insights for the function of each miRNA. For the expression analysis, miRGator integrates public expression data of miRNA with those of mRNA and protein. Expression correlation between miRNA and target mRNA/proteins is evaluated and their expression patterns can be readily compared. Our web implementation supports diverse query types including miRNA name, gene symbol, gene ontology, pathway, and disease terms. Interfaces for exploring common targets or regulatory miRNAs and for profiling compendium expression data have been developed as well. Currently, miRGator supports the human and mouse genomes.

 

2. Target-Function-Expression Module   top

This is the main interface of miRGator for examining target genes, inferred functions, and the correlated expression through target prediction.

2.A.  Target Navigation   top

Target prediction methods and statistics

Available target prediction methods and statistics are summarized in Table 1. Default choice is miRanda 4.0 from the miRBase since it covers the most recent compendium of miRNAs. Other methods are rather out-dated with lower coverage (genome-wide calculation performed almost 2 years ago) but their target sites are conserved in more species, which may be of help in filtering out false positives.

 

TABLE 1. Statistics for various target prediction methods

 

Human (hg17)

Mouse (mm7)

miRanda

PicTar-4way

PicTar-5way

Target-ScanS

miRanda

PicTar-dog

PicTar-chicken

No. of miRNAs

470

179

131

139

375

269

249

No. of target genes

15274

9152

3455

7709

14768

6550

1492

No. of binding sites

284714

154894

28870

22837

241791

106022

8354

Avg. no. of target genes per miRNA

32.5

51.1

26.4

55.5

39.4

24.3

6.0

Avg. no. of binding sites per miRNA

606

865

220

164

645

394

34

Avg. no. of binding sites per gene

18.6

16.9

8.3

3.0

16.4

16.2

5.7

       Cross-species conservation for each prediction method

        - miRanda (version 4.0): conserved in at least two species
        - PicTar-4way: conserved in 4 species (human, mouse, rat, dog)
        - PicTar-5way: conserved in 5 species (human, mouse, rat, dog, chicken)
        - TargetScanS: conserved in 5 species (human, mouse, rat, dog, chicken)
        - PicTar-dog: conserved in 7 species (mouse, rat, rabbit, human, chimp, macaque, dog)
        - PicTar-chicken: conserved in 13 species (7 species + cow, armadillo, elephant, tenrec, opossum, chicken)

 

2.A.a. Input GUI (graphic user interface)   top

 

Selecting target prediction method(s)

-   Table 1 is the summary of miRNA coverage and cross-species conservation for each method. Average number of target genes and binding     sites would be helpful in estimating reliability of prediction methods.

-   'Target Summary Table' shows the number of target genes for all miRNAs in miRGator. This can be useful to select the proper methods     for miRNAs of interest. Clicking on each number in the table displays the list of target genes.

                                <Target Summary Table>

-   Target genes from different methods may be combined using the Boolean logic (AND, OR, NOT). For example, one may want to examine     only the common targets from miRanda, PicTar-way and TargetScanS to reduce false targets.

Main Search

-  The main search can be initiated either with miRNA(s) or with target gene(s). We support the gene names of HUGO gene symbol and     RefSeq ID.

-  Multiple genes or miRNAs may be specified. Currently, up to 50 gene names or 5 miRNAs may be selected.

 

2.A.b. Output GUI for target navigation   top

miRNA-target gene table

                <Output screen and the miRNA-target gene table>

 

-   Main output consists of three major parts as indicated in the 'Quick Links' part – miRNA-target gene table, functional enrichment analysis of     target genes, and expression correlation analysis of target genes.

-   Binding information is always displayed in the miRNA-target gene table format where the number of binding sites is indicated. Buttons for     sorting target genes according to the number of binding sites are available, which would allow users to concentrate on mRNAs with multiple     binding sites preferentially.

-   Link outs are provided for detailed information on target genes. Current link outs include the Entrez Gene, OMIM, UniProt, and Stanford     SOURCE databases.

-   Clicking on each number in the table leads to the detailed information on target binding and the expression correlation pattern for the     corresponding miRNA-mRNA pair. Sample screen shot of output page for miRNA-target mRNA binding and expression correlation is     shown below ("Details of miRNA-target mRNA pair").

 

2.A.c. Detailed information on miRNA-target mRNA pair   top

-  This output page contains the most detailed information on miRNA binding to a specific target mRNA. It consists of four parts - miRNA,     Target mRNA, miRNA binding site, and expression correlation between miRNA, mRNA, and protein.

-   Link to the UCSC genome browser displays all miRNA binding sites as custom tracks in the UCSC genome browser. This feature would     facilitate comparison of miRanda result with other predictions. UCSC tracks in the comparative genomics category are also helpful to     examine cross-species conservation.

-  Other miRNAs targeting the same mRNA are summarized as well as shown below. Numbers in the parenthesis are scores of target binding.

-  Expression correlation is calculated for available data types. We use the Pearson correlation coefficient which ranges from +1 to -1 for     positive and negative correlation, respectively. Bar plot below indicates the expression level in various types of tissues/organs. See the     "Correlated expression" section for more details.

 

2.B.  Functional Enrichment Analysis   top

Functional enrichment analysis of target genes can be performed in three categories - GO, pathway, and disease terms. Simple hypergeometric test of over-representation in each term was carried out for all terms in GO, pathway and disease classification systems. The output page summarizes the significant nodes for a given p-value, which can be sorted according to various criteria.

-   Functional analysis is performed for target genes of a single miRNA obtained by one prediction method. Thus, it should be noted that the     enrichment test is performed not for the genes listed in the output screen, but for those in the target summary table. In other words, target     genes of selected miRNA obtained by using the algorithm of choice are collected and they are used for statistical enrichment test. This l    imitation was inevitable since all possibilities were pre-calculated and stored in the database to speed up the response.

-   Option of excluding terms of single occurrence removes annotation terms that include one target gene even though they are statistically     significant. Therefore, output terms include at least two target genes.

 

2.B.a. Gene Ontology   top

Gene-to-GO mapping was achieved by combining the UCSC kgXref table (known gene to UniProt ID) and GOA association table (UniProt ID to GO nodes) from the GO web site. All GO nodes were tested for statistical enrichment.

  <Partial output of GO enrichment for miR-16 targets predicted by miRanda>

 Output terms can be sorted according to various criteria. Default is the P-value.

 -   Frequency in target gene group (14 out of 922): miRanda predicted 589 target mRNAs (not shown in this page). Those 589 genes were       mapped to UniProt proteins (numbers not shown). Those proteins were found to have 922 GO associations (i.e. 922 GO-protein       relationships), 14 out of which belong to the GO node 0000074 (regulation of progression through cell cycle).

 -   Background distribution from all human genes is 632 out of 165128 GO associations. Since one protein may have      multiple GO associations, this number exceeds the total number of human proteins significantly.

 -   P-value: obtained from the Fisher’s exact test using 2x2 contingency table

 -   Fold ratio: comparison of target genes with whole genome background

 Clicking on blue number (14 out of 922) gives the list of 14 target proteins annotated in this specific GO node. As for the pathway  and disease nodes, this number indicates the number of target genes, not proteins since those annotations are based on gene  symbol.

 

2.B.b. Pathway Databases   top

Genes in the KEGG/GenMAPP/BioCarta pathways were obtained from ArrayXPath database, which included 65 KEGG, 322 GenMAPP, and 45 BioCarta pathways for human. KEGG pathways and GeneMAPP are available for mouse. Note that pathway database is not in hierarchical unlike other two ontology-based annotations.

Output format is almost identical to the GO analysis. Sorting and listing gene names are also supported.

 

2.B.c. Disease Ontology from Ingenuity Pathway Analysis   top

IPA's gene to disease mapping from Ingenuity Systems was used to test disease enrichment of miRNA targets. IPA's disease classification system consists of more than 7000 terms organized in 3 hierarchical levels of depth. Terms in top nodes are of general nature such as genetic disorder or inflammatory disease. Thus, we performed the enrichment test at the leaf nodes to obtain the specific disease information.

 

2.C.  Correlated Expression Analysis   top

Expression correlation can be a direct or indirect consequence of miRNA binding/targeting. Reciprocal expression pattern is expected for genuine targets and the pairs of high correlation between miRNA and apparent non-targets may indicate indirect targeting. We calculate the Pearson correlation coefficients between miRNA and target mRNA as well as miRNA and target protein if the data are available. Expression data for correlation study are strictly limited to those from the identical samples.

-   Mouse: Hughes and coworkers generated a series of genome-wide expression data for mouse genome using homogeneous samples.     MiRNA microarray data are available for 78 miRNAs in 17 tissues. Their mRNA expression data cover 55 tissues and the proteomic data     include 4768 proteins in 6 organs. We found 4 common organs where all expression data (miRNA, mRNA, protein) are available and use     those values to evaluate the expression correlation.

-   Human: Golub and coworkers published expression profile of mRNA and 217 miRNAs in 334 samples. No global proteomic data in     multiple tissues are available for human to the best of our knowledge, and we simply compared the expression profiles of miRNA and     mRNA in 11 common tissues.

-   Thus, the expression correlation analysis for mouse covers miRNA, mRNA, and proteins, whereas only the expression correlation between     miRNA and mRNA is available for human. However, human data covers diverse tissue types.

 

Correlated expression table

Target genes can be sorted according to the correlation coefficients in descending or ascending order. Link to detailed information on target binding and correlated expression pattern is also provided for each miRNA-mRNA pair. See the "Detailed information on miRNA-target mRNA pair" section for more details (in Part A).

 

3. 'miRNA with inferred function' Module   top

This is a simple utility for reciprocal search of miRNAs that are statistically enriched in specific nodes of functional annotation categories. Our implementation supports any node id in the GO classification and all pathways. As for the disease search, we support just 29 representative terms as provided in the drop-down box.

 

4. miRNA Expression Module   top

Expression profiling is an important part of functional annotation. The purpose of miRNA expression profiling module is to visualize miRNA expression and to obtain information on differential regulation in various situations. We integrated various miRNA-related expression data from the GEO database and built a compendium of miRNA expression data in a similar fashion to the Oncomine cancer profiling database. Each data set was analyzed for differentially regulated miRNAs after quantile normalization. Simple interface was developed to visualize the expression profile of miRNAs and to address the issue of differential regulation in various situations. Current data covers 12 datasets, 106 comparison studies, 566 samples, most of those being human cases.

 -  This utility is under active development. Documentation below may not reflect recent improvements and changes.

4.A.  Expression profiling of a miRNA   top

 -  Search by miRNA name leads to the list of comparison studies that include the miRNA of interest. Link to the list of miRNAs with     expression profiling information is provided for user convenience. Output screen shows the list of comparison experiments containing the     miRNA of interest. Partial output is shown below for hsa-miR-155.

 

 -  Samples were divided in two classes in each comparison study. We provide two types of plots

 -  the box and bar plots as shown below.

        

 -  The box plot shows the upper and lower 25 percentile as a box where the median is indicated inside the box. Refer to the Wikipedia for      more details (www.wikipedia.org).

 -  The bar plot shows the normalized expression value of all samples. Samples in two classes are colored differently (red and green). Each      vertical bar represents a sample.

 -  The box and bar plots are available whenever a miRNA and two classes of samples are specified.

 

4.B. Differential expression studies   top

All comparison studies were classified into three types of differential expression.

                           

   <tissue/organ>                               <cell type>                             <condition>

Tissue/organ-dependent miRNAs

 -   Expression tissue/organ group is the differential expression studies in terms of tissue or organs. Currently available set of tissue/organs are       provided in the drop-down box.

 -   Choosing 'lung' shows the list of lung-related comparison studies as shown below.

 -   Two studies were found as lung-related differential expression studies. The first study compared the poorly differentiated tumor of lung with       a pool of other tissues. The second study compared lung tissues of normal versus cancer.

 -   We provide the link to tables of up-regulated miRNAs, down-regulated miRNAs, and differentially-regulated (up + down) miRNAs. The       sample output, shown below, includes links to the box and bar plots where more detailed expression is visualized.

 

Condition-dependent miRNAs

 -   These are of particular interest since they may elucidate cancer-related miRNAs. Seven studies (bladder, breast, colon, eye, kidney, lung,      and uterus) are available for comparing cancer versus normal tissues.

    

 -   Links to tables and plots are identical to other differential expression studies.

 

 

5. Data Sources   top

Genome maps: UCSC genome maps of the NCBI Build 35 (hg17) and the NCBI Build 35 (mm7) were used for the human and mouse genomes

 miRNA target predictions

-    Results from three programs (miRanda, PicTar, TargetScanS) are available. Refer to Table 1 for miRNA coverage and statistics.

 -   Genome-wide prediction results from the PicTar and TargetScanS programs were obtained from the UCSC genome browser database.       Note that these calculations were performed about 2 year ago and do not cover miRNAs that were identified recently.

 -   Target genes from miRanda 4.0 were obtained from the miRBase website (Sep. 2007) where the most up-to-date information were      available. Downloaded targets on the current genomes (hg18 & mm8) were lifted back to the previous genomes (hg17 & mm7) for      consistency in genome version.

 Functional annotations

 -   Gene Ontology (GO) mapping: GOA association table (UniProt ID to GO nodes) obtained from the GO web site was used for GO       enrichment analysis. Mapping of target mRNAs to proteins (UniProt ID) was achieved by using the UCSC kgXref table.

 -   Genes in the KEGG/GenMAPP/BioCarta pathways were obtained from ArrayXPath database that collected these data almost manually.

 -   Gene-to-disease association was obtained from the annotation data of Ingenuity Pathway Analysis (IPA). IPA's disease classification       system consists of more than 7000 terms organized in 3 hierarchical levels of depth.

 Correlated expression data

 -   Human (Golub and coworkers): mRNA expression and 217 miRNAs in 334 samples. No global proteomic data in multiple tissues are      available for human yet. Thus only the expression correlation between miRNA and mRNA is available for human.

 -   Mouse (Hughes and coworkers): MiRNA microarray data are available for 78 miRNAs in 17 tissues. Their mRNA expression data cover      55 tissues and the proteomic data include 4768 proteins in 6 organs.

 -   Refer to the manuscript for references to these expression data.

 miRNA expression profiling

 -   12 GEO datasets for miRNA expression profiles (566 samples)

 -   106 comparison studies were set up for differential expression studies.