0% found this document useful (0 votes)

28 views

Mastriani2018 Protocol Microarray-BasedMicroRNAExpres

This chapter describes a standard analysis pipeline for microRNA microarray data using Bioconductor packages in R. The pipeline includes preprocessing and normalization of microarray data, differential expression analysis, target gene prediction, and functional annotation. An example dataset is preprocessed and normalized to demonstrate the pipeline.

Uploaded by

emilio

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Mastriani2018 Protocol Microarray-BasedMicroRNAExpres

Uploaded by

emilio

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Chapter 9

Microarray-Based MicroRNA Expression Data Analysis

with Bioconductor
Emilio Mastriani, Rihong Zhai, and Songling Zhu

Abstract
MicroRNAs (miRNAs) are small, noncoding RNAs that are able to regulate the expression of targeted
mRNAs. Thousands of miRNAs have been identified; however, only a few of them have been functionally
annotated. Microarray-based expression analysis represents a cost-effective way to identify candidate
miRNAs that correlate with specific biological pathways, and to detect disease-associated molecular signa-
tures. Generally, microarray-based miRNA data analysis contains four major steps: (1) quality control and
normalization, (2) differential expression analysis, (3) target gene prediction, and (4) functional annota-
tion. For each step, a large couple of software tools or packages have been developed. In this chapter, we
present a standard analysis pipeline for miRNA microarray data, assembled by packages mainly developed
with R and hosted in Bioconductor project.

Key words MicroRNA (miRNA), Bioconcductor, R Package, Gene expression analysis, Microarray
data analysis

1 Introduction

MicroRNAs (miRNAs) are small, noncoding and conserved RNA

molecules that can inhibit protein expression by post-
transcriptional regulation or translational repression. More than
20,000 different miRNAs have been disclosed among hundreds
of species [1]. Although miRNAs play important roles in various
biological processes, the function has only been well clarified for a
small subset.
The expression profile of miRNAs often shows developmental
stage or tissue specific patterns, suggesting that they may partici-
pate in the specific regulatory processes [2, 3]. Microarray is attrac-
tive to profile the miRNA expression under different conditions
because it can detect thousands of miRNAs simultaneously
[4]. Compared with other high-throughput technique, such as
RNA-Seq, the cost of microarray-based studies appears much

Yejun Wang and Ming-an Sun (eds.), Transcriptome Data Analysis: Methods and Protocols, Methods in Molecular Biology,
vol. 1751, https://doi.org/10.1007/978-1-4939-7710-9_9, © Springer Science+Business Media, LLC 2018

127
128 Emilio Mastriani et al.

lower and hundreds or thousands of biological samples can be

studied in one experiment with a cost-effective way.
There is some difference between the analytic pipelines of
miRNA and other microarray-based expression data. Besides the
routine preprocessing, expression comparison and functional anno-
tation, miRNA data also involve additional target prediction and
target gene annotation steps. For each step, a large number of
bioinformatic tools have been developed. Experimental researchers
will struggle to find, assemble and test the tools for the task of each
step. In this chapter, we are going to present a pipeline specific for
microarray-based miRNA expression data analysis. The pipeline is
assembled by packages mostly hosted in Bioconductor project, and
therefore all the analysis can be completed in R environment con-
veniently (R: http://www.r-project.org; Bioconductor: http://
www.bioconductor.org).

2 Materials

2.1 Software Tools The most recent version of R was downloaded and installed. For
this chapter, Linux platform is used. For R installation and admin-
2.1.1 R/Bioconductor
istration, the FAQs and documents can be referred: https://www.r-
project.org/. Bioconductor can be installed by entering the follow-
ing commands after starting R:
> source("https://bioconductor.org/biocLite.R")
> biocLite()

2.1.2 Installation Install the R/Bioconductor packages for miRNA microarray data
of R/Bioconductor analysis with biocLite(). The packages are summarized in
Packages Table 1 [5–16].
> biocLite(c("Biobase", "GEOquery", "limma", "mclust",
"devtools",
+ "GOstats","gplots","networkD3","miRNAtap","miRNAtap.db",
+ "visNetwork","SpidermiR"))

2.2 Datasets A public available dataset, GSE54578, is used as an example for

demonstration (https://www.ncbi.nlm.nih.gov/geo/query/acc.
cgi?acc¼GSE54578). The study profiles genome-wide miRNA
expression in blood from 15 early-onset schizophrenia cases and
15 healthy controls, detecting a total of 1070 miRNAs by the
microarrays [17]. A GPL16016 platform (Exiqon miRCURY
LNA microRNA array) was used [17]. The dataset can be down-
loaded through the link directly; alternatively, it can be accessed
with “getGEO” function of the “GEOquery” package.

> library("GEOquery")
> gset <- getGEO("GSE54578",GSEMatrix=TRUE,AnnotGPL=FALSE)
MicroRNA Analysis Pipeline 129

Table 1
R packages used in the chapter for miRNA data analysis

Package name Short description

Biobase [5] Functions that are needed by many other packages or which replace R functions
devtools [6] Collection of package development tools
GOstats [7] Tools for manipulating GO and microarrays
GEOquery [8] GEOquery is the bridge between GEO and BioConductor
gplots [9] Various R programming tools for plotting data
limma [10] Data analysis, linear models and differential expression for microarray data
mclust [11] Gaussian finite mixture models fitted via EM algorithm for model-based clustering,
classification, and density estimation
miRNAtap microRNA targets aggregated predictions
[12]
miRNAtap.db Holding the database for miRNAtap
[13]
networkD3 Creates ‘D3’ ‘JavaScript’ network, tree, dendrogram, and Sankey graphs from ‘R’
[14]
SpidermiR [15] The package provides multiple methods for query, prepare and download network
data, and the integration with validated and predicted miRNA data and the use of
standard analysis and visualization methods
visNetwork Provides an R interface to the ‘vis.js’ JavaScript charting library
[16]

> if(length(gset)>1) idx <- grep("GPL16016",attr(gset,"-

names")) else idx <- 1
> gset <- gset[[idx]]

The GSE54578 dataset is now stored in gset, which will be

used for further processing and analysis.

3 Methods

3.1 Preprocessing The original miRNA expression data could contain some “NA”
and Normalization values and the columns are named with GSM accessions in default.
The data structure and content can be shown with “head(exprs
3.1.1 Preprocessing
(gset))” command (Fig. 1a). In the preprocessing step, we may
wish to remove all the “NA” records and rename the columns with
user-readable format (Fig. 1b).

> head(exprs(gset))
> rmv <- which(apply(exprs(gset),1,function(x) any (is.na
(x))))
130 Emilio Mastriani et al.

Fig. 1 Preprocessing of miRNA microarray data. (a) Raw expression data containing “NA” values. (b) “NA”
filtered expression data. (c) Variance among samples before normalization. (d) Variance among samples after
normalization

> gset <- gset[-rmv,]

> sampleNames(gset) <- c("CTRL1", . . .,"CTRL15","SCHIZO1",. . .,"SCHIZO15")
> gsms <-"000000000000000111111111111111" #Grouping names
> sml <- c()
> for(i in 1:nchar(gsms)) {sml[i] <- substr(gsms,i,i)}
> head(exprs(gset))

Note that the “CTRL2”~“CTRL14” and “SCHI-

ZO1”~“SCHIZO15” were omitted in the demonstrated
command line.
Before normalization, the probe intensities should be checked
to find out the apparent outliers caused by nonsystem errors. These
outliers must be excluded for further analysis. Typically, a “box-
plot” can be generated and show the uniformity of the signal
intensity.

> ex <- exprs(gset)

> boxplot(ex, which=‘pm’, ylab="Intensities", xlab="Array names")
MicroRNA Analysis Pipeline 131

After recalling and filtering the arrays with apparent experimen-

tal biases, the general signal intensity distribution should follow the
distribution patterns as in Fig. 1c, with small variance among arrays.

3.1.2 Normalization After preprocessing, the microarray data must be normalized to get
rid of variations with nonbiological sources. A large number of
methods have been proposed to normalize microarray-based tran-
scriptome data. The methods are suited for different platforms and
integrated in packages for corresponding data analysis, e.g., “Nor-
miR” function in the “ExiMiR” package for two-color microarray
experiments using a common reference or similar methods in the
“affy” package for single-channel Affymetrix arrays, “normal-
izeBetweenArrays” function in the “limma” package, etc. In
the example, “normalizeBetweenArrays” is applied, with a
quantile normalization procedure.
> library("limma")
> ex_norm <- normalizeBetweenArrays(ex)
> qu <- as.numeric(quantile(ex,c(0.,0.25,0.5,0.75,0.99,1.0),
na.rm=T))
> filt <- ( qu[5]>100 || (qu[6]-qu[1]>50 && qu[2]>0) || (qu[2]>
0 && qu[2]<1 && qu[4]>1
&& qu[4]<2))
> if(filt){ex_norm[which(ex<=0)] <- NaN; exprs(gset) <- log2
(ex_norm)}

A log2 transformation is done to the normalized expression

values to make the data follow Gaussian distribution more approxi-
mately. A boxplot generated with the normalized data shows more
even distribution of the expression levels among different arrays
(Fig. 1d).

3.2 Expression The normalized expression data can be compared directly between
Difference groups. T Test is the most straightforward statistic comparison
and Clustering method between two groups, which will measure the significance
Analysis of difference with probability of no difference ( p values: the lower,
the more significant). For microarray data, tens of thousands of
genes are compared between groups simultaneously and it is a
massive multiple testing problem. It is more complicated that the
measured expression levels do not always follow normal distribu-
tions and have nonidentical and dependent distributions between
genes. To solve this problem and identify the differentially
expressed genes more precisely, Smyth proposed an empirical
Bayes moderated t test, which has been incorporated into the
“limma” package [10]. An example is shown as following, and
more details about the usage of “eBayes” can refer to the docu-
ment: http://web.mit.edu/~r/current/arch/i386_linux26/lib/
R/library/limma/html/ebayes.html.
132 Emilio Mastriani et al.

> sml <- paste("G",sml,sep="")

> fl <- as.factor(sml)
> gset$description <- fl
> design <- model.matrix(~ description + 0, gset)
> colnames(design) <- levels(fl)
> fit <- lmFit(gset,design)
> cont.matrix <- makeContrasts(G1-G0,levels=design)
> fit2 <- contrasts.fit(fit,cont.matrix)
> fit2 <- eBayes(fit2,0.01)
> tT <- topTable(fit2,adjust="fdr",sort.by="B",number=1000)

The comparison results are stored in objects fit2 and tT, which
will be used for further analysis.
Besides the significance measured by the statistic p values, the
fold change amplitude of miRNA gene expression levels also
appears important to biologists. A volcano plot can show the
statistic significance and change amplitude in a two-dimensional
plane simultaneously, which plots the fold change and p values
(log-transformed results) on x- and y-axis respectively (Fig. 2a).
The “volcanoplot” function in the “limma” package can be
applied conveniently. Note that the ‘highlight’ argument indicates
the top probe sets are highlighted. Other packages such as
“ggplot2” also have functions to draw volcano plots.
> volcanoplot(fit2,coef=1,highlight=10)

Alternatively, basic R plot function can also generate the vol-

cano plot.

a b
2.0

168789
148624
11058
17332
46829
168809
145984
145833
42609
42514
148234
46869
42801
1.5

148049
46752
11134
145705
169188
168637
147767
168878
27720
146008
168844
29575
-log10p

169305
17953
42540
miRNA probes

148622
147940
1.0

42513
10975
168871
13147
10952
168955
148247
42490
148491
147632
147806
27740
169167
169035
46479
46866
147588
0.5

169185
10964
145633
145647
168648
42782
42522
27537
46810
168709
148032
168769
46380
42808
17898
17822
168722
0.0

146163
17904
169171
27672
SCHIZO1
CTRL4
CTRL2
CTRL3
CTRL6
CTRL7
SCHIZO3
SCHIZO7
CTRL1
CTRL5
CTRL12
CTRL11
SCHIZO6
CTRL15
CTRL8
SCHIZO9
CTRL13
CTRL10
CTRL14
SCHIZO2
SCHIZO15
SCHIZO10
CTRL9
SCHIZO14
SCHIZO11
SCHIZO12
SCHIZO13
SCHIZO5
SCHIZO4
SCHIZO8

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0

log-ratio Array names

Fig. 2 Volcano plot and heat map of miRNA expression data. (a) Volcano plot showing the differentially
expressed miRNAs between disease and control samples. (b) Clustering the samples and genes with
expression patterns of significantly differential miRNAs
MicroRNA Analysis Pipeline 133

> lod <- -log10(tT$adj.P.Val)

> plot(tT$logFC,lod,xlab="log-ratio",ylab=expression(-log[10]~p))
> abline(h=1.5,col="red")

As in other transcriptome data analysis, besides gene expression

difference analysis, clustering analysis can also be performed for
miRNA microarray data. For example, a simple heatmap plot can
be generated for a subset of the miRNAs with significant expression
difference between disease and control (Fig. 2b; FDR adjusted p-
value < 0.05).
> selected <- which(p.adjust(fit2$p.value[,1]<0.05) == 1)
> esetSel <- ex_norm[selected,]
> heatmap(esetSel)

For more in-depth clustering analysis, readers can refer to

Chapter 2 of the book, since the procedure and tools are general
rather than specific for miRNA datasets.

3.3 miRNA Target The difference between miRNA and general transcriptome data
Analysis analysis is mainly represented by the specific target gene analysis
of the former. The major activity of miRNAs is to regulate the
3.3.1 Target
expression of target genes posttranscriptionally or translationally,
Identification
and therefore annotation of the target genes of interesting miRNAs
appears important.
There are multiple options to identify target genes of miRNAs.
For example, Brock et al proposed a pipeline for miRNA target
analysis with R packages “targetscan.Mm.eg.db”, “micro-
RNA” and “org.Mm.eg.db”. In the example shown below, an
integrated package “SpidermiR” is adopted, which provides
both validated and predicted target genes from multiple databases
or software tools including mirWalk [18], miR2Disease [19], miR-
Tar [20], miRTarBase [21], miRandola [22], Pharmaco-miR [23],
DIANA [24], Miranda [25], PicTar [26], and TargetScan [27]. It
can also retrieve and visualize the gene networks. The following
commands give an example of target gene determination for some
interesting miRNAs, e.g., the top significant five miRNAs with
expression difference between groups (see Note 1). The potential
targets of these miRNAs will be predicted with SpidermiRdown-
load_miRNAprediction and exported to mirnaTar.

> tT[selected,]$Name[1:5]
> mirna <-
c(’hsa-miR-4429’,’hsa-miR-1827’,’hsa-miR-5002-5p’,’hsa-miR-
5187-3p’,’hsa-miR-4455’)
> mirnaTar <- SpidermiRdownload_miRNAprediction(mirna_list=-
mirna)
134 Emilio Mastriani et al.

The data frame of mirnaTar can be checked with head(mir-

naTar), and there are two columns, V1 showing miRNA names
and V2 listing the target genes.
Note that SpidermiRdownload_miRNAprediction gave
the prediction targets of four tools: DIANA, Miranda, PicTar,
and TargetScan. The validated targets could be downloaded from
miRTAR and miRwalk with SpidermiRdownload_miRNAvali-
date function.

3.3.2 Network and Gene Network analysis and visualization can show not only the shared
Set Enrichment Analysis targets of multiple miRNAs, but also the interactions and pathways
among the target genes. There are many tools developed for net-
work building and visualization, e.g., user-friendly interfaced tool
Cytoscape [28], R package SpidermiR [15]. Here, we use Cytos-
cape to construct the regulatory network between the miRNAs
(top significant 5) and their predicted targets (50 for each
miRNA), since Cytoscape is quite straightforward and particularly
useful for network construction with user-customized interactions
(Fig. 3a) (see Note 2). GeneMANIA curates validated and pre-
dicted networks between genes from a variety of species [29]. The
network types include coexpression, colocalization, genetic inter-
actions, pathway, physical interactions, shared protein domains, and
predicted interactions. GeneMANIA also provides a webserver to
implement the network construction. SpidermiR can download the
interaction data from GeneMANIA and visualize the networks
among the user-customized genes, and the functions are still
being debugged and updated. Here, we directly use the GeneMA-
NIA prediction server (http://genemania.org/) to construct the
pathway network of miRNA target genes (Fig. 3b) (see Note 3).
Besides the network analysis, statistics-based gene set enrich-
ment analysis (GSEA) should be done for the miRNAs and miRNA
targets, so as to find biological meanings and help increase the
statistical power through aggregating the signal across groups of
related genes. GOstats and a number of other R/Bioconductor
packages (e.g., GeneAnswers [30]) can make the enrichment
analysis with hypergeomtric tests (hyperGTest function for
GOstats). As an example, we use GOstats to make GO enrich-
ment analysis (Biological Process) to the predicted target genes of
the top 5 miRNAs (see Note 4).

> library("org.Hs.eg.db")
> library("GSEABase")
> library("GOstats")
> mirTarget <- mirnaTar$V2
> goAnn <- get("org.Hs.egGO")
> universe <- Lkeys(goAnn)
> entrezIDs <- mget(mirTarget, org.Hs.egSYMBOL2EG, ifnotfound=NA)
> entrezIDs <- as.character(entrezIDs)
MicroRNA Analysis Pipeline 135

Fig. 3 Interaction networks among miRNAs and their targets. (a) Regulatory network between miRNAs and
target genes. (b) Pathway sub-network among the miRNA target genes

> params <- new("GOHyperGParams",

+ geneIds=entrezIDs,
+ universeGeneIds=universe,
+ annotation="org.Hs.eg.db",
+ ontology="BP",
+ pvalueCutoff=0.01,
+ conditional=FALSE,
+ testDirection="over")
> goET <- hyperGTest(params)
> library(Category)
> genelist <- geneIdsByCategory(goET)
> genelist <- sapply(genelist, function(.ids) {
+ .sym <- mget(.ids, envir=org.Hs.egSYMBOL, ifnotfound=NA)
+ .sym[is.na(.sym)] <- .ids[is.na(.sym)]
+ paste(.sym, collapse=";")
+ })
> GObp <- summary(goET)
> GObp$Symbols <- genelist[as.character(GObp$GOBPID)]
> head(GObp)

KEGG enrichment can also be performed:

> keggAnn <- get("org.Hs.egPATH")

> universe <- Lkeys(keggAnn)
> params <- new("KEGGHyperGParams",
+ geneIds=entrezIDs,
136 Emilio Mastriani et al.

+ universeGeneIds=universe,
+ annotation="org.Hs.eg.db",
+ categoryName="KEGG",
+ pvalueCutoff=0.01,
+ testDirection="over")
> keggET <- hyperGTest(params)
> kegg <- summary(keggET)
> library(Category)
> genelist <- geneIdsByCategory(keggET)
> genelist <- sapply(genelist, function(.ids) {
+ .sym <- mget(.ids, envir=org.Hs.egSYMBOL, ifnotfound=NA)
+ .sym[is.na(.sym)] <- .ids[is.na(.sym)]
+ paste(.sym, collapse=";")
+ })
> kegg$Symbols <- genelist[as.character(kegg$KEGGID)]
> head(kegg)

4 Notes

1. For illustration convenience, the top five miRNAs are selected

for target analysis. In practice, all the meaningful miRNAs
should be analyzed for targets. For target prediction, multiple
prediction tools should be combined and the intersected set will
be selected for further analysis if the number of prediction
results is large.
2. Cytoscape can be downloaded from http://www.cytoscape.org.
There is a detailed manual demonstrating how to install and use
the tool. To visualize the interaction network of miRNAs and
their target genes, a two-column table is prepared in which the
first column records miRNAs and the second records the
corresponding targets. Directly import the interaction table to
Cytoscape, indicate the interaction sources and targets, and then
draw the network with directions.
3. GeneMANIA curates several categories of gene interaction data-
bases, and the database(s) can be selected in the server for
network prediction. In the GeneMANIA prediction webserver
(http://genemania.org), simply copy the gene symbols (one per
line) into the input area, select the desired database(s) and run
prediction.
4. Besides GOstats, there are also other R packages making Gene
Set Enrichment Analysis (GSEA). Chapter 3 in this book can be
referred to, which gives a comprehensive introduction on the
methods and related packages. The website of Gene Ontology
Consortium (http://geneontology.org) also presents an online
GO enrichment analysis tool, and it would be an easy choice.
MicroRNA Analysis Pipeline 137

References
1. Kozomara A, Griffiths-Jones S (2014) miR- 11. Scrucca L, Fop M, Murphy TB, Raftery AE
Base: annotating high confidence microRNAs (2016) mclust 5: clustering, classification and
using deep sequencing data. Nucleic Acids Res density estimation using Gaussian finite mix-
42(Database issue):D68–D73. https://doi. ture models. R J 8(1):289–317
org/10.1093/nar/gkt1181 12. Pajak M, Simpson TI (2016) miRNAtap: miR-
2. McCall MN, Kim MS, Adil M, Patil AH, Lu Y, NAtap: microRNA targets – aggregated predic-
Mitchell CJ, Leal-Rojas P, Xu J, Kumar M, tions. R package version 1.8.0.
Dawson VL, Dawson TM, Baras AS, Rosen- 13. Pajak M, Simpson TI (2016) miRNAtap.db:
berg AZ, Arking DE, Burns KH, Pandey A, data for miRNAtap. R package version
Halushka MK (2017) Toward the human cel- 0.99.10.
lular microRNAome. Genome Res. https:// 14. Allaire JJ, Gandrud C, Russell K, Yetman CJ
doi.org/10.1101/gr.222067.117 (2017) networkD3: D3 JavaScript network
3. Otto T, Candido SV, Pilarz MS, Sicinska E, graphs from R. R package version 0.4.
Bronson RT, Bowden M, Lachowicz IA, https://CRAN.R-project.org/
Mulry K, Fassl A, Han RC, Jecrois ES, Sicinski package¼networkD3.
P (2017) Cell cycle-targeting microRNAs pro- 15. Cava C, Colaprico A, Bertoli G, Graudenzi A,
mote differentiation by enforcing cell-cycle Silva TC, Olsen C, Noushmehr H,
exit. Proc Natl Acad Sci U S A 114 Bontempi G, Mauri G, Castiglioni I (2017)
(40):10660–10665. pii 201702914. https:// SpidermiR: an R/bioconductor package for
doi.org/10.1073/pnas.1702914114 integrative analysis with miRNA data. Int J
4. Gao L, Jiang F (2016) MicroRNA (miRNA) Mol Sci 18(2.): pii: E274). https://doi.org/
profiling. Methods Mol Biol 1381:151–161 10.3390/ijms18020274
5. Huber W, Carey VJ, Gentleman R, Anders S, 16. Almende BV, Thieurmel B, Robert T (2017)
Carlson M, Carvalho BS, Bravo HC, Davis S, visNetwork: network visualization using ‘vis.js’
Gatto L, Girke T, Gottardo R, Hahne F, Han- library. R package version 2.0.1. https://
sen KD, Irizarry RA, Lawrence M, Love MI, CRAN.R-project.org/package¼visNetwork.
MacDonald J, Obenchain V, Oleś AK, 17. Zhang F, Xu Y, Shugart YY, Yue W et al (2015)
Pagès H, Reyes A, Shannon P, Smyth GK, Converging evidence implicates the abnormal
Tenenbaum D, Waldron L, Morgan M microRNA system in schizophrenia. Schizophr
(2015) Orchestrating high-throughput geno- Bull 41(3):728–735
mic analysis with Bioconductor. Nat Methods
12(2):115–121. https://doi.org/10.1038/ 18. Dweep H, Sticht C, Pandey P, Gretz N (2011)
nmeth.3252 miRWalk--database: prediction of possible
miRNA binding sites by “walking” the genes
6. Wickham H, Chang W (2017) devtools: tools of three genomes. J Biomed Inform 44
to make developing R packages easier. R pack- (5):839–847
age version 1.13.3. https://CRAN.R-project.
org/package¼devtools. 19. Jiang Q, Wang Y, Hao Y, Juan L, Teng M,
Zhang X, Li M, Wang G, Liu Y (2009) miR2-
7. Falcon S, Gentleman R (2007) Using GOstats Disease: a manually curated database for micro-
to test gene lists for GO term association. Bio- RNA deregulation in human disease. Nucleic
informatics 23(2):257–258 Acids Res 37(Database issue):D98–104
8. Davis S, Meltzer PS (2017) GEOquery: a 20. Hsu JB, Chiu CM, Hsu SD, Huang WY, Chien
bridge between the Gene Expression Omnibus CH, Lee TY, Huang HD (2011) miRTar: an
(GEO) and BioConductor. Bioinformatics 23 integrated system for identifying miRNA-
(14):1846–1847 target interactions in human. BMC Bioinfor-
9. Warnes GR, Bolker B, Bonebakker L, et al. matics 12:300. https://doi.org/10.1186/
(2016) gplots: various R programming tools 1471-2105-12-300
for plotting data. R package version 3.0.1. 21. Hsu SD, Lin FM, Wu WY, Liang C, Huang
https://CRAN.R-project.org/ WC, Chan WL, Tsai WT, Chen GZ, Lee CJ,
package¼gplots. Chiu CM, Chien CH, Wu MC, Huang CY,
10. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Tsou AP, Huang HD (2011) miRTarBase: a
Shi W, Smyth GK (2015) limma powers differ- database curates experimentally validated
ential expression analyses for RNA-sequencing microRNA-target interactions. Nucleic Acids
and microarray studies. Nucleic Acids Res 43 Res 39(Database issue):D163–D169. https://
(7):e47 doi.org/10.1093/nar/gkq1107
138 Emilio Mastriani et al.

22. Russo F, Di Bella S, Nigita G, Macca V, 26. Krek A, Grün D, Poy MN, Wolf R,
Laganà A, Giugno R, Pulvirenti A, Ferro A Rosenberg L, Epstein EJ, MacMenamin P, da
(2012) miRandola: extracellular circulating Piedade I, Gunsalus KC, Stoffel M, Rajewsky N
microRNAs database. PLoS One 7(10): (2005) Combinatorial microRNA target pre-
e47786. https://doi.org/10.1371/journal. dictions. Nat Genet 37(5):495–500
pone.0047786 27. Agarwal V, Bell GW, Nam J, Bartel DP (2015)
23. Rukov JL, Wilentzik R, Jaffe I, Vinther J, Predicting effective microRNA target sites in
Shomron N (2014) Pharmaco-miR: linking mammalian mRNAs. eLife 4:e05005
microRNAs and drug effects. Brief Bioinform 28. Saito R, Smoot ME, Ono K, Ruscheinski J,
15(4):648–659. https://doi.org/10.1093/ Wang PL, Lotia S, Pico AR, Bader GD, Ideker
bib/bbs082 T (2012) A travel guide to cytoscape plugins.
24. Maragkakis M, Reczko M, Simossis VA, Nat Methods 9(11):1069–1076. https://doi.
Alexiou P, Papadopoulos GL, Dalamagas T, org/10.1038/nmeth.2212
Giannopoulos G, Goumas G, Koukis E, 29. Montojo J, Zuberi K, Rodriguez H, Bader GD,
Kourtis K, Vergoulis T, Koziris N, Sellis T, Morris Q (2014) GeneMANIA: fast gene net-
Tsanakas P, Hatzigeorgiou AG (2009) work construction and function prediction for
DIANA-microT web server: elucidating micro- cytoscape. F1000Res 3(153). https://doi.org/
RNA functions through target prediction. 10.12688/f1000research.4572.1. eCollection
Nucleic Acids Res 37(Web Server issue): 2014
W273–W276 30. Feng G, Shaw P, Rosen ST, Lin SM, Kibbe WA
25. John B, Enright AJ, Aravin A, Tuschl T, (2012) Using the bioconductor GeneAnswers
Sander C, Marks DS (2004) Human Micro- package to interpret gene lists. Methods Mol
RNA targets. PLoS Biol 2(11):e363 Biol 802:101–112. https://doi.org/10.1007/
978-1-61779-400-1_7

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6388)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (634)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1160)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (983)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4/5 (8302)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (633)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1254)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4/5 (10337)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (933)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (887)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1007)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4/5 (3237)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (297)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5058)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4346)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (458)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2091)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1993)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2283)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1077)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2780)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2032)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2838)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (692)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (1912)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4086)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (76)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (830)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (906)
Week 2
0% (1)
Week 2
17 pages
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (143)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2544)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M L Stedman
4.5/5 (813)
How To Write Your Paper: An Easy Guide
No ratings yet
How To Write Your Paper: An Easy Guide
4 pages
FROM: Preparing A Manuscript For Publication: A User-Friendly Guide (PMC2528624) Step 1: Finding The Time To Think
No ratings yet
FROM: Preparing A Manuscript For Publication: A User-Friendly Guide (PMC2528624) Step 1: Finding The Time To Think
3 pages
Assembly Procedure
No ratings yet
Assembly Procedure
1 page
Installation Softwareraid: Fakeraidhowto
No ratings yet
Installation Softwareraid: Fakeraidhowto
6 pages
Procedure To Count RPKM Using CONIFER: For I in 'Find Tumor - ERP10142 - ERR8636 - Name ' RPKM'' Do LL $i Done
No ratings yet
Procedure To Count RPKM Using CONIFER: For I in 'Find Tumor - ERP10142 - ERR8636 - Name ' RPKM'' Do LL $i Done
1 page
Psipred Tutorial
No ratings yet
Psipred Tutorial
4 pages
Affy Diffexp Clustering Exercise-1
No ratings yet
Affy Diffexp Clustering Exercise-1
16 pages
Secondary Structure Prediction
No ratings yet
Secondary Structure Prediction
7 pages
It's Probably Me - Sting From "Ten Summoner's Tales" Album Transcription: Luis Ferreira
100% (1)
It's Probably Me - Sting From "Ten Summoner's Tales" Album Transcription: Luis Ferreira
1 page
Technical Note: Guide To Probe Logarithmic Intensity Error (PLIER) Estimation
No ratings yet
Technical Note: Guide To Probe Logarithmic Intensity Error (PLIER) Estimation
10 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (277)

Mastriani2018 Protocol Microarray-BasedMicroRNAExpres

Uploaded by

Mastriani2018 Protocol Microarray-BasedMicroRNAExpres

Uploaded by

Chapter 9

Microarray-Based MicroRNA Expression Data Analysis

MicroRNAs (miRNAs) are small, noncoding and conserved RNA

lower and hundreds or thousands of biological samples can be

2.2 Datasets A public available dataset, GSE54578, is used as an example for

Package name Short description

> if(length(gset)>1) idx <- grep("GPL16016",attr(gset,"-

The GSE54578 dataset is now stored in gset, which will be

> gset <- gset[-rmv,]

Note that the “CTRL2”~“CTRL14” and “SCHI-

> ex <- exprs(gset)

After recalling and filtering the arrays with apparent experimen-

A log2 transformation is done to the normalized expression

> sml <- paste("G",sml,sep="")

Alternatively, basic R plot function can also generate the vol-

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0

> lod <- -log10(tT$adj.P.Val)

As in other transcriptome data analysis, besides gene expression

For more in-depth clustering analysis, readers can refer to

The data frame of mirnaTar can be checked with head(mir-

> params <- new("GOHyperGParams",

KEGG enrichment can also be performed:

> keggAnn <- get("org.Hs.egPATH")

1. For illustration convenience, the top five miRNAs are selected

You might also like