Transcription

Graphics and Data Visualization in ROverviewThomas GirkeDecember 13, 2013Graphics and Data Visualization in RSlide 1/121

OverviewGraphics EnvironmentsBase GraphicsGrid Graphicslatticeggplot2Specialty GraphicsGenome GraphicsggbioAdditional Genome GraphicsClusteringBackgroundHierarchical Clustering ExampleNon-Hierarchical Clustering ExamplesGraphics and Data Visualization in RSlide 2/121

OutlineOverviewGraphics EnvironmentsBase GraphicsGrid Graphicslatticeggplot2Specialty GraphicsGenome GraphicsggbioAdditional Genome GraphicsClusteringBackgroundHierarchical Clustering ExampleNon-Hierarchical Clustering ExamplesGraphics and Data Visualization in ROverviewSlide 3/121

Graphics in RPowerful environment for visualizing scientific dataIntegrated graphics and statistics infrastructurePublication quality graphicsFully programmableHighly reproducibleFull LATEXLink& SweaveLinksupportVast number of R packages with graphics utilitiesGraphics and Data Visualization in ROverviewSlide 4/121

Documentation on Graphics in RGeneralGraphics Task PageR Graph GalleryLinkLinkR Graphical ManualLinkPaul Murrell’s book R (Grid) GraphicsLinkInteractive graphicsrggobi (GGobi)iplotsLinkOpen GL (rgl)Graphics and Data Visualization in RLinkLinkOverviewSlide 5/121

Graphics EnvironmentsViewing and saving graphics in ROn-screen graphicspostscript, pdf, svgjpeg/png/wmf/tiff/.Four major graphic environmentsLow-level infrastructureR Base Graphics (low- and high-level)grid: Manual Link , Book LinkHigh-level infrastructurelattice: Manual Link , Intro Link , Book Linkggplot2: Manual Link , Intro Link , Book LinkGraphics and Data Visualization in ROverviewSlide 6/121

OutlineOverviewGraphics EnvironmentsBase GraphicsGrid Graphicslatticeggplot2Specialty GraphicsGenome GraphicsggbioAdditional Genome GraphicsClusteringBackgroundHierarchical Clustering ExampleNon-Hierarchical Clustering ExamplesGraphics and Data Visualization in RGraphics EnvironmentsSlide 7/121

OutlineOverviewGraphics EnvironmentsBase GraphicsGrid Graphicslatticeggplot2Specialty GraphicsGenome GraphicsggbioAdditional Genome GraphicsClusteringBackgroundHierarchical Clustering ExampleNon-Hierarchical Clustering ExamplesGraphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 8/121

Base Graphics: OverviewImportant high-level plotting functionsplot: generic x-y plottingbarplot: bar plotsboxplot: box-and-whisker plothist: histogramspie: pie chartsdotchart: cleveland dot plotsimage, heatmap, contour, persp: functions to generate image-likeplotsqqnorm, qqline, qqplot: distribution comparison plotspairs, coplot: display of multivariant dataHelp on these functions?myfct?plot?parGraphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 9/121

Base Graphics: Preferred Input Data ObjectsMatrices and data framesVectorsNamed vectorsGraphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 10/121

Scatter Plot: very basicSample data set for subsequent plots set.seed(1410) y - matrix(runif(30), ncol 3, dimnames list(letters[1:10], LETTERS[1:3])) plot(y[,1], y[,2]) 0.8 0.4y[, 2]0.6 0.2 0.20.40.6 0.8y[, 1]Graphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 11/121

Scatter Plot: all pairs pairs(y)0.20.40.60.8 A 0.8 0.2 0.4 0.60.8 0.6 B0.4 0.2 C 0.2 0.6 0.4 0.20.40.60.0 Graphics and Data Visualization in R0.8 1.0 0.8Graphics Environments0.00.20.40.60.81.0Base GraphicsSlide 12/121

Scatter Plot: with labels plot(y[,1], y[,2], pch 20, col "red", main "Symbols and Labels") text(y[,1] 0.03, y[,2], rownames(y))Symbols and Labels j0.8 eg a0.4y[, 2]0.6 f bh0.2 d 0.20.4i0.6 c0.8y[, 1]Graphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 13/121

Scatter Plots: more examplesPrint instead of symbols the row names plot(y[,1], y[,2], type "n", main "Plot of Labels") text(y[,1], y[,2], rownames(y))Usage of important plotting parameters grid(5, 5, lwd 2)op - par(mar c(8,8,8,8), bg "lightblue")plot(y[,1], y[,2], type "p", col "red", cex.lab 1.2, cex.axis 1.2,cex.main 1.2, cex.sub 1, lwd 4, pch 20, xlab "x label",ylab "y label", main "My Main", sub "My Sub")par(op)Important argumentsmar: specifies the margin sizes around the plotting area in order: c(bottom,left, top, right)col: color of symbolspch: type of symbols, samples: example(points)lwd: size of symbolscex.*: control font sizesFor details see ?parGraphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 14/121

Scatter Plots: more examplesAdd a regression line to a plot plot(y[,1], y[,2]) myline - lm(y[,2] y[,1]); abline(myline, lwd 2) summary(myline)Same plot as above, but on log scale plot(y[,1], y[,2], log "xy")Add a mathematical expression to a plot plot(y[,1], y[,2]); text(y[1,1], y[1,2], expression(sum(frac(1,sqrt(x 2*pi)))), cex 1.3)Graphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 15/121

Exercise 1: Scatter PlotsTask 1 Generate scatter plot for first two columns in iris data frame and color dots byits Species column.Task 2 Use the xlim/ylim arguments to set limits on the x- and y-axes so that all datapoints are restricted to the left bottom quadrant of the plot.Structure of iris data set: class(iris)[1] "data.frame" iris[1:4,]1234Sepal.Length Sepal.Width Petal.Length Petal.Width Species5.13.51.40.2 setosa4.93.01.40.2 setosa4.73.21.30.2 setosa4.63.11.50.2 setosa table(iris Species)setosa versicolor5050Graphics and Data Visualization in Rvirginica50Graphics EnvironmentsBase GraphicsSlide 16/121

Line Plot: Single Data Set0.20.4y[, 1]0.60.8 plot(y[,1], type "l", lwd 2, col "blue")246810IndexGraphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 17/121

Line Plots: Many Data Sets split.screen(c(1,1));[1] 1plot(y[,1], ylim c(0,1), xlab "Measurement", ylab "Intensity", type "l", lwd 2, col 1)for(i in 2:length(y[1,])) {screen(1, new FALSE)plot(y[,i], ylim c(0,1), type "l", lwd 2, col i, xaxt "n", yaxt "n", ylab "",xlab "", main "", bty "n")}close.screen(all TRUE)0.60.40.00.2Intensity0.81.0 246810MeasurementGraphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 18/121

Bar Plot Basics1.01.2 barplot(y[1:4,], ylim c(0, max(y[1:4,]) 0.3), beside TRUE, legend letters[1:4]) text(labels round(as.vector(as.matrix(y[1:4,])),2), x seq(1.5, 13, by 1) sort(rep(c(0,1,2), 4)), y as.vector(as.matrix(y[1:4,])) .320.310.120.0500.0AGraphics and Data Visualization in RGraphics EnvironmentsBCBase GraphicsSlide 19/121

Bar Plots with Error Bars0246810 bar - barplot(m - rowMeans(y) * 10, ylim c(0, 10)) stdev - sd(t(y)) arrows(bar, m, bar, m stdev, length 0.15, angle 90)aGraphics and Data Visualization in RbcdGraphics EnvironmentsefghijBase GraphicsSlide 20/121

Mirrored Bar Plotsdf - data.frame(group rep(c("Above", "Below"), each 10), x rep(1:10, 2), yplot(c(0,12),range(df y),type "n")barplot(height df y[df group 'Above'], add TRUE,axes FALSE)barplot(height df y[df group 'Below'], add TRUE,axes FALSE)0.0 1.0 0.5range(df y)0.5 024681012c(0, 12)Graphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 21/121

Histograms hist(y, freq TRUE, breaks 10)201Frequency34Histogram of y0.00.20.40.60.81.0yGraphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 22/121

Density Plots plot(density(y), col "red")0.60.00.20.4Density0.81.0density.default(x y)0.00.51.0N 30 Bandwidth 0.136Graphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 23/121

Pie Charts pie(y[,1], col rainbow(length(y[,1]), start 0.1, end 0.8), clockwise TRUE) legend("topright", legend row.names(y), cex 1.3, bty "n", pch 15, pt.cex 1.8, col rainbow(length(y[,1]), start 0.1, end 0.8), ncol 1)jiabhcgabcdefghijdfGraphics and Data Visualization in RGraphics EnvironmentseBase GraphicsSlide 24/121

Color Selection UtilitiesDefault color palette and how to change it palette()[1] "black""red""green3""blue""cyan""magenta" "yellow""gray" palette(rainbow(5, start 0.1, end 0.2)) palette()[1] "#FF9900" "#FFBF00" "#FFE600" "#F2FF00" "#CCFF00" palette("default")The gray function allows to select any type of gray shades by providing values from 0to 1 gray(seq(0.1, 1, by 0.2))[1] "#1A1A1A" "#4D4D4D" "#808080" "#B3B3B3" "#E6E6E6"Color gradients with colorpanel function from gplots library library(gplots) colorpanel(5, "darkblue", "yellow", "white")Much more on colors in R see Earl Glynn’s color chartGraphics and Data Visualization in RGraphics EnvironmentsLinkBase GraphicsSlide 25/121

Arranging Several Plots on Single PageWith par(mfrow c(nrow,ncol)) one can define how several plots are arranged nextto each other. 44 6810 24682 1:10 6 4 41:10 10 2 2 10 888 88 66101010 44Index 610Index64 22 Index1:10 4 61:1061:10 2 2 888641:10 2 2 2IndexGraphics and Data Visualization in R101010 par(mfrow c(2,3)); for(i in 1:6) { plot(1:10) }46IndexGraphics Environments810246810IndexBase GraphicsSlide 26/121

Arranging Plots with Variable WidthThe layout function allows to divide the plotting device into variable numbers of rowsand columns with the column-widths and the row-heights specified in the respectivearguments.108642002468100246810 nf - layout(matrix(c(1,2,3,3), 2, 2, byrow TRUE), c(3,7), c(5,5), respect TRUE) # layout.show(nf) for(i in 1:3) { barplot(1:10) }Graphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 27/121

Saving Graphics to FilesAfter the pdf() command all graphs are redirected to file test.pdf. Works for allcommon formats similarly: jpeg, png, ps, tiff, . pdf("test.pdf"); plot(1:10, 1:10); dev.off()Generates Scalable Vector Graphics (SVG) files that can be edited in vector graphicsprograms, such as InkScape. svg("test.svg"); plot(1:10, 1:10); dev.off()Graphics and Data Visualization in RGraphics EnvironmentsBase GraphicsSlide 28/121

Exercise 2: Bar PlotsTask 1 Calculate the mean values for the Species components of the first four columnsin the iris data set. Organize the results in a matrix where the row names arethe unique values from the iris Species column and the column names arethe same as in the first four iris columns.Task 2 Generate two bar plots: one with stacked bars and one with horizontallyarranged bars.Structure of iris data set: class(iris)[1] "data.frame" iris[1:4,]1234Sepal.Length Sepal.Width Petal.Length Petal.Width Species5.13.51.40.2 setosa4.93.01.40.2 setosa4.73.21.30.2 setosa4.63.11.50.2 setosa table(iris Species)setosa versicolor5050Graphics and Data Visualization in Rvirginica50Graphics EnvironmentsBase GraphicsSlide 29/121

OutlineOverviewGraphics EnvironmentsBase GraphicsGrid Graphicslatticeggplot2Specialty GraphicsGenome GraphicsggbioAdditional Genome GraphicsClusteringBackgroundHierarchical Clustering ExampleNon-Hierarchical Clustering ExamplesGraphics and Data Visualization in RGraphics EnvironmentsGrid GraphicsSlide 30/121

grid Graphics EnvironmentWhat is grid?Low-level graphics systemHighly flexible and controllable systemDoes not provide high-level functionsIntended as development environment for custom plottingfunctionsPre-installed on new R distributionsDocumentation and HelpManual LinkBook LinkGraphics and Data Visualization in RGraphics EnvironmentsGrid GraphicsSlide 31/121

OutlineOverviewGraphics EnvironmentsBase GraphicsGrid Graphicslatticeggplot2Specialty GraphicsGenome GraphicsggbioAdditional Genome GraphicsClusteringBackgroundHierarchical Clustering ExampleNon-Hierarchical Clustering ExamplesGraphics and Data Visualization in RGraphics EnvironmentslatticeSlide 32/121

lattice EnvironmentWhat is lattice?High-level graphics systemDeveloped by Deepayan SarkarImplements Trellis graphics system from S-PlusSimplifies high-level plotting tasks: arranging complexgraphical featuresSyntax similar to R’s base graphicsDocumentation and HelpManual LinkIntro LinkBook Linklibrary(help lattice) opens a list of all functionsavailable in the lattice packageAccessing and changing global parameters:?lattice.options and ?trellis.deviceGraphics and Data Visualization in RGraphics EnvironmentslatticeSlide 33/121

Scatter Plot Sample library(lattice) p1 - xyplot(1:8 1:8 rep(LETTERS[1:4], each 2), as.table TRUE) plot(p1)24A68B864 2 1:8 CD 8 6 4224681:8Graphics and Data Visualization in RGraphics EnvironmentslatticeSlide 34/121

Line Plot Sample library(lattice) p2 - parallelplot( iris[1:4] Species, iris, horizontal.axis FALSE, layout c(1, 3, 1)) nSepal.LengthGraphics and Data Visualization in RSepal.WidthGraphics EnvironmentsPetal.LengthPetal.WidthlatticeSlide 35/121

OutlineOverviewGraphics EnvironmentsBase GraphicsGrid Graphicslatticeggplot2Specialty GraphicsGenome GraphicsggbioAdditional Genome GraphicsClusteringBackgroundHierarchical Clustering ExampleNon-Hierarchical Clustering ExamplesGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 36/121

ggplot2 EnvironmentWhat is ggplot2 ?High-level graphics systemImplements grammar of graphics from Leland WilkinsonStreamlines many graphics workflows for complex plotsSyntax centered around main ggplot functionSimpler qplot function provides many shortcutsLinkDocumentation and HelpManual LinkIntro LinkBook LinkCookbook for RGraphics and Data Visualization in RLinkGraphics Environmentsggplot2Slide 37/121

ggplot2 Usageggplot function accepts two argumentsData set to be plottedAesthetic mappings provided by aes functionAdditional parameters such as geometric objects (e.g. points,lines, bars) are passed on by appending them with asseparator.List of available geom * functions:LinkSettings of plotting theme can be accessed with the commandtheme get() and its settings can be changed with theme().Preferred input data objectqgplot: data.frame (support for vector, matrix, .)ggplot: data.framePackages with convenience utilities to create expected inputsplyrreshapeGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 38/121

qplot Functionqplot syntax is similar to R’s basic plot functionArguments:x: x-coordinates (e.g. col1)y: y-coordinates (e.g. col2)data: data frame with corresponding column namesxlim, ylim: e.g. xlim c(0,10)log: e.g. log "x" or log "xy"main: main title; see ?plotmath for mathematical formulaxlab, ylab: labels for the x- and y-axescolor, shape, size.: many arguments accepted by plot functionGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 39/121

qplot: Scatter PlotsCreate sample data library(ggplot2) x - sample(1:10, 10); y - sample(1:10, 10); cat - rep(c("A", "B"), 5)Simple scatter plot qplot(x, y, geom "point")Prints dots with different sizes and colors qplot(x, y, geom "point", size x, color cat, main "Dot Size and Color Relative to Some Values")Drops legend qplot(x, y, geom "point", size x, color cat) theme(legend.position "none")Plot different shapes qplot(x, y, geom "point", size 5, shape cat)Graphics and Data Visualization in RGraphics Environmentsggplot2Slide 40/121

qplot: Scatter Plot with qplot p - qplot(x, y, geom "point", size x, color cat, main "Dot Size and Color Relative to Some Values") theme(legend.position "none") print(p)Dot Size and Color Relative to Some Values10.0 7.5 y 5.0 2.5 2.55.07.510.0xGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 41/121

qplot: Scatter Plot with Regression Line set.seed(1410)dsmall - diamonds[sample(nrow(diamonds), 1000), ]p - qplot(carat, price, data dsmall, geom c("point", "smooth"),method "lm")print(p)20000 price 100000 1 23caratGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 42/121

qplot: Scatter Plot with Local Regression Curve (loess) p - qplot(carat, price, data dsmall, geom c("point", "smooth"), span 0.4) print(p) # Setting 'se FALSE' removes error shade 15000 price5000 10000 0123caratGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 43/121

ggplot FunctionMore important than qplot to access full functionality of ggplot2Main argumentsdata set, usually a data.frameaesthetic mappings provided by aes functionGeneral ggplot syntaxggplot(data, aes(.)) geom *() . stat *() .Layer specificationsgeom *(mapping, data, ., geom, position)stat *(mapping, data, ., stat, position)Additional componentsscalescoordinatesfacetaes() mappings can be passed on to all components (ggplot, geom *, etc.).Effects are global when passed on to ggplot() and local for other components.x, ycolor: grouping vector (factor)group: grouping vector (factor)Graphics and Data Visualization in RGraphics Environmentsggplot2Slide 44/121

Changing Plotting Themes with ggplotTheme settings can be accessed with theme get()Their settings can be changed with theme()Some examplesChange background color to white.Graphics and Data Visualization in R theme(panel.background element rect(fill "white", colour "black"))Graphics Environmentsggplot2Slide 45/121

Storing ggplot SpecificationsPlots and layers can be stored in variables p - ggplot(dsmall, aes(carat, price)) geom point() p # or print(p)Returns information about data and aesthetic mappings followed by each layer summary(p)Prints dots with different sizes and colors bestfit - geom smooth(methodw "lm", se F, color alpha("steelblue", 0.5), p bestfit # Plot with custom regression lineSyntax to pass on other data sets p % % diamonds[sample(nrow(diamonds), 100),]Saves plot stored in variable p to file ggsave(p, file "myplot.pdf")Graphics and Data Visualization in RGraphics Environmentsggplot2Slide 46/121

ggplot: Scatter Plot p - ggplot(dsmall, aes(carat, price, color color)) geom point(size 4) print(p)price150001000050000 12color DEFGHIJ3caratGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 47/121

ggplot: Scatter Plot with Regression Line p - ggplot(dsmall, aes(carat, price)) geom point() geom smooth(method "lm", se FALSE) theme(panel.background element rect(fill "white", colour "black print(p)2500020000 15000 price 10000500001 23caratGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 48/121

ggplot: Scatter Plot with Several Regression Lines p - ggplot(dsmall, aes(carat, price, group color)) geom point(aes(color color), size 2) geom smooth(aes(color color), method "lm", se FALSE) print(p)20000 price1500010000500001 2color D E F G H I J3caratGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 49/121

ggplot: Scatter Plot with Local Regression Curve (loess) p - ggplot(dsmall, aes(carat, price)) geom point() geom smooth() print(p) # Setting 'se FALSE' removes error shade 15000 price5000 10000 0123caratGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 50/121

ggplot: Line Plot p - ggplot(iris, aes(Petal.Length, Petal.Width, group Species, color Species)) geom line() rvirginica1.00.50.0246Petal.LengthGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 51/121

ggplot: Faceting p - ggplot(iris, aes(Sepal.Length, Sepal.Width)) geom line(aes(color Species), size 1) facet wrap( Species, ncol 1) hics and Data Visualization in RGraphics Environmentsggplot2Slide 52/121

Exercise 3: Scatter PlotsTask 1 Generate scatter plot for first two columns in iris data frame and color dots byits Species column.Task 2 Use the xlim, ylim functionss to set limits on the x- and y-axes so that alldata points are restricted to the left bottom quadrant of the plot.Task 3 Generate corresponding line plot with faceting show individual data sets insaparate plots.Structure of iris data set: class(iris)[1] "data.frame" iris[1:4,]1234Sepal.Length Sepal.Width Petal.Length Petal.Width Species5.13.51.40.2 setosa4.93.01.40.2 setosa4.73.21.30.2 setosa4.63.11.50.2 setosa table(iris Species)setosa versicolor5050Graphics and Data Visualization in Rvirginica50Graphics Environmentsggplot2Slide 53/121

ggplot: Bar PlotsSample Set: the following transforms the iris data set into a ggplot2-friendly format.Calculate mean values for aggregates given by Species column in iris data set iris mean - aggregate(iris[,1:4], by list(Species iris Species), FUN mean)Calculate standard deviations for aggregates given by Species column in iris data set iris sd - aggregate(iris[,1:4], by list(Species iris Species), FUN sd)Convert iris mean with melt library(reshape2) # Defines melt function df mean - melt(iris mean, id.vars c("Species"), variable.name "Samples", value.nConvert iris sd with melt df sd - melt(iris sd, id.vars c("Species"), variable.name "Samples", value.name Define standard deviation limits limits - aes(ymax df mean[,"Values"] df sd[,"Values"], ymin df mean[,"Values"]Graphics and Data Visualization in RGraphics Environmentsggplot2Slide 54/121

ggplot: Bar Plot p - ggplot(df mean, aes(Samples, Values, fill Species)) geom bar(position "dodge", stat "identity") lesGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 55/121

ggplot: Bar Plot Sideways p - ggplot(df mean, aes(Samples, Values, fill Species)) geom bar(position "dodge", stat "identity") coord flip() theme(axis.text.y theme text(angle 0, hjust 1)) uesGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 56/121

ggplot: Bar Plot with Faceting p - ggplot(df mean, aes(Samples, Values)) geom bar(aes(fill Species), stat facet wrap( Species, ncol 1) .WidthPetal.LengthPetal.WidthSamplesGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 57/121

ggplot: Bar Plot with Error Bars p - ggplot(df mean, aes(Samples, Values, fill Species)) geom bar(position "dodge", stat "identity") geom errorbar(limits, lesGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 58/121

ggplot: Changing Color Settings library(RColorBrewer)# display.brewer.all()p - ggplot(df mean, aes(Samples, Values, fill Species, color Species)) geom bar(position "dodge", stat "identity") geom errorbar(limits, position "dodge") scale fill brewer(palette "Blues") scale color brewer(palette idthSamplesGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 59/121

ggplot: Using Standard Colors p - ggplot(df mean, aes(Samples, Values, fill Species, color Species)) geom bar(position "dodge", stat "identity") geom errorbar(limits, position "dodge") scale fill manual(values c("red", "green3", "blue")) scale color manual(values c("red", "green3", "blue")) lesGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 60/121

ggplot: Mirrored Bar Plots df - data.frame(group rep(c("Above", "Below"), each 10), x rep(1:10, 2), y c(runif(10, 0, 1), runi p - ggplot(df, aes(x x, y y, fill group)) geom bar(stat "identity", position "identity") print(p)0.5group0.0yAboveBelow 0.5 1.02.55.07.510.0xGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 61/121

Exercise 4: Bar PlotsTask 1 Calculate the mean values for the Species components of the first four columnsin the iris data set. Use the melt function from the reshape2 package to bringthe results into the expected format for ggplot.Task 2 Generate two bar plots: one with stacked bars and one with horizontallyarranged bars.Structure of iris data set: class(iris)[1] "data.frame" iris[1:4,]1234Sepal.Length Sepal.Width Petal.Length Petal.Width Species5.13.51.40.2 setosa4.93.01.40.2 setosa4.73.21.30.2 setosa4.63.11.50.2 setosa table(iris Species)setosa versicolor5050Graphics and Data Visualization in Rvirginica50Graphics Environmentsggplot2Slide 62/121

ggplot: Data Reformatting Example for Line Plot y - matrix(rnorm(500), 100, 5, dimnames list(paste("g", 1:100, sep ""), paste("Sample", 1:5, sep ""))) y - data.frame(Position 1:length(y[,1]), y) y[1:4, ] # First rows of input format expected by melt()g1g2g3g4 PositionSample1Sample2Sample3Sample4Sample51 1.0002088 0.6850199 -0.21324932 1.27195056 1.04793012 -1.2024596 -1.5004962 -0.01111579 0.07584497 -0.71006623 0.1023678 -0.5153367 0.28564390 1.41522878 1.10846954 1.3294248 -1.2084007 -0.19581898 -0.42361768 1.7139697df - melt(y, id.vars c("Position"), variable.name "Samples", value.name "Values")p - ggplot(df, aes(Position, Values)) geom line(aes(color Samples)) facet wrap( Samples, ncol 1)print(p)## Represent same data in box plot## ggplot(df, aes(Samples, Values, fill Samples)) geom boxplot()Sample120 2 4Sample220 2 4ValuesSample320 2 20 2 4Sample5Graphics and Data Visualization in R20 2 4Graphics Environmentsggplot2Slide 63/121

ggplot: Jitter Plots p - ggplot(dsmall, aes(color, price/carat)) geom jitter(alpha I(1 / 2), aes(color color)) orGraphics and Data Visualization in RGraphics Environmentsggplot2Slide 64/121

ggplot: Box Plots p - ggplot(dsmall, aes(color, price/carat, fill color)) geom boxplot() print(p) 10000 price/carat colorDE FGHIJ5000DEFGHIJcolorGraphics and Data Visuali

jpeg/png/wmf/ti /. Four major graphic environments Low-level infrastructure R Base Graphics (low- and high-level) grid: Manual Link, Book Link High-level infrastructure lattice: Manual Link, Intro Link, Book Link ggplot2: Manual Link, Intro Link, Book Link Graphics and Data Visualization in R Overview Slide 6/121