实验目的

掌握R语言中箱线图的绘制

实验原理

箱线图(又称盒须图)通过绘制连续型变量的五数总括,即最小值、下四分位数(第25百分位数)、中位数(第50百分位数)、上四分位数(第75百分位数)以及最大值,描述了连续型变量的分布。箱线图能够显示可能为离群点(范围±1.5*IQR以外的值,IQR表示四分位距,即上四分位数与下四分位数的差值)的观测。

实验步骤

下面的例子描绘了美国一家航空公司在1949年到1960年每年的航空乘客数量,该数据集名为AirPassengers,是R自带数据集,其中横坐标表示年份,纵坐标为每年各月份的乘客数量。

> library(ggplot2)
> ggplot(data=AirPassengers_modified,aes(group=year,x=year,y=x))+geom_point(aes(color=factor(year)),
alpha=0.2,position="jitter")+geom_boxplot(outlier.size=0, alpha=0.1) +guides(colour=FALSE) 
+ labs(title="每年乘客数量箱线图",x="年份")

抖动散点图被覆盖在箱线图之下。箱子区域包含了50%的点,箱子里的线代表中位数所在的位置,箱子的上边和下边分别代表第一四分位数和第三四分位数所的位置。穿过箱子的线的上端表示距箱子上边1.5*IQR(Q3-Q1)处,线的下端表示距箱子下边1.5*IQR处,一般,在线外的点就可以认为是异常值了。

下面使用boxplot()函数绘制箱线图:

> mn.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, mean)
> sd.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, sd)
> xi <- 0.3 + seq(rb$n)
> points(xi, mn.t, col = "orange", pch = 18)
> arrows(xi, mn.t - sd.t, xi, mn.t + sd.t,
+        code = 3, col = "pink", angle = 75, length = .1)

另一个例子:

> boxplot(len ~ dose:supp, data = ToothGrowth,
+         boxwex = 0.5, col = c("orange", "yellow"),
+         main = "Guinea Pigs' Tooth Growth",
+         xlab = "Vitamin C dose mg", ylab = "tooth length",
+         sep = ":", lex.order = TRUE, ylim = c(0, 35), yaxs = "i")

绘制散点图并添加标题图例:

> boxplot(len ~ dose:supp, data = ToothGrowth,
+         boxwex = 0.5, col = c("orange", "yellow"),
+         main = "Guinea Pigs' Tooth Growth",
+         xlab = "Vitamin C dose mg", ylab = "tooth length",
+         sep = ":", lex.order = TRUE, ylim = c(0, 35), yaxs = "i")
> boxplot(len ~ dose, data = ToothGrowth,
+         boxwex = 0.25, at = 1:3 - 0.2,
+         subset = supp == "VC", col = "yellow",
+         main = "Guinea Pigs' Tooth Growth",
+         xlab = "Vitamin C dose mg",
+         ylab = "tooth length",
+         xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = "i")
> boxplot(len ~ dose, data = ToothGrowth, add = TRUE,
+         boxwex = 0.25, at = 1:3 + 0.2,
+         subset = supp == "OJ", col = "orange")
> legend(2, 9, c("Ascorbic acid", "Orange juice"),
+        fill = c("yellow", "orange"))

results matching ""

    No results matching ""