# Multiple Histograms in R

Histogram is one of the important visualization for univariate analysis. Data vector or a column in a data frame must be numeric for plotting a histogram in R or for that matter in any of the tool.

A tutorial on plot histogram in r. In this blog, the focus is on using base R graphic functions for plotting a beautiful histogram in R.

A step by step tutorial on Histogram in r using ggplot2. In this blog, ggplot2 package and its functions are using for plotting histogram. Also, options for plotting density histogram curve and overlay Histogram with density curve are illustrated.

### Basics of Histogram in R

Basic syntax of a histogram is simple

hist : Function to plot a histogram x : Input vector and must be numeric breaks : Input for deciding number of bars or breaks in a histogram xlab* : Label for X axis ylab** : Label for Y axis main : Chart title xlim : Range of X Values ylim : Range of Y Values

And many other parameters in hist function for histogram plot in R.

```# Create a numeric vector or data points

df.hist <- rnorm(1000,m=70, sd=20)
hist(x=df.hist,
breaks=10,
xlab="Input Values/Groups",
ylab="Count",
main="Histogram in R",
col="blue",
border="white")```

### Read csv file and create a data frame

Now we want to take an example and create histogram in R for all the numeric variables in an input csv file.

In a real life scenario, an input data frame or a csv file may have hundreds of number variables, and we want to create histogram for each of the variable and store in a particular folder.

*   Takes inputs : data frame, path  to which histogram to be plotted and stores, maximum number of breaks
*   Find a list and count of numeric columns
*   Loop for each of the numeric variable
*   Starting plotting device
*   Plot histogram in R
*   Close the plotting device

```invoiceDF <- read.csv(file="C:\\DnI\\DnI Institute\\Blog\\R\\histogram\\invoiceData1.csv",header=T)

# View of initial 6 rows
##   custNo InvoiceAmount2  AnnualSale InvoiceAmountTotal InvoiceAmount1
## 1      1      -11783.55 14210860132         -1583608.8    1673175.330
## 2      2       42248.11  3446817261          -584130.6    -909903.952
## 3      3      -20737.96  4360346859           470542.5    -159253.953
## 4      4       62529.48  1609197951         -1527911.7     955919.797
## 5      5      -33672.69 -4673802032           963853.6       8520.159
## 6      6      -21573.67  -372227640           260563.4     319893.629
# Type of variables
sapply(invoiceDF, function (x) class(x))
##             custNo     InvoiceAmount2         AnnualSale
##          "integer"          "numeric"          "numeric"
## InvoiceAmountTotal     InvoiceAmount1
##          "numeric"          "numeric"
```

In this example, input data is a csv file which has 5 columns or variables - custNo, annual sales, total invoice amount, invoice amount 1 and invoice amount 2.

4 variables are numeric CustNo is an integer. We want to plot histogram for all the numeric variable (excluding custNo)

First we will show manual steps to get the histogram using R

```#Get type of input data frame columns
col.type <- data.frame("varType"=sapply(invoiceDF, function (x) class(x)))
# Convert rownames/variable names to a data frame column
col.type\$varName <- rownames(col.type)
# Set row names of data frame to null
rownames(col.type) <- NULL

noBreaks <- 20

nCol <- nrow(col.type)
for(i in 1:nCol){
if(col.type[i,1]=="numeric"){

hist(invoiceDF[,col.type[i,2]],
breaks=noBreaks,
main=paste(" Histogram for ",col.type[i,2],sep=""),
xlab=paste(" Breaks : ",col.type[i,2],sep=""),
ylab="Count",
col="green",
border="red"
)
}
}
```

plots window in R. If the plots have to saved to a folder, we need to save programatically. Graphs could be saved as PDF as well.

R graphic devices could be used for storing plot in any of these BMP, JPEG, PNG and TIFF formats. R functions for these are bmp, jpeg , png and tiff .

```#path and name of image to be created
png(file="C:\\DnI\\DnI Institute\\Blog\\R\\histogram\\hist.png")
# Histogram Image
hist(x=df.hist,
breaks=10,
xlab="Input Values/Groups",
ylab="Count",
main="Histogram in R",
col="blue",
border="white")
#close the graphic device
dev.off()```

list.files function, one can find the list of files in a folder/directory.

```list.files(path ="C:\\DnI\\DnI Institute\\Blog\\R\\histogram")
## [1] "AnnualSale.png"         "hist.png"
## [3] "InvoiceAmount1.png"     "InvoiceAmount2.png"
## [5] "InvoiceAmountTotal.png" "invoiceData.csv"
## [7] "invoiceData.xlsx"       "invoiceData1.csv"
## [9] "ram.png"
```

You will be see that hist image is created in the folder.

Now, most of the important components are illustrated above and below is the function to create a list of histograms for each of the input variable (numeric type).

```NumVarHist <- function(dataF,noBins,fpath){

#Get type of input data frame columns
col.type <- data.frame("varType"=sapply(dataF, function (x) class(x)))
# Convert rownames/variable names to a data frame column
col.type\$varName <- rownames(col.type)
# Set row names of data frame to null
rownames(col.type) <- NULL

nCol <- nrow(col.type)
for(i in 1:nCol){
if(col.type[i,1]=="numeric"){
# path and name of image
fileN=paste(col.type[i,2],"png",sep=".")
png(file=paste(fpath,fileN,sep="\\"))
hist(dataF[,col.type[i,2]],
breaks=noBins,
main=paste(" Histogram for ",col.type[i,2],sep=""),
xlab=paste(" Breaks : ",col.type[i,2],sep=""),
ylab="Count",
col="green",
border="red"
)
#close the graphic device
dev.off()
}
}
}

# Call function
NumVarHist(invoiceDF,15,"C:\\DnI\\DnI Institute\\Blog\\R\\histogram")

# List of hisgram files created
list.files(path ="C:\\DnI\\DnI Institute\\Blog\\R\\histogram",
pattern=".png")
## [1] "AnnualSale.png"         "hist.png"
## [3] "InvoiceAmount1.png"     "InvoiceAmount2.png"
## [5] "InvoiceAmountTotal.png" "ram.png"
```

We can add one more argument to exclude a list of columns even from numeric type. Also applying checks to handle incorrect inputs effectively, e.g. if data frame does not exist or physical path is not available.

histogram in r example,histogram in r ggplot2,plot histogram in r,histogram in r from csv,histogram in r with line,histogram in r density,histogram in r xlim, histogram in r x must be numeric,how to make histograms in r,plot histograms in r,how to draw histograms in r,plot distribution r

### 2 thoughts on “Multiple Histograms in R”

1. Is there a way we can add multiple series in the histogram plot..
For example get histogram for year 2015 and yer 2016 in one plot?