We will discussing some of the commonly used Base R Graphic functions. Some of the commonly used functions are

**plot**: Plotting Line Chart and Scatter Plot**boxplot**: Box Whiskers Plot for a continuous variable or distributions by different groups**hist**: Histogram

## Scatter Plot

We will create a sample data points and then use for the scatter plot.

1 2 3 4 5 6 7 8 9 10 11 |
# Our first plot par(mfrow=c(2,2)) x <- c (1, 2, 3, 4, 5) y <- c (1, 5, 3, 2, 0) plot (x, y, pch=20, col="red", main="Scatter Plot", xlab="X Variable", ylab="Y variable") |

We have given x and y coordinate values using x and y vectors.

pch: Plotting character and 20 indicate a filled dot

col: Color of plotting character - in the above example it is red color.

main: Chart Title

xlab: X Axis Label

ylab: Y Axis Label

Scenario: We want to explore relationship between Parent Height and Children Height. Scatterplot is useful plot to visualise relationship between two continuous variables. We have a data frame "**galton**" which has mid parent and child heights. This data frame is available in **psych** package. We have to install and load the package to use the data frame.

1 2 3 4 |
install.packages("psych") library(psych) library(help=psych) data(galton,package = "psych") |

Now, we can use **plot** function to find the relationship.It may be appropriate to estimate a linear relationship between parents and children heights. For a linear relationship, we need to estimate intercept and slope. **lm** (linear model) function helps us in estimating the intercept and slope.

Output of **lm** function is passed as input to **abline** function to plot the linear line representing relationship between heights of parent and child.

**lty** helps in selecting type of line and **lwd** in assigning width of the line.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
par(mfrow=c(1,2)) # Add elements to the graph plot(galton$parent, galton$child, xlab = "Height of Parent", ylab= "Height of Children", main=" Relationship between Parent and Children Heights") # Changes in Plotting Characters plot(galton$parent, galton$child, xlab = "Height of Parent", ylab= "Height of Children", main=" Relationship between Parent and Children Heights", pch="a", col="blue") # Fit a line between X and Y or Height of Parent and Children abline(lm(galton$child~galton$parent), col = "green", lwd=5, lty=6) |

## Line Plot or Time Series Plot

Again, for time series or line plot, we can use plot function. Within plot function **type** parameter helps us in selecting ways to connect the points.

Scenario: We have temperature data of a place across years and we want to see pattern of temperature across months & years.

Since the data frame is available in "**datasets**" package if we have to install and load if not done already.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# ------------------- Time Series Plot or Line Chart ------------------------- # Scenario - How Average Month Temprature is changing across years # nottem Average Monthly Temperatures at Nottingham,1920-1939 library(help = "datasets") data(nottem,package = "datasets") # Add elements plot(nottem, xlab="Years", ylab="Avg Monthly Temp", main="Temp across years", col="blue", type="s", pch=20) |

## Histogram

When we want to see distribution plot of a continuous variable, we create equal size bins and count number of observations in each of the bin. When we plot the count or proportion for each of the bin, we get the histogram.

**Scenario**: We have a list of customers and their age. We want to see distribution of age, instead of just looking at the summary statistics. We can get histogram of the age variable and get the distribution. R has **hist** function to get the histogram plot.

1 2 3 4 5 |
# Generate Age data ## Generate a numeric vector for Age Age <- as.integer(rnorm(10000,m=55, sd=15)) # Plot histogram hist(Age) |

In the above R code,we created a series of value for Age (using normally distributed random value with expected mean as 55 and standard deviation 15).

Once, we had a vector - Age- , we have used **hist** function to get histogram plot.

We can use some of the additional arguments of hist function to make the histogram look better and improve the readability.

1 2 3 4 5 6 7 8 |
# Add elements or beautify Histogram hist(Age, breaks=30, col="green", border="red", xlab="Age", ylab="Counts", main="Histogram:Age") |

Arguments for hist function are similar to plot function. **col** - fills input color, **border** allows selecting color for the histogram border, **xlab** - label of X axis, **ylab** - label of Y axis and **main** for giving chart title.

We can add density curve by first making histogram as probability histogram by making argument **freq** as FALSE.

1 2 3 4 5 6 7 8 9 10 11 12 |
hist(Age, xlim = c(-10,150), breaks=30, col="red", border="orange", xlab="Age", ylab="Prob", main="Histogram:Age Denssity", freq = F) lines(density(Age,na.rm = TRUE), col="orange", lwd=2) #density computes kernel density estimates |

**density** function is used for estimating the density values and **line** function helps in plotting the line.

More on Visualisation using R

- Base R Graphic Elements
- Line Charts using Base R Graphic
- Formatted Line Chart for Forecasting Example
- Histogram using Base R Graphic
- Histogram using ggplot2
- Histogram Function- Creating histogram for each numeric variable of data frame
- Cute Column Chart using Base R Visualization
- Cute Column Chart using ggplot2 Visualization
- Clustered Column Chart using Base R Graphics
- Plots used for Moving Average and Weighted Moving Average
- Scatterplot using Base R Graphics
- Bubble Chart using Base R and Ggplot2 graphics