How to plot boxplot in R?

This recipe helps you plot boxplot in R

Recipe Objective

Box-plots are also known as box-whisker plots. They are a special type of plot which showcases complex numerical data in a compact manner. It is more informative than a strip chart as it also displays the interquartile range, median and min-max range. It is mainly used to find outliers in the dataset. ​

In this recipe we are going to use ggplot2 package to plot the required box plot. ggplot2 package is based on the book Grammar of Graphics by Wilkinson. This package provides flexibility while incorporating different themes and plot specification with a high level of abstraction. The package mainly uses aesthetic mapping and geometric objects as arguments. Different types of geometric objects include: ​

  1. geom_point() - for plotting points
  2. geom_bar() - for plotting bar graph
  3. geom_line() - for plotting line chart
  4. geom_boxplot() - for plotting boxplot

The basic syntax of gggplot2 plots is: ​

ggplot(data, mapping = aes(x =, y=)) + geometric object ​

where: ​

  1. data : Dataframe that is used to plot the chart
  2. mapping = aes() : aesthetic mapping which deals with controlling axis (x and y indicates the different variables)
  3. geometric object : Indicates the code for typeof plot you need to visualise.

This recipe demonstrates how to make a violin plot using ggplot2.

STEP 1: Loading required library and dataset

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we interested in is Annual.Income which is in 1000s

# Data manipulation package library(tidyverse) ​ # ggplot for data visualisation library(ggplot2) ​ # reading a dataset customer_seg = read.csv('Mall_Customers.csv') ​ glimpse(customer_seg)

Rows: 200
Columns: 5
$ CustomerID              1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1...
$ Gender                  Male, Male, Female, Female, Female, Female, ...
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, ...
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, ...
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99,...

STEP 2: Plotting a Box plot using ggplot

We use geometric object as geom_boxplot() to plot a boxplot of Annual Income variable based on the gender

Note:

  1. The + sign in the syntax earlier makes the code more readable and enables R to read further code without breaking it.
  2. fill arguement inside the geom_violin() o fill the violinplot based on a factor
  3. We also use labs() function to give a title to the graph

ggplot(customer_seg, aes(x = Gender, y = Annual.Income..k..)) + geom_boxplot(aes(fill = Gender)) + labs(title = "Annual Income Box Plot")

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Build and Deploy Text-2-SQL LLM Using OpenAI and AWS
In this LLM project, you will learn to build a user-friendly web application that leverages Large Language Models (LLMs) to convert natural language queries into optimized SQL commands.

Multilabel Classification Project for Predicting Shipment Modes
Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

Hands-On Approach to Master PyTorch Tensors with Examples
In this deep learning project, you will learn how to perform various operations on the building block of PyTorch : Tensors.

Autogen Project to Build an Intelligent AI Personal Assistant
Build a multi-agent AI personal assistant using Autogen that can handle tasks like managing calendars, emails, reminders, messaging, research, and weather updates, automating everyday workflows with LLMs and tool integrations. This is an upcoming project that is expected to be launched in June.

Recommender System Machine Learning Project for Beginners-3
Content Based Recommender System Project - Building a Content-Based Product Recommender App with Streamlit

Learn How to Build a Logistic Regression Model in PyTorch
In this Machine Learning Project, you will learn how to build a simple logistic regression model in PyTorch for customer churn prediction.

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

AWS Project to Build and Deploy LSTM Model with Sagemaker
In this AWS Sagemaker Project, you will learn to build a LSTM model on Sagemaker for sales forecasting while analyzing the impact of weather conditions on Sales.

AI Video Summarization Project using Mixtral, Whisper, and AWS
In this AI Video Summarization Project, you will build a quiz generation tool by extracting key concepts from educational videos and generating concise summaries.

Learn to Build Generative Models Using PyTorch Autoencoders
In this deep learning project, you will learn how to build a Generative Model using Autoencoders in PyTorch

OSZAR »