/  Technology   /  Plotting data using pandas in python
Plotting data using pandas in python

Plotting data using pandas in python

In this article, we will know about plotting data using pandas using python.

Plotting data :

Plotting data is nothing but is a graphical technique using a dataset, which shows the relationship between the dependent and independent variables. Plotting plays an important role in statistics and data analysis. There are many data plotting techniques such as scatter plot, line plot, bar, pie chart, box, etc.

Python provides many data visualization libraries such as seaborn, bokeh, ggplot, etc. here we are going to plot the data using the library called Pandas.

Pandas :

From the panda library, we can able to complete many of the data analysis tasks.

Plotting data using pandas in python

pandas are one of the most popular open-source data exploration libraries currently available.  It gives its users the ability to explore, manipulate, query, aggregate, and visualize tabular data. Tabular meaning data that is two-dimensional with rows and columns. pandas are built just for analyzing this tabular, rectangular, very deceptively normal concept of data. pandas can read in many different formats of data, but they all will be converted to tabular data.

So let’s move to plot the data frame using pandas. By using pandas  we can plot data frame  in the various type listed below

  • Scatter plot
  • Line plot
  • Bar plot
  • Histogram
  • Area plot
  • Horizontal bar plot
  • Box plot
  • Pie chart

In this, we will use the same data set through the article.

The prerequisites are that you should have installed the pandas and matplotlib library on your computer. you can install the library using pip.

The data set we are using is about the cars that contain  types of cars and the price of them

Data:

Plotting data using pandas in python

After collecting the data we will have to import the libraries we are going to need

Importing libraries :

import pandas as pd
import matplotlib.pyplot as plt

we are plotting our data frame using pandas and to show the plot we are gonna use the matplotlib library.

if we are done with importing the libraries which are required we have to create the data frame

for creating data frame :

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
print (df)

output :

Plotting data using pandas in python

So we have imported the libraries and create the data frame now

We are ready to plot the data by following the proper syntax for a particular plot.

  1. Scatter plot :

A scatter plot or scatter graph uses data points to represent values for different numeric variables on the verticle and horizontal axis. Through the scatter plot we can see the relationship between the independent and dependent variable whether the relationship is positive, negative, etc

To plot the scatter plot in pandas we are using the following syntax:

df.plot(x='Brand',y='Price',kind='scatter')
plt.show()

here, you will observe kind is scatter to plot  the scatter chart

putting everything together:

import pandas as pd
import matplotlib.pyplot as plt
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.plot(x='Brand',y='Price',kind='scatter')
plt.show()

Output :

Plotting data using pandas in python
  1. Line plot:

A line plot is used for plotting the information as a series of data point which is connected by a straight line. It is used for observing the trends in data.

To plot the line plot in pandas we are using the fallowing syntax:

df.plot(x='Brand',y='Price',kind='line')

here now the kind is line for line plot.

putting everything together :

import pandas as pd
import matplotlib.pyplot as plt
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.plot(x='Brand',y='Price',kind='line')
plt.show()

output :

Plotting data using pandas in python
  1. Bar plot :

In Bar plot, the data is represented in bars. The dependent variable is on the y-axis.

To plot the line plot in pandas we are using the fallowing:

df.plot(x='Brand',y='Price',kind='bar')

here kind is bar for representing data in  bars

putting everything together:

import pandas as pd
import matplotlib.pyplot as plt
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.plot(x='Brand',y='Price',kind='bar')
plt.show()

 output:

Plotting data using pandas in python
  1. Histogram :

The histogram is a representation of the distribution of the numerical or categorical value of data.

To plot the Histogram in pandas we are using the fallowing

df.plot(x='Brand',y='Price',kind='hist')

here we select the kind as hist to plot histogram

putting everything together:

import pandas as pd
import matplotlib.pyplot as plt
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.plot(x='Brand',y='Price',kind='hist')
plt.show()

output:

Plotting data using pandas in python
  1. Area plot :

The Area plot is the same as the line plot but also, it has the area between the x-axis and the line plotted. It helps us to understand the evolution of the numerical value following another numerical variable.

To plot the area plot in pandas we are using the following syntax:

df.plot(x='Brand',y='Price',kind='area')

here we choose the kind as area for area plot

putting everything together

import pandas as pd
import matplotlib.pyplot as plt
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.plot(x='Brand',y='Price',kind='area')
plt.show()

output :

Plotting data using pandas in python
  1. Horizontal bar plot :

The horizontal bar plot  is the same as the vertical bar plot but here the x-axis and the y-axis are switches

To plot the area plot in pandas we are using the following syntax:

df.plot( x='Brand', y='Price', kind='barh' )

putting everything together

import pandas as pd
import matplotlib.pyplot as plt
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.plot(x='Brand',y='Price',kind='area')
plt.show()

output:

Plotting data using pandas in python
  1. Box plot :

Box plot is a graphical representation of numerical data using quartiles .it uses explanatory data analysis. Here in the box plot, the outlier is plotted as the individual point.

To plot the box plot in pandas we are using the following syntax:

df.plot(x='Brand',y='Price',kind='box')

putting everything together

import pandas as pd
import matplotlib.pyplot as plt
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
       'Price': [22000,25000,27000,35000]
        }
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
df.plot(x='Brand',y='Price',kind='box')
plt.show()

output :

Plotting data using pandas in python
  1. Pie chart :

A pie chart or the circular chat is a circular statistical graph which is divided the circular portion into the numerical portion according to data. It is used to visualize the data more efficiently.

To plot the pie chart in pandas we are using the following  syntax :

df.plot.pie(y='Tasks',figsize=(5, 5),autopct='%1.1f%%', startangle=90)
plt.show()

putting every thing together :

from pandas import DataFrame
import matplotlib.pyplot as plt
Data = {'Prices': [22000,25000,27000,35000]}
df = DataFrame(Data,columns=['Prices'],index = ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'])
df.plot.pie(y='Prices',figsize=(5, 5),autopct='%1.1f%%', startangle=90)
plt.show()

output :

Plotting data using pandas in python

so far we have discuss the all the plotting technique using the pandas python library.

Leave a comment