Complete Guide to Exploratory Data Analysis with Python Plotly

Anar Abiyev
4 min readMar 13, 2022

Introduction

Exploratory data analysis is the key method to extract useful insights from data which play important role in the selection of model, data preprocessing steps, interpretation of results and etc.

In this blog, I will dive into details of EDA with the example of car prediction dataset.

You can also have look to the project example of the same dataset.

I will be using tools of pandas and plotly libraries. At each step I will point out what question I am analyzing and which information I extracted with EDA tools.

Hope you will enjoy it.

  1. Data Loading and Cleaning

As this blog is focused on EDA, I will not explain data cleaning part in detail. You can find it in this blog.

2. Overview of columns

Pandas library provide “info” function which shows column names, data types and count of nun-null values:

3. Missing values

Firstly, our dataset doesn’t contain any missing value.

If you come across null values in your dataset, missingno library is ideal for this task.

4. Descriptive Summary of numerical columns

Scaling is one of the main parts of data preprocessing. To determine weather the data needs scaling or not, you have to look at the range of values in numerical columns. With the help of “describe” function, not only this information, but also all the main statistical values can be…

--

--

Anar Abiyev

Writing about Data Science / Deep Learning and Self Improvement