
Python - Data Cleaning, Plotting, and Correlation with Heatmap
Jul 21, 2024
1 min read
0
17
0

This is a data cleaning and visualization Python project executed in Jupyter Notebook on a Movie Industry dataset on kaggle.com . My approach utilizes Pandas, Numpy, Seaborn, and Matplotlib Python libraries. You will see I identify percentages of missing data in every data field, clean the data in every field necessary (with replace, fillna, and type changing syntax), and plot data to identify top performers along with an average line of regression. This all capped of with a Heatmap Correlation Matrix that indicates the most relevant driver of success for movie revenue generation.
For full code viewing, please see github link - https://github.com/DataDoneByMark/Python-Data-Cleaning-Plotting-and-Correlation-with-HeatMap/blob/main/Portfolio%20Project%20Python%20v3.ipynb





