Pandas is a fast, powerful, flexible, and easy-to-use open-source data analysis and manipulation tool, built on top of the Python programming language.
A lot of data analysts and scientists have been using Python and generally use a lot of Pandas during their cleaning and pre-processing step.
Hence, I thought of writing this article because I myself struggled a lot finding out solutions to these problems and wish I knew how to solve them before.
One of the most frustrating things I struggled with was how to find the percentage of each values within a group. …
This is the second part of the Tidy Tuesday datasets implementation in Python series. You can find the first part here.
The data for this project can be taken from the Github repository of Tidy Tuesday. This is a very well maintained repository and also includes the original data set along with the very much needed data dictionary.
Tidy Tuesday is a weekly social data project in R where users explore a new dataset each week and share their findings on Twitter with #TidyTuesday.
Tidy Tuesday is mainly used by people using the R ecosystem. In this series, I will try to complete the data analysis and visualizations using Python and mainly the Pandas and Plotnine libraries.
The motivation behind this project comes from data screencasts done by David Robinson. In these videos, David looks at the data set and starts analyzing it live without previously seeing it.
Data analysts and data scientists spend most of their time cleaning and preprocessing their data. This step involves getting the right data, understanding the data, exploring the data for patterns, and cleaning or preprocessing the data before building any models.
In this article, I will explain how a data analyst goes by analyzing the data using Pandas which is a widely used data analysis library for Python.
I will go through a dataset made by Quentin Caudron . It’s a time-series dataset, describing the total number of coffees made by an espresso machine by a certain date.
Netflix has truly revolutionized the way we consume content. We find ourselves spending hours watching movies and binge-watching popular TV Series.
The pandemic hit the world in 2020 and we were forced to live indoors and work from home. Because we were restricted in our houses, people started consuming more and more content and started spending much more time on Netflix.
This assumption is backed by this article on Netflix’s increase in subscriptions.
In this article, I will explain how you can analyze your own Netflix Viewing History and understand how you spent time and turned to Netflix for comfort…