Important tips and tricks on some useful functions in Pandas

Photo by Christopher Gower on Unsplash

Pandas is a fast, powerful, flexible, and easy-to-use open-source data analysis and manipulation tool, built on top of the Python programming language.

A lot of data analysts and scientists have been using Python and generally use a lot of Pandas during their cleaning and pre-processing step.

Hence, I thought of writing this article because I myself struggled a lot finding out solutions to these problems and wish I knew how to solve them before.

1. Percentage within each group

One of the most frustrating things I struggled with was how to find the percentage of each values within a group. …


Analyze horror movie profits using Python

Photo by Stefano Pollio on Unsplash

This is the second part of the Tidy Tuesday datasets implementation in Python series. You can find the first part here.

Getting the Data

The data for this project can be taken from the Github repository of Tidy Tuesday. This is a very well maintained repository and also includes the original data set along with the very much needed data dictionary.


A Python implementation of the very famous Tidy Tuesday datasets using Pandas and Plotnine.

Photo by Vasily Koloda on Unsplash

Tidy Tuesday is a weekly social data project in R where users explore a new dataset each week and share their findings on Twitter with #TidyTuesday.

Tidy Tuesday is mainly used by people using the R ecosystem. In this series, I will try to complete the data analysis and visualizations using Python and mainly the Pandas and Plotnine libraries.

The motivation behind this project comes from data screencasts done by David Robinson. In these videos, David looks at the data set and starts analyzing it live without previously seeing it.

Getting the Data

The data for this project can be taken from the…


Learn how to analyze a time-series dataset using Pandas.

Photo by Niels Kehl on Unsplash

Data analysts and data scientists spend most of their time cleaning and preprocessing their data. This step involves getting the right data, understanding the data, exploring the data for patterns, and cleaning or preprocessing the data before building any models.

In this article, I will explain how a data analyst goes by analyzing the data using Pandas which is a widely used data analysis library for Python.

I will go through a dataset made by Quentin Caudron [1]. It’s a time-series dataset, describing the total number of coffees made by an espresso machine by a certain date.

You can find…


Explore how much time you spent watching Netflix movies and TV Shows during COVID-19.

Man watching Netflix on TV.
Man watching Netflix on TV.
Photo by Mollie Sivaram on Unsplash

Netflix has truly revolutionized the way we consume content. We find ourselves spending hours watching movies and binge-watching popular TV Series.

The pandemic hit the world in 2020 and we were forced to live indoors and work from home. Because we were restricted in our houses, people started consuming more and more content and started spending much more time on Netflix.

This assumption is backed by this article on Netflix’s increase in subscriptions.

In this article, I will explain how you can analyze your own Netflix Viewing History and understand how you spent time and turned to Netflix for comfort…

Hemant Rattey

Data Analyst | Data Science Enthusiast | Writing about Data Analysis and Visualizations | linkedin.com/in/hemantrattey/ | github.com/hemantrattey

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store