Using Pandas and Matplotlib in Python to Track Your Spending Habits

Tyler Travis
4 min readMay 12, 2023

In today’s world, it’s essential to track our expenses to achieve financial stability. However, manually tracking our expenses can be cumbersome and prone to error. Why collect receipts, when most major banks store transaction data anyways? Data science comes to the rescue by providing a smarter and more efficient way to track expenses. In this article, we’ll discuss how to use Pandas and Matplotlib in Python to track your spending habits and achieve your financial goals. When do you spend your money, how frequently, and where? Let’s begin.

Multiple transactions are performed each day by the average person.

Regardless of the data science project, there is a common start to the process:

  • Importing Required Libraries: To track spending habits, making use of popular data science libraries such as Pandas, and MatPlotLib.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
  • Collect Your Data: The first step in tracking your expenses using data science is to collect your data. You can do this by exporting your transactions from your bank’s website and saving them in a CSV file. Alternatively, you can use a budgeting app that provides an API to download your data directly into a Pandas DataFrame.
df = pd.read_csv("bank_account_file_name.csv")
  • Clean Your Data: Before you can analyze your spending habits, you need to clean your data. Do you want to neglect e-transfers? Do you only care about retail purchases, and want to remove major outliers such as college tuition or your mortgage payments? Cleaning data is to the discretion of what you want your spending habits to be representative of.
  • Analyze Your Data: This is where creativity is awarded. How can this data be queried, to give us important information on our spending habits, and help us guide smarter financial decisions?
  • Visualize Your Data: Data visualization is an essential aspect of data science because it helps you communicate your findings effectively. You can use Matplotlib to create charts and graphs that illustrate your spending habits, such as a pie chart that shows how much you spend on different categories. You can also use Seaborn, a Matplotlib-based library, to create more complex visualizations.

After loading my own transaction history, I had a few questions about my spending habits. I firstly wanted the change chequeing account balance throughout a 12-month period. This was trivial by building a plot with the date column on one axis, and the corresponding account balance.

df['Date'] = pd.to_datetime(df['Date'])
df = df.sort_values(by='Date')
plt.plot(df['Date'], df['Balance'])
plt.xlabel('Date')
plt.ylabel('Account Balance')
plt.title('Account Balance Over Time')
plt.show()
Example of Account Balance Over Time

I was curious where I spent my money most frequently. In my dataset, there is a column named “Description” which describes where the purchase was made (i.e: McDonalds, Gas Station). I want to know where I spend my money most often, to potentially find ways to self-discipline myself. I wrote this query.

top_descriptions = df['Description'].value_counts().nlargest(10)
top_descriptions

This describes the 10 spots I had purchased most frequently. The data outputted is adjusted to protect my privacy.

Walmart                 44
Apple 34
Best Buy 29
Groceries 23
University Fees 20
Target 20
Groceries 2 16
Bank Fee 14
Sports Event 13
Food Place 3. 13

In this example, it seems that I am able to cut costs from eating out, and getting groceries too frequently. Maybe I need to find a way to budget my food, and lower costs of my meals.

Lastly, I wanted to see the distribution of my spending across different months. To do this, we need to count the total sum withdrawn of each month,

df["Date"] = pd.to_datetime(df["Date"])
df["Month"] = df["Date"].dt.month_name()
grouped = df.groupby("Month").sum()["Withdrawl"]

plt.pie(grouped, labels=grouped.index, autopct="%1.1f%%")
plt.title("Withdrawals by Month")
plt.show()
Shows Distribution of Spending By Month

After performing just three queries, it became evident that reducing spending in April, purchasing longer-lasting groceries, and assessing cash flow were necessary actions. This information serves as a foundation for devising strategies to curtail expenses. It is important to note that the more data available, the more accurate the insights derived from it will be.

--

--