download notebook
view notebook w/o solutions

Comparing inflation rates

files needed = ('inflation_tuition.csv', 'inflation_food.xlsx')

Let's put some of our new skills to work. Tuition and food are important expenditures for college students. How have their prices changed since you started at UW? Let's assume you started in 2019. You can change the start date to match your particular circumstance if you like.

We have data in two files. The data are collected as part of the Consumer Price Index.

  1. inflation_tuition.csv This file is directly from the Bureau of Labor Statistics (BLS) who collects the data.

  2. inflation_food.xlsx This data is also collected by BLS, but this file is from FRED, a site run by the St. Louis Federal Reserve. It is easier to get data from FRED than directly from BLS. (The tuition data was not available on FRED...)

Our goal is to compute the cumulative inflation rate from 2019 to 2022, the last year for which we have annual data.

We proceed in the usual data analysis workflow

  1. Pose a preliminary question
  2. Find the appropriate data
  3. Get data into usable form (missing values, numbers as strings, etc.)
  4. Preliminary analysis
  5. Revisit and revise preliminary questions
  6. Final analysis
  7. Report results

Are preliminary question is: How much has tuition increased in the last 3 years? How does the change compare to the change in food prices?

We will focus on steps #2 and #3 in this notebook.

import pandas as pd

1. Tuition

  1. Load the 'inflation_tuition.xlsx' file and create a DataFrame. Watch out for extra stuff at the beginning of the file. I would use the header argument to specify where the data start. Check your data types. Do they look correct?
  2. Set the index to the Year column.
tuit = pd.read_excel('inflation_tuition.xlsx', header=11, index_col=0)
print(tuit.dtypes)
display(tuit.head(3))
Jan    float64
Feb    float64
Mar    float64
Apr    float64
May    float64
Jun    float64
Jul    float64
Aug    float64
Sep    float64
Oct    float64
Nov    float64
Dec    float64
dtype: object
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Year
1979 7.9 7.9 7.9 7.9 7.9 8.0 8.0 8.0 8.0 7.9 7.9 7.9
1980 8.3 8.4 8.3 8.3 8.3 8.1 8.1 8.2 11.7 12.1 12.1 12.1
1981 12.2 12.2 12.3 12.3 12.3 12.3 12.7 12.2 12.0 12.7 12.7 12.7

The data are arranged with years in the index and months in the columns, so each entry is the inflation rate for that year-month. We want to compute annual interest rates.

  1. Compute the mean of each row. Use the .mean() method of DataFrame. Which axis should you specify? Create a new column in your DataFrame to hold the annual inflation rate.
tuit['annual'] = tuit.mean(axis=1)
tuit.head(2)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec annual
Year
1979 7.9 7.9 7.9 7.9 7.9 8.0 8.0 8.0 8.0 7.9 7.9 7.9 7.933333
1980 8.3 8.4 8.3 8.3 8.3 8.1 8.1 8.2 11.7 12.1 12.1 12.1 9.500000
  1. The inflation rates are in percent. We want them in growth rates. Divide the annual inflation rate by 100 and add one. For example, 4.13 percent should now be 1.0413.
tuit['annual'] = tuit['annual']/100+1
tuit.head(2)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec annual
Year
1979 7.9 7.9 7.9 7.9 7.9 8.0 8.0 8.0 8.0 7.9 7.9 7.9 1.079333
1980 8.3 8.4 8.3 8.3 8.3 8.1 8.1 8.2 11.7 12.1 12.1 12.1 1.095000

2. Food

  1. Load the 'inflation_food.xlsx' file and create a DataFrame. Check your data types. Do they look correct?
  2. Set the index to the DATE column.
food = pd.read_csv('inflation_food.csv', index_col=0)
food.rename(columns={'CPIUFDSL_PC1':'food'}, inplace=True)
print(food.dtypes)
display(food.head(2))
food    float64
dtype: object
food
DATE
1948 8.41996
1949 -3.96293
  1. The inflation rates are in percent. We want them in growth rates. Divide the annual inflation rate by 100 and add one. For example, 8.4 percent should now be 1.084.
food['food'] = food['food']/100+1
food.head(2)
food
DATE
1948 1.084200
1949 0.960371

3. How has tuition changed since 2019?

The data are set up. Now, to answer our question.

Our data are the changes in price from \(p_{t-1}\) to \(p_t\), a measure of the inflation rate,

$$(1+\pi_{t})=\frac{p_t}{p_{t-1}}.$$

We compute the cumulative change by multiplying the inflation rates for the consecutive periods. For example, if we want to know how the price of something has changed from 2019 to 2022, we compute

$$1+\pi_{2019,2022} = (1+\pi_{2020})(1+\pi_{2021})(1+\pi_{2022})$$

This suggests that we need to multiply the values in the inflation column of our DataFrame. Check out the .cumprod() method. (docs)

tuit.loc[2020:2022,'annual'].cumprod()
Year
2020    1.013750
2021    1.022451
2022    1.044604
Name: annual, dtype: float64

Tuition (on average, as measured by the CPI) has increased 4.5 percent since 2019.

4. How have food prices changed since 2019?

This is the same calculation we did for tuition, but use the food inflation data.

food.loc[2020:2022,'food'].cumprod()
DATE
2020    1.034442
2021    1.075252
2022    1.182240
Name: food, dtype: float64

Food (on average, as measured by the CPI) has increased 18.2 percent since 2019.

5. How does inflation in tuition and food compare?

Tuition prices have grown much slower than food prices. This is true for each year since 2019. The cumulative effect implies that food prices have increased more than 4 times faster than tuition since 2019.

If you finish early

Explore the older data. The tuition data goes back to 1979. How do these numbers look in the 1980s, a period of famously high inflation?

When I was in school, my biggest expenses were tuition, rent, and food. You can find the rent cpi data here. The download button is in the upper right corner. How has rent inflation changed?