Research Domain and Question¶

Region: State of North Carolina, USA Domain: Economic Statistics

Main Research Question: When adjusted for inflation using the BLS CPI, did the per capita income of residents of Buncome County, North Carolina increase or decrease between 1970 and 2020?

Secondary Research Question: How does the inlfation adjusted per capita income of Buncombe County residents compare to the average per capita income of residents in all counties of North Carolina, in the same period?

Income in North Carolina¶

The data I have elected to use for income statistics in North Carolina comes from the website of the North Carolina Office of State Management and Budget. Specifically, I have examined data from the following dataset:

  • Employment and Income (LINC)

Consumer Price Index (CPI) Data¶

I found a table with historical CPI data, used for calculating inflation effects, on the US Inflation Calculator website. To retrieve the table, I first saved the HTML, then imported that into Excel, cleaned up the sheet outside of the table, and then imported the Excel sheet into Pandas.

Following best practices in economics, I use the Average Annual CPI to compute inflation, calculating a multiplier which I used to convert historical income amounts into their equivalent in "2020 dollars".

In [363]:
import os
def prindex(alist):
    for i, x in enumerate(alist):
        print(f"{i}: {x}")

files = os.listdir()
prindex(files)
0: education.csv
1: educational-attainment-for-the-population-b15003.csv
2: USBLS_Annual_Inflation_Rates_1970-2020.xlsx
3: Consumer Price Index Data from 1913 to 2024.xlsx
4: Consumer Price Index Data from 1913 to 2024.html
5: Assignment_04_Income_and_Education_in_NC.ipynb
6: North_Carolina_Income_Change_1970_to_2020.png
7: employment-and-income-linc.csv
8: .ipynb_checkpoints
In [135]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
In [364]:
emp_inc_df = pd.read_csv(files[7], sep=";")
In [207]:
cpi_df = pd.read_excel(files[3])

Calculating Historical Inflation Multipliers based on CPI Data¶

In [318]:
years = [1970, 1980, 1990, 2000, 2010, 2020]
CPI_census_years = cpi_df[(cpi_df["Year"].isin(years))].sort_values(by="Year", ascending=True).reset_index(drop=True)
In [319]:
CPI_census_years
Out[319]:
Year Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec Annual Avg Dec-Dec Avg-Avg
0 1970 37.800 38.000 38.200 38.500 38.600 38.800 39.000 39.000 39.200 39.400 39.600 39.800 38.800 5.6 5.7
1 1980 77.800 78.900 80.100 81.000 81.800 82.700 82.700 83.300 84.000 84.800 85.500 86.300 82.400 12.5 13.5
2 1990 127.400 128.000 128.700 128.900 129.200 129.900 130.400 131.600 132.700 133.500 133.800 133.800 130.700 6.1 5.4
3 2000 168.800 169.800 171.200 171.300 171.500 172.400 172.800 172.800 173.700 174.000 174.100 174.000 172.200 3.4 3.4
4 2010 216.687 216.741 217.631 218.009 218.178 217.965 218.011 218.312 218.439 218.711 218.803 219.179 218.056 1.5 1.6
5 2020 257.971 258.678 258.115 256.389 256.394 257.797 259.101 259.918 260.280 260.388 260.229 260.474 258.811 1.4 1.2
In [321]:
CPI = CPI_census_years["Annual Avg"]
CPI = list(CPI)
CPI_Multipliers = [CPI[5] / multiplier  for multiplier in CPI] 
print(CPI, CPI_Multipliers)
[38.8, 82.4, 130.7, 172.2, 218.056, 258.811] [6.670386597938144, 3.1409101941747566, 1.9801912777352717, 1.5029674796747967, 1.1869015298822319, 1.0]
In [322]:
print(f"""
CPI: {CPI}
CPI_Multipliers: {CPI_Multipliers}
""")
CPI: [38.8, 82.4, 130.7, 172.2, 218.056, 258.811]
CPI_Multipliers: [6.670386597938144, 3.1409101941747566, 1.9801912777352717, 1.5029674796747967, 1.1869015298822319, 1.0]

Employment and Income Data¶

The employment and income LINC database contains 1,177,506 rows, 904,374 of which provide data for North Carolina counties ("Area Type": "County").

It contains data spanning 1969 to 2023. It combines US census data, economic statistics and much more. The dataset provides data for an astonishing 439 different variable types, of which I am interested in only one:

"Per Capita Money Income (Census)"

First I isolate County data on Per Capita Income for the years 1970, 1980, 1990, 2000, 2010, and 2020 (census years) and then I further isolate Buncome County data. I create a separate dataframe for each of these datasets. I discard data for 2015 and 2022.

In [323]:
years = [1970, 1980, 1990, 2000, 2010, 2020]

Per Capita Income Data¶

In [334]:
county_per_capita_income = emp_inc_df[(emp_inc_df["Area Type"] == "County") 
                                    & (emp_inc_df["Variable"] == "Per Capita Money Income (Census)") 
                                    & (emp_inc_df["Year"].isin(years))]

The resulting dataframe contains 600 rows and five columns, giving per capita money income in the census years of 1970-2020 for 100 counties in North Carolina.

In [335]:
print(f"There are {len(set(county_per_capita_income["Area Name"]))} North Carolina counties represented in the DataFrame.")
There are 100 North Carolina counties represented in the DataFrame.

To make best use of the dataframe, pivot it

In [336]:
NC_per_capita_Income_1970_to_2020 = county_per_capita_income.pivot(index="Area Name", columns="Year", values="Value")
NC_per_capita_Income_1970_to_2020
Out[336]:
Year 1970 1980 1990 2000 2010 2020
Area Name
Alamance County 2936.0 6534.0 13290.0 19391.0 22819.0 27944.0
Alexander County 2391.0 5731.0 11624.0 18507.0 20716.0 27446.0
Alleghany County 2012.0 4934.0 10237.0 17691.0 18919.0 26026.0
Anson County 1744.0 4826.0 9402.0 14853.0 16856.0 22483.0
Ashe County 1717.0 4652.0 9545.0 16429.0 20350.0 25282.0
... ... ... ... ... ... ...
Wayne County 1998.0 5320.0 10843.0 17010.0 20446.0 26362.0
Wilkes County 2093.0 5767.0 10816.0 17516.0 19406.0 25634.0
Wilson County 2054.0 5754.0 11641.0 17102.0 20691.0 24656.0
Yadkin County 2303.0 5645.0 11843.0 18576.0 20379.0 26907.0
Yancey County 1624.0 4530.0 9462.0 16335.0 18576.0 25767.0

100 rows × 6 columns

In [337]:
# make a copy for experimentation
df = NC_per_capita_Income_1970_to_2020.copy()
In [338]:
df.index.name = "County"
df.index.name
Out[338]:
'County'
In [341]:
# testing out .loc on it 
df.loc["Buncombe County", 1970]
Out[341]:
np.float64(2671.0)
In [342]:
# what does the mean of each column produce?
columns = df.columns
for col in columns:
    print(df[col].mean())
2161.2
5504.49
11306.92
17812.76
21620.08
27696.34
In [343]:
bunc_per_cap_income = df.loc["Buncombe County"]
NC_avg_per_cap_income = df.mean()

# is the mean of each column = to the loop above (spoiler: yes)
print(bunc_per_cap_income, NC_avg_per_cap_income)
Year
1970     2671.0
1980     6360.0
1990    13211.0
2000    20384.0
2010    25665.0
2020    33835.0
Name: Buncombe County, dtype: float64 Year
1970     2161.20
1980     5504.49
1990    11306.92
2000    17812.76
2010    21620.08
2020    27696.34
dtype: float64

Let's Look at Mean Family Income as Well¶

The variable "Mean Fanily Income" is another measure which will allow a similar comparison. Let's add it to our dataframe. Here I'll do all the steps in one motion, as it were.

In [348]:
county_mean_family_income = emp_inc_df[(emp_inc_df["Area Type"] == "County") 
                                    & (emp_inc_df["Variable"] == "Mean Family Income") 
                                    & (emp_inc_df["Year"].isin(years))]
county_mean_family_income = county_mean_family_income.pivot(index="Area Name", columns="Year", values="Value")
buncombe_mean_family_income = county_mean_family_income.loc["Buncombe County"]
NC_mean_family_income = county_mean_family_income.mean()
In [349]:
# Put together a finished comparison DF

compare_Bunc_and_NC_per_cap_income = pd.DataFrame({"Buncombe County Per Cap Income": bunc_per_cap_income, 
                                                   "Buncombe County Mean Family Income": buncombe_mean_family_income,
                                                   "Avg NC County Per Cap Income": NC_avg_per_cap_income,
                                                   "Avg NC Mean Family Income": NC_mean_family_income,
                                                   "CPI": CPI,
                                                   "CPI 2020 Multiplier": CPI_Multipliers}, 
                                                   index=NC_avg_per_cap_income.index)
compare_Bunc_and_NC_per_cap_income["Bunc Co PCI (2020 Dollars)"] = (
    compare_Bunc_and_NC_per_cap_income["Buncombe County Per Cap Income"]
    * compare_Bunc_and_NC_per_cap_income["CPI 2020 Multiplier"])
compare_Bunc_and_NC_per_cap_income["Bunc Co Mean Family Income (2020 Dollars)"] = (
    compare_Bunc_and_NC_per_cap_income["Buncombe County Mean Family Income"]
    * compare_Bunc_and_NC_per_cap_income["CPI 2020 Multiplier"])
compare_Bunc_and_NC_per_cap_income["NC Avg PCI (2020 Dollars)"] = (
    compare_Bunc_and_NC_per_cap_income["Avg NC County Per Cap Income"]
    * compare_Bunc_and_NC_per_cap_income["CPI 2020 Multiplier"])
compare_Bunc_and_NC_per_cap_income["NC Mean Family Income (2020 Dollars)"] = (
    compare_Bunc_and_NC_per_cap_income["Avg NC Mean Family Income"]
    * compare_Bunc_and_NC_per_cap_income["CPI 2020 Multiplier"])

compare_Bunc_and_NC_per_cap_income
Out[349]:
Buncombe County Per Cap Income Buncombe County Mean Family Income Avg NC County Per Cap Income Avg NC Mean Family Income CPI CPI 2020 Multiplier Bunc Co PCI (2020 Dollars) Bunc Co Mean Family Income (2020 Dollars) NC Avg PCI (2020 Dollars) NC Mean Family Income (2020 Dollars)
Year
1970 2671.0 8781.0 2161.20 7769.05 38.800 6.670387 17816.602603 58572.664716 14416.039515 51822.566999
1980 6360.0 19279.0 5504.49 17708.04 82.400 3.140910 19976.188835 60553.607633 17289.108755 55619.363355
1990 13211.0 37183.0 11306.92 33517.70 130.700 1.980191 26160.306970 73629.452280 22389.864362 66371.457190
2000 20384.0 56354.0 17812.76 51308.68 172.200 1.502967 30636.489106 84698.229350 26771.999003 77115.277465
2010 25665.0 71669.0 21620.08 62891.14 218.056 1.186902 30461.827764 85064.045745 25660.906028 74645.590282
2020 33835.0 96530.0 27696.34 80073.13 258.811 1.000000 33835.000000 96530.000000 27696.340000 80073.130000

Generate the Visualization¶

In [362]:
buncombe_PCI = compare_Bunc_and_NC_per_cap_income["Bunc Co PCI (2020 Dollars)"]
buncombe_MFI = compare_Bunc_and_NC_per_cap_income["Bunc Co Mean Family Income (2020 Dollars)"]
NC_PCI = compare_Bunc_and_NC_per_cap_income["NC Avg PCI (2020 Dollars)"]
NC_MFI = compare_Bunc_and_NC_per_cap_income["NC Mean Family Income (2020 Dollars)"]
years = list(compare_Bunc_and_NC_per_cap_income.index)
fig, ax = plt.subplots(1, 1, figsize=(14, 6))
plt.plot(years, buncombe_MFI, color="blue", label="Bunc. Co. Mean Family Income (2020 Dollars)")
plt.plot(years, NC_MFI, color="red", label="NC Mean Family Income (2020 Dollars)")
plt.plot(years, buncombe_PCI, color="skyblue", label="Bunc. Co. Per Capita Income (2020 Dollars)")
plt.plot(years, NC_PCI, color="pink", label="N.C. Mean Per Capita Income (2020 Dollars)")
plt.title("Inflation Adjusted Per Capita and Mean Family Income in Buncombe County vs North Carolina Average, 1970-2020")
plt.xlabel("Census Year")
plt.ylabel("Inflation Adjusted 2020 Dollars (Based on CPI)")
plt.legend()
plt.savefig("North_Carolina_Income_Change_1970_to_2020.png")
plt.show(fig);
No description has been provided for this image
In [ ]: