download notebook
view notebook w/ solutions
Exam 1 Fall 2022 (100 pts total)
files needed = ('interest.csv', 'dogs.xlsx', 'messy1.xlsx', 'messy2.xlsx', 'messy3.csv'), which can be found in exam1_data.zip
You have 75 minutes to complete this exam.
Answer all the questions below in this notebook. You should insert as many cells into the notebook as you need. When you are finished, upload your finished notebook to Canvas.
- You may use your notes and the Internet, but you cannot work with others.
- Import any packages you need to complete this exam.
- Do not modify the data files directly. All data manipulation should happen in your code.
Remember, jupyter notebooks and python have lots of built in help facilities.
Question 0 (5 points): Last, First
Replace 'Last, First' above with your actual name. Enter it as: last name, first name.
Question 1 (30 points): Plotting
The file 'interest.csv' contains annual data on the average 30-year mortgage interest rate (MORTGAGE30US
) and the CPI inflation rate (CPIAUCSL_PC1
).
- Create a line plot with the date on the x-axis. Plot the mortgage rate in black and the inflation rate in red.
- The figure size should be 12 inches wide and 8 inches tall.
- Both lines should have weight 2.
- The tick labels on both axes should have font size 18.
- The x-axis should range from 1971 to 2022.
Make any further adjustments you find necessary.
Question 2 (5 points): Graphical excellence
Insert a markdown cell below this cell and type your answer to the following question. You do not need to write any code.
The figure you created above suggests that inflation and interest rates are related.
- I want my message to be "mortgage rates go up when interest goes up." How would you change the figure you created in question 1 to better communicate that message? Explain why.
Question 3 (20 points): Working with DataFrames
The file 'dogs.xlsx' contains data about three dogs: Buster, Subee, and Jax. For each dog, the dataset records the number of snacks and walks each dog had on a given day.
Use the data to answer the following questions.
- What is the average number of snacks that Buster eats per day? Print the answer as
"Buster averages ?.? snacks per day."
Replace ?.? with the answer. Match the formatting of the number.
- Create a DataFrame that contains only the data for Buster and Jax.
How many columns and rows are in this DataFrame? Print the answer as
"The Buster and Jax DataFrame has ? columns and ?? rows."
Replace the ? and ?? with the answers.
Question 4 (15 points): Loading messy data
Let's load three datasets. Each one has something wrong with it.
-
[Footer.] Load the file 'messy1.xlsx' into a DataFrame. Make sure that the variables (v1 through v8) are all of type
float64
. Print out the types of the variables. -
[Missing data.] Load the file 'messy2.xlsx' into a DataFrame. Make sure that the variables (v1 through v8) are all of type
float64
. Print out the types of the variables. -
[Separator not a comma.] Load the file 'messy3.csv' into a DataFrame. While the suffix of the file is
.csv
, the data are not separated by commas. Use pd.read_csv() to load the data. Print out the shape of the DataFrame.
Hint: Open the file in a text editor such as notepad++ and check the delimiter type.
Question 5 (20 points): Slicing and types
For each value in the following list, extract the year (the 4 numeric digits) and store them in a list. Print out the list, sorted from smallest to largest.
strlist = ['yy2022yy', 'yy1980yy', 'yy1991yy', 'yy1793yy', 'yy1288yy']
The output should be [1288, 1793, 1980, 1991, 2022]
.
Question 6 (5 points): Using error messages
The code below contains two errors. Correct them both so that the code runs without error.
df = pd.DataFrame({'names':['Bucky', 'Brutus', 'Sparty'], 'school':['UW', 'OSU', 'MSU']})
print( df.shape() )
df.drop('school', inplace=True)
print(df)
You are finished!
Upload your completed notebook to Canvas.