Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Project A1 Hidden Tests

Question 2

Q2a

Question 4

Q4a

Q4b

Hint: consider the x-axis and that most of the data is distributed to the left from 0.5×106=$500,0000.5 \times 10^6 = \$500,000.

Q4c

After removing outliers, training_data should have 153,776 rows remaining.

Question 5

Q5a

A good representative of "Description" column value in training_data is shown below:

“This property, sold on 06/10/2016, is a one-story household located at 104 SAUK TRL. It has a total of 5 rooms, 2 of which are bedrooms, and 1.0 of which are bathrooms.”

Consider what features of houses are specified here.

Q5b

Question 6

Q6a

There are many ways to approach this question. Using .unique() pandas function is one of the ways to solve it.

Q6b

Q6c

The three most expensive neighborhood codes (using the median) are neighborhood 44, 94, and 93.

Q6d

Using the three most expensive neighborhood codes from Q6c, there should be 1,290 rows belonging to those neighborhoods.

Question 7

Q7a

training_data should have 4 unique wall materials with value_counts() = [70303, 59125, 35717, 3786]

Q7b

The sum of the one-hot encoded columns should equal the number of rows in training_data.