Project A1 Common Questions

Question 6

TypeError: could not convert string to float: 'SF'

Type errors like these usually stem from applying a numeric aggregation function to a non-numeric column as described in the pandas section of the debugging guide.

Aggregation functions like np.median and np.mean are only well-defined for columns with numeric types like int and float. Your code is likely trying to aggregate across all columns in training_data, including those of type str. Instead of aggregating across the entire DataFrame, try just selecting the relevant columns.

TypeError: unhashable type: 'Series'

This error can occur if you try and use Python’s in to check whether values in a Series are contained in a list. If you’re trying to perform boolean filtering in this manner, you should look into the .isin (documentation) function as introduced in HW 2.

Question 7

I’m not sure how to use sklearn to do One Hot Encoding

A good starting point is to revisit the One Hot Encoding question in Lab 7. It’s recommended you look through this portion of the walkthrough, so you have a good understanding of how to use the OneHotEncoder object. Pay attention to what each variable represents and the expected outputs of the functions used. Can you map the logic from the lab to this project? A nice way to start is to make a new cell and experiment with examples from the documentation.

My OHE columns contain a lot of NaN values

This may happen if you try and merge the OHE columns with the training_data table without making sure both DataFrame have the same index values. Look into the pd.merge documentation for ways to resolve this.