ValueError: could not convert string to float: 'n'

by NimbleTortoise   Last Updated September 20, 2018 23:26 PM

Hello I am following a video on Udemy. We are trying to apply a random forest classifier. Before we do so, we convert one of the columns in a data frame into a string. The 'Cabin' column represents values such as "4C" but in order to reduce the number of unique values, we want to use simply the first number to map onto a new column 'Cabin_mapped'.

data['Cabin_mapped'] = data['Cabin'].astype(str).str[0]
# this transforms the letters into numbers
cabin_dict = {k:i for i, k in enumerate(
    data['Cabin_mapped'].unique(),0)}

data.loc[:,'Cabin_mapped'] =  data.loc[:,'Cabin_mapped'].map(cabin_dict)

data[['Cabin_mapped', 'Cabin']].head() 

This part is simply splitting the data into training and test set. The parameters don't really matter for figuring out the problem.

X_train_less_cat, X_test_less_cat, y_train, y_test = \
    train_test_split(data[use_cols].fillna(0), data.Survived, 
                     test_size = 0.3, random_state=0) 

I get an error here after the fit, saying I could not convert the float into a string. rf = RandomForestClassifier(n_estimators=200, random_state=39) rf.fit(X_train_less_cat, y_train)

It seems like I need to convert one of the inputs back into float to use the random forest algorithms. This is despite the error not showing up in the tutorial video. If anyone could help me out, that'd be great.



Related Questions


Assign custom categories to json data - pandas

Updated April 10, 2017 19:26 PM




ImportError: No module named pandasql

Updated August 21, 2017 02:26 AM