scikit-learn Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

DictVectorizer class

Another option is to use DictVectorizer class. This can be used to directly convert strings to features:

from sklearn.feature_extraction import DictVectorizer
dv = DictVectorizer()
my_dict = [{'species': iris.target_names[i]} for i in y]
dv.fit_transform(my_dict).toarray()[:5]

array([[ 1., 0., 0.], [ 1., 0., 0.], [ 1., 0., 0.], [ 1., 0., 0.], [ 1., 0., 0.]])