
上QQ阅读APP看书,第一时间看更新
How to do it...
- Let's get the data sorted first:
from sklearn import datasets
import numpy as np
iris = datasets.load_iris()
X = iris.data
y = iris.target
- Place X and y, all of the numerical data, side-by-side. Create an encoder with scikit-learn to handle the category of the y column:
from sklearn import preprocessing
cat_encoder = preprocessing.OneHotEncoder()
cat_encoder.fit_transform(y.reshape(-1,1)).toarray()[:5]
array([[ 1., 0., 0.], [ 1., 0., 0.], [ 1., 0., 0.], [ 1., 0., 0.], [ 1., 0., 0.]])