
Linear versus nonlinear
Linear algorithms involve lines or hyperplanes. Hyperplanes are flat surfaces in any n-dimensional space. They tend to be easy to understand and explain, as they involve ratios (with an offset). Some functions that consistently and monotonically increase or decrease can be mapped to a linear function with a transformation. For example, exponential growth can be mapped to a line with the log transformation.
Nonlinear algorithms tend to be tougher to explain to colleagues and investors, yet ensembles of decision trees that are nonlinear tend to perform very well. KNN, which we examined earlier, is nonlinear. In some cases, functions not increasing or decreasing in a familiar manner are acceptable for the sake of accuracy.
Try a simple SVC with a polynomial kernel, as follows:
from sklearn.svm import SVC #Usual import of SVC
svc_poly_clf = SVC(kernel = 'poly', degree= 3).fit(X_train, y_train) #Polynomial Kernel of Degree 3
The polynomial kernel of degree 3 looks like a cubic curve in two dimensions. It leads to a slightly better fit, but note that it can be harder to explain to others than a linear kernel with consistent behavior throughout all of the Euclidean space:
svc_poly_scores = cross_val_score(svc_clf, X_train, y_train, cv=4)
svc_poly_scores.mean()
0.95906432748538006