How to use Keras Models With Scikit-Learn For General Machine Learning

How to use Keras Models With Scikit-Learn For General Machine Learning,第1张

The scikit-learn library is the most popular library for general machine learning in Python.

In this lesson you will discover how you can use deep learning models from Keras with the scikit-learn library in Python. After completing this lesson you will know :

  • How to wrap a Keras model for use with the scikit-learn machine learning library.
  • How to easily evaluate Keras models using cross validation in scikit-learn.
  • How to tune Keras model hyperparameters using grid search in scikit-learn.
1.1 Overview

Keras is a popular library for deep learning in Python, but the focus of the library is deep learning. In fact it strives for minimalism, focusing on only what you need to quickly and simply define and build deep learning models. The scikit-learn library in Python is built upon the SciPy stack for efficient numerical computation. It is a fully featured library for general purpose machine learning  and provides many utilities that are useful in the development of deep learning models. Not least:

  • Evaluation of models using resampling method like k-fold cross validation.
  • Efficient search and evaluation of model hyperparameters.

The Keras library provides a convenient wrapper for deep learning models to be used as classification or regression estimators in scikit-learn.

1.2 Evaluate Models with Cross Validation

The KerasClassifier and KerasRegressor classes in Keras take an argument build_fn which is the name of the function to call to create your model. You must define a function called whatever you like that defines your model, compiles it and returns it. In the example below we define a function create_model() that create a simple multilayer neural network for the problem. We pass this function name to the KerasClassifier class by the build_fn argument. We also pass in additional arguments of nb_epoch=150 and batch_size=10. These are automatically bundled up and passed on to the fit() function which is called internally by the KerasClassifier class. In this example we use the scikit-learn StratifiedKFold to perform 10-fold stratified cross validation. This is a resampling technique that can provide a robust estimate of the performance of a machine learning model on unseen data. We use the scikit-learn function cross_val_score() to evaluate our model using the cross validation scheme and print the results.

# MLP for Pima Indians Dataset with 10-fold  via sklearn
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
import numpy as np

# Function to create model,required for KerasClassifier
def create_model():
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, kernel_initializer = 'uniform',activation='relu'))
    model.add(Dense(8, kernel_initializer = 'uniform',activation='relu'))
    model.add(Dense(1,kernel_initializer = 'uniform',activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accurecy'])
    return model

# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

# load pima indians dataset
dataset = np.loadtxt("pima-indians-diabetes.csv",delimiter=",")
# split into input (X) and output(Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

# create model 
model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)
# evaluate using 10-fold cross validation
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
results = cross_val_score(model, X, Y, cv=kfold)
print(results.mean())

1.3 Grid Search Deep Learning Model Parameters

In this example we use a grid search to evaluate different configurations for our neural network model and report on the combination that provides the best estimated performance. The create model() function is defined to take two arguments optimizer and init, both of which must have default values. This will allow us to evaluate the effect of using different optimization algorithms and weight initialization schemes for our network. After creating our model, we define arrays of values for the parameter we wish to search, specifically:

  • Optimizers for searching different weight values.
  • Initializers for preparing the network weights using different schemes.
  • Number of epochs for training the model for different number of exposures to the training dataset.
  • Batches for varying the number of samples before weight updates.

# MLP for Pima Indians Dataset with grid search via sklearn
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
import numpy as np

# Funciton to create model, required for KerasClassifier
def create_model(optimizer='rmsprop',init='glorot_uniform'):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, kernel_initializer=init, activation='relu'))
    model.add(Dense(8, kernel_initializer=init, activation='relu'))
    model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=['accuracy'])
    return model

# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

# load pima indians dataset
dataset = np.loadtxt("pima-indians-diabetes.csv",delimiter=",")
# split into input (X) and output (Y) variables

X = dataset[:,0:8]
Y = dataset[:,8]

# create model
model = KerasClassifier(build_fn=create_model,verbose=0)
# grid search epochs,batch size and optimizer
optimizers = ['rmsprop','adam']
init = ['glorot_uniform','normal','uniform']
epochs = np.array([50, 100, 150])
batches = np.array([5, 10, 20])
param_grid = dict(optimizer=optimizers,nb_epoch=epochs, batch_size=batches,init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X, Y)
# summarize results
print("Best %f using %s" % (grid_result.best_score_,grid_result.best_params_))
for params,mean_score, scores in grid_result.cv_results_:
    print("%f (%f) with: %r" % (scores.mean(),scores.std(),params))

1.4 Summary

In this lesson you discovered how you can wrap your Keras deep learning models and use them in the scikit-learn general machine learning library. You learned:

  • Specifically how to wrap Keras models so that they can be used with the scikit-learn machine learning library.
  • How to use a wrapped Keras model as part of evaluating model performance in scikit-learn.
  • How to perform hyperparameter tuning in scikit-learn using a wrapped Keras model.

You can see that using scikit-learn for standard machine learning operations such as model evaluation and model hyperparameter optimization can save a lot of time over implementing these schemes yourself.

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/923095.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-16
下一篇 2022-05-16

发表评论

登录后才能评论

评论列表(0条)

保存