## Archive for the ‘**Keras**’ Category

## R Interfaces to Python Keras Package

Keras is a popular Python package to do the prototyping for deep neural networks with multiple backends, including TensorFlow, CNTK, and Theano. Currently, there are two R interfaces that allow us to use Keras from R through the reticulate package. While the keras R package is able to provide a flexible and feature-rich API, the kerasR R package is more convenient and computationally efficient. For instance, in the below example mimicking the Python code shown in https://statcompute.wordpress.com/2017/01/02/dropout-regularization-in-deep-neural-networks, the kerasR package is at least 10% faster than the keras package in terms of the computing time.

df <- read.csv("credit_count.txt") Y <- matrix(df[df$CARDHLDR == 1, ]$DEFAULT) X <- scale(df[df$CARDHLDR == 1, ][3:14]) set.seed(2018) rows <- sample(1:nrow(Y), nrow(Y) - 2000) Y1 <- Y[rows, ] Y2 <- Y[-rows, ] X1 <- X[rows, ] X2 <- X[-rows, ] ### USE KERAS PACKAGE (https://keras.rstudio.com) ### library(keras) dnn1 % ### DEFINE THE INPUT LAYER ### layer_dense(units = 50, activation = 'relu', input_shape = ncol(X), kernel_constraint = constraint_maxnorm(4)) %>% layer_dropout(rate = 0.2, seed = 1) %>% ### DEFINE THE 1ST HIDDEN LAYER ### layer_dense(units = 20, activation = 'relu', kernel_constraint = constraint_maxnorm(4)) %>% layer_dropout(rate = 0.2, seed = 1) %>% ### DEFINE THE 2ND HIDDEN LAYER ### layer_dense(units = 20, activation = 'relu', kernel_constraint = constraint_maxnorm(4)) %>% layer_dropout(rate = 0.2, seed = 1) %>% layer_dense(units = 1, activation = 'sigmoid') %>% compile(loss = 'binary_crossentropy', optimizer = 'sgd', metrics = c('accuracy')) dnn1 %>% fit(X1, Y1, batch_size = 50, epochs = 20, verbose = 0, validation_split = 0.3) pROC::roc(as.numeric(Y2), as.numeric(predict_proba(dnn1, X2))) ### USE KERAS PACKAGE (https://github.com/statsmaths/kerasR) ### library(kerasR) dnn2 <- Sequential() ### DEFINE THE INPUT LAYER ### dnn2$add(Dense(units = 50, input_shape = ncol(X), activation = 'relu', kernel_constraint = max_norm(4))) dnn2$add(Dropout(rate = 0.2, seed = 1)) ### DEFINE THE 1ST HIDDEN LAYER ### dnn2$add(Dense(units = 20, activation = 'relu', kernel_constraint = max_norm(4))) dnn2$add(Dropout(rate = 0.2, seed = 1)) ### DEFINE THE 2ND HIDDEN LAYER ### dnn2$add(Dense(units = 20, activation = 'relu', kernel_constraint = max_norm(4))) dnn2$add(Dropout(rate = 0.2, seed = 1)) dnn2$add(Dense(units = 1, activation = 'sigmoid')) keras_compile(dnn2, loss = 'binary_crossentropy', optimizer = 'sgd', metrics = 'accuracy') keras_fit(dnn2, X1, Y1, batch_size = 50, epochs = 20, verbose = 0, validation_split = 0.3) pROC::roc(as.numeric(Y2), as.numeric(keras_predict_proba(dnn2, X2)))

## A Simple Convolutional Neural Network for The Binary Outcome

Since CNN(Convolutional Neural Networks) have achieved a tremendous success in various challenging applications, e.g. image or digit recognitions, one might wonder how to employ CNNs in classification problems with binary outcomes.

Below is an example showing how to use a simple 1D convolutional neural network to predict credit card defaults.

### LOAD PACKAGES from numpy.random import seed from pandas import read_csv, DataFrame from sklearn.preprocessing import minmax_scale from keras.layers.convolutional import Conv1D, MaxPooling1D from keras.optimizers import SGD from keras.models import Sequential from keras.layers import Dense, Flatten ### PREPARE THE DATA df = read_csv("credit_count.txt") Y = df[df.CARDHLDR == 1].DEFAULT X = minmax_scale(df[df.CARDHLDR == 1].ix[:, 2:12], axis = 0) y_train = Y.values x_train = X.reshape(X.shape[0], X.shape[1], 1) ### FIT A 1D CONVOLUTIONAL NEURAL NETWORK seed(2017) conv = Sequential() conv.add(Conv1D(20, 4, input_shape = x_train.shape[1:3], activation = 'relu')) conv.add(MaxPooling1D(2)) conv.add(Flatten()) conv.add(Dense(1, activation = 'sigmoid')) sgd = SGD(lr = 0.1, momentum = 0.9, decay = 0, nesterov = False) conv.compile(loss = 'binary_crossentropy', optimizer = sgd, metrics = ['accuracy']) conv.fit(x_train, y_train, batch_size = 500, epochs = 100, verbose = 0)

Considering that 1D is the special case of 2D, we can also solve the same problem with a 2D convolutional neural network by changing the input shape, as shown below.

from numpy.random import seed from pandas import read_csv, DataFrame from sklearn.preprocessing import minmax_scale from keras_diagram import ascii from keras.layers.convolutional import Conv2D, MaxPooling2D from keras.optimizers import SGD from keras.models import Sequential from keras.layers import Dense, Flatten df = read_csv("credit_count.txt") Y = df[df.CARDHLDR == 1].DEFAULT X = minmax_scale(df[df.CARDHLDR == 1].ix[:, 2:12], axis = 0) y_train = Y.values x_train = X.reshape(X.shape[0], 1, X.shape[1], 1) seed(2017) conv = Sequential() conv.add(Conv2D(20, (1, 4), input_shape = x_train.shape[1:4], activation = 'relu')) conv.add(MaxPooling2D((1, 2))) conv.add(Flatten()) conv.add(Dense(1, activation = 'sigmoid')) sgd = SGD(lr = 0.1, momentum = 0.9, decay = 0, nesterov = False) conv.compile(loss = 'binary_crossentropy', optimizer = sgd, metrics = ['accuracy']) conv.fit(x_train, y_train, batch_size = 500, epochs = 100, verbose = 0)

## Autoencoder for Dimensionality Reduction

We often use ICA or PCA to extract features from the high-dimensional data. The autoencoder is another interesting algorithm to achieve the same purpose in the context of Deep Learning.

with the purpose of learning a function to approximate the input data itself such that F(X) = X, an autoencoder consists of two parts, namely encoder and decoder. While the encoder aims to compress the original input data into a low-dimensional representation, the decoder tries to reconstruct the original input data based on the low-dimension representation generated by the encoder. As a result, the autoencoder has been widely used to remove the data noise as well to reduce the data dimension.

First of all, we will show the basic structure of an autoencoder with 1-layer encoder and 1-layer decoder, as below. In the example, we will compress the input data with 10 columns into a compressed on with 3 columns.

from pandas import read_csv, DataFrame from numpy.random import seed from sklearn.preprocessing import minmax_scale from sklearn.model_selection import train_test_split from keras.layers import Input, Dense from keras.models import Model df = read_csv("credit_count.txt") Y = df[df.CARDHLDR == 1].DEFAULTS X = df[df.CARDHLDR == 1].ix[:, 2:12] # SCALE EACH FEATURE INTO [0, 1] RANGE sX = minmax_scale(X, axis = 0) ncol = sX.shape[1] X_train, X_test, Y_train, Y_test = train_test_split(sX, Y, train_size = 0.5, random_state = seed(2017)) ### AN EXAMPLE OF SIMPLE AUTOENCODER ### # InputLayer (None, 10) # Dense (None, 5) # Dense (None, 10) input_dim = Input(shape = (ncol, )) # DEFINE THE DIMENSION OF ENCODER ASSUMED 3 encoding_dim = 3 # DEFINE THE ENCODER LAYER encoded = Dense(encoding_dim, activation = 'relu')(input_dim) # DEFINE THE DECODER LAYER decoded = Dense(ncol, activation = 'sigmoid')(encoded) # COMBINE ENCODER AND DECODER INTO AN AUTOENCODER MODEL autoencoder = Model(input = input_dim, output = decoded) # CONFIGURE AND TRAIN THE AUTOENCODER autoencoder.compile(optimizer = 'adadelta', loss = 'binary_crossentropy') autoencoder.fit(X_train, X_train, nb_epoch = 50, batch_size = 100, shuffle = True, validation_data = (X_test, X_test)) # THE ENCODER TO EXTRACT THE REDUCED DIMENSION FROM THE ABOVE AUTOENCODER encoder = Model(input = input_dim, output = encoded) encoded_input = Input(shape = (encoding_dim, )) encoded_out = encoder.predict(X_test) encoded_out[0:2] #array([[ 0. , 1.26510417, 1.62803197], # [ 2.32508397, 0.99735016, 2.06461048]], dtype=float32)

In the next example, we will relax the constraint of layers and employ a stack of layers to achievement the same purpose as above.

### AN EXAMPLE OF DEEP AUTOENCODER WITH MULTIPLE LAYERS # InputLayer (None, 10) # Dense (None, 20) # Dense (None, 10) # Dense (None, 5) # Dense (None, 3) # Dense (None, 5) # Dense (None, 10) # Dense (None, 20) # Dense (None, 10) input_dim = Input(shape = (ncol, )) # DEFINE THE DIMENSION OF ENCODER ASSUMED 3 encoding_dim = 3 # DEFINE THE ENCODER LAYERS encoded1 = Dense(20, activation = 'relu')(input_dim) encoded2 = Dense(10, activation = 'relu')(encoded1) encoded3 = Dense(5, activation = 'relu')(encoded2) encoded4 = Dense(encoding_dim, activation = 'relu')(encoded3) # DEFINE THE DECODER LAYERS decoded1 = Dense(5, activation = 'relu')(encoded4) decoded2 = Dense(10, activation = 'relu')(decoded1) decoded3 = Dense(20, activation = 'relu')(decoded2) decoded4 = Dense(ncol, activation = 'sigmoid')(decoded3) # COMBINE ENCODER AND DECODER INTO AN AUTOENCODER MODEL autoencoder = Model(input = input_dim, output = decoded4) # CONFIGURE AND TRAIN THE AUTOENCODER autoencoder.compile(optimizer = 'adadelta', loss = 'binary_crossentropy') autoencoder.fit(X_train, X_train, nb_epoch = 100, batch_size = 100, shuffle = True, validation_data = (X_test, X_test)) # THE ENCODER TO EXTRACT THE REDUCED DIMENSION FROM THE ABOVE AUTOENCODER encoder = Model(input = input_dim, output = encoded4) encoded_input = Input(shape = (encoding_dim, )) encoded_out = encoder.predict(X_test) encoded_out[0:2] #array([[ 3.74947715, 0. , 3.22947764], # [ 3.93903661, 0.17448257, 1.86618853]], dtype=float32)