Yet Another Blog in Statistical Computing

I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

An Example of Merge Layer in Keras

The power of a DNN does not only come from its depth but also come from its flexibility of accommodating complex network structures. For instance, the DNN shown below consists of two branches, the left with 4 inputs and the right with 6 inputs. In addition, the right branch shows a more complicated structure than the left.

                                                InputLayer (None, 6)
                                                     Dense (None, 6)
                                        BatchNormalization (None, 6)
                                                     Dense (None, 6)
         InputLayer (None, 4)           BatchNormalization (None, 6)
              Dense (None, 4)                        Dense (None, 6)
 BatchNormalization (None, 4)           BatchNormalization (None, 6)
                    \____________________________________/
                                      |
                                 Merge (None, 10)
                                 Dense (None, 1)

To create a DNN as the above, both left and right branches are defined separately with corresponding inputs and layers. In the line 29, both branches would be combined with a MERGE layer. There are multiple benefits of such merged DNNs. For instance, the DNN has the flexibility to handle various inputs differently. In addition, new features can be added conveniently without messing around with the existing network structure.

from pandas import read_csv, DataFrame
from numpy.random import seed
from sklearn.preprocessing import scale
from keras.models import Sequential
from keras.constraints import maxnorm
from keras.optimizers import SGD
from keras.layers import Dense, Merge
from keras.layers.normalization import BatchNormalization
from keras_diagram import ascii

df = read_csv("credit_count.txt")
Y = df[df.CARDHLDR == 1].DEFAULTS
X1 = scale(df[df.CARDHLDR == 1][["MAJORDRG", "MINORDRG", "OWNRENT", "SELFEMPL"]])
X2 = scale(df[df.CARDHLDR == 1][["AGE", "ACADMOS", "ADEPCNT", "INCPER", "EXP_INC", "INCOME"]])

branch1 = Sequential()
branch1.add(Dense(X1.shape[1], input_shape = (X1.shape[1],), init = 'normal', activation = 'relu'))
branch1.add(BatchNormalization())

branch2 = Sequential()
branch2.add(Dense(X2.shape[1], input_shape =  (X2.shape[1],), init = 'normal', activation = 'relu'))
branch2.add(BatchNormalization())
branch2.add(Dense(X2.shape[1], init = 'normal', activation = 'relu', W_constraint = maxnorm(5)))
branch2.add(BatchNormalization())
branch2.add(Dense(X2.shape[1], init = 'normal', activation = 'relu', W_constraint = maxnorm(5)))
branch2.add(BatchNormalization())

model = Sequential()
model.add(Merge([branch1, branch2], mode = 'concat'))
model.add(Dense(1, init = 'normal', activation = 'sigmoid'))
sgd = SGD(lr = 0.1, momentum = 0.9, decay = 0, nesterov = False)
model.compile(loss = 'binary_crossentropy', optimizer = sgd, metrics = ['accuracy'])
seed(2017)
model.fit([X1, X2], Y.values, batch_size = 2000, nb_epoch = 100, verbose = 1)
Advertisements

Written by statcompute

January 8, 2017 at 4:42 pm

%d bloggers like this: