## Archive for the ‘**Matlab/Octave**’ Category

## Oct2Py and GNU Octave

GNU Octave (www.gnu.org/software/octave) is a matrix-based computing language most compatible with Matlab and is gaining popularity in Machine Learning after Dr Ng’s open course (www.coursera.org/course/ml). It is not a general-purpose programming language and is primarily designed for numerical computation. As a result, it is usually not convenient to do dirty works such as data munging within octave. In this case, python becomes a good compliment.

Oct2Py is a python interface to octave, which allows users to run octave commands and m-files dynamically within python and to exchange data between two computing systems seamlessly.

In the example below, I will show how to do simple data wrangling with oct2py package in python, convert python numpy array to octave binary *.mat file, load the data in *.mat file into octave, and then estimate a logistic regression through IRLS (iteratively reweighted least squares) method. As shown in octave session below, the vectorized implementation makes octave very efficient in prototyping machine learning algorithm, i.e. logistic regression in this case.

**PYTHON SESSION**

In [1]: import pandas as pd In [2]: import numpy as np In [3]: import oct2py as op In [4]: # READ DATA FROM CSV In [5]: df = pd.read_csv('credit_count.csv') In [6]: # SCRUB RAW DATA In [7]: df2 = df.ix[df.CARDHLDR == 1, ['AGE', 'ADEPCNT', 'MAJORDRG', 'MINORDRG', 'INCOME', 'OWNRENT', 'BAD']] In [8]: # CONVERT DATAFRAME TO NUMPY ARRAY In [9]: arr = np.array(df2, dtype = np.double) In [10]: # INVOKE AN OCTAVE SESSION In [11]: octave = op.Oct2Py() In [12]: # PUT NUMPY ARRAY INTO OCTAVE WORKSPACE In [13]: octave.put('m', arr) In [14]: print octave.run('whos') Variables in the current scope: Attr Name Size Bytes Class ==== ==== ==== ===== ===== ans 1x70 763 cell m 10499x7 587944 double Total is 73563 elements using 588707 bytes In [15]: # SAVE MAT DATA TO BE USED IN OCTAVE In [16]: octave.run('save data.mat m') Out[16]: ''

**OCTAVE SESSION**

octave:1> % LOAD DATA octave:1> load data.mat octave:2> whos Variables in the current scope: Attr Name Size Bytes Class ==== ==== ==== ===== ===== ans 1x70 763 cell m 10499x7 587944 double Total is 73563 elements using 588707 bytes octave:3> x = [ones(size(m, 1), 1) m(:, 1:6)]; octave:4> y = m(:, 7); octave:5> % SET INITIAL VALUES OF PARAMETERS octave:5> b = zeros(size(x, 2), 1); octave:6> % MAX ITERATIONS octave:6> maxiter = 100; octave:7> % SOLVE FOR BETA THROUGH IRLS octave:7> for i = 1:maxiter > xb = x * b; > p = 1 ./ (1 + exp(-xb)); > w = diag(p .* (1 - p)); > z = x * b + w \ (y - p); > b = (x' * w * x) \ (x' * w) * z; > new_ll = y' * log(p) + (1 - y)' * log(1 - p); > disp(sprintf('log likelihood for iteration %0.0f: %0.8f', i, new_ll)); > if i == 1 > old_ll = new_ll; > elseif new_ll > old_ll > old_ll = new_ll; > else > break > endif > end log likelihood for iteration 1: -7277.35224870 log likelihood for iteration 2: -3460.32494800 log likelihood for iteration 3: -3202.91405878 log likelihood for iteration 4: -3176.95671906 log likelihood for iteration 5: -3175.85136112 log likelihood for iteration 6: -3175.84836320 log likelihood for iteration 7: -3175.84836317 log likelihood for iteration 8: -3175.84836317 log likelihood for iteration 9: -3175.84836317 octave:8> % CALCULATE STANDARD ERRORS AND Z-VALUES octave:8> stder = sqrt(diag(inv(x' * w * x))); octave:9> z = b ./ stder; octave:10> result = [b, stder, z]; octave:11> format short g; octave:12> disp(result); -0.96483 0.13317 -7.2453 -0.0094622 0.0038629 -2.4495 0.13384 0.02875 4.6554 0.21028 0.069724 3.0159 0.20073 0.048048 4.1776 -0.0004632 4.1893e-05 -11.057 -0.22627 0.077392 -2.9237

## Generalized Regression Neural Networks and the Implementation with Matlab

Generalized Regression Neural Networks (GRNN) is a special case of Radial Basis Networks (RBN). Compared with its competitor, e.g. standard feedforward neural network, GRNN has several advantages. First of all, the structure of a GRNN is relatively simple and static with 2 layers, namely pattern and summation layers. Once the input goes through each unit in the pattern layer, the relationship between the input and the response would be “memorized” and stored in the unit. As a result, # of units in the pattern layer is equal to # of observations in the training sample. In each pattern unit, a Gaussian PDF would be applied to the network input such that

Theta = EXP[-0.5 * (X – u) `(X – u) / ( Sigma ^2)]

where Theta is the output from pattern units, X is the input, u is training vector stored in the unit, and Sigma is a positive constant known as “spread” or “smooth parameter”. Once Theta is computed, it is passed to the summation layer to calculate Y|X = SUM(Y * Theta) / SUM(Theta), where Y|X is the prediction conditional on X and Y is the response in the training sample. In addition to the above, other benefits of GRNN claimed by Specht (1991) include:

1) The network is able to learning from the training data by “1-pass” training in a fraction of the time it takes to train standard feedforward networks.

2) The spread, Sigma, is the only free parameter in the network, which often can be identified by the V-fold or Split-Sample cross validation.

3) Unlike standard feedforward networks, GRNN estimation is always able to converge to a global solution and won’t be trapped by a local minimum.

With respect to the implementation of GRNN, Matlab might be considered the best computing engine from my limited experience in terms of ease to use and fast speed. A demo is given below on how to use matlab to develop a GRNN and to identify an optimal value of Sigma using split-sample cross validation.

load credit Y = transpose(data(:, 2)); [n, m] = size(Y); train_index = 2:2:m; % SPLIT THE RESPONSE VECTOR INTO TRAINING AND TESTING train_Y = Y(train_index); test_Y = Y; test_Y(train_index) = []; % SPLIT X MATRIX INTO TRAINING AND TESTING X = transpose(data(:, 3:10)); train_X = X(:, train_index); test_X = X; test_X(:, train_index) = []; % STANDARDIZE X MATRIX IN TRAINING SET [train_X2, map] = mapstd(train_X); % STANDARDIZE X MATRIX IN TESTING SET test_X2 = mapstd('apply', test_X, map); % CHECK IF VARIANCE == 1 var(transpose(train_X2)) var(transpose(test_X2)) % TESTING DIFFERENT SPREAD OF RADIAL BASIS FUNCTION j = 0; for i = 1:0.02:2 % TRAIN A GRNN grnn = newgrnn(train_X2, train_Y, i); % CALCULATE THE PREDICTION FOR TESTING SET test_P = sim(grnn, test_X2); % COLLECT THE PERFORMANCE if j == 0 spread = i; perf = sse(test_Y - test_P); else spread = [spread, i]; perf = [perf, sse(test_Y - test_P)]; end; j = j + 1; end; plot(spread, perf, '-ro');

The plot below is generated by the matlab program. As shown, SSE reaches the minimal when Sigma is between 1.3 and 1.4, indicating a reasonable range of the optimal Spread value.