Baixe o app para aproveitar ainda mais
Prévia do material em texto
Seminário 7 PART I: Overview Dendrite morphological neurons Dendrite morphological neurons are a type of artificial neural network that works with min and max operators instead of algebraic products. These morphological operators build hyperboxes in N-dimensional space. (There are no convergence problems, perfect classification can always be reached and training is performed in only one epoch.) Each dendrite generates a hyperbox and the dendrite that is most active is the one whose case is closer to the input pattern. The output y of this neuron is a scalar. Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. The min operators together check if x inside the hyperbox is limited by wmin and wmax as the extreme points. If dn, c > 0, x is inside the hyperbox; if dn,c = 0, x is somewhere in the hyperbox boundary; otherwise, it is outside. Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. Gradient Descent and Backpropagation There have been four proposals to use gradient descent and backpropagation to train morphological neural networks: In Pessoa and Maragos (2000), (MRL-NNs) combine classical perceptrons with morphological/rank neurons. The output of each layer is a convex combination of linear and rank operations applied to inputs. The training is based on gradient descent. MRL-NNs have shown that they can solve complex classification problems like recognising digits in images (NIST dataset), generating similar or better results compared to classical MLPs in shorter training times. This approach uses no hyperboxes which is a great difference with respect to our work; hyperboxes make it easy to initialise learning parameters for gradient optimisation. Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. In de A. Araujo (2012), a neural model similar to DMNN but with a linear activation function has been applied to regression problems such as forecasting stock markets. This neural architecture is called Increasing Morphological Perceptron (IMP) and is also trained by gradient descent. The drawback of these training methods based on gradient is that the number of hidden units (dendrites or rank operations) must be tuned as a hyperparameter. The advantage of creating more hyperboxes during training is lost. Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. In Zamora and Sossa (2017), focus on classification problems, instead of regression problems, reuse some heuristics to initialise hyperboxes before training and extend the neural architecture using a softmax layer, the training after the hyperboxes initializations optimise the dendrite parameters by SGD. Further, it differs from MRL-NNs in the neural architecture which does not incorporate linear layers, only morphological layers. Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. The softmax layer normalises the dendrite outputs such that these outputs are restricted between zero to one and can be interpreted as a measurement of likelihood Pr so that a pattern x belongs to the class c, as follows: d1 to dNc are dendrites clusters. Each class corresponds to one dendrite cluster dc. Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. Cross entropy The softmax layer makes it possible to estimate the probability of a pattern x that belongs to the class c, Prc(x). The cross entropy between these probabilities and the targets (ideal probabilities) can be used as objective function J , as follows: where Q is the total number of training samples, q is the index for a training sample, tq∈ N is the target class for that training sample, and 1{·} is the indicator function. The purpose is to minimise this cost function by SGD method Dendrite morphological neurons trained by stochastic gradient descent. Erik Zamoraa , Humberto Sossa. 2017. In Hernández, Zamora and Sossa (2018), the proposal is to create another model of neural network called Morphological-Linear Neural Network, which consists on merging two different types of neural layers: a hidden layer of morphological neurons and an output layer of classical perceptrons. For this, they added a non-linear activation function at the output of the morphological neuron. The hybrid neural network, the MLNN, has two hyperparameters to optimize: the number of morphological neurons to use in the middle layer and the learning rate. The output layer is a perceptron type neuron with an activation function for a bi-classification problem. In contrast, for a classification problem of N classes, the number of perceptron neurons in the output layer is equal to the number of classes plus a probability distribution layer, the softmax function. Morphological-Linear Neural Network. Gerardo Hernandez , Erik Zamora , Humberto Sossa. 2018. Seminário 7 PART II: Elaboration Dropout Dropout is a technique for regularization in which we modify the network itself. Randomly delete half the hidden neurons, while leaving the input and output neurons untouched, then forward-propagate and backpropagate, update appropriate weights and biases. Repeat the process. Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, 2015 Softmax Softmax implies the use of the softmax function. The sum over all activations is constant equal to 1. In addition, since all output values are positive, the softmax function can be seen as a probability distribution. Caution: The softmax function does not follow the principle of non-locality. Output activation depends on all weighted inputs. Convolutional Networks Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, 2015 Local Receptive Field: Instead of connecting one input to one neuron, we connect regions of input to neurons. Shared weights and biases: All the neurons in the first hidden layer detect exactly the same feature. Pooling layers: In detail, a pooling layer takes each feature map output from the convolutional layer and prepares a condensed feature map. Convolutional Networks Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, 2015 The final layer of connections in the network is a fully-connected layer. MRL-NN and Backpropagation algorithm Neural networks with hybrid morphological/rank/linear nodes: a unifying framework with applications to handwritten character recognition Lú cio F.C. Pessoa, Petros Maragos. Robust techniques are necessary to circumvent the nondifferentiability of rank functions. Observe that the central problem in using this general training algorithm is the evaluation of three derivatives:
Compartilhar