上QQ阅读APP看书，第一时间看更新

Getting started with deep feedforward neural networks

A deep feedforward neural network is designed to approximate a function, f(), that maps some set of input variables, x, to an output variable, y. They are called feedforward neural networks because information flows from the input through each successive layer as far as the output, and there are no feedback or recursive loops (models including both forward and backward connections are referred to as recurrent neural networks).

Deep feedforward neural networks are applicable to a wide range of problems, and are particularly useful for applications such as image classification. More generally, feedforward neural networks are useful for prediction and classification where there is a clearly defined outcome (what digit an image contains, whether someone is walking upstairs or walking on a flat surface, the presence/absence of disease, and so on).

Deep feedforward neural networks can be constructed by chaining layers or functions together. For example, a network with four hidden layers is shown in the following diagram:

Figure 4.1: A d eep feedforward neural network

This diagram of the model is a directed acyclic graph. Represented as a function, the overall mapping from the input, X, to the output, Y, is a multilayered function. The first hidden layer is H₁=f(1)(X, w₁ a₁), the second hidden layer is H₂=f(2)(H₁, w₂ a₂), and so on. These multiple layers can allow complex functions and transformations to be built up from relatively simple ones.

If sufficient hidden neurons are included in a layer, it can approximate to the desired degree of precision with many different types of functions. Feedforward neural networks can approximate non-linear functions by applying non-linear transformations between layers. These non-linear functions are known as activation functions, which we will cover in the next section.

The weights for each layer will be learned as the model is trained through forward- and backward-propagation. Another key piece of the model that must be determined is the cost, or loss, function. The two most commonly used cost functions are cross-entropy, which is used for classification tasks, and mean squared error (MSE), which is used for regression tasks.