Hands-On Convolutional Neural Networks with TensorFlow
上QQ阅读APP看书,第一时间看更新

Underfitting versus overfitting

When designing a neural network to solve a specific problem, we may have lots of moving parts, and have to take care of many things at the same time such as:

  • Preparing your dataset
  • Choosing the number of layers/number of neurons
  • Choosing optimizer hyper-parameters

If we focus on the second point, it leads us to learn about two problems that might occur when choosing or designing a neural network architecture/structure.

The first of these problems is if your model is too big for the amount, or complexity, of your training data. As the model has so many parameters, it can easily just learn exactly what it sees in your training set even down to the noise that is present in the data. This is a problem because when the network is presented with data that is not exactly like the training set, it will not perform well because it has learned too precisely what the data looks like and has missed the bigger picture behind it. This issue is called overfitting or having high-variance.

On the other hand, you might choose a network that is not big enough to capture the data complexity. We now have the opposite problem, and your model is unable to capture the underlying structure behind your dataset well enough as it doesn’t have the capacity (parameters) to fully learn. The network will again not be able to perform well on new data. This issue is called underfitting or having high-bias.

As you may suspect, you will always be looking for the right balance when it comes to your model complexity to avoid these issues.

In later chapters, we will see how to detect, avoid, and remedy those problems, but just for the sake of introduction, these are some of the classic ways to solve these issues:

  • Getting more data
  • Stopping when you detect that the error on the test data starts to grow (early-stopping)
  • Starting the model design as simple as possible and only adding complexity when you detect underfitting