Hands-On Python Deep Learning for the Web
上QQ阅读APP看书,第一时间看更新

Data

The amount of data we have today is enormous—as Hal Varian, Chief Economist at Google, put it in 2016:

"Between the dawn of civilization and 2003, we only created five exabytes; now we're creating that amount every two days. By 2020, that figure is predicted to sit at 53 zettabytes (53 trillion gigabytes) an increase of 50 times."

That's a lot of data. As the number of digital devices grows, this volume of data will only continue to grow exponentially. Gone are the times when a running car only displayed the speed on the speedometer. We're in an age where every part of the car can be made to produce logs at every split second, enabling us to entirely reconstruct any moment of the car's life.

The more a person gets to learn from life, the wiser the person becomes, and the better they can predict outcomes of events in the future. Analogically with machines, the greater the amount of (quality) data that a piece of software gets to train upon, the better it gets at predicting future unseen data.

In the last few years, the availability of data has grown manifold due to various factors:

  • Cheaper storage
  • Higher data transmission rates
  • Availability of cloud-based storage solutions
  • Advanced sensors
  • The Internet of Things
  • An increase in the various forms of digital electronic devices
  • Increased usage of websites and native apps

There are more digital devices now than ever. They are all equipped with systems that can generate logs at all times and transmit them over the internet to the companies that manufacture them or any other vendor that buys that data. Also, a lot of logs are created by the websites or apps people use. All of these are easily stored in cloud-based storage solutions or in physical storage of high storage capacity, which are now cheaper than before.

If you look around yourself, you will probably be able to see a laptop on which you regularly use several pieces of software and websitesall of which may be collecting data on every action you perform on them. Similarly, your phone acts as such a data-generating device. With a television with several channels provided by your television service providerboth the service provider and the channel provider are collecting data about you to serve you better and to improve their products. You can only imagine the massive amount of data a single person generates on a daily basis, and there are billions of us on this planet!