Exploratory Data Analysis (EDA)
After the data is collected, the first step in the data preparation stage is Exploratory Data Analysis, which is very popularly known as EDA. EDA techniques allow us to know the data in a detailed manner for better understanding. This is an extremely vital step in the overall ML pipeline because without good knowledge about the data itself, if we blindly fit an ML model to the data, it most likely will not produce good results. EDA gives us a direction in which to proceed and helps us to decide further steps in the pipeline. EDA involves many things such as calculating useful statistics about the data and determining whether the data suffers from any outliers. It also comprises effective data visualization, which helps us to interpret the data graphically and therefore helps us to communicate vital facts about the data in a meaningful way.