Inference vs Prediction
2 min readMar 31, 2021
Inference and prediction are terms used to describe the two approaches for finding a model that best describes the relationship between outcome and other variables from a given set of data. Inference uses the model to learn about the data generation process, while prediction uses the model to predict the outcomes for new data points. In short:
- Inference: understand relationship between x and y
- Prediction: predicts y from x
Goal
- Inference: Estimate relationship between outcome variable and predictor variable(s).
- Prediction: Develop “best” model using predictor variables to get high accuracy and low error.
Examples
- Inference: You want to understand how housing prices are influenced by square feet of the home, zip code, crime rate, and condition of the house. Based on the model, you interpret the role of the features on housing prices.
- Prediction: You want to predict unknown housing prices based on known housing prices and its data. You fit several models and choose the model with the lowest error and make predictions.
Workflow
Inference:
- Modeling: Choose the model that approximates the data generation process best.
- Model validation: Evaluate the validity of model.
- Inference: Use the model to understand relationship.
Prediction
- Modeling: Consider several different models and different parameter settings.
- Model selection: Identify the model with the greatest predictive performance using validation/test sets; select the model with the highest performance on the test set.
- Prediction: Apply the selected model.