To forecast the daily sales in euros for the entire year of 2012 using historical sales transaction data from December 1, 2010, to December 9, 2011, for an online retailer.
An online retail historical sales transactions were analyzed for future sales forecasting. To forecast these sales, we employed five machine learning models: Light Gradient Boosted Model (LightGBM), Auto-Regressive Integrated Moving Average (ARIMA), Seasonal Autoregressive Integrated Moving Average with Exogenous Variables (SARIMAX), Prophet (a Generative Additive Model), and Recurrent Neural Network (RNN).
Upon testing these models, I found that the Prophet model outperformed the others in delivering a reliable sales forecast over an extended period, even with a limited dataset. The other models faced challenges in generating accurate and noise-free predictions for such long time horizons.
In this project, I focused solely on univariate analysis. For future work, we recommend exploring multivariate and multiple time series analysis. Additionally, ensemble models could offer improved accuracy in sales forecasting.
Sales forecasting plays a pivotal role in business management, particularly in resource allocation, inventory control, and operational planning. Among the various methods available for forecasting sales, time series analysis is frequently employed. Techniques can range from traditional linear regression models to cutting-edge neural networks, depending on the specific domain, objectives, and available data.
Before initiating the forecasting process, it's crucial to outline the project's scope, conduct an exploratory data analysis, and architect the codebase. Subsequent steps involve implementing the chosen predictive models, conducting experiments, and analyzing the results. In the following sections, we will delve into each of these aspects in greater detail.
https://app.eraser.io/workspace/EPHT4pT3EDDDrYKqyMgo?origin=share
The primary objective is to forecast the daily sales for an online retailer for the year 2012, leveraging historical data from previous years. Given the open-ended nature of the task, multiple analytical approaches can be adopted.
Timeframe