time-series-forecasting-research

Walmart Sales Forecasting Research

A comprehensive time series forecasting framework that combines advanced machine learning models with hierarchical forecasting techniques to predict Walmart sales across 45 stores and 99 departments.

Key Highlights

Quick Start

# Clone the repository
git clone https://github.com/JoKerDii/time-series-forecasting-research.git
cd time-series-forecasting-research

# Install dependencies
pip install -r requirements.txt

# Run the complete pipeline
python main.py --model lstm --store-type A --holiday-weight 5.0

Table of Contents

Problem Statement

Predict weekly sales for Walmart store-department combinations with emphasis on holiday period performance, where holiday weeks are weighted 5x in evaluation metrics.

Key Challenges

Dataset Overview

Dataset Purpose Key Fields Records
train.csv Historical sales Store, Dept, Date, Weekly_Sales, IsHoliday 421,570
test.csv Prediction targets Store, Dept, Date, IsHoliday 115,064
stores.csv Store metadata Type, Size 45
features.csv External factors Temperature, Fuel_Price, CPI, Unemployment, MarkDown1-5 8,190

Time Period: February 2010 - November 2012 Holiday Periods: Super Bowl, Labor Day, Thanksgiving, Christmas

Model Architecture

Implemented Models

Model Architecture Strengths Training Time
LSTM 3-layer (64→32→16) + dense Temporal dependencies, non-linear patterns 180.4s
Random Forest 100 trees, max depth 15 Feature importance, mixed data types 45.2s
Transformer Multi-head attention + FFN Long-range dependencies, parallel processing 156.7s
Prophet Additive decomposition Interpretable components, fast training 12.8s
HTS OLS reconciliation Guaranteed forecast coherence 89.3s

Feature Engineering Pipeline

50+ engineered features across multiple categories:

Results

Model Performance Comparison

Rank Model Weighted RMSE Holiday RMSE Regular RMSE Training Time
1 LSTM 2,847.32 4,892.15 2,634.21 180.4s
2 Random Forest 3,124.67 5,234.89 2,891.43 45.2s
3 Transformer 3,298.45 5,567.23 3,087.12 156.7s
4 Prophet 3,456.78 5,789.34 3,201.56 12.8s
5 HTS Model 3,567.89 6,012.45 3,334.67 89.3s

Key Insights

Feature Importance (Top 5)

  1. Total_MarkDown (0.85) - Direct promotional impact
  2. IsHoliday (0.78) - Critical business periods
  3. Sales_lag_1 (0.72) - Strong autoregressive patterns
  4. Store_Type (0.69) - Fundamental store characteristics
  5. Temperature (0.61) - Seasonal shopping behavior

Installation

Prerequisites

Setup

# Clone repository
git clone https://github.com/JoKerDii/time-series-forecasting-research.git
cd time-series-forecasting-research

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Download dataset (if not included)
python scripts/download_data.py

Project Structure

time-series-forecasting-research/
├── data/                          # Dataset files
├── notebooks/                     # Jupyter notebooks
├── src/                          # Source code
│   ├── advanced_forecasting_models.py
│   ├── data_loader.py
│   ├── data_processing.py
│   ├── evaluation.py
│   ├── feature_engineering.py
│   ├── forecasting_models.py
│   ├── hierarchical_time_series_model.py
│   └── interpretability.py
├── results/                       # Model outputs and visualizations
├── tests/                        # Unit tests
├── requirements.txt              # Dependencies
├── main.py                       # Main execution script
└── README.md                     # This file

Research Methodology

Data Integration & Quality Assessment

Exploratory Data Analysis

Model Development

Evaluation & Interpretation

Business Impact

Key Recommendations

  1. Optimize markdown timing: Schedule promotions 2-3 weeks before holidays
  2. Holiday inventory planning: Increase stock 15-20% above baseline
  3. Store format strategy: Prioritize Type A format expansion
  4. Economic monitoring: Track unemployment and CPI for demand shifts