AI / Machine Learning

StaySure

Hotel Booking Cancellation Predictor

Live Demo View Source

F1 Score

0.85

on the held-out test set

ROC-AUC

0.958

strong class separation

Training Data

119k

real hotel bookings

Problem

Hotel booking cancellations cost the hospitality industry billions annually. If a hotel could predict — at booking time — which reservations are likely to cancel, they could adjust pricing, overbooking policies, and staffing accordingly.

Solution

An end-to-end ML project trained on 119,390 real hotel bookings. Given details like lead time, deposit type, booking channel, and previous cancellations, the model predicts whether a booking will be cancelled — and explains why using SHAP feature importance values. Deployed as an interactive Gradio demo on Hugging Face Spaces.

Tech Stack

EDA	Jupyter · pandas · seaborn
Modeling	scikit-learn Pipelines · LogisticRegression · RandomForest · XGBoost
Hyperparameter Tuning	GridSearchCV (3-fold)
Experiment Tracking	MLflow (local file backend)
Explainability	SHAP summary + per-prediction force plots
Deployment	Gradio on Hugging Face Spaces

ML Pipeline

Raw CSV (119,390 rows)
  │
  ▼
EDA notebook
  │  leakage check, target distribution, correlations
  ▼
sklearn Pipeline (ColumnTransformer)
  │  impute nulls → encode categoricals → scale numerics
  ▼
Model comparison (LR baseline → RF → XGBoost)
  │  GridSearchCV, logged to MLflow
  ▼
Best model (Random Forest) → test set evaluation
  │  confusion matrix, ROC-AUC, F1
  ▼
SHAP analysis → feature importance
  │
  ▼
Gradio app → Hugging Face Spaces (public demo)

Model Results

Model	F1 Score	ROC-AUC
Logistic Regression (baseline)	0.74	0.86
Random Forest (winner ✓)	0.85	0.958
XGBoost	0.83	0.95

Random Forest narrowly edged XGBoost on F1; chosen for faster inference and a simpler dependency surface for the Hugging Face deployment.

Selected Lessons

›EDA leakage hunt comes first. Several columns (e.g. reservation_status) effectively encode the target — using them inflates accuracy to 100%. Removing them is the difference between a model and a lookup table.
›Class imbalance ≠ broken metrics. ~37% cancellation rate is mild but not balanced. F1 over accuracy as the primary metric — a 63%-accurate model that always predicts "not cancelled" is useless.
›Pipelines > manual preprocessing. sklearn ColumnTransformer + Pipeline guarantees test-set transformations match the training-set transformations. A whole class of "works in notebook, fails in production" bugs disappears.
›SHAP explanations are the deploy unlock. Showing why a booking is flagged turns "the AI says no" into a tool a hotel manager would actually trust.

← All Projects