Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Ensemble Learning with Systematic Hyperparameter Optimization for Urban-Bike-Sharing Demand Prediction

View through CrossRef
Bike sharing is an established component of urban mobility infrastructure, offering a low-emission alternative to motorized transport for short trips in cities worldwide. Accurate demand forecasting is essential for efficient system operation: it enables better bike redistribution, reduces user wait times, and lowers the operational costs associated with rebalancing. This study evaluated multiple ensemble strategies for hourly bike-sharing demand prediction, comparing bagging methods (Random Forest, Extra Trees), boosting methods (AdaBoost, Gradient Boosting Regressor, Histogram-based Gradient Boosting Regressor), and a Voting ensemble, while systematically investigating the impact of hyperparameter optimization. A repeated hold-out protocol was used, in which the dataset was randomly divided into 80% training and 20% test subsets across 10 random splits; 5-fold cross-validation was applied within each training fold exclusively for hyperparameter tuning, ensuring the test set remained unseen during model selection. Random Search and Bayesian Optimization were compared under identical budgets of 60 configurations per model. Results show that optimization substantially improves all models, with the most pronounced gains for AdaBoost (58% RMSE reduction) and Gradient Boosting Regressor (45% RMSE reduction). A Voting ensemble combining a Random Search-tuned Gradient Boosting Regressor and a Bayesian-optimized Histogram-based Gradient Boosting Regressor achieves the best overall performance (RMSE of 38.48, R2 of 0.955) with the lowest variance among all repeated splits. Feature importance analysis confirms that hour of day and temperature are the dominant demand drivers, consistent with the operational patterns of urban bike-sharing systems. The performance difference between Random Search and Bayesian Optimization is negligible for most models, suggesting that well-designed search spaces allow simpler strategies to achieve competitive results. A controlled comparison conducted under identical experimental conditions shows that the Voting ensemble is statistically equivalent to XGBoost and nominally better than LightGBM, while CatBoost achieves a statistically significant advantage, highlighting it as a strong individual alternative.
Title: Ensemble Learning with Systematic Hyperparameter Optimization for Urban-Bike-Sharing Demand Prediction
Description:
Bike sharing is an established component of urban mobility infrastructure, offering a low-emission alternative to motorized transport for short trips in cities worldwide.
Accurate demand forecasting is essential for efficient system operation: it enables better bike redistribution, reduces user wait times, and lowers the operational costs associated with rebalancing.
This study evaluated multiple ensemble strategies for hourly bike-sharing demand prediction, comparing bagging methods (Random Forest, Extra Trees), boosting methods (AdaBoost, Gradient Boosting Regressor, Histogram-based Gradient Boosting Regressor), and a Voting ensemble, while systematically investigating the impact of hyperparameter optimization.
A repeated hold-out protocol was used, in which the dataset was randomly divided into 80% training and 20% test subsets across 10 random splits; 5-fold cross-validation was applied within each training fold exclusively for hyperparameter tuning, ensuring the test set remained unseen during model selection.
Random Search and Bayesian Optimization were compared under identical budgets of 60 configurations per model.
Results show that optimization substantially improves all models, with the most pronounced gains for AdaBoost (58% RMSE reduction) and Gradient Boosting Regressor (45% RMSE reduction).
A Voting ensemble combining a Random Search-tuned Gradient Boosting Regressor and a Bayesian-optimized Histogram-based Gradient Boosting Regressor achieves the best overall performance (RMSE of 38.
48, R2 of 0.
955) with the lowest variance among all repeated splits.
Feature importance analysis confirms that hour of day and temperature are the dominant demand drivers, consistent with the operational patterns of urban bike-sharing systems.
The performance difference between Random Search and Bayesian Optimization is negligible for most models, suggesting that well-designed search spaces allow simpler strategies to achieve competitive results.
A controlled comparison conducted under identical experimental conditions shows that the Voting ensemble is statistically equivalent to XGBoost and nominally better than LightGBM, while CatBoost achieves a statistically significant advantage, highlighting it as a strong individual alternative.

Related Results

Bloor Bike Lanes: Assessing The Economic Impact Of Bike Lanes In The Planning Of A 21st Century Street
Bloor Bike Lanes: Assessing The Economic Impact Of Bike Lanes In The Planning Of A 21st Century Street
<p>Cycling and cycling-specific infrastructure are timely topics that addresses the mounting need for an improved and sustainable transportation network in Canadian cities (L...
Bloor Bike Lanes: Assessing The Economic Impact Of Bike Lanes In The Planning Of A 21st Century Street
Bloor Bike Lanes: Assessing The Economic Impact Of Bike Lanes In The Planning Of A 21st Century Street
<p>Cycling and cycling-specific infrastructure are timely topics that addresses the mounting need for an improved and sustainable transportation network in Canadian cities (L...
Assessment of the dynamics of bike-sharing for students’ mobility in Kigali City
Assessment of the dynamics of bike-sharing for students’ mobility in Kigali City
Abstract Compared to other modes of transportation available today, bike sharing is favored in more than 800 cities for its low environmental impact. Members of the bike-sh...
Investigation on the impact of new bike stations on a bike-share system based on a complex bike-sharing network
Investigation on the impact of new bike stations on a bike-share system based on a complex bike-sharing network
Abstract The effect of newly introduced bike stations on bike-share systems at the system, community, and station levels is investigated in this study. Changes in the topol...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
Research on Spatio-Temporal Prediction and Rebalancing Optimization for Bike-Sharing Supply-Demand Imbalance
Research on Spatio-Temporal Prediction and Rebalancing Optimization for Bike-Sharing Supply-Demand Imbalance
Bike-sharing systems have emerged as a pivotal component of urban sustainable transportation, yet they frequently face challenges related to supply-demand imbalances across spatial...
Moura et al_TransportRxiv_GIRA demand_2022.pdf
Moura et al_TransportRxiv_GIRA demand_2022.pdf
<p>Bike-sharing systems allow occasional and regular users to move by replacing other transport modes for the same trip or generating a new journey. Our research assesses the...
Moura et al_TransportRxiv_GIRA demand_2022.pdf
Moura et al_TransportRxiv_GIRA demand_2022.pdf
<p>Bike-sharing systems allow occasional and regular users to move by replacing other transport modes for the same trip or generating a new journey. Our research assesses the...

Back to Top