FAQ¶
Installation and setup¶
Prophet is hanging on first import. What is happening?
Prophet compiles a Stan model the first time it is imported. This compilation step
takes approximately 5–10 minutes and only happens once per environment. Subsequent
imports are fast. If the hang persists beyond 15 minutes, check that CmdStan installed
correctly: import cmdstanpy; cmdstanpy.cmdstan_path().
I get an XGBoost error about libomp.dylib not loaded on macOS with conda.
This affects Intel Mac users who install the package via the conda path in the
Setup guide. The conda environment uses pip install -e . to
install dependencies, and the pip-distributed XGBoost binary on Intel Macs links
against OpenMP at a Homebrew path (/usr/local/opt/libomp/) that conda environments
cannot see. Running brew install libomp does not fix it — Homebrew and conda manage
libraries independently, and the library ends up in a location the conda environment
never searches.
After hitting the error, replace the pip-installed XGBoost with the conda-forge build, which bundles OpenMP internally:
If that does not resolve it, install OpenMP directly into the conda environment:
Note: this issue does not affect Apple Silicon (M-series) Macs or users following the
venv-based installation path.
I get an ImportError for NeuralProphet or PyTorch.
NeuralProphet is an optional dependency (~1 GB including PyTorch) and is not installed by default. Install it with:
I get an error about pkg_resources not found.
The package pins setuptools>=68,<71 because NeuralProphet uses pkg_resources,
which was removed in setuptools 71+. Check your setuptools version:
If it is 71 or higher, downgrade: pip install "setuptools>=68,<71".
Running the pipeline¶
How do I know if my counterfactual is credible?
Two checks in order of importance:
-
Visual: open
{model}_counterfactual.pngand inspect the pre-event period. The observed and expected lines should track closely before the event. Divergence before the event means the model did not capture the baseline trend. -
Numeric: check
{model}_metrics.csv. A low test R² (e.g., below 0.5) or a test RMSE that is large relative to the outcome's range signals poor generalization. The test window performance — not training performance — is the relevant signal.
If either check fails, try a different model via compare_models() or add covariates
that help explain the baseline trend.
What block length should I use for the Moving Block Bootstrap?
The default block_length=14 is appropriate for daily data with moderate
autocorrelation (approximately two weeks of temporal dependence). It is not
automatically adapted to other data configurations.
For non-daily data or series with substantially different autocorrelation structure,
the default may produce CI coverage that is too narrow or too wide. Adjust via
config_overrides={"bootstrap": {"block_length": <value>}}. Automated block length
selection is not currently implemented.
My bootstrap is producing many "simulation failure" warnings.
Warnings about failed bootstrap simulations typically mean the model failed to converge on some resampled series. Common causes:
- Training window is too short for reliable refitting on bootstrap resamples.
- The series contains extreme outliers that cause numerical instability.
- NeuralProphet training stochasticity on short windows.
If more than ~10% of simulations fail (the package warns at 50%), the CI coverage may be unreliable. Check whether the full series has enough data and whether outliers should be handled before running.
Can I use this package on weekly or monthly data?
Yes, but configuration adjustments are required:
- Set
mfor ARIMA:config_overrides={"models": {"arima": {"m": 52}}}for weekly,{"m": 12}for monthly. - Set
freqfor NeuralProphet:config_overrides={"models": {"neuralprophet": {"freq": "W"}}}. - Consider adjusting
block_lengthfor the MBB (default 14 is calibrated for daily data).
What is the minimum series length?
There is no hard minimum, but the series must be long enough to:
- Support the cross-validation framework (
min_train_days+ at least one fold oftest_days+skip_days). - Provide a reliable test window (
test_days, default 365 days) for model selection.
Series shorter than about two years of daily observations will typically produce unstable tuning and cross-validation results. Shorter series are better served by simpler models (ARIMA) with reduced CV requirements.
Data¶
My covariate is missing in part of the post-event window.
Covariates must be present for every row, including the entire post-event projection window. Options:
- Shorten
holdout_daysto the period for which the covariate is available. - Drop the covariate if it cannot be extended.
- Impute the post-event covariate externally (only appropriate if the imputed values would not be affected by the event itself).
I have missing values in my outcome column.
By default, the pipeline raises a ValueError on missing outcome values. Two
automated strategies are available:
# drop rows with missing outcome
run_single_its(df, ..., config_overrides={"data": {"missing_data": "drop"}})
# linear interpolation
run_single_its(df, ..., config_overrides={"data": {"missing_data": "interpolate"}})
For models sensitive to regular spacing (ARIMA, NeuralProphet), external imputation with explicit handling is preferred.
How do I switch to date-based splitting?
The default split_method="percent" sizes test/holdout windows as fractions of the
available data. To pin the windows to fixed calendar durations instead:
run_single_its(
df, intervention_date="2022-06-01",
config_overrides={
"periods": {
"split_method": "days",
"test_days": 365,
"holdout_days": 365,
},
},
)
In "days" mode, test_days must be strictly less than the number of pre-intervention
observations or the pipeline raises ValueError (a previously silent failure mode where
the training split came back empty).
Configuration¶
What is the config_overrides format?
config_overrides is a nested dict that mirrors the structure of params.yaml. It is
merged on top of the package defaults at the highest priority:
config_overrides = {
"periods": {"test_days": 180, "holdout_days": 90},
"bootstrap": {"n_sim": 500, "block_length": 14},
"models": {"arima": {"m": 52}},
}
Inside tune_model, the search space also accepts double-underscore flattened keys
for nested model parameters (e.g., "xgb__max_depth" for XGBoost's max_depth).