Summary and Prediction Functions¶
Functions for extracting coefficient statistics and making predictions with confidence intervals.
Summary Functions¶
Return coefficient statistics in tidy format (like R's broom::tidy).
Available Functions¶
ps.ols_summary(y, *x, with_intercept=True) -> pl.Expr
ps.ridge_summary(y, *x, lambda_=1.0, with_intercept=True) -> pl.Expr
ps.elastic_net_summary(y, *x, lambda_=1.0, alpha=0.5, with_intercept=True) -> pl.Expr
ps.wls_summary(y, weights, *x, with_intercept=True) -> pl.Expr
ps.rls_summary(y, *x, forgetting_factor=0.99, with_intercept=True) -> pl.Expr
ps.bls_summary(y, *x, lower_bound=None, upper_bound=None, with_intercept=True) -> pl.Expr
ps.logistic_summary(y, *x, with_intercept=True) -> pl.Expr
ps.poisson_summary(y, *x, with_intercept=True) -> pl.Expr
ps.negative_binomial_summary(y, *x, theta=None, with_intercept=True) -> pl.Expr
ps.tweedie_summary(y, *x, var_power=1.5, with_intercept=True) -> pl.Expr
ps.probit_summary(y, *x, with_intercept=True) -> pl.Expr
ps.cloglog_summary(y, *x, with_intercept=True) -> pl.Expr
ps.alm_summary(y, *x, distribution="normal", with_intercept=True) -> pl.Expr
Formula variants also available: ps.ols_formula_summary(formula, ...), etc.
Returns: See Summary Output
Example¶
# Get coefficient table per group
df.group_by("group").agg(
ps.ols_summary("y", "x1", "x2").alias("coef")
).explode("coef").unnest("coef")
Output: | group | term | estimate | std_error | statistic | p_value | |-------|------|----------|-----------|-----------|---------| | A | intercept | 1.234 | 0.123 | 10.03 | 0.000 | | A | x1 | 0.567 | 0.045 | 12.60 | 0.000 | | A | x2 | -0.234 | 0.067 | -3.49 | 0.001 |
Prediction Functions¶
Return per-row predictions with optional confidence/prediction intervals.
Signature¶
ps.ols_predict(
y: Union[pl.Expr, str],
*x: Union[pl.Expr, str],
add_intercept: bool = True,
interval: str | None = None, # None, "confidence", "prediction"
level: float = 0.95,
null_policy: str = "drop", # "drop", "drop_y_zero_x"
) -> pl.Expr
Available Functions¶
All models have prediction functions:
- ols_predict, ridge_predict, elastic_net_predict
- wls_predict, rls_predict, bls_predict, nnls_predict
- logistic_predict, poisson_predict, negative_binomial_predict
- tweedie_predict, probit_predict, cloglog_predict
- alm_predict
Formula variants also available: ps.ols_formula_predict(formula, ...), etc.
Returns: See Prediction Output
Parameters¶
| Parameter | Description |
|---|---|
interval |
None (point only), "confidence" (mean interval), "prediction" (individual interval) |
level |
Confidence level (default 0.95 for 95% intervals) |
null_policy |
How to handle missing values: "drop" or "drop_y_zero_x" |
Example¶
# Per-group predictions with 95% prediction intervals
df.with_columns(
ps.ols_predict("y", "x1", "x2", interval="prediction", level=0.95)
.over("group").alias("pred")
).unnest("pred")
Output: | group | y | x1 | x2 | prediction | lower | upper | |-------|---|----|----|------------|-------|-------| | A | 5.2 | 1.0 | 2.0 | 5.15 | 3.21 | 7.09 | | A | 3.8 | 0.5 | 1.5 | 3.92 | 1.98 | 5.86 |
Interval Types¶
| Type | Description | Use Case |
|---|---|---|
None |
Point prediction only | Fast prediction |
"confidence" |
Interval for mean response | Uncertainty about regression line |
"prediction" |
Interval for individual observation | Forecasting new observations |
Note: Prediction intervals are always wider than confidence intervals because they account for both uncertainty in the regression line and individual observation variance.
See Also¶
- Linear Models - Model fitting
- GLM Models - GLM fitting
- Formula Syntax - Formula interface