Skip to content

API Conventions

Common patterns and conventions used throughout the polars-statistics API.

Expression API

All functions work as Polars expressions and integrate with group_by, over, and lazy evaluation:

import polars as pl
import polars_statistics as ps

# With group_by (aggregation)
df.group_by("group").agg(ps.ols("y", "x1", "x2").alias("model"))

# With over (window function)
df.with_columns(ps.ols("y", "x1", "x2").over("group").alias("model"))

# Lazy evaluation
df.lazy().group_by("group").agg(ps.ttest_ind("x", "y")).collect()

Column References

All functions accept column names as strings or pl.Expr:

ps.ols("y", "x1", "x2")              # String column names
ps.ols(pl.col("y"), pl.col("x1"))   # Polars expressions
ps.ols("y", pl.col("x1") * 2)       # Mixed / transformed

Return Types

Statistical Tests

Return a struct with statistic and p_value fields:

result = df.select(ps.ttest_ind("x", "y").alias("test"))
result.with_columns(
    pl.col("test").struct.field("statistic"),
    pl.col("test").struct.field("p_value"),
)

Regression Models

Return a struct with model-specific fields. See Output Structures for details.

result = df.group_by("group").agg(ps.ols("y", "x1").alias("model"))
result.with_columns(
    pl.col("model").struct.field("r_squared"),
    pl.col("model").struct.field("coefficients"),
)

Summary Functions

Return List[Struct] with coefficient statistics (like R's broom::tidy):

df.group_by("group").agg(
    ps.ols_summary("y", "x1", "x2").alias("coef")
).explode("coef").unnest("coef")
# Columns: term, estimate, std_error, statistic, p_value

Prediction Functions

Return Struct{prediction, lower, upper} per row:

df.with_columns(
    ps.ols_predict("y", "x1", "x2", interval="prediction", level=0.95)
        .over("group").alias("pred")
).unnest("pred")
# Columns: prediction, lower, upper

Common Parameters

Parameter Description Default
with_intercept Include intercept term True
alternative Test alternative: "two-sided", "less", "greater" "two-sided"
alpha Significance level 0.05
conf_level Confidence level for intervals 0.95
lambda_ L2 (Ridge) regularization strength 0.0

See Also