Skip to content

polars-statistics

High-performance statistical testing and regression for Polars DataFrames, powered by Rust.


Features

  • Native Polars Expressions — Full support for group_by, over, and lazy evaluation
  • Statistical Tests — Parametric, non-parametric, distributional, correlation, categorical, and TOST equivalence tests
  • Regression Models — OLS, Ridge, Elastic Net, WLS, Quantile, Isotonic, GLMs, ALM (24+ distributions)
  • Formula Syntax — R-style formulas with polynomial and interaction effects
  • Diagnostics — Condition number, quasi-separation detection, count sparsity checks
  • High Performance — Rust-powered with zero-copy data transfer and automatic parallelization

Installation

pip install polars-statistics

Quick Example

import polars as pl
import polars_statistics as ps

df = pl.DataFrame({
    "group": ["A"] * 50 + ["B"] * 50,
    "y": [...],
    "x1": [...],
    "x2": [...],
})

# OLS regression per group
result = df.group_by("group").agg(
    ps.ols("y", "x1", "x2").alias("model")
)

# Extract results
result.with_columns(
    pl.col("model").struct.field("r_squared"),
    pl.col("model").struct.field("coefficients"),
)

Examples

Example Description
Hypothesis Testing Check assumptions, choose tests, interpret results
Regression Workflow Fit, summarize, predict, diagnose
Group-wise Analysis group_by and over patterns
A/B Testing Proportions, equivalence, per-segment analysis

What's in the Docs

Section Description
Getting Started Installation and first examples
API Conventions Common patterns across all functions
Statistical Tests 30+ hypothesis tests
Regression Linear, GLM, ALM, dynamic models
Model Classes Direct Python class access
R Validation R-vs-Rust numerical agreement with reference values
Output Structures Return type definitions