DataframeQuantile / Aggregation Layer
Calculate quantiles across specified columns, similar to numpy percentile() or pandas quantile(). Quantiles divide sorted data into equal-sized portions, providing insights into data distribution.
Common quantile values:
- 0.25: First quartile (Q1)
- 0.50: Median (Q2)
- 0.75: Third quartile (Q3)
- 0.10, 0.90: Deciles
- 0.01, 0.99: Percentiles
Applications:
- Statistical summaries
- Box plot generation
- Income distribution analysis
- Performance benchmarking
- Risk assessment (Value at Risk)
- Quality control limits
Provides multiple configuration options for detailed distribution analysis.
Select
[column, ...]Numeric columns to compute quantiles for. If empty, processes all numeric columns. Non-numeric columns are ignored. Selected columns should contain sufficient non-null values for meaningful quantile calculation.
Quantiles
[, ...]Configuration for quantile calculation specifying target quantiles and interpolation method.
Quantile
[f64, ...]Quantile values to compute (0 to 1). Default quartiles [0.25, 0.50, 0.75]. Examples:
- [0.5]: Median only
- [0.1, 0.9]: 10th and 90th percentiles
- [0.01, 0.25, 0.5, 0.75, 0.99]: Detailed distribution
Interpolation
enumMethods for estimating quantile values between discrete data points. Choice affects results when exact quantile falls between observations.
Linear interpolation between points. Most common method, provides smooth transitions. Example: value = v1 + fraction * (v2 - v1)
Use lower value. Conservative estimate, ensures value exists in dataset. Example: floor function
Use higher value. Liberal estimate, ensures value exists in dataset. Example: ceiling function
Use nearest value. Minimizes interpolation error, maintains existing values. Example: round function
Average of lower and higher. Balanced approach between extremes. Example: (lower + higher) / 2