Cut / Manipulation Layer
Convert continuous numeric data into categorical intervals (bins). Similar to pandas' cut() or R's cut().
Mathematical representation for n bins: where represents break points.
Common applications:
- Score grouping (0-60, 61-70, 71-80, 81-90, 91-100)
- Duration categorization (0-5min, 5-15min, 15+min)
- Value segmentation (Economy, Standard, Premium)
- Range analysis (Low, Medium, High)
- Performance grading (A, B, C, D, F)
- Risk categorization (Low-risk, Medium-risk, High-risk)
- Experience levels (Beginner, Intermediate, Advanced)
- Distance grouping (Local, Regional, National, International)
Table
0
0
Table
Select
columnNumeric column to bin. Common input types:
- Continuous measurements
- Raw scores or ratings
- Numeric indicators
- Sequential values
- Quantitative metrics
Breaks
[f64, ...]Break points defining bin boundaries. Examples:
- Score ranges: [0, 60, 70, 80, 90, 100]
- Time periods: [0, 5, 15, 30, 60]
- Value tiers: [0, 100, 500, 1000, 5000]
- Percentiles: [0, 25, 50, 75, 100] Must be in ascending order.
Labels
[string, ...]Names for the resulting bins. Examples:
- Grades: ['F', 'D', 'C', 'B', 'A']
- Levels: ['Beginner', 'Intermediate', 'Advanced']
- Categories: ['Low', 'Medium', 'High']
- Tiers: ['Bronze', 'Silver', 'Gold', 'Platinum'] Length must be number of breaks minus 1.
LeftClosed
boolDefines interval closure:
false
(default): Right-closed [a,b)true
: Left-closed (a,b] Example: with breaks [0,10] and value 10:- Right-closed: 10 belongs to next bin
- Left-closed: 10 belongs to current bin
AsColumn
nameName for the new column. If not provided, the system generates a unique name. If AsColumn
matches an existing column, the existing column is replaced. The name should follow valid column naming conventions.