CaseWhenColumns / Manipulation Layer

Create a new column by comparing and mapping values between columns. Similar to SQL's CASE WHEN with column references or pandas' np.where() with column conditions. All comparisons and replacements use column values rather than constants.

Common applications:

  • Data reconciliation (comparing reported vs. actual values)
  • Cross-system validation (matching values across sources)
  • Dynamic value mapping (lookup-style operations)
  • Quality control (comparing duplicate measurements)
  • Historical comparisons (current vs. previous values)

Example: Sales Performance Analysis

Current SalesTargetPrevious YearStatus ColumnResult Column
120,000100,00090,000'Growing''Excellent'
80,000100,00085,000'Declining''Warning'
95,000100,00092,000'Stable''Good'
70,000100,00075,000'Declining''Alert'

Implementation logic:

CASE 
  WHEN current_sales > target THEN status_column
  WHEN current_sales > previous_year THEN performance_rating
  WHEN current_sales < target * 0.8 THEN alert_level
  ELSE default_status
END

Key features:

  • Both comparison values and results come from columns
  • Supports dynamic value mapping based on data
  • Enables complex data validation scenarios
  • Allows flexible type conversion through AsDatatype
  • Handles null values with configurable strategies

Note: Column data types should be compatible for comparisons. Use AsDatatype to ensure proper type conversion in the output column.

Table
0
0
Table

CaseList

[, ...]

Defines a conditional case where both comparison values and results come from columns. Enables dynamic, data-driven conditional logic based on column contents.

The column to evaluate in the condition. Forms the left side of the comparison. Example: current_balance, temperature_reading, stock_level

EqualTo

The operator to use for comparison

EqualTo ~

Equal.

NotEqualTo ~

Not equal.

LessThan ~

Less than.

LessThanOrEqualTo ~

Less than or equal.

GreaterThan ~

Greater than.

GreaterThanOrEqualTo ~

Greater than or equal.

StrContains ~

Contains.

StrDoesNotContain ~

Does not contain.

StrStartsWith ~

Begins with.

StrDoesNotStartWith ~

Does not begin with.

StrEndsWith ~

Ends with.

StrDoesNotEndWith ~

Does not end with.

The column containing values to compare against. Forms the right side of the comparison. Must contain values compatible with SelectWhen column for comparison.

The column containing values to use when the condition is true. These values will be converted to the specified AsDatatype in the output column.

FillNull

Specifies that when no conditions match, the resulting value should be null. This is a conservative approach to handling unmatched cases, making it explicit that no conditions were satisfied.

Default value to use when no conditions match. Similar to ELSE in SQL CASE statement. Provides a fallback value for unmatched cases. Example: 'Other', 'Unknown', -1

Column to draw default values from when no conditions match. Allows dynamic fallback values based on existing data. Useful for preserving original values in specific cases.

Name for the new column. If not provided, the system generates a unique name. If AsColumn matches an existing column, the existing column is replaced. The name should follow valid column naming conventions.

String

Specifies the output data type for the new column. Enables type control for the conditional results.

Int8 ~

8-bit signed integer (-128 to 127). For small-range discrete values.

Int16 ~

16-bit signed integer (-32,768 to 32,767). For moderate-range whole numbers.

Int32 ~

32-bit signed integer (-2^31 to 2^31-1). Common for general integer data.

Int64 ~

64-bit signed integer (-2^63 to 2^63-1). For large whole numbers.

Uint8 ~

8-bit unsigned integer (0 to 255). For small positive numbers.

Uint16 ~

16-bit unsigned integer (0 to 65,535). For moderate positive ranges.

Uint32 ~

32-bit unsigned integer (0 to 4,294,967,295). For larger positive numbers.

Uint64 ~

64-bit unsigned integer (0 to 2^64-1). For very large positive numbers.

Float32 ~

32-bit floating point. For decimal numbers with moderate precision.

Float64 ~

64-bit floating point. For high-precision decimal numbers.

String ~

Text data type. Default type for flexible value representation.

Categorical ~

Categorical string type. For efficient storage of repeated string values.

Bool ~

Boolean type. For true/false results.