DropNulls / Manipulation Layer

Remove rows containing null values (missing data). Similar to pandas' dropna() or R's na.omit().

Key features:

  • Works with all data types (numeric, string, boolean, etc.)
  • Configurable column selection
  • Flexible removal criteria (any/all conditions)

Common applications:

  • Data cleaning for analysis
  • Ensuring dataset completeness
  • Preparing data for algorithms that can't handle nulls
  • Quality control in datasets
  • Database export preparation

Note: This operation specifically handles null values (missing data), distinct from NaN values in floating-point columns. For handling NaN values, use ManipulationDropNans instead.

Table
0
0
Table

Select

[column, ...]

Columns to check for null values. Examples:

  • Required form fields
  • Key identifiers
  • Critical measurements If empty, checks all columns. Note: This operation handles null values (missing data) while preserving NaN values in floating-point columns.

How

enum
Any

Strategy for row removal based on null presence across specified columns. Controls the balance between data completeness and data preservation.

Any ~

Remove rows if any specified column contains null. Stricter cleaning that ensures complete data across specified columns. Use cases:

  • Required field validation
  • Mandatory data completeness
  • Analysis requiring full records
All ~

Remove rows only if all specified columns contain null. Preserves rows with partial data. Appropriate for:

  • Sparse data handling
  • Maximizing data retention
  • Partial record analysis