DropNulls / List Layer
Remove all null values from lists while preserving the order of remaining elements. Similar to Python's list comprehension [x for x in lst if x is not None] or R's na.omit(). Creates a new list column with nulls removed from each list.
Example transformation:
lists | result |
---|---|
[1, null, 3, null, 5] | [1, 3, 5] |
[null, null, 10] | [10] |
[1, 2, 3] | [1, 2, 3] |
[null, null] | [] |
[] | [] |
Common applications:
- Cleaning sensor data with missing readings
- Processing survey responses with skipped questions
- Filtering incomplete measurements from experiments
- Preparing data for analysis that can't handle nulls
- Consolidating sparse event logs
- Cleaning time series with gaps
Note: The original order of non-null elements is preserved. Empty lists remain empty. Lists containing only nulls become empty lists. Useful for cleaning data while maintaining the temporal or logical sequence of valid elements.
Select
columnThe list column to clean. Supports various types:
- Numeric with gaps: [1, null, 3, null]
- Strings with missing: [apple, null, orange]
- Mixed nulls: [null, 42, null, 100]
- Dates with gaps: [2024-01-01, null, 2024-01-03] Lists can have different lengths. Original order is preserved.
AsColumn
nameName for the new column. If not provided, the system generates a unique name. If AsColumn
matches an existing column, the existing column is replaced. The name should follow valid column naming conventions.