DropNans / Manipulation Layer
Remove rows containing NaN (Not a Number) values in floating-point columns. Similar to pandas' dropna() or R's na.omit() with specific handling for IEEE 754 NaN values.
Key features:
- Selective column filtering
- Flexible row removal criteria (any/all conditions)
- Floating-point specific cleaning
Common applications:
- Data cleaning for statistical analysis
- Preparing complete case analysis
- Removing computation artifacts
- Machine learning dataset preparation
- Quality control in numerical data
Note: Only considers floating-point columns as NaN is a special floating-point value. If no columns are specified, examines all floating-point columns in the dataset.
Select
[column, ...]Floating-point columns to check for NaN values. Use cases:
- Specific measurement columns
- Critical numerical features
- Calculated fields If empty, all floating-point columns are considered. Non-floating-point columns are ignored as they cannot contain NaN values.
How
enumStrategy for row removal based on NaN presence across specified columns. Affects how aggressively rows are filtered from the dataset.
Remove rows if any specified column contains NaN. More aggressive filtering, ensuring complete data in all specified columns. Use cases:
- Complete case analysis
- Critical data completeness
- Statistical procedures requiring full data
Remove rows only if all specified columns contain NaN. More conservative approach, preserving rows with partial data. Suitable for:
- Preserving maximum data
- Handling sparse datasets
- Partial information analysis