CastDatatype / Manipulation Layer
Convert columns to different data types. Similar to pandas' astype() or R's as.type() functions.
Key features:
- Multiple data type options
- Batch conversion support
- Safe type conversion handling
Common applications:
- Memory optimization
- Data validation
- Format standardization
- Database compatibility
- Algorithm requirements
Note: Conversions that could lose data precision or cause overflow will raise errors unless explicitly handled.
Transforms
[, ...]List of column type conversions to perform. Multiple transforms allow batch processing of type conversions across different columns.
Select
columnSource column for type conversion. The current type and data content should be compatible with the target data type to avoid conversion errors.
Datatype
enumAvailable target data types for conversion. Choose based on requirements for precision, range, and memory usage.
8-bit signed integer (-128 to 127). Use for:
- Small categorical codes
- Byte-sized data
- Memory-efficient counters
16-bit signed integer (-32,768 to 32,767). Ideal for:
- Year values
- Medium-range counts
- Audio sample data
32-bit signed integer (-2^31 to 2^31-1). Common for:
- General purpose integers
- Population counts
- Time differences
64-bit signed integer (-2^63 to 2^63-1). Suitable for:
- Large counts
- Timestamps
- Big integer calculations
8-bit unsigned integer (0 to 255). Perfect for:
- Color values
- Small positive counts
- Binary flags
16-bit unsigned integer (0 to 65,535). Use for:
- Port numbers
- Image pixel values
- Medium positive counts
32-bit unsigned integer (0 to 2^32-1). Good for:
- Large positive counts
- File sizes
- Network addresses
64-bit unsigned integer (0 to 2^64-1). Ideal for:
- Very large counts
- Unique identifiers
- Microsecond timestamps
32-bit floating point (single precision). Used for:
- Basic scientific calculations
- Memory-efficient decimals
- Graphics coordinates
64-bit floating point (double precision). Standard for:
- Financial calculations
- Scientific computing
- High-precision analytics
Text data type. Essential for:
- Human-readable data
- Textual features
- Identifier preservation
Optimized storage for repeated strings. Perfect for:
- Factor variables
- Enumerated types
- Grouped text data
Boolean true/false values. Used for:
- Binary flags
- Condition indicators
- Yes/No data
AsColumn
nameName for the new column. If not provided, the system generates a unique name. If AsColumn
matches an existing column, the existing column is replaced. The name should follow valid column naming conventions.