SplitWords / String Layer
Split strings into lists of words using whitespace as delimiter. Similar to Python's str.split(), R's strsplit(x, '\s+'), or Rust's split_whitespace(). Creates a list column containing the separated words.
Handles multiple types of whitespace:
- Spaces
- Tabs
- Line breaks
- Other Unicode whitespace characters
Common applications:
- Text tokenization for NLP
- Processing space-separated data
- Extracting words from sentences
- Parsing log file entries
- Breaking down full names
- Analyzing word frequencies
- Processing structured text data
Note: Consecutive whitespace characters are treated as a single delimiter, and leading/trailing whitespace is ignored.
Table
0
0
Table
Select
columnThe string column to split. Examples of input and results:
- 'hello world' → ['hello', 'world']
- 'John Doe Smith' → ['John', 'Doe', 'Smith']
- ' spaces trim ' → ['spaces', 'trim']
- 'line1\nline2\tword' → ['line1', 'line2', 'word']
AsColumn
nameName for the new column. If not provided, the system generates a unique name. If AsColumn
matches an existing column, the existing column is replaced. The name should follow valid column naming conventions.