Join / Manipulation Layer

Combine two DataFrames based on matching values in specified columns. Similar to SQL JOIN operations, pandas merge(), or R's join functions.

Example joining customer and order data:

Left Table (Customers):

ID	Name
1	Alice
2	Bob
3	Carol

Right Table (Orders):

ID	Amount
1	100
2	200
4	300

Inner Join Result:

ID	Name	Amount
1	Alice	100
2	Bob	200

Common applications:

Combining related datasets
Enriching data with attributes
Relationship analysis
Data consolidation
Cross-reference validation
Master data integration
Historical data linkage
Reference data lookup

LeftTable

RightTable

Table

SelectLeft

[column, ...]

Join key columns from left table. Examples:

Primary keys
Foreign keys
Natural keys
Composite keys (multiple columns)

Strategy

enum

Inner

Join strategy. Default is Left.

Left ~

Keep all left table rows with matching data from right table where available.

Set notation: ${(l, r) : l \in L, r \in R \cup {null}, l . key = r . key}$

Example:

Left Table:

id	name
1	Alice
2	Bob
3	Carol

Right Table:

id	amount
1	100
3	300

Result (LEFT JOIN):

id	name	amount
1	Alice	100
2	Bob	null
3	Carol	300

Key characteristics:

Preserves all left table rows
Includes matching right table data
Fills non-matches with null
Maintains left table order

Common applications:

Enriching base dataset
Optional relationship handling
Master data lookup
Preserving core records

Inner ~

Keep only rows that have matching keys in both tables.

Set notation: ${(l, r) : l \in L, r \in R, l . key = r . key}$

Example:

Left Table:

id	name
1	Alice
2	Bob
3	Carol

Right Table:

id	amount
1	100
3	300
4	400

Result (INNER JOIN):

id	name	amount
1	Alice	100
3	Carol	300

Key characteristics:

Only includes matching rows
No null values in join columns
Maintains left table order
Potentially fewer rows than either input

Common applications:

Strict relationship enforcement
Guaranteed complete records
Data validation
Matched pair analysis

Outer ~

Keep all rows from both tables, matching where possible.

Set notation: ${(l, r) : l \in L \cup {null}, r \in R \cup {null}, l . key = r . key or l = null or r = null}$

Example:

Left Table:

id	name
1	Alice
2	Bob
3	Carol

Right Table:

id	amount
1	100
3	300
4	400

Result (FULL OUTER JOIN):

id	name	amount
1	Alice	100
2	Bob	null
3	Carol	300
4	null	400

Key characteristics:

Preserves all rows from both tables
Matches where possible
Fills non-matches with null
Shows complete relationship view

Common applications:

Data completeness analysis
Relationship gap detection
Full data integration
Migration validation

Cross ~

Cartesian product - combine every row with every row.

Set notation: ${(l, r) : l \in L, r \in R}$

Example:

Left Table:

id	name
1	Alice
2	Bob

Right Table:

type	value
A	100
B	200

Result (CROSS JOIN):

id	name	type	value
1	Alice	A	100
1	Alice	B	200
2	Bob	A	100
2	Bob	B	200

Key characteristics:

No key matching required
Result size = left rows × right rows
No null values
All possible combinations

Common applications:

Generating all possibilities
Feature combination creation
Product variations
Multi-factor analysis

Semi ~

Keep left table rows that have matches in right table.

Set notation: ${l : l \in L, \exists r \in R, l . key = r . key}$

Example:

Left Table:

id	name
1	Alice
2	Bob
3	Carol

Right Table:

id	amount
1	100
3	300
4	400

Result (SEMI JOIN):

id	name
1	Alice
3	Carol

Key characteristics:

Only keeps matching left rows
Maintains left table structure
No duplicates
No right table columns

Common applications:

Filtering by existence
Finding matched records
Data subset selection
Relationship verification

Anti ~

Keep left table rows that have no matches in right table.

Set notation: ${l : l \in L, ∄ r \in R, l . key = r . key}$

Example:

Left Table:

id	name
1	Alice
2	Bob
3	Carol

Right Table:

id	amount
1	100
3	300
4	400

Result (ANTI JOIN):

id	name
2	Bob

Key characteristics:

Only keeps non-matching left rows
Maintains left table structure
No right table columns
Inverse of semi join

Common applications:

Finding missing relationships
Data gap analysis
Exception identification
Quality control checks

Validation

enum

ManyToMany

Cardinality validation ensuring relationship integrity between tables

ManyToMany ~

No uniqueness requirements:

Multiple matches allowed
No validation performed
Most flexible option

OneToOne ~

Unique keys in both tables:

One-to-one relationships
No duplicates allowed
Strict 1:1 mapping

ManyToOne ~

Unique keys in right table only:

Many-to-one relationships
Right side as lookup table
Common for reference data

OneToMany ~

Unique keys in left table only:

One-to-many relationships
Left side as master record
Common for parent-child data

SelectRight

[column, ...]

Join key columns from right table. Must correspond to left keys:

Matching data types
Same number of columns
Corresponding order
Related content