Datasets and Tables
Overview
The Datasets section enables users to create, manage, and organize data collections. This documentation covers dataset creation, management, and version control.
Table of Contents
Creating Datasets
Watch this video for a visual guide on dataset creation.
Steps to Create a Dataset
- Navigate to "Datasets - Versions, Tables and Import Jobs"
- Enter a name for your new dataset
- Click "Create New Dataset"
Initial Setup
- Add Title
- Add Description
- Assign Category
- Add Table/s to dataset
- Commit and release version
Next Steps
- Check the ‘Datasets’ table to see all datasets created.
- Click to any dataset from the table to see table, description, versions and tables in this dataset.
- Click on the ‘Trash’ button and type ‘delete’ to delete the dataset permanently from ZinkML platform.
Working with Datasets
Watch this video for detailed instructions on working with Datasets.
Watch this video for detailed instructions on committing and releasing a Dataset version.
Dataset Table Overview
The main dataset table displays:
Column | Description |
---|---|
Name | Dataset identifier |
Number of Tables | Count of tables in dataset |
Total Size | Storage space used |
Number of Rows | Total data rows |
Latest Version | Current version number |
Status | Staged or Committed |
Last Updated | Last modification date |
Created On | Creation date |
Access Type | Private/Shared/Public |
Actions | Collaboration options |
Version Control
- Create New Version
- Tables from previous version automatically included
- Option to remove existing tables
- Add new tables [See Ingestion Documentation]
- Commit and Release Version
- Finalizes changes
- Assign appropriate License to the Dataset
- Makes version available for use
Dataset Management Features
Basic Operations
- View dataset details
- Edit title and description
- Modify category
- Add/remove tables
- Check table schemas
Access Control
- Set dataset as:
- Private (default)
- Shared
- Public
Deletion
- Click "Trash" button
- Type 'delete' to confirm
- Dataset permanently removed from platform
Best Practices
- Use descriptive names
- Maintain detailed descriptions
- Regularly commit versions
- Review table schemas
- Organize with categories