Web based interface
Rockfish's Web-Based Interface (Beta)
Access
To access Rockfish's web-based interface, you must register your organization with our platform. Please reach out to support@rockfish.ai to add your organization's domain to our platform. Once an organization is added, users within that organization can access the interface with their org email id.
Process
With the web-based interface, you can generate the synthetic data in just 4 steps:
- Upload dataset
- Configure dataset
- Train model
- Generate synthetic data
Steps
Once you log into the application, you will be prompted to create a project. The source dataset, the generated synthetic data and the trained model will be associated with the project.
Upload Dataset
- Format: Currenty the web-based interface supports importing dataset in csv or parquet format.
- Size: User can upload maximum of 100mb files If you want to upload bigger file size dataset, please contact us via support@rockfish.ai.
Configure Dataset
Once the dataset is uploaded, before training the model, user can modify the configurations for the Model.
-
Data type
By default, our system will identify the data type of the uploaded data. However, you can use the data type control to modify the data type as:
- Tabular or
- Session/Time series
-
Encoder details
Each field in the datatype must be defined as either categorical or numerical.
-
Privacy Settings
If your dataset contains sensitive information that you don't want to include in training,you can use the remap settings. Currently we support remapping to Rockfish default mapping methods for following fields, in personally identifiable information (PII) to enhance privacy protection.
- Remap IP: Replaces values with randomly generated IP addresses
- Remap Email: Replaces emails with randomly generated email address
- Remap SSN: Mask last 8 characters
- Remap Phone no: Replaces phone no. with random phone numbers
- Remap Credit Card: Mask all characters
- Remap Name: Replaces name with random name
- Remap Date: Keeps two of the three values of mm, dd, yy
- Remap Date with Timestamp: Drop Timestamp
To learn more, please refer Remap functionality documentation.
Train Model
Rockfish's recommendation engine auto recommends Model and Epoch values based on the dataset uploaded. User has the option to manually select a different model applicable to the dataset type.
-
Model Selection: Rockfish currently supports 4 models.
-
If uploaded dataset is tabular data, which is a common 2-dimensional dataset, then the models recommended are:
- Model RF-Tab-GAN (Rockfish CTGAN) OR
- Model RF-Tab-Transformer (Rockfish REaLTabFormer tabular)
-
If uploaded dataset is time series data, which includes metadata fields, a timestamp field, and measurement fields, then the models recommended are
- Model RF-Time-GAN (Rockfish DoppelGANger) OR
- Model RF-Time-Transformer (Rockfish REaLTabFormer time-series)
-
-
Epoch Values: The UI has default low, medium, high epochs values.
To learn more, please refer Model Training documentation.
Generate Synthetic Data
You can specify the number of sessions (time series dataset) or the number of records (tabular dataset) you want in your Synthetic Dataset.
To learn more, please refer Synthetic Data Generation documentation.
Troubleshoot
If the Synthetic Data generated is not satisfying your requirements, use the troubleshoot mode to modify
- Configuration
- Data Type
- Encoder Details
- Privacy settings
- Model type and Epoch values
- Sample (Session/Record) Size
Model Store
Once a model is trained with your dataset, our system auto saves the model in Model store. User can use the trained model and generate additional synthetic data.
If you have any quesiton, please reach out to support@rockfish.ai