Skip to content

Requirements

Requirements

Rockfish Data is a multi-service application that can run on single-node or multi-node Kubernetes clusters. For a successful deployment and operation, the Kubernetes cluster on which you deploy Rockfish Data must meet the compute and storage requirements.

If you need to run multiple synthetic datasets in parallel, you can deploy Rockfish Data in a multi-node environment with multiple worker nodes. Single-node deployments

You can deploy Rockfish Data in a single-server environment that runs as a single-node Kubernetes cluster. The single node will run all components of the Rockfish Data application architecture, including the application and worker nodes.

The minimum requirements for a successful single-node deployment of Rockfish Data are as follows: Resources Size CPU 16 cores RAM 32 GB Storage 256 GB

The majority of scenarios can be covered with the minimum setup.

Depending on the datasets you intend to synthesize, you might need a node with higher compute resources. Our highest recommended single-node configuration is listed below (more than that is typically not required): Resources Size CPU 64 cores RAM 128 GB Storage 1 TB

Note To learn how to size your infrastructure based on the size of the datasets you want to synthesize, see the best practices for virtual machine sizes. Multi-node deployments

For a multi-node deployment, you need a Kubernetes cluster with at least two nodes. One of the nodes functions as the application node and the remaining function as worker nodes.

Rockfish Data requires nodes with available resources that meet the computation and memory requirements. To ensure a smooth operation in shared cluster environments, it is best to dedicate the nodes solely to Rockfish Data tasks. This prevents resource conflicts and ensures that all tasks required to complete AI training and to generate synthetic datasets will run unobstructed. To achieve this, you can isolate the dedicated nodes by applying taints and tolerations to node and pod configurations. For more information, see Taints and tolerations

in the Kubernetes documentation. Application node requirements

The application node runs the web-based user interface and distributes the synthetic data generation jobs across the worker nodes. Resources Size CPU 4 cores RAM 8 GB Storage 20 GB Worker nodes requirements

The worker nodes are responsible for running and processing each synthetic dataset. Depending on the size of the datasets you want to synthesize, the minimum requirements for a single worker node are as follows: Resources Size CPU 8 cores RAM 16 GB Storage 128 GB

The majority of scenarios can be covered with the minimum setup.

Depending on the datasets you intend to synthesize, you might need worker nodes with higher compute resources. Our highest recommended worker node configuration is listed below (more than that is typically not required): Resources Size CPU 64 cores RAM 128 GB Storage 256 GB