Skip to content

Managing Clusters

Once a cluster is created, Krayne provides commands to list, inspect, scale, and delete clusters.


Listing clusters

List all Ray clusters in a namespace:

$ krayne get

krayne get output

List clusters in a specific namespace:

krayne get -n ml-team
from krayne.api import list_clusters

clusters = list_clusters(namespace="default")
for cluster in clusters:
    print(f"{cluster.name}{cluster.status} ({cluster.num_workers} workers)")

Describing a cluster

Get detailed information about a specific cluster, including resource breakdowns:

$ krayne describe my-cluster

krayne describe output

from krayne.api import describe_cluster

details = describe_cluster("my-cluster")
print(f"Head: {details.head.cpus} CPUs, {details.head.memory}")
for wg in details.worker_groups:
    print(f"  {wg.name}: {wg.replicas}x ({wg.cpus} CPUs, {wg.gpus} GPUs)")

Scaling workers

Scale a worker group's desired, minimum, or maximum replica count:

# Set desired replicas (autoscaler adjusts within min/max bounds)
krayne scale my-cluster --replicas 4

# Adjust autoscaling bounds
krayne scale my-cluster --min-replicas 1 --max-replicas 10

# Set all at once
krayne scale my-cluster --replicas 4 --min-replicas 2 --max-replicas 8

krayne scale output

Scale a named worker group:

krayne scale my-cluster --worker-group gpu-workers --replicas 8 -n ml-team
from krayne.api import scale_cluster

# Set desired replicas
info = scale_cluster("my-cluster", "default", "worker", replicas=4)

# Adjust autoscaling range
info = scale_cluster("my-cluster", "default", "worker", min_replicas=1, max_replicas=10)

Autoscaling-aware scaling

When autoscaling is enabled, only the explicitly provided fields are patched — the autoscaler continues to manage the others. When autoscaling is disabled, all three fields (replicas, minReplicas, maxReplicas) are pinned to the target replica count.


Deleting a cluster

$ krayne delete my-cluster

krayne delete output

Skip the confirmation prompt with --force:

krayne delete my-cluster --force
from krayne.api import delete_cluster

delete_cluster("my-cluster", namespace="default")

Warning

Deletion is permanent. All pods, services, and data associated with the cluster are removed.


JSON output

All CLI commands support --output json for scripting and piping:

# List as JSON
krayne get --output json

# Parse with jq
krayne get --output json | jq '.[].name'

# Describe as JSON
krayne describe my-cluster -o json
Example JSON output
{
  "name": "my-cluster",
  "namespace": "default",
  "status": "ready",
  "dashboard_url": "http://10.0.0.1:8265",
  "client_url": "ray://10.0.0.1:10001",
  "num_workers": 1,
  "created_at": "2026-04-01T10:30:00Z"
}

Waiting for a cluster

You can wait for a cluster to be ready using the SDK:

from krayne.api import wait_until_ready

info = wait_until_ready("my-cluster", timeout=300)
print(f"Cluster ready: {info.status}")

This polls every 2 seconds until the cluster reaches the ready state or the timeout expires.


What's next