Managing Clusters¶

Once a cluster is created, Krayne provides commands to list, inspect, scale, and delete clusters.

Listing clusters¶

List all Ray clusters in a namespace:

CLIPython SDK

$ krayne get

krayne get output

List clusters in a specific namespace:

krayne get -n ml-team

from krayne.api import list_clusters

clusters = list_clusters(namespace="default")
for cluster in clusters:
    print(f"{cluster.name} — {cluster.status} ({cluster.num_workers} workers)")

Describing a cluster¶

Get detailed information about a specific cluster, including resource breakdowns:

CLIPython SDK

$ krayne describe my-cluster

krayne describe output

from krayne.api import describe_cluster

details = describe_cluster("my-cluster")
print(f"Head: {details.head.cpus} CPUs, {details.head.memory}")
for wg in details.worker_groups:
    print(f"  {wg.name}: {wg.replicas}x ({wg.cpus} CPUs, {wg.gpus} GPUs)")

Scaling workers¶

Scale a worker group's desired, minimum, or maximum replica count:

CLIPython SDK

# Set desired replicas (autoscaler adjusts within min/max bounds)
krayne scale my-cluster --replicas 4

# Adjust autoscaling bounds
krayne scale my-cluster --min-replicas 1 --max-replicas 10

# Set all at once
krayne scale my-cluster --replicas 4 --min-replicas 2 --max-replicas 8

krayne scale output

Scale a named worker group:

krayne scale my-cluster --worker-group gpu-workers --replicas 8 -n ml-team

from krayne.api import scale_cluster

# Set desired replicas
info = scale_cluster("my-cluster", "default", "worker", replicas=4)

# Adjust autoscaling range
info = scale_cluster("my-cluster", "default", "worker", min_replicas=1, max_replicas=10)

Autoscaling-aware scaling

When autoscaling is enabled, only the explicitly provided fields are patched — the autoscaler continues to manage the others. When autoscaling is disabled, all three fields (replicas, minReplicas, maxReplicas) are pinned to the target replica count.

Deleting a cluster¶

CLIPython SDK

$ krayne delete my-cluster

krayne delete output

Skip the confirmation prompt with --force:

krayne delete my-cluster --force

from krayne.api import delete_cluster

delete_cluster("my-cluster", namespace="default")

Warning

Deletion is permanent. All pods, services, and data associated with the cluster are removed.

JSON output¶

All CLI commands support --output json for scripting and piping:

# List as JSON
krayne get --output json

# Parse with jq
krayne get --output json | jq '.[].name'

# Describe as JSON
krayne describe my-cluster -o json

Example JSON output

{
  "name": "my-cluster",
  "namespace": "default",
  "status": "ready",
  "dashboard_url": "http://10.0.0.1:8265",
  "client_url": "ray://10.0.0.1:10001",
  "num_workers": 1,
  "created_at": "2026-04-01T10:30:00Z"
}

Waiting for a cluster¶

You can wait for a cluster to be ready using the SDK:

from krayne.api import wait_until_ready

info = wait_until_ready("my-cluster", timeout=300)
print(f"Cluster ready: {info.status}")

This polls every 2 seconds until the cluster reaches the ready state or the timeout expires.

What's next¶

Configuration — config sources, precedence, and defaults
Error Handling — debugging and common error solutions
CLI Reference — full command documentation