Configuration¶

Krayne uses Pydantic v2 models for all cluster configuration. This provides type validation, sensible defaults, and clear error messages for invalid input.

Configuration sources¶

Configuration is resolved from three sources, in order of precedence (highest wins):

flowchart LR
  CLI["<b>CLI Flags</b><br/>(highest priority)"]
  YAML["<b>YAML File</b><br/>(--file cluster.yaml)"]
  Defaults["<b>Built-in Defaults</b><br/>(lowest priority)"]

  CLI --> Merge["Merge &<br/>Validate"]
  YAML --> Merge
  Defaults --> Merge
  Merge --> Config["ClusterConfig"]

CLI flags — individual options like --gpus-per-worker 1
YAML file — full cluster spec via --file cluster.yaml
Built-in defaults — sensible values so zero-config works

Minimal configuration¶

The only required field is name. Everything else has a default:

CLIPythonYAML

krayne create my-cluster

from krayne.config import ClusterConfig
config = ClusterConfig(name="my-cluster")

name: my-cluster

This creates a cluster with:

Head: 1 CPU, 4 Gi memory (control plane only — runs_tasks=False, so Ray's num-cpus=0)
1 worker group named worker: autoscaling from 0 to 1 worker (0 initial), 1 CPU, 2 Gi memory each
Autoscaling enabled by default (Ray in-tree autoscaler, Default upscaling mode, 60 s idle timeout)
Jupyter notebook, code-server, and SSH all enabled on the head node

Autoscaling¶

Krayne enables Ray v2 in-tree autoscaling by default. The Ray autoscaler sidecar runs on the head pod and dynamically scales worker groups between configurable min/max bounds.

Default behavior¶

By default, worker groups start with min_replicas=0, replicas=0, max_replicas=1. The autoscaler will scale workers up when Ray tasks or actors demand resources, and scale them back down after an idle timeout.

CLI flags¶

# Create with custom autoscaling bounds
krayne create my-cluster --min-workers 0 --max-workers 10 --workers 2

# Disable autoscaling (pin replicas)
krayne create my-cluster --no-autoscaling --workers 4

YAML configuration¶

cluster-autoscaling.yaml

name: my-experiment
autoscaler:
  enabled: true
  idle_timeout_seconds: 120
  upscaling_mode: Aggressive   # Default, Aggressive, or Conservative
worker_groups:
  - name: gpu-workers
    replicas: 2
    min_replicas: 0
    max_replicas: 10
    gpus: 1

Python SDK¶

from krayne.config import ClusterConfig, WorkerGroupConfig, AutoscalerConfig

config = ClusterConfig(
    name="auto-cluster",
    autoscaler=AutoscalerConfig(idle_timeout_seconds=120),
    worker_groups=[
        WorkerGroupConfig(replicas=2, min_replicas=0, max_replicas=10),
    ],
)

Disabling autoscaling¶

Set autoscaler.enabled = false to pin minReplicas == maxReplicas == replicas in the manifest (no autoscaler sidecar):

autoscaler:
  enabled: false
worker_groups:
  - replicas: 4
    min_replicas: 4
    max_replicas: 4

YAML configuration¶

For complex setups, define your cluster in a YAML file:

cluster.yaml

name: my-experiment
namespace: ml-team
head:
  cpus: 8
  memory: 32Gi
worker_groups:
  - name: cpu-workers
    replicas: 4
    cpus: 15
    memory: 48Gi
  - name: gpu-workers
    replicas: 2
    gpus: 1
    image: rayproject/ray:2.41.0-gpu
services:
  notebook: true
  code_server: true

krayne create my-experiment --file cluster.yaml

Overriding YAML values with CLI flags¶

CLI flags take precedence over YAML values:

# YAML sets workers=1, but this creates 4
krayne create my-experiment --file cluster.yaml --workers 4

Loading YAML from Python¶

from krayne.config import load_config_from_yaml

# Basic load
config = load_config_from_yaml("cluster.yaml")

# With overrides (supports dot-notation for nested fields)
config = load_config_from_yaml(
    "cluster.yaml",
    overrides={"namespace": "staging", "head.cpus": 32},
)

Default values rationale¶

Default	Value	Why
Head CPUs	`1`	Minimum that boots the head pod (GCS, dashboard, autoscaler, postStart services). Head is clamped up to at least `1` CPU / `4Gi` memory in the manifest builder.
Head Memory	`4Gi`	Same minimum-boot rationale — clamped up if you set a smaller value.
Head `runs_tasks`	`false`	Ray `num-cpus=0` on the head, so user tasks are routed to workers. (No GPU support on the head — GPUs belong on workers.)
Worker Replicas	`0`	Autoscaler manages worker count
Worker Min Replicas	`0`	Scale to zero when idle
Worker Max Replicas	`1`	Conservative default; increase for production
Worker CPUs	`1`	Small CPU request so the cluster fits on modest sandboxes
Worker Memory	`2Gi`	Small memory request so the cluster fits on modest sandboxes
Worker GPUs	`0`	CPU-only by default; opt in via `--gpus-per-worker`
Autoscaling	enabled	KubeRay in-tree autoscaler manages worker lifecycle
Idle Timeout	`60s`	Scale down unused workers after 60 seconds
Upscaling Mode	`Default`	One of `Conservative`, `Default`, `Aggressive`
Notebook	enabled	Jupyter notebook on port `8888`
Code Server	enabled	Browser IDE on port `8443`
SSH	enabled	SSH to the head pod on port `22` (loopback-only inside the pod)

Config validation¶

ClusterConfig uses Pydantic's extra = "forbid" mode — unknown fields in YAML or keyword arguments raise a ConfigValidationError:

from krayne.config import ClusterConfig

# This raises ConfigValidationError — "unknown_field" is not a valid field
config = ClusterConfig(name="test", unknown_field="value")

Terminal output

╭──── Error ─────────────────────────────────╮
│ Configuration validation error:            │
│ Extra inputs are not permitted             │
╰────────────────────────────────────────────╯

See Configuration Models Reference for full field definitions and types.

What's next¶

Error Handling — how to debug and handle errors
Configuration Models Reference — complete Pydantic model field tables