Configuration Models¶

All cluster configuration is defined using Pydantic v2 models. Models are importable from krayne.config.

from krayne.config import (
    ClusterConfig,
    HeadNodeConfig,
    WorkerGroupConfig,
    ServicesConfig,
    load_config_from_yaml,
)

`ClusterConfig`¶

Top-level configuration for a Ray cluster. Uses extra = "forbid" — unknown fields raise a validation error.

config = ClusterConfig(
    name="my-cluster",          # required
    namespace="default",        # optional
    head=HeadNodeConfig(...),   # optional — uses defaults
    worker_groups=[...],        # optional — single default worker
    services=ServicesConfig(...),  # optional — uses defaults
)

Field	Type	Default	Description
`name`	`str`	—	Cluster name (required)
`namespace`	`str`	`"default"`	Kubernetes namespace
`head`	`HeadNodeConfig`	See below	Head node configuration
`worker_groups`	`list[WorkerGroupConfig]`	`[WorkerGroupConfig()]`	Worker group configurations
`services`	`ServicesConfig`	See below	Enabled services
`autoscaler`	`AutoscalerConfig`	See below	Autoscaler configuration

`HeadNodeConfig`¶

Resource configuration for the Ray head node. By default the head is a control plane only — runs_tasks=False causes the manifest builder to set Ray's num-cpus=0, so user tasks are routed to workers. GPU support is intentionally omitted; GPUs belong on workers. Note also that the manifest builder clamps cpus and memory up to 1 CPU / 4Gi so the head pod can boot GCS, the dashboard, the autoscaler, and the postStart services.

Field	Type	Default	Description
`cpus`	`str`	`"1"`	CPU count (clamped up to `1` minimum)
`memory`	`str`	`"4Gi"`	Memory allocation (clamped up to `4Gi` minimum)
`image`	`str \\| None`	`None`	Custom container image. When `None`, defaults to `rayproject/ray:<ray-version>-py<py>` derived from the installed Ray and Python versions
`runs_tasks`	`bool`	`False`	When `True`, advertise the head's CPUs to Ray (`num-cpus=<cpus>`) so user tasks can run on it

`WorkerGroupConfig`¶

Configuration for a worker group.

Field	Type	Default	Description
`name`	`str`	`"worker"`	Worker group name
`replicas`	`int`	`0`	Desired number of worker replicas
`min_replicas`	`int`	`0`	Minimum replicas (autoscaling lower bound)
`max_replicas`	`int`	`1`	Maximum replicas (autoscaling upper bound)
`cpus`	`str`	`"1"`	CPUs per worker
`memory`	`str`	`"2Gi"`	Memory per worker
`gpus`	`int`	`0`	GPUs per worker
`image`	`str \\| None`	`None`	Custom container image

Replica validation

The constraint min_replicas <= replicas <= max_replicas is enforced. If replicas exceeds max_replicas, max_replicas is automatically adjusted upward for backward compatibility.

`AutoscalerConfig`¶

Configuration for the Ray v2 in-tree autoscaler. When enabled, the KubeRay operator injects an autoscaler sidecar into the head pod that dynamically adjusts worker group replica counts.

Field	Type	Default	Description
`enabled`	`bool`	`True`	Enable Ray v2 in-tree autoscaling
`idle_timeout_seconds`	`int`	`60`	Seconds before idle workers are scaled down
`upscaling_mode`	`str`	`"Default"`	Upscaling strategy: `Default`, `Aggressive`, or `Conservative`
`cpu`	`str`	`"500m"`	CPU request/limit for the autoscaler sidecar container
`memory`	`str`	`"512Mi"`	Memory request/limit for the autoscaler sidecar container

`ServicesConfig`¶

Services to enable on the cluster head node. Each enabled service adds its port to the head pod spec and populates the corresponding URL in ClusterInfo.

Field	Type	Default	Port	Description
`notebook`	`bool`	`True`	8888	Jupyter notebook server (runs on ray-head container)
`code_server`	`bool`	`True`	8443	Code Server, installed from a standalone pre-built binary at container startup
`ssh`	`bool`	`True`	22	SSH access to the head node as the `ray` user. `openssh-server` is installed and configured at container startup if missing.

How SSH access works

The bootstrap binds sshd to 127.0.0.1 only and empties the ray user's password. Root login is explicitly disabled (PermitRootLogin no); the only allowed login user is ray (uid 1000), the same user that runs the Ray processes — so files you create over SSH match what Ray itself sees.

The only path in is kubectl port-forward, which targets the pod's loopback interface — sshd is not reachable from other pods, the Service IP, or outside the cluster. Combined with Kubernetes RBAC (which is what authorizes the port-forward), this means users do not need to manage SSH keys. To connect:

krayne tun-open my-cluster
ssh -p <ssh-local-port> -o StrictHostKeyChecking=no ray@localhost

The local port is shown by krayne describe my-cluster and in the TUI's Services tab. Initial bootstrap can take ~30s after the cluster reaches ready while apt-get install openssh-server runs on the head pod.

`load_config_from_yaml`¶

Load a ClusterConfig from a YAML file with optional field overrides.

def load_config_from_yaml(
    path: str | Path,
    overrides: dict[str, Any] | None = None,
) -> ClusterConfig

Parameters:

Parameter	Type	Default	Description
`path`	`str \\| Path`	—	Path to YAML configuration file (required)
`overrides`	`dict[str, Any] \\| None`	`None`	Field values taking precedence over YAML. Supports dot-notation keys.

Returns: ClusterConfig

Raises: ConfigValidationError on validation failure

Example:

from krayne.config import load_config_from_yaml

# Basic load
config = load_config_from_yaml("cluster.yaml")

# With overrides (supports dot-notation for nested fields)
config = load_config_from_yaml(
    "cluster.yaml",
    overrides={"namespace": "staging", "head.cpus": 32},
)

Settings¶

User-level settings are persisted in ~/.krayne/config.yaml and managed via these functions:

from krayne.config import (
    KrayneSettings,
    load_krayne_settings,
    save_krayne_settings,
    clear_krayne_settings,
)

`KrayneSettings`¶

@dataclass
class KrayneSettings:
    kubeconfig: str | None = None
    kube_context: str | None = None

`load_krayne_settings() -> KrayneSettings`¶

Load settings from ~/.krayne/config.yaml, returning defaults if absent.

`save_krayne_settings(settings: KrayneSettings) -> None`¶

Write settings to ~/.krayne/config.yaml, creating the directory if needed.

`clear_krayne_settings() -> None`¶

Remove the settings file if it exists.

Configuration Models¶

ClusterConfig¶

HeadNodeConfig¶

WorkerGroupConfig¶

AutoscalerConfig¶

ServicesConfig¶

load_config_from_yaml¶