Configuration Models¶
All cluster configuration is defined using Pydantic v2 models. Models are importable from krayne.config.
from krayne.config import (
ClusterConfig,
HeadNodeConfig,
WorkerGroupConfig,
ServicesConfig,
load_config_from_yaml,
)
ClusterConfig¶
Top-level configuration for a Ray cluster. Uses extra = "forbid" — unknown fields raise a validation error.
config = ClusterConfig(
name="my-cluster", # required
namespace="default", # optional
head=HeadNodeConfig(...), # optional — uses defaults
worker_groups=[...], # optional — single default worker
services=ServicesConfig(...), # optional — uses defaults
)
| Field | Type | Default | Description |
|---|---|---|---|
name |
str |
— | Cluster name (required) |
namespace |
str |
"default" |
Kubernetes namespace |
head |
HeadNodeConfig |
See below | Head node configuration |
worker_groups |
list[WorkerGroupConfig] |
[WorkerGroupConfig()] |
Worker group configurations |
services |
ServicesConfig |
See below | Enabled services |
autoscaler |
AutoscalerConfig |
See below | Autoscaler configuration |
HeadNodeConfig¶
Resource configuration for the Ray head node. By default the head is a control plane only — runs_tasks=False causes the manifest builder to set Ray's num-cpus=0, so user tasks are routed to workers. GPU support is intentionally omitted; GPUs belong on workers. Note also that the manifest builder clamps cpus and memory up to 1 CPU / 4Gi so the head pod can boot GCS, the dashboard, the autoscaler, and the postStart services.
| Field | Type | Default | Description |
|---|---|---|---|
cpus |
str |
"1" |
CPU count (clamped up to 1 minimum) |
memory |
str |
"4Gi" |
Memory allocation (clamped up to 4Gi minimum) |
image |
str \| None |
None |
Custom container image. When None, defaults to rayproject/ray:<ray-version>-py<py> derived from the installed Ray and Python versions |
runs_tasks |
bool |
False |
When True, advertise the head's CPUs to Ray (num-cpus=<cpus>) so user tasks can run on it |
WorkerGroupConfig¶
Configuration for a worker group.
| Field | Type | Default | Description |
|---|---|---|---|
name |
str |
"worker" |
Worker group name |
replicas |
int |
0 |
Desired number of worker replicas |
min_replicas |
int |
0 |
Minimum replicas (autoscaling lower bound) |
max_replicas |
int |
1 |
Maximum replicas (autoscaling upper bound) |
cpus |
str |
"1" |
CPUs per worker |
memory |
str |
"2Gi" |
Memory per worker |
gpus |
int |
0 |
GPUs per worker |
image |
str \| None |
None |
Custom container image |
Replica validation
The constraint min_replicas <= replicas <= max_replicas is enforced. If replicas exceeds max_replicas, max_replicas is automatically adjusted upward for backward compatibility.
AutoscalerConfig¶
Configuration for the Ray v2 in-tree autoscaler. When enabled, the KubeRay operator injects an autoscaler sidecar into the head pod that dynamically adjusts worker group replica counts.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool |
True |
Enable Ray v2 in-tree autoscaling |
idle_timeout_seconds |
int |
60 |
Seconds before idle workers are scaled down |
upscaling_mode |
str |
"Default" |
Upscaling strategy: Default, Aggressive, or Conservative |
cpu |
str |
"500m" |
CPU request/limit for the autoscaler sidecar container |
memory |
str |
"512Mi" |
Memory request/limit for the autoscaler sidecar container |
ServicesConfig¶
Services to enable on the cluster head node. Each enabled service adds its port to the head pod spec and populates the corresponding URL in ClusterInfo.
| Field | Type | Default | Port | Description |
|---|---|---|---|---|
notebook |
bool |
True |
8888 | Jupyter notebook server (runs on ray-head container) |
code_server |
bool |
True |
8443 | Code Server, installed from a standalone pre-built binary at container startup |
ssh |
bool |
True |
22 | SSH access to the head node as the ray user. openssh-server is installed and configured at container startup if missing. |
How SSH access works
The bootstrap binds sshd to 127.0.0.1 only and empties the ray user's password. Root login is explicitly disabled (PermitRootLogin no); the only allowed login user is ray (uid 1000), the same user that runs the Ray processes — so files you create over SSH match what Ray itself sees.
The only path in is kubectl port-forward, which targets the pod's loopback interface — sshd is not reachable from other pods, the Service IP, or outside the cluster. Combined with Kubernetes RBAC (which is what authorizes the port-forward), this means users do not need to manage SSH keys. To connect:
The local port is shown by krayne describe my-cluster and in the TUI's Services tab. Initial bootstrap can take ~30s after the cluster reaches ready while apt-get install openssh-server runs on the head pod.
load_config_from_yaml¶
Load a ClusterConfig from a YAML file with optional field overrides.
def load_config_from_yaml(
path: str | Path,
overrides: dict[str, Any] | None = None,
) -> ClusterConfig
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str \| Path |
— | Path to YAML configuration file (required) |
overrides |
dict[str, Any] \| None |
None |
Field values taking precedence over YAML. Supports dot-notation keys. |
Returns: ClusterConfig
Raises: ConfigValidationError on validation failure
Example:
from krayne.config import load_config_from_yaml
# Basic load
config = load_config_from_yaml("cluster.yaml")
# With overrides (supports dot-notation for nested fields)
config = load_config_from_yaml(
"cluster.yaml",
overrides={"namespace": "staging", "head.cpus": 32},
)
Settings¶
User-level settings are persisted in ~/.krayne/config.yaml and managed via these functions:
from krayne.config import (
KrayneSettings,
load_krayne_settings,
save_krayne_settings,
clear_krayne_settings,
)
KrayneSettings¶
load_krayne_settings() -> KrayneSettings¶
Load settings from ~/.krayne/config.yaml, returning defaults if absent.
save_krayne_settings(settings: KrayneSettings) -> None¶
Write settings to ~/.krayne/config.yaml, creating the directory if needed.
clear_krayne_settings() -> None¶
Remove the settings file if it exists.