Ray cluster with examples running on Kubernetes (k3d).
- python 3
- k3d to create a k3s kubes cluster (optional)
- helm to install the kuberay operator (optional)
Create virtualenv:
make install
If needed, create a k3s kubes cluster using k3d (optional):
make cluster
Now set your kube context before running further commands.
Install the ray cluster into kubes:
- kuberay operator:
make kuberay raycluster
(recommended) - stock python operator:
make ray-kube-install
For k3d, run make k3d-ingress
, else run make forward
- The Ray client server will be exposed on localhost port 10001.
- The Ray dashboard can be accessed on http://localhost:8265/
- The Ray GCS server will be exposed on localhost port 6379.
Ping head node (once pod is ready):
make ping
Run example application
python raydemo/cluster_info.py
Run shell on head pod:
make shell
Kuberay consists of:
- helm-chart/ - helm charts for the apiserver, operator and a ray-cluster (recommended)
- ray-operator/config/ - kustomize templates, which seem more up to date than the helm charts. Includes
- crd: the rayclusters, rayjobs, and rayservices CRDs
- default: crd, rbac, manager, and ray-system namespace
- manager: kuberay operator deployment and serivce
- prometheus
- rbac: roles, service accounts etc.
- ray-operator/config/samples: raycluster examples
- manifests/ kutomize quickstart manifests for installing the default template + apiserver
make kuberay
installs the kuberay-operator helm chart which creates:
- customresourcedefinition.apiextensions.k8s.io/rayclusters.ray.io created
- customresourcedefinition.apiextensions.k8s.io/rayjobs.ray.io created
- customresourcedefinition.apiextensions.k8s.io/rayservices.ray.io created
And the following resources in the default namespace:
- ServiceAccount kuberay-operator
- ClusterRole rayjob-editor-role
- ClusterRole rayjob-viewer-role
- ClusterRole rayservice-editor-role
- ClusterRole rayservice-viewer-role
- ClusterRole kuberay-operator
- ClusterRoleBinding kuberay-operator
- Role kuberay-operator
- RoleBinding kuberay-operator
- Service kuberay-operator
- Deployment kuberay-operator
make raycluster
creates the following in the default namespace:
- raycluster-kuberay-head-svc service
- ray head pod with limits of 1 CPU and 2Gi memory
- ray worker pod with limits of 1 CPU and 2Gi memory
make delete
removes the ray cluster
For more info see the ray-operator readme.
See examples in raydemo.
Most examples will start a local ray instance. To use the cluster instead:
export RAY_ADDRESS=ray://
See autoscaler.md
See [Feature][Docs][Discussion] Provider consistent guidance on resource Request and Limits #744
- [Data>] Dataset write_csv AttributeError: ‘Worker’ object has no attribute 'core_worker'
- Pods aren't restarted when the RayCluster CRD image is updated
- core - ray logs CLI doesn't work for kubernetes raycluster
- [Data] error: Argument of type "(df: DataFrame) -> DataFrame" cannot be assigned to parameter