OKE
Prerequisites
- Oracle bash cloud shell cli with the environment variable ENVSUBST pointing to an installation of envsubst (cloud shell does not allow root/sudo package installation).
- kubectl with correct context selected: pointing to the cluster you want to deploy Luna on.
- helm: the package manager for Kubernetes
- cmctl: the cert-manager command line utility
- An existing OKE cluster with at least 2 nodes (for Luna webhook replica availability) without autoscaling enabled.
Step 1(optional): Install Nvidia gpu driver for gpu workload
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml
Step 2: Deploy Luna
Luna needs cert-manager running in the cluster. The deploy script tries to detect cert-manager in the cluster and installs cert-manager to cert-manager namespace otherwise.
Luna supports 2 authentication methods for making OCI API calls: mounted OCI config file and instance principal.
The default authentication method uses an OCI config file mounted into the Luna node manager. To use this method, set up API keys for oracleidentitycloudservice/<account> and validate that access using ~/.oci/config
with key ~/.oci/oci.pem
works. Note that the key_file
value specified inside the ~/.oci/config
file should be expressed using the tilde path for the home directory, e.g., "key_file=~/.oci/oci.pem", so that it will work correctly when mounted in the pod. For this method, deploy using the following.
./deploy.sh --cluster-ocid <cluster-ocid> --config-file-full-path <config-file-full-path> --pem-file-full-path <pem-file-full-path> [--helm-release-name <release-name>] [--namespace <namespace>] [--additional-helm-values "<additional-helm-values>"]
To instead use the instance principal authentication method, set up an instance principal with sufficient service permissions. Information on configuring the instance principal is given in the section "Configuring an Instance Principal for Luna Use" below. For this method, deploy using the following:
./deploy.sh --cluster-ocid <cluster-ocid> --use-instance-principal [--helm-release-name <release-name>] [--namespace <namespace>] [--additional-helm-values "<additional-helm-values>"]
Note: The deploy.sh command generates a cluster-ocid_values.yaml file and a cluster-ocid_values_full.yaml file; please retain these files for use in future upgrades.
Step 3: Verify Luna
kubectl get all -n elotl
Sample Output
NAME READY STATUS RESTARTS AGE
pod/elotl-luna-manager-6bd7f4674d-cxwz6 1/1 Running 0 2m39s
pod/elotl-luna-webhook-7fcf5998b6-ltrd6 1/1 Running 0 2m39s
pod/elotl-luna-webhook-7fcf5998b6-svr6b 1/1 Running 0 2m39s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/elotl-luna-manager ClusterIP x.x.x.x <none> 9090/TCP 2m39s
service/elotl-luna-webhook ClusterIP x.x.x.x <none> 8443/TCP 2m39s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/elotl-luna-manager 1/1 1 1 2m39s
deployment.apps/elotl-luna-webhook 2/2 2 2 2m39s
NAME DESIRED CURRENT READY AGE
replicaset.apps/elotl-luna-manager-6bd7f4674d 1 1 1 2m39s
replicaset.apps/elotl-luna-webhook-7fcf5998b6 2 2 2 2m39s
Step 4: Run some workloads!
Follow our tutorial to understand value provided by Luna.
Step 5: Verify test pod launch and dynamic worker node addition/removal (while testing)
kubectl get pods --selector=elotl-luna=true -o wide -w
kubectl get nodes -w
Upgrade
When running the upgrade command described below, set <retained-values-file>
to <retained-path>/<cluster-name>_values_full.yaml
, if your installation version was post 0.5.4 Luna, and to <retained-path>/<cluster-name>_values.yaml
otherwise.
To upgrade an existing luna deployment, run:
helm upgrade elotl-luna <chart-path> --wait --namespace=<cluster-namespace> --values=<retained-values-file> <additional-helm-values(optional)>
For example, to upgrade my-cluster from luna-v0.4.6 to luna-v0.5.0 and set an additional helm value binPackingNodeCpu=2, run:
helm upgrade elotl-luna ./elotl-luna-v0.5.0.tgz --wait --namespace=elotl --values=../../luna-v0.4.6/oke/ocid1.cluster.oc1.iad.aaaaaaaasxamzqh2ch6fmxvk5uierz6sxyicmutae2b2e25tvcybh2azdukq_values.yaml --set binPackingNodeCpu=2
And validate the upgrade as follows:
helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
elotl-luna elotl 4 2023-05-19 14:15:30.686251 -0700 PDT deployed elotl-luna-v0.5.0 v0.5.0
Cleanup
helm uninstall elotl-luna --namespace=elotl
kubectl delete namespace elotl
Configuring an Instance Principal for Luna Use
There are various approaches to configuring an instance principal that will allow Luna to manage compute resources in your cluster. General background information is given here.
One approach is as follows:
Create a compartment-level dynamic group in the default domain containing the statically-allocated nodes that may host Luna in the cluster, e.g., create
luna-dyn-group
with the rule:Any {instance.id = 'ocid1.instance.oc1...', instance.id = 'ocid1.instance.oc1...'}
Create a tenancy-level policy to allow the nodes in that rule to manage OCI resources, e.g.:
Allow dynamic-group luna-dyn-group to manage all-resources in tenancy