Setup SparkOperator and Minio in K8s
This post assumes that you already have a running k8s cluster setup as minikube or in docker-desktop and you are familiar with kubectl commands.
Prerequisites
- minikube or any other k8s environment
- kubectl
- helm
Install spark operator
Create Namespaces
Add Helm repo
Install Helm chart for Spark
Webhook must be enabled for mounting volumes to driver and executor nodes and also for environment variables defined on pods to work.
helm install my-spark spark-operator/spark-operator --namespace spark-operator --set sparkJobNamespace=spark-jobs --set webhook.enable=true --set webhook.port=443
Install Minio
Create Namespaces
Add Helm repo
Install Helm chart for Minio
Setup Minio
Minio will be installed with its access key & secret key which should be used while connecting to the service There is a separate image for minio cli minio/mc which should be used to upload your data to minio or it can be accessed using UI also for which we should do a port forward of the minio service.
Move data to minio
## Setup minio credentials
1. export MINIO_ACCESSKEY=`kubectl get secret -n minio -l app=minio -o json | jq .data.accesskey | sed s/\"//g | base64 -d`
2. export MINIO_SECRETKEY=`kubectl get secret -n minio -l app=minio -o json | jq .data.secretkey | sed s/\"//g | base64 -d`
3. export MINIO_HOST=`kubectl get svc -n minio -l app=minio -ojsonpath='http://{.items[0].metadata.name}:{.items[0].spec.ports[0].targetPort}'`
4. kubectl run -n minio -it --rm minio-cli --env MINIO_HOST=$MINIO_HOST --env MINIO_ACCESSKEY=$MINIO_ACCESSKEY --env MINIO_SECRETKEY=$MINIO_SECRETKEY \
--image=minio/mc --command sh
mc cp
command to copy data from the container to minio.
Use wget
commands to fetch data to container.
Last update:
November 7, 2023