Skip to content

Setup SparkOperator and Minio in K8s

This post assumes that you already have a running k8s cluster setup as minikube or in docker-desktop and you are familiar with kubectl commands.

Prerequisites

  • minikube or any other k8s environment
  • kubectl
  • helm

Install spark operator

Create Namespaces

kubectl create ns spark-operator
kubectl create ns spark-jobs

Add Helm repo

helm repo add spark-operator https://googlecloudplatform.github.io/spark-on-k8s-operator

Install Helm chart for Spark

Webhook must be enabled for mounting volumes to driver and executor nodes and also for environment variables defined on pods to work.

helm install my-spark spark-operator/spark-operator --namespace spark-operator --set sparkJobNamespace=spark-jobs --set webhook.enable=true --set webhook.port=443

Install Minio

Create Namespaces

kubectl create ns minio

Add Helm repo

helm repo add minio https://helm.min.io/

Install Helm chart for Minio

helm install --namespace minio --generate-name minio/minio

Setup Minio

Minio will be installed with its access key & secret key which should be used while connecting to the service There is a separate image for minio cli minio/mc which should be used to upload your data to minio or it can be accessed using UI also for which we should do a port forward of the minio service.

Move data to minio

## Setup minio credentials
1. export MINIO_ACCESSKEY=`kubectl get secret -n minio -l app=minio -o json | jq .data.accesskey | sed s/\"//g | base64 -d`
2. export MINIO_SECRETKEY=`kubectl get secret -n minio -l app=minio -o json | jq .data.secretkey | sed s/\"//g | base64 -d`
3. export MINIO_HOST=`kubectl get svc -n minio -l app=minio -ojsonpath='http://{.items[0].metadata.name}:{.items[0].spec.ports[0].targetPort}'`
4. kubectl run -n minio -it --rm minio-cli --env MINIO_HOST=$MINIO_HOST --env MINIO_ACCESSKEY=$MINIO_ACCESSKEY --env MINIO_SECRETKEY=$MINIO_SECRETKEY \
   --image=minio/mc --command sh
Run the following commands inside the container in step 4 :
mc config host add myminio ${MINIO_HOST} ${MINIO_ACCESS_KEY} ${MINIO_SECRET_KEY}
mc ls myminio
Use mc cp command to copy data from the container to minio.

Use wget commands to fetch data to container.


Last update: November 7, 2023
Back to top