Skip to main content
Skip table of contents

Spark Operator in a Workspace

How to spin up your Spark Operator

The Kubernetes Operator for Apache Spark aims to make specifying and running Spark applications as easy and idiomatic as running other workloads on Kubernetes. It uses Kubernetes custom resources for specifying, running, and surfacing status of Spark applications. For a complete reference of the custom resource definitions, please refer to the API Definition. For details on its design, please refer to the design documentation. It requires Spark 2.3 and above that supports Kubernetes as a native scheduler backend.

The default installation is basic, please provide your override configmap to enable desired Spark Operator features.

Install Spark Operator

You can find generic installation instructions for workspace catalog applications on the Application Deployment topic.

Only install the Spark operator once per workspace.

For details on custom configuration for the operator, refer to the Spark Operator Helm Chart documentation.

After you finish the installation, see Spark Operator in a Project custom resource documentation for more information about how to submit your Spark jobs.

Sample Override Configuration File

Ensure you configure the AppDeployment with the appropriate override configmap.

  • Using UI

      owner: john
      team: operations
  • Using CLI

    See Application Deployment for details.

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: ConfigMap
      namespace: ${WORKSPACE_NAMESPACE}
      name: spark-operator-overrides
      values.yaml: |
            owner: john
            team: operations

Uninstall via the CLI

Uninstalling the Spark Operator does not affect existing SparkApplication and ScheduledSparkApplication custom resources. You need to manually remove any leftover custom resources and CRDs from the operator. Please refer to deleting Spark Operator custom resources.

Follow these steps:

  1. Uninstall the Spark Operator AppDeployment:

    kubectl -n <your workspace namespace> delete AppDeployment <name of AppDeployment>
  2. Remove the Spark Operator Service Account:

    # <name of service account> is spark-operator-service-account if you didn't override the RBAC resources.
    kubectl -n <your workspace namespace> delete serviceaccounts <name of service account>
  3. Remove the Spark Operator CRDs:

    NOTE: The CRDs are not finalized for deletion until you delete the associated custom resources.

    kubectl delete crds


Here are some resources to learn more about Spark Operator:

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.