Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ All notable changes to this project will be documented in this file.
Previously, Jobs were retried at most 6 times by default ([#647]).
- Support for Spark `3.5.8` ([#650]).
- First class support for S3 on Spark connect clusters ([#652]).
- Spark applications can now have templates that are merged into the application manifest before reconciliation. This allows users with many applications to source out common configuration in a central place and reduce duplication ([#660]).

### Fixed

Expand Down Expand Up @@ -45,6 +46,7 @@ All notable changes to this project will be documented in this file.
[#652]: https://github.com/stackabletech/spark-k8s-operator/pull/652
[#655]: https://github.com/stackabletech/spark-k8s-operator/pull/655
[#656]: https://github.com/stackabletech/spark-k8s-operator/pull/656
[#660]: https://github.com/stackabletech/spark-k8s-operator/pull/660

## [25.11.0] - 2025-11-07

Expand Down
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions Cargo.nix

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ tokio = { version = "1.40", features = ["full"] }
tracing = "0.1"
tracing-futures = { version = "0.2", features = ["futures-03"] }
indoc = "2"
regex = "1"

[patch."https://github.com/stackabletech/operator-rs.git"]
# stackable-operator = { git = "https://github.com/stackabletech//operator-rs.git", branch = "main" }
Expand Down
1 change: 1 addition & 0 deletions deploy/helm/spark-k8s-operator/templates/roles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ rules:
- sparkapplications
- sparkhistoryservers
- sparkconnectservers
- sparkapptemplates
verbs:
- get
- list
Expand Down
105 changes: 105 additions & 0 deletions docs/modules/spark-k8s/pages/usage-guide/app_templates.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
= Spark Application Templates
:description: Learn how to configure application templates for Spark applications on the Stackable Data Platform.

Spark application templates are used to define reusable configurations for Spark applications.
When you have many applications with similar configurations, templates can help you avoid duplication by grouping common settings together.
Application templates are available for the `v1alpha1` version of the SparkApplication custom resource and share the exact same structure as the SparkApplication resource, but with some differences in the way the operator handles them:

1. Application templates are cluster wide resources, while Spark application resources are namespace-scoped.
This means that application templates can be used across multiple namespaces, while Spark application resources are limited to the namespace they are created in.
2. Application templates are not reconciled by the operator, but must be referenced from a SparkApplication resource to be applied. This means that changes to an application template will not automatically trigger updates to SparkApplication resources that reference it.
3. An application can reference multiple application templates, and the settings from these templates will be merged together. The merging order of the templates is indicated by their index in the reference list. The application fields have the highest precedence and will override any conflicting settings from the templates. This allows you to have a base template with common settings and then override specific settings in the application resource as needed.
4. Application template references are immutable in the sense that once applied to an application they cannot be changed again. Currently templates are applied upon the creation of the application, and any changes to the template references after that will be ignored.
5. Application and template CRDs must have the exact same versions. Currently only `v1alpha1` is supported.

== Examples

Applications use `metadata.annotations` to reference application templates as shown below:

[source,yaml]
----
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: app
annotations:
spark-application.template.merge: "true" # <1>
spark-application.template.0.name: "app-template" # <2>
spark-application.template.upgradeStrategy: "onCreate" # <3>
spark-application.template.applyStrategy: "enforce" # <4>
spec: # <5>
sparkImage:
productVersion: "4.1.1"
mode: cluster
mainClass: com.example.Main
mainApplicationFile: "/examples.jar"
----
<1> Enable application template merging for this application.
<2> Name of the application template to reference.
<3> Optional. The upgrade strategy for the application template. Currently only `onCreate` is supported. This means that the application template will only be applied when the application is created, and any changes to the template after that will be ignored.
<4> Optional. The apply strategy for the application template. Currently only `enforce` is supported. This means that any errors that appear during the application of the template will be treated as errors for the application resource, and the application will not be created or updated until the errors are resolved.
<5> Application specification. The fields `sparkImage`, `mode`, `mainClass`, and `mainApplicationFile` are required for the application to be valid, but the rest of the fields are optional and can be defined in the application template.

The application template referenced in the example above is defined as follows:

[source,yaml]
----
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplicationTemplate # <1>
metadata:
name: app-template # <2>
spec:
sparkImage:
productVersion: "4.1.1"
pullPolicy: IfNotPresent
mode: cluster
mainClass: com.example.Main
mainApplicationFile: "placeholder" # <3>
sparkConf:
spark.kubernetes.file.upload.path: "s3a://my-bucket"
s3connection:
reference: spark-history-s3-connection
logFileDirectory:
s3:
prefix: eventlogs/
bucket:
reference: spark-history-s3-bucket
driver:
config:
logging:
enableVectorAgent: False
executor:
replicas: 1
config:
logging:
enableVectorAgent: False
----
<1> The kind of the resource is `SparkApplicationTemplate` to indicate that this is an application template.
<2> Name of the application template.
<3> The value of `mainApplicationFile` is set to a placeholder value, which will be overridden by the application resource. Similarly to the application, The fields `sparkImage`, `mode`, `mainClass`, and `mainApplicationFile` are required for the template to be valid.

An application can reference multiple application templates as shown below:

[source,yaml]
----
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: app
annotations:
spark-application.template.merge: "true" # <1>
spark-application.template.0.name: "app-template-0" # <2>
spark-application.template.1.name: "app-template-1"
spark-application.template.2.name: "app-template-2"
spec: # <3>
sparkImage:
productVersion: "4.1.1"
mode: cluster
mainClass: com.example.Main
mainApplicationFile: "/examples.jar"
----
<1> Enable application template merging for this application.
<2> The name of the application templates to reference. The settings from these templates will be merged together in the order they are referenced, with `app-template-0` having the lowest precedence and `app-template-2` having the highest precedence. The application fields have the highest overall precedence and will override any conflicting settings from the templates.
1 change: 1 addition & 0 deletions docs/modules/spark-k8s/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
** xref:spark-k8s:usage-guide/job-dependencies.adoc[]
** xref:spark-k8s:usage-guide/resources.adoc[]
** xref:spark-k8s:usage-guide/s3.adoc[]
** xref:spark-k8s:usage-guide/app_templates.adoc[]
** xref:spark-k8s:usage-guide/security.adoc[]
** xref:spark-k8s:usage-guide/logging.adoc[]
** xref:spark-k8s:usage-guide/history-server.adoc[]
Expand Down
Loading