-
Notifications
You must be signed in to change notification settings - Fork 44
[SPARK-49464] Add documentations #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make CI happy?
spark-operator-docs/build.gradle
Outdated
|
|
||
| dependencies { | ||
| implementation project(":spark-operator") | ||
| implementation("org.projectlombok:lombok:$lombokVersion") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @jiangzho , it seems that your repository is outdated a little.
Apache Spark Kubernetes Operator follows the Gradle Version Catalog. Please rebase your repository and refer the following commit.
|
Gentle ping, @jiangzho . |
docs/operations.md
Outdated
|
|
||
| - JDK17 | ||
| - Operator used fabric8 which assumes to be compatible with available k8s versions. However for using status subresource, please use k8s version 1.14 or above. | ||
| - Spark versions 3.4 or above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apache Spark 3.4.x reaches the end-of-life very soon (2024-10-13).
docs/operations.md
Outdated
|
|
||
| ### Compatibility | ||
|
|
||
| - JDK17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java 17 and 21.
docs/operations.md
Outdated
| ### Compatibility | ||
|
|
||
| - JDK17 | ||
| - Operator used fabric8 which assumes to be compatible with available k8s versions. However for using status subresource, please use k8s version 1.14 or above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I pinged you already, the K8s eco-system is moving faster in the public environment.
- [SPARK-49512][K8S][DOCS] Drop K8s v1.27 Support spark#47990
- [SPARK-49516] Upgrade the minimum K8s version to v1.28 #117
Just FYI, in the community, please don't claim which you didn't test explicitly.
docs/operations.md
Outdated
| - Operator used fabric8 which assumes to be compatible with available k8s versions. However for using status subresource, please use k8s version 1.14 or above. | ||
| - Spark versions 3.4 or above | ||
|
|
||
| ## Manage Your Spark Operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this section because it's duplicated.
docs/operations.md
Outdated
| | operatorRbac.role.create | Whether to create Role for operator to use. At least one of `clusterRole.create` or `role.create` should be enabled | true | | ||
| | operatorRbac.roleBinding.create | Whether to create RoleBinding for operator to use. At least one of `clusterRoleBinding.create` or `roleBinding.create` should be enabled | true | | ||
| | operatorRbac.clusterRole.configManagement.roleName | Role name for operator configuration management (hot property loading and leader election) | `spark-operator-config-role` | | ||
| | appResources.namespaces.create | Whether to create dedicated namespaces for Spark apps. | `spark-operator-config-role-binding` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add clusterResources first before adding this document? It looks a little weird because the document is missing one of the part while Apache Spark Operator supports both SparkApp CRD and SparkCluster CRD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Add a short field in Spark Custom Resources page to start with. Also created SPARK-49528 to better doc the template support for clusters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually +1 for the point - appResources can be a bit misleading, since it may serve both SparkApp and SparkCluster. It was introduced to indicate this is for running Spark workload (comparing other resources created for operator deployment itself).
I shall fix this by SPARK-49623
settings.gradle
Outdated
| include 'spark-operator-api' | ||
| include 'spark-submission-worker' | ||
| include 'spark-operator' | ||
| include 'spark-operator-docs' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we move this into build-tools directory? In addition, spark-operator-docs sounds like a little overclaim because this has only ConfOptionDocGenerator while the documentations has more contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to SPARK-49527
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please spin off code part.
https://github.com/apache/spark-kubernetes-operator/pull/113/files#r1744847803
|
Thank you for updating this. |
| commandLine "java", "-classpath", sourceSets.main.runtimeClasspath.getAsPath(), javaMainClass, docsPath | ||
| } | ||
|
|
||
| build.finalizedBy(generateConfPropsDoc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ensures the generated doc is updated per gradle build when new conf is introduced, if any
docs/architecture.md
Outdated
| # Design & Architecture | ||
|
|
||
| **Spark-Kubernetes-Operator** (Operator) acts as a control plane to manage the complete | ||
| deployment lifecycle of Spark applications. The Operator can be installed on a Kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was correct, but not as of now because we add SparkCluster CRD.
I guess we need to revise README.md, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the review and sorry for the late response!
I updated this to try a best-effort to cover SparkCluster as well.
docs/architecture.md
Outdated
| namespace and controls Spark deployments in one or more managed namespaces. The custom resource | ||
| definition (CRD) that describes the schema of a SparkApplication is a cluster wide resource. | ||
| For a CRD, the declaration must be registered before any resources of that CRDs kind(s) can be | ||
| used, and the registration process sometimes takes a few seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove CRD details. It distracts Apache Spark K8s Operator explanation. We had better link to K8s CRD document.
For a CRD, the declaration must be registered before any resources of that CRDs kind(s) can be
used, and the registration process sometimes takes a few seconds.
docs/architecture.md
Outdated
| For a CRD, the declaration must be registered before any resources of that CRDs kind(s) can be | ||
| used, and the registration process sometimes takes a few seconds. | ||
|
|
||
| Users can interact with the operator using the kubectl or k8s API. The Operator continuously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not assume that the users are unaware of kubectl and helm. It's too common in these days, even in the ASF projects.
Users can interact with the operator using the kubectl or k8s API.
docs/architecture.md
Outdated
| tracks cluster events relating to the SparkApplication custom resources. When the operator | ||
| receives a new resource update, it will take action to adjust the Kubernetes cluster to the | ||
| desired state as part of its reconciliation loop. The initial loop consists of the following | ||
| high-level steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to rewrite the above paragraph into a single sentence.
| desired state as part of its reconciliation loop. The initial loop consists of the following | ||
| high-level steps: | ||
|
|
||
| * User submits a SparkApplication custom resource(CR) using kubectl / API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be clear that this is one of two cases: SparkApplication and SparkCluster .
docs/architecture.md
Outdated
| desired state until the | ||
| current state becomes the desired state. All lifecycle management operations are realized | ||
| using this very simple | ||
| principle in the Operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove All lifecycle management operations are realized using this very simple principle in the Operator.
docs/architecture.md
Outdated
|
|
||
| ## State Transition | ||
|
|
||
| [<img src="resources/state.png">](resources/state.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
png is not editable. When we need to update this in the future, how can we do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diagrams are created with draw.io which allows import and update png for simple diagrams. Would you suggest we add this in docs as well ? It's not user facing but may help future works
docs/architecture.md
Outdated
| [<img src="resources/state.png">](resources/state.png) | ||
|
|
||
| * Spark application are expected to run from submitted to succeeded before releasing resources | ||
| * User may configure the app CR to time-out after given threshold of time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this in the diagram?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The application diagram tried to cover timeout blocks as well
docs/architecture.md
Outdated
| * Spark application are expected to run from submitted to succeeded before releasing resources | ||
| * User may configure the app CR to time-out after given threshold of time | ||
| * In addition, user may configure the app CR to skip releasing resources after terminated. This is | ||
| typically used at dev phase: pods / configmaps. etc would be kept for debugging. They have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pods / configmaps. etc? Although the meaning is clear, could you revise this grammatically?
docs/configuration.md
Outdated
| To enable hot properties loading, update the **helm chart values file** with | ||
|
|
||
| ``` | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant empty line.
| # ... all other config overides... | ||
| dynamicConfig: | ||
| create: true | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant empty line.
docs/configuration.md
Outdated
|
|
||
| ## Config Metrics Publishing Behavior | ||
|
|
||
| Spark Operator uses the same source & sink interface as Apache Spark. You may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
source & sink -> source and sink
docs/metrics_logging.md
Outdated
| under the License. | ||
| --> | ||
|
|
||
| # Metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have a new file for this? I'd recommend to include this content into configuration.md.
docs/operator_probes.md
Outdated
| under the License. | ||
| --> | ||
|
|
||
| # Operator Probes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this section.
docs/operator_probes.md
Outdated
| * operator runtimeInfo health state | ||
| * Sentinel resources health state | ||
|
|
||
| ### Operator Sentinel Resource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this section only into operations.md and remove operator_probles.md completely.
docs/spark_custom_resources.md
Outdated
| runtimeVersions: | ||
| scalaVersion: "2.13" | ||
| sparkVersion: "4.0.0-preview1" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant empty link.
docs/configuration.md
Outdated
|
|
||
| ## Config Metrics Publishing Behavior | ||
|
|
||
| Spark Operator uses the same source & sink interface as Apache Spark. You may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Spark has corresponding doc, it is better to add a hyper link here.
docs/metrics_logging.md
Outdated
| the [Dropwizard Metrics Library](https://metrics.dropwizard.io/4.2.25/). Note that Spark Operator | ||
| does not have Spark UI, MetricsServlet | ||
| and PrometheusServlet from org.apache.spark.metrics.sink package are not supported. If you are | ||
| interested in Prometheus metrics exporting, please take a look at below section `Forward Metrics to Prometheus` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add hyper link?
docs/metrics_logging.md
Outdated
|
|
||
| ## Forward Metrics to Prometheus | ||
|
|
||
| In this section, we will show you how to forward spark operator metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| In this section, we will show you how to forward spark operator metrics | |
| In this section, we will show you how to forward Spark Operator metrics |
docs/metrics_logging.md
Outdated
| * Modify the | ||
| build-tools/helm/spark-kubernetes-operator/values.yaml file' s metrics properties section: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Modify the | |
| build-tools/helm/spark-kubernetes-operator/values.yaml file' s metrics properties section: | |
| * Modify the metrics properties section in the file | |
| `build-tools/helm/spark-kubernetes-operator/values.yaml`: |
docs/metrics_logging.md
Outdated
| sink.PrometheusPullModelSink | ||
| ``` | ||
|
|
||
| * Install the Spark Operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Install the Spark Operator | |
| * Install Spark Operator |
| @@ -0,0 +1,203 @@ | |||
| ## Spark Operator API | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
License header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch!
We have disabled license header check for markdowns, but I added this back for files under /docs for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Let's merge this as the initial draft.
bump internal version to 0.4.0.1
What changes were proposed in this pull request?
This PR includes Operator docs under
docs/for configuration, architecture, operations, and metrics.Why are the changes needed?
Operator docs are necessary for users to understand the design and getting started with the operator installation
Does this PR introduce any user-facing change?
No - new release
How was this patch tested?
CIs
Was this patch authored or co-authored using generative AI tooling?
No