Skip to content

Conversation

@richscott
Copy link
Member

@richscott richscott commented Nov 21, 2025

These changes add support for running a Spark client against a remote Armada server (as opposed to both being on localhost). This has been developed and tested on a macOS system as the Spark client with an Ubuntu Linux server running Armada (on top of a kind k8s cluster).

See updates to the README.md for the steps to use this.

Change Details:

  • support for using TLS certificates is added - instructions have been added to README.md detailing on how to get the cert files, define ARMADA_MASTER in scripts/config.sh, etc.
  • the dev-e2e.sh has been modified to check for the existence and health of the specified Armada server, but if ARMADA_MASTER is not localhost, it will not attempt to start the server via armada-operator.
  • the dev-e2e.sh script now does not use Docker to run the tests - it directly uses the local clone-copy of Spark (which is typically in .spark-3.5.5).
  • the code to acquire a fabric8 k8s client has been changed from the default client (which assumes its running inside the destination cluster), to build the client, and use the TLS cert files if specified in the init.sh/config.sh configuration settings.

Changes to have submitArmadaSpark.sh work with a possibly-remote Armada server, just as the dev-e2e.sh logic, are not yet included in this PR, but will be coming soon.

These changes are to allow running an armada-spark client against a
remote (i.e. non-localhost) Armada cluster server.

Signed-off-by: Rich Scott <[email protected]>
Also, fixes in TestOrchestrator for running against a remote
Armada instance, and run the tests directly, instead of using
a Docker container on the client.

Signed-off-by: Rich Scott <[email protected]>
It will soon be used by ArmadaClientApplication.

Signed-off-by: Rich Scott <[email protected]>
@richscott richscott self-assigned this Nov 21, 2025
Comment on lines +102 to +103
of `kubectl config view`. These files can be left in this directory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should e2e/*crt be in .gitignore?


TMPDIR="$scripts/.tmp" "$AOHOME/bin/tooling/kind" load docker-image "$IMAGE_NAME" --name armada 2>&1 \
| log_group "Loading Docker image $IMAGE_NAME into Armada cluster";
if [ "$ARMADA_MASTER" = "localhost" ] ; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARMADA_MASTER is never just "localhost" is it?

usually something like: "armada://localhost:30002"

Seq(armadactlCmd) ++ subCommand.split(" ") ++ Seq("--armadaUrl", armadaUrl)
// armadactl command expects the server address to be of the form
// <hostname-or-IP>:<port> with no pseudo-protocol prefix
var armadactlUrl = armadaUrl.replaceFirst("^armada://", "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val

package org.apache.spark.deploy.armada.e2e

import io.fabric8.kubernetes.api.model.Pod
import org.apache.spark.deploy.armada.K8sClient
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used?

val dockerCommand = buildDockerCommand(
val runTestCommand = buildRunTestCommand(
config.imageName,
volumeMounts,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this removed?

lookoutUrl: String,
pythonScript: Option[String]
): Seq[String] = {
val sparkRepoCopy = ".spark-3.5.5"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded?


val classPathEntries: Seq[String] = Seq(
".",
s"${sparkRepoCopy}/assembly/target/scala-2.13/jars/*",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hard coded versions?

@GeorgeJahad
Copy link
Collaborator

Running with:

ARMADA_MASTER=armada://localhost:30002

no longer seems to work:

scripts/submitArmadaSpark.sh 100
Submitting spark job to Armada.
/home/gbj/incoming/armada-spark/scripts/init.sh: line 67: CLIENT_CERT_FILE: unbound variable

Get the first external interface IP address and use in Kind
configuration for allowing external K8S/Armada API access. Add
protective quotes around TLS vars in dev-e2e.sh, per shellcheck. Use
`realpath` for getting reliable full pathnames.

Signed-off-by: Rich Scott <[email protected]>
Use better pattern checks for verifying if Armada master is localhost;
remove quote wrapper around tls_args, so Maven doesn't error on a ""
target.

Signed-off-by: Rich Scott <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants