Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update EBS Volume Provisioning Documentation for Spark on EKS #58

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 117 additions & 9 deletions content/storage/docs/spark/ebs.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,8 @@ kubectl apply -f ebs-static-pvc.yaml -n <namespace>

PVC - `ebs-static-pvc` can be used by spark developer to mount to the spark pod

**NOTE**: Pods running in EKS worker nodes can only attach to the EBS volume provisioned in the same AZ as the EKS worker node. Use [node selectors](../../../node-placement/docs/eks-node-placement.md) to schedule pods on EKS worker nodes the specified AZ.
!!! warning "Warning"
Pods running in EKS worker nodes can only attach to the EBS volume provisioned in the same AZ as the EKS worker node. Use [node selectors](../../../node-placement/docs/eks-node-placement.md) to schedule pods on EKS worker nodes the specified AZ.

#### Spark Developer Tasks

Expand Down Expand Up @@ -146,36 +147,141 @@ kubectl get pod <driver pod name> -n <namespace> -o yaml --export

### Dynamic Provisioning

Dynamic Provisioning PVC/Volumes is supported for both Spark driver and executors for EMR versions >= 6.3.0.
EMR versions 6.3.0 and newer support dynamic provisioning of PVC/Volumes for both Spark driver and executors.

The EBS CSI Driver offers two ways to bind volumes in Kubernetes:

1. `WaitForFirstConsumer` Mode:
- The volume is created in the right Availability Zone only when a Pod needs it.
- This is usually the best choice for most situations.

2. `Immediate` Mode:
- The volume is created right away, but it might not be in the right Availability Zone.
- This can sometimes cause problems with scheduling Pods and may result in unschedulable Pods.
- **Important**: Availability Zone constraints are mandatory to avoid unschedulable Pods.
- Only use this if you need the fastest possible start-up time.


#### EKS Admin Tasks

Create a new "gp3" EBS Storage Class or use an existing one:
!!! warning "Warning"
The default Kubernetes role for `emr-containers` does not have the required PVC permissions and the job fails when
you submit it. Please follow [AWS Troubleshooting Guide](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/permissions-for-pvc.html)
to add the required PVC permissions.


To set up storage, you can create a new "gp3" EBS Storage Class or use an existing one.

When using `WaitForFirstConsumer` mode, you don't need to specify the Availability Zone. Here's an example:

```shell
cat >demo-gp3-sc.yaml << EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: demo-gp3-sc-wait-for-first-consumer
provisioner: ebs.csi.aws.com
parameters:
type: gp3
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- debug
volumeBindingMode: WaitForFirstConsumer
EOF
```

When using `Immediate` mode, you must specify where the volume should be created to avoid problems. Here's an example:


```bash
cat >demo-gp3-sc.yaml << EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: demo-gp3-sc
provisioner: kubernetes.io/aws-ebs
name: demo-gp3-sc-immediate-us-east-1a
provisioner: ebs.csi.aws.com
parameters:
type: gp3
reclaimPolicy: Retain
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- debug
volumeBindingMode: Immediate
allowedTopologies:
- matchLabelExpressions:
- key: topology.kubernetes.io/zone
values:
- us-east-1a
EOF
```

!!! info "Info"
It's important to note that you need to create a separate storage class for each Availability Zone (AZ) where you want to run your application. This ensures that volumes can be provisioned in the correct AZ to match your application's requirements.

To apply either configuration, save it to a file (e.g., demo-gp3-sc.yaml) and run:

```bash
kubectl apply -f demo-gp3-sc.yaml
```

#### Spark Developer Tasks

**Request**
**Request for `WaitForFirstConsumer` Mode**

```bash
cat >spark-python-in-s3-ebs-dynamic-localdir.json << EOF
{
"name": "spark-python-in-s3-ebs-dynamic-localdir",
"virtualClusterId": "<virtual-cluster-id>",
"executionRoleArn": "<execution-role-arn>",
"releaseLabel": "emr-6.15.0-latest",
"jobDriver": {
"sparkSubmitJobDriver": {
"entryPoint": "s3://<s3 prefix>/trip-count-fsx.py",
"sparkSubmitParameters": "--conf spark.driver.cores=5 --conf spark.executor.instances=10 --conf spark.executor.memory=20G --conf spark.driver.memory=15G --conf spark.executor.cores=6"
}
},
"configurationOverrides": {
"applicationConfiguration": [
{
"classification": "spark-defaults",
"properties": {
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.options.claimName": "OnDemand",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "demo-gp3-sc-wait-for-first-consumer",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.mount.path":"/data",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.mount.readOnly": "false",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.options.sizeLimit": "10Gi",

"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.claimName":"OnDemand",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "demo-gp3-sc-wait-for-first-consumer",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.path": "/data",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.readOnly": "false",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.sizeLimit": "50Gi",
}
}
],
"monitoringConfiguration": {
"cloudWatchMonitoringConfiguration": {
"logGroupName": "/emr-containers/jobs",
"logStreamNamePrefix": "demo"
},
"s3MonitoringConfiguration": {
"logUri": "s3://joblogs"
}
}
}
}
EOF
aws emr-containers start-job-run --cli-input-json file:///spark-python-in-s3-ebs-dynamic-localdir.json
```

**Request for `Immediate` Mode**

!!! warning "Warning"
Pods running in EKS worker nodes can only attach to the EBS volume provisioned in the same AZ as the EKS worker node. Use [node selectors](../../../node-placement/docs/eks-node-placement.md) to schedule pods on EKS worker nodes the specified AZ.

```bash
cat >spark-python-in-s3-ebs-dynamic-localdir.json << EOF
{
"name": "spark-python-in-s3-ebs-dynamic-localdir",
Expand All @@ -193,14 +299,16 @@ cat >spark-python-in-s3-ebs-dynamic-localdir.json << EOF
{
"classification": "spark-defaults",
"properties": {
"spark.kubernetes.node.selector.topology.kubernetes.io/zone": "us-east-1a",

"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.options.claimName": "OnDemand",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "demo-gp3-sc",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "demo-gp3-sc-immediate-us-east-1a",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.mount.path":"/data",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.mount.readOnly": "false",
"spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-1.options.sizeLimit": "10Gi",

"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.claimName":"OnDemand",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "demo-gp3-sc",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "demo-gp3-sc-immediate-us-east-1a",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.path": "/data",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.readOnly": "false",
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.sizeLimit": "50Gi",
Expand Down
4 changes: 2 additions & 2 deletions content/storage/docs/spark/instance-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ managedNodeGroups:
- YOUR_NG_SUBNET
preBootstrapCommands: # commands executed as root
- yum install -y mdadm nvme-cli
- nvme_disks=($(nvme list | grep "Amazon EC2 NVMe Instance Storage" | awk -F'[[:space:]][[:space:]]+' '{print $1}')) && [[ ${#nvme_disks[@]} -eq 1 ]] && mkfs.ext4 -F ${nvme_disks[*]} && systemctl stop docker && mkdir -p /var/lib/kubelet/pods && mount ${nvme_disks[*]} /var/lib/kubelet/pods && chmod 750 /var/lib/docker && systemctl start docker
- nvme_disks=($(nvme list | grep "Amazon EC2 NVMe Instance Storage" | awk -F'[[:space:]][[:space:]]+' '{print $1}')) && [[ ${#nvme_disks[@]} -ge 2 ]] && mdadm --create --verbose /dev/md0 --level=0 --raid-devices=${#nvme_disks[@]} ${nvme_disks[*]} && mkfs.ext4 -F /dev/md0 && systemctl stop docker && mkdir -p /var/lib/kubelet/pods && mount /dev/md0 /var/lib/kubelet/pods && chmod 750 /var/lib/docker && systemctl start docker
- nvme_disks=($(nvme list | grep "Amazon EC2 NVMe Instance Storage" | awk -F'[[:space:]][[:space:]]+' '{print $1}')) && [[ ${#nvme_disks[@]} -eq 1 ]] && mkfs.ext4 -F ${nvme_disks[*]} && mkdir -p /var/lib/kubelet/pods && mount ${nvme_disks[*]} /var/lib/kubelet/pods && chmod 750 /var/lib/kubelet
- nvme_disks=($(nvme list | grep "Amazon EC2 NVMe Instance Storage" | awk -F'[[:space:]][[:space:]]+' '{print $1}')) && [[ ${#nvme_disks[@]} -ge 2 ]] && mdadm --create --verbose /dev/md0 --level=0 --raid-devices=${#nvme_disks[@]} ${nvme_disks[*]} && mkfs.ext4 -F /dev/md0 && mkdir -p /var/lib/kubelet/pods && mount /dev/md0 /var/lib/kubelet/pods && chmod 750 /var/lib/kubelet
```


Expand Down