This Helm chart deploys the Platforma application to a Kubernetes cluster.
- Helm: v3.8.0+ (for OCI support)
- Kubernetes: v1.25.0+
- Persistent Volume Provisioner: A dynamic provisioner is required if you are using persistence (enabled by default).
- Ingress Controller: An Ingress controller (e.g., NGINX Ingress Controller) must be installed in the cluster to use the Ingress resource.
There are two recommended methods for installing the Platforma Helm chart.
This is the preferred method for modern Helm versions. It pulls the chart directly from the GitHub Container Registry.
# Replace <version> with the specific chart version you want to install
# Replace <namespace> with the target namespace
# Provide your custom values file with -f
helm install my-platforma oci://ghcr.io/milaboratory/platforma-helm-charts/platforma \
--version <version> \
--namespace <namespace> \
-f my-values.yaml
This method uses the traditional Helm repository hosted on GitHub Pages.
1. Add the Helm Repository:
helm repo add platforma https://milaboratory.github.io/platforma-helm-charts
helm repo update
2. Install the Chart:
# You can search for available versions
helm search repo platforma/platforma --versions
# Install the chart (replace <version> with the desired chart version)
# Replace <namespace> with the target namespace
# Provide your custom values file with -f
helm install my-platforma platforma/platforma \
--version <version> \
--namespace <namespace> \
-f my-values.yaml
- image: repository, tag (defaults to
appVersion), pullPolicy,imagePullSecrets. - service: gRPC Service on
listenOptions.port(default 6345). Optional HTTP Service only whenprimaryStorage.fs.enabledis true. - ingress: Single
host; gRPC path always when enabled; HTTP path only ifprimaryStorage.fs.enabled. - probes:
httpGet,tcpSocket, orgrpc, separately configurable for liveness/readiness. - deployment: strategy, pod labels/annotations, securityContext and podSecurityContext.
- persistence: either single
mainRootPVC or splitdbDir/workDir/packagesDirPVCs; optional logging PVC; optional FS data libraries; optional FS primary storage PVC. - primaryStorage (exclusive): Exactly one of S3, FS, or GCS must be enabled; the chart validates this and fails otherwise.
- dataLibrary: Additional S3/GCS/FS libraries.
- authOptions: htpasswd or LDAP (+ TLS via paths or secretRef).
- googleBatch: CLI args and optional shared NFS PVC for offloaded jobs.
- monitoring/debug: Optional Services and ports.
- gcp.serviceAccount: Optional centralized GCP service account email used as a fallback for GCS and Google Batch CLI options.
- gcp.projectId: Optional centralized GCP Project ID used as a fallback for GCS and Google Batch CLI options.
Version 2.0.0 of this Helm chart introduces significant structural changes and is not backward-compatible with 1.x versions. A manual migration is required to upgrade existing releases while preserving data.
The key change is the refactoring of the values.yaml file for better organization and clarity.
-
Backup Your Data: Before starting the migration, ensure you have a backup of your persistent volumes.
-
Prepare a Migration
values.yaml: You will need to create a new values file (migration-values.yaml) that maps your old configuration to the new structure. The primary goal is to reuse your existing PersistentVolumeClaims (PVCs) to avoid data loss.Your existing PVCs typically follow this naming pattern:
<release-name>-platforma-database<release-name>-platforma-work<release-name>-platforma-softwareloader
-
Map Old Values to New Structure: Here is an example of how to configure the
persistencesection in yourmigration-values.yamlto reuse your existing volumes:# migration-values.yaml persistence: dbDir: enabled: true existingClaim: "<release-name>-platforma-database" mountPath: "/db" workDir: enabled: true existingClaim: "<release-name>-platforma-work" mountPath: "/data/work" packagesDir: enabled: true existingClaim: "<release-name>-platforma-softwareloader" mountPath: "/storage/controllers/software-loader"You must also port other custom configurations from your old
values.yaml(e.g.,image.tag,ingress,resources,primaryStorage,authOptions) to their new locations in theplatformastructure. -
Perform the Upgrade: Run
helm upgradewith your release name, the new chart version, and your migration values file.helm upgrade <release-name> platforma/platforma --version 2.0.0 -f migration-values.yaml
You can pass licenses for Platforma (PL_LICENSE) and other integrated tools (MI_LICENSE) securely using Kubernetes Secrets and environment variables.
1. Create the Secret Resources
Create Kubernetes secrets to hold your license keys.
Using kubectl:
kubectl create secret generic pl-license-secret --from-literal=pl-license-key='your_pl_license_key_here'
kubectl create secret generic mi-license-secret --from-literal=mi-license-key='your_mi_license_key_here'
2. Reference the Secrets in values.yaml
Modify your values.yaml to reference these secrets. The chart will inject them as environment variables into the application container.
env:
secretVariables:
- name: PL_LICENSE
secretKeyRef:
name: pl-license-secret
key: pl-license-key
- name: MI_LICENSE
secretKeyRef:
name: mi-license-secret
key: mi-license-key
Persistence is enabled by default and controlled under persistence:
- Remove the former
globalEnabled; behavior now depends onmainRoot.enabledvs split volumes. - mainRoot (default): a single PVC mounted at
persistence.mainRoot.mountPath(default/data/platforma-data). WhenmainRoot.enabled: true, the split volumes below are ignored. - Split volumes: only used when
mainRoot.enabled: false:dbDir: RocksDB stateworkDir: working directorypackagesDir: software packages For each you can setexistingClaimto use existing PersistentVolumeClaim instead of automatic PVC creation for service Also you can altersizeandstorageClass.
- Logging persistence: when
logging.destinationisdir://orfile://, you can persist logs withlogging.persistence.enabled. Configuration rules are the same as for other persistent volumes. - FS data libraries: each entry in
dataLibrary.fscan create or reuse a PVC and is mounted at itspath.
Tip: set existingClaim to reuse an existing volume; otherwise set createPvc: true and specify size (and storageClass if needed).
Platforma Backend can use docker images to run software for blocks.
To enable this mode, use docker.enabled: true in values configuration.
NOTE: for now, 'docker' mode is restrictive, making backend to either require all software be binary (enabled: false) or be dockerized (enabled: true).
By default, docker pod gets created with the same resource requests/limits, as main service pod. It is possible to specify alternative resources for docker pod. Only options that are set to non-empty values will override common resource settings.
docker:
enabled: true
resources:
requests:
memory: 256Gi
For sensitive files like TLS certificates, S3 credentials, or the Platforma license file, this chart uses a secure mounting mechanism.
You can create secrets from files or literal values.
- LDAP Certificates:
kubectl create secret generic ldap-cert-secret \ --from-file=tls.crt=./tls.crt \ --from-file=tls.key=./tls.key \ --from-file=ca.crt=./ca.crt - Platforma License File:
kubectl create secret generic platforma-license \ --from-file=license=./license.txt - S3 Credentials:
kubectl create secret generic my-s3-secret \ --from-literal=access-key=AKIA... \ --from-literal=secret-key=abcd1234...
Reference the secrets in values.yaml under the appropriate section (e.g., authOptions.ldap.secretRef, mainOptions.licenseFile.secretRef, primaryStorage.s3.secretRef).
The chart mounts the referenced secret as files into the pod (e.g., at /etc/platforma/secrets/ldap/), and the application is automatically configured to use these file paths.
This Helm chart provides flexible options for both primary and data library storage, allowing you to use S3, GCS, or a local filesystem (via PersistentVolumeClaims).
Primary storage is used for long-term storage of analysis results. Only one primary storage provider can be enabled at a time.
- S3: To use an S3-compatible object store, configure the
primaryStorage.s3section. You can provide credentials directly or reference a Kubernetes secret. - GCS: To use Google Cloud Storage, configure
primaryStorage.gcs, specifying the bucket URL, project ID, and service account. - FS (Filesystem): To use a local filesystem path backed by a PVC, enable
primaryStorage.fs.- If
primaryStorage.fs.persistence.enabledis true:- Use
existingClaimto reuse a PVC, OR - Provide
storageClassandsizeto let the chart create a PVC.
- Use
- The chart will attach the
primary-storagevolume automatically whenprimaryStorage.fs.persistence.enabledis true.
- If
gcp:
projectId: "my-gcp-project-id" # optional centralized project
primaryStorage:
gcs:
enabled: true
url: "gs://my-gcs-bucket/primary-storage/"
# projectId can be omitted; will use gcp.projectId when set
# Optional if you set top-level gcp.serviceAccount (see below)
# serviceAccount: "[email protected]"
For S3-compatible endpoints (e.g., Hetzner), set AWS-style env variables and a Secret for access/secret:
kubectl create secret generic hetzner-s3-credentials \
--from-literal=access-key=ACCESS_KEY \
--from-literal=secret-key=SECRET_KEY
In values:
env:
variables:
AWS_REGION: eu-central-1
secretVariables:
- name: AWS_ACCESS_KEY_ID
secretKeyRef:
name: hetzner-s3-credentials
key: access-key
- name: AWS_SECRET_ACCESS_KEY
secretKeyRef:
name: hetzner-s3-credentials
key: secret-key
Exactly one of primaryStorage.s3.enabled, primaryStorage.fs.enabled, or primaryStorage.gcs.enabled must be true. The chart validates this at render time and will fail if none or multiple are enabled.
Data libraries allow you to mount additional datasets into the application. You can configure multiple libraries of different types.
- S3 Libraries: Configure S3-backed libraries under
dataLibrary.s3. - GCS Libraries: Configure GCS-backed libraries under
dataLibrary.gcs. - FS Libraries: Configure filesystem-backed libraries under
dataLibrary.fs, which will be provisioned using PVCs.
This chart supports integration with Google Batch for offloading job execution. This is useful for large-scale data processing tasks. To enable this, you need a shared filesystem (like NFS) that is accessible by both the Platforma pod and the Google Batch jobs. Google Cloud Filestore is a common choice for this.
Configuration:
The googleBatch section in values.yaml controls this integration.
enabled: Set totrueto enable Google Batch integration.storage: Specifies the mapping between a local path in the container and the shared NFS volume. The format is<local-path>=<nfs-uri>.project: Your Google Cloud Project ID.region: The GCP region where Batch jobs will run.serviceAccount: The email of the GCP service account that Google Batch jobs will use. This service account needs appropriate permissions for Batch and storage access.network/subnetwork: The VPC network and subnetwork for the Batch jobs.volumes: Configures the shared NFS volume. Provide EITHERexistingClaim(reuse an existing PVC) ORstorageClass+size(let the chart create a PVC). SetaccessModeas needed (defaultReadWriteMany).
Example Configuration:
googleBatch:
enabled: true
storage: "/data/platforma-data=nfs://10.0.0.2/fileshare"
project: "my-gcp-project-id"
region: "us-central1"
serviceAccount: "[email protected]"
network: "projects/my-gcp-project-id/global/networks/default"
subnetwork: "projects/my-gcp-project-id/regions/us-central1/subnetworks/default"
volumes:
enabled: true
existingClaim: "my-filestore-pvc" # or omit and set storageClass + size for dynamic provisioning
accessMode: "ReadWriteMany"
# storageClass: "filestore-rwx"
# size: "1Ti"
This configuration assumes you have already created a Google Cloud Filestore instance and a corresponding PersistentVolumeClaim (my-filestore-pvc) in your Kubernetes cluster.
dataLibrary:
s3:
- id: "my-s3-library"
enabled: true
url: "s3://my-s3-bucket/path/to/library/"
region: "us-east-1"
The chart offers flexible logging options configured via the logging.destination parameter in values.yaml.
-
Stream-Based Logging (Default):
stream://stdout: Logs are sent to standard output (recommended for Kubernetes).stream://stderr: Logs are sent to standard error.
-
Directory-Based Logging:
dir:///path/to/logs: Logs are written to files in the specified directory. To persist logs, enablelogging.persistenceinvalues.yaml, which will create a PersistentVolumeClaim (PVC) to store the log files.
logging:
destination: "dir:///var/log/platforma"
persistence:
enabled: true
size: 10Gi
storageClass: "standard"
When deploying to a production environment, consider the following:
- Resource Management: Set realistic CPU and memory
requestsandlimitsin theresourcessection to ensure stable performance. For example:For production, consider increasing resources as needed, e.g.:resources: # Default (sane for small clusters/testing) limits: cpu: 2000m memory: 4Gi requests: cpu: 1000m memory: 2Giresources: limits: cpu: 8000m memory: 16Gi requests: cpu: 4000m memory: 8Gi - Security:
- Use a dedicated
serviceAccountand link it to a cloud IAM role for secure access to cloud resources. - Configure the
deployment.securityContextandpodSecurityContextto run the application with the least required privileges.
- Use a dedicated
- Networking:
- For secure external access, configure the
ingresswith a real TLS certificate. - Use
networkPolicyto restrict traffic between pods for a more secure network posture.
- For secure external access, configure the
- Ingress specifics:
- The HTTP port and
-httpService exist only whenprimaryStorage.fs.enabledis true. The Ingress HTTP path is added only in that case. gRPC access is always via the main Service.
- The HTTP port and
- Traefik + gRPC (h2c):
- If you use Traefik, you may need to enable h2c on the Service:
service: annotations: traefik.ingress.kubernetes.io/service.serversscheme: "h2c"
- If you use Traefik, you may need to enable h2c on the Service:
- Image pull secrets:
- For private registries, set:
imagePullSecrets: - name: regcred
- For private registries, set:
- NetworkPolicy:
- Enable and define ingress/egress rules under
networkPolicyif your cluster enforces them.
- Enable and define ingress/egress rules under
- Security defaults:
- The chart defaults to running the container as root (
runAsUser: 0). Consider hardening viadeployment.securityContextanddeployment.podSecurityContextto comply with cluster policies.
- The chart defaults to running the container as root (
Ready-to-use example values are provided under the examples/ directory:
examples/hetzner-s3.yamlexamples/aws-s3.yamlexamples/gke-gcs.yamlexamples/fs-primary.yaml
Important: Always review and adapt example files before deployment. Replace placeholders (bucket names, domains, storageClass, regions, service account emails, credentials) with values that match your environment and security policies.
kubectl create secret generic my-s3-secret \
--from-literal=access-key=AKIA... \
--from-literal=secret-key=abcd1234...
primaryStorage:
s3:
enabled: true
url: "s3://my-bucket/primary/"
region: "eu-central-1"
secretRef:
enabled: true
name: my-s3-secret
keyKey: access-key
secretKey: secret-key
-
IAM Integration for AWS EKS and GCP GKE: When running on managed Kubernetes services like AWS EKS or GCP GKE, it is common practice to associate Kubernetes service accounts with cloud IAM roles for fine-grained access control. You can add the necessary annotations to the
ServiceAccountcreated by this chart using theserviceAccount.annotationsvalue.AWS EKS Example (IAM Roles for Service Accounts - IRSA):
serviceAccount: create: true annotations: eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/MyPlatformaIAMRole"GCP GKE Example (Workload Identity):
serviceAccount: create: true annotations: iam.gke.io/gcp-service-account: "[email protected]"
When running on GKE with GCS/Batch or on EKS with S3, grant at least the following permissions to the cloud identity used by the chart.
Assign these roles to the GCP service account mapped via Workload Identity:
- roles/storage.objectAdmin
- roles/batch.jobsEditor
- roles/batch.agentReporter
- roles/iam.serviceAccountTokenCreator
- roles/artifactregistry.reader
- roles/logging.logWriter
Attach an IAM policy similar to the following to the role mapped via IRSA. Substitute placeholders with your own values:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListEntireBucketAndMultipartActions",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:ListMultipartUploadParts"
],
"Resource": "arn:aws:s3:::example-bucket-name"
},
{
"Sid": "FullAccessUserSpecific",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:GetObjectAttributes",
"s3:AbortMultipartUpload"
],
"Resource": [
"arn:aws:s3:::example-bucket-name/user-demo",
"arn:aws:s3:::example-bucket-name/user-demo/*"
]
},
{
"Sid": "GetObjectCommonPrefixes",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectAttributes"
],
"Resource": [
"arn:aws:s3:::example-bucket-name/corp-library/*",
"arn:aws:s3:::example-bucket-name/test-assets/*"
]
}
]
}