From 31b8cac2f36464ce86a5485207aa3b6757cf7e45 Mon Sep 17 00:00:00 2001 From: Benjamin Nelson Date: Thu, 16 May 2024 15:54:24 -0500 Subject: [PATCH 1/3] Add docs for generic runtime --- docs/reference/crd-config/function-crd.md | 34 +++++++++++++++++++---- 1 file changed, 29 insertions(+), 5 deletions(-) diff --git a/docs/reference/crd-config/function-crd.md b/docs/reference/crd-config/function-crd.md index 3dd89bf..a1c676b 100644 --- a/docs/reference/crd-config/function-crd.md +++ b/docs/reference/crd-config/function-crd.md @@ -71,6 +71,9 @@ This table lists available Function runtime runner images. | Java runner | The Java runner is based on the base runner and contains the Java function instance to run Java functions or connectors. The `streamnative/pulsar-functions-pulsarctl-java-runner`(`streamnative/pulsar-functions-java-runner` will be deprecated) Java runner is stored at the [Docker Hub](https://hub.docker.com/r/streamnative/pulsar-functions-pulsarctl-java-runner) and is automatically updated to align with Apache Pulsar release. | Python runner | The Python runner is based on the base runner and contains the Python function instance to run Python functions. You can build your own Python runner to customize Python dependencies. The `streamnative/pulsar-functions-pulsarctl-python-runner`(`streamnative/pulsar-functions-python-runner` will be deprecated) Python runner is located at the [Docker Hub](https://hub.docker.com/r/streamnative/pulsar-functions-pulsarctl-python-runner) and is automatically updated to align with Apache Pulsar release. | Golang runner | The Golang runner provides all the tool-chains and dependencies required to run Golang functions. The `streamnative/pulsar-functions-pulsarctl-go-runner`(`streamnative/pulsar-functions-go-runner` will be deprecated) Golang runner is located at the [Docker Hub](https://hub.docker.com/r/streamnative/pulsar-functions-pulsarctl-go-runner) and is automatically updated to align with Apache Pulsar release. +| Generic Python Runnner | A python function runner built on top of the generic base runner image. It is hosted [here](https://hub.docker.com/r/streamnative/pulsar-functions-generic-python-runner). +| Generic Node Runner | A node function runner built on top of the generic base runner image. It is hosted [here](https://hub.docker.com/r/streamnative/pulsar-functions-generic-node-runner). +| Generic base runner | If you do not want to build your function on a specific version of Pulsar this base image is available for use. It is hosted [here](https://hub.docker.com/r/streamnative/pulsar-functions-generic-base-runner). ## Image pull policies @@ -87,7 +90,7 @@ When the Function Mesh Operator creates a container, it uses the `imagePullPolic Function Mesh provides Pulsar cluster configurations in the Function, Source, and Sink CRDs. You can configure TLS encryption, TLS authentication, and OAuth2 authentication using the following configurations. > **Note** -> +> > The `tlsConfig` and `tlsSecret` are exclusive. If you configure TLS configurations, the TLS Secret will not take effect. @@ -216,7 +219,7 @@ The output topics of a Pulsar Function. This table lists options available for t |Name | Description | | --- | --- | -| `topics` | The output topic of a Pulsar Function (If none is specified, no output is written). | +| `topics` | The output topic of a Pulsar Function (If none is specified, no output is written). | | `sinkSerdeClassName` | The map of output topics to SerDe class names (as a JSON string). | | `sinkSchemaType` | The built-in schema type or custom schema class name to be used for messages sent by the function.| | `producerConf` | The producer specifications. Available options:
- `batchBuilder`: The type of batch construction method. Support the key-based batcher.
- `compressionType`: the message data compression type used by a producer. Available options are `LZ4`, `NONE`, `ZLIB`, `ZSTD`, and `SNAPPY`. By default, it is set to `LZ4`. This option is only available for the runner image v3.0.0 or above.
- `cryptoConfig`: the cryptography configurations of the producer.
- `maxPendingMessages`: the maximum number of pending messages.
- `maxPendingMessagesAcrossPartitions`: the maximum number of pending messages across all partitions.
- `useThreadLocalProducers`: configure whether the producer uses a thread. | @@ -268,7 +271,25 @@ Then, in the Pulsar Functions and Connectors, you can call `context.getSecret("u ## Packages -Function Mesh supports running Pulsar Functions in Java, Python and Go. This table lists fields available for running Pulsar Functions in different languages. +Function Mesh supports running Pulsar Functions in Java, Python, Go and a generic runtime. This table lists fields available for running Pulsar Functions in different languages. + +The language fields are nested under each type in the CRD. + +For example, a java runtime would nest under a `java` key. + +``` +java: + extraDependenciesDir: /pulsar/lib + jar: /tmp/api-examples.jar + jarLocation: function://public/default/test + log: + javaLog4JConfigFileType: yaml + logConfig: + key: log4j2-function.yaml + name: new-metrics-test-broker-log4j-config +``` + +The key for the generic runtime is `genericRuntime`. | Field | Description | | --- | --- | @@ -277,6 +298,9 @@ Function Mesh supports running Pulsar Functions in Java, Python and Go. This tab | `goLocation` | The path to the JAR file for the function. It is only available for Pulsar functions written in Go.| | `pyLocation` | The path to the JAR file for the function. It is only available for Pulsar functions written in Python.| | `extraDependenciesDir` | It specifies the dependent directory for the JAR package. | +| `functionFile` | This is the location of the executable to run. | +| `functionFileLocation` | The location on the filesystem where the function is at. | +| `language` | The programming language used for the function. Currently supports `nodejs`, `python`, `executable`, and `wasm`. | ## Runtime logs @@ -364,7 +388,7 @@ spec: periodSeconds: 10 # --- [3] successThreshold: 1 # --- [4] -... +... # Other configs ``` @@ -394,7 +418,7 @@ Apart from the `PodSecurityContext`, Function Mesh also applies the following `S ```yaml SecurityContext: capabilities: - drop: + drop: - ALL allowPrivilegeEscalation: false ``` From 4f76092fb873bae284abea41b9c654790d87bc7c Mon Sep 17 00:00:00 2001 From: Benjamin Nelson Date: Thu, 16 May 2024 15:57:49 -0500 Subject: [PATCH 2/3] Add documentation around java memory limits --- docs/reference/crd-config/function-crd.md | 3 +++ docs/reference/crd-config/sink-crd-config.md | 9 ++++++--- docs/reference/crd-config/source-crd-config.md | 11 +++++++---- 3 files changed, 16 insertions(+), 7 deletions(-) diff --git a/docs/reference/crd-config/function-crd.md b/docs/reference/crd-config/function-crd.md index a1c676b..4707dde 100644 --- a/docs/reference/crd-config/function-crd.md +++ b/docs/reference/crd-config/function-crd.md @@ -231,6 +231,9 @@ When you specify a function or connector, you can optionally specify how much of If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a pod to use more resources than its `request` for that resource specifies. However, a pod is not allowed to use more than its resource `limit`. +If the CPU and memory are the same and the function is a Java function then we only allow 90% of the memory available from the request for the JVM size to prevent +out of memory errors. + ## Secrets Function Mesh provides the `secretsMap` field for Function, Source, and Sink in the CRD definition. You can refer to the created secrets under the same namespace and the controller can include those referred secrets. The secrets are provide by `EnvironmentBasedSecretsProvider`, which can be used by `context.getSecret()` in Pulsar functions and connectors. diff --git a/docs/reference/crd-config/sink-crd-config.md b/docs/reference/crd-config/sink-crd-config.md index 1a71adf..1033e6c 100644 --- a/docs/reference/crd-config/sink-crd-config.md +++ b/docs/reference/crd-config/sink-crd-config.md @@ -82,7 +82,7 @@ When the Function Mesh Operator creates a container, it uses the `imagePullPolic Function Mesh provides Pulsar cluster configurations in the Function, Source, and Sink CRDs. You can configure TLS encryption, TLS authentication, and OAuth2 authentication using the following configurations. > **Note** -> +> > The `tlsConfig` and `tlsSecret` are exclusive. If you configure TLS configurations, the TLS Secret will not take effect.
@@ -195,6 +195,9 @@ When you specify a function or connector, you can optionally specify how much of If the node where a Pod is running has enough of a resource available, it is possible (and allowed) for a Pod to use more resources than its `request` for that resource. However, a Pod is not allowed to use more than its resource `limit`. +If the CPU and memory are the same and the function is a Java function then we only allow 90% of the memory available from the request for the JVM size to prevent +out of memory errors. + ## Secrets Function Mesh provides the `secretsMap` field for Function, Source, and Sink in the CRD definition. You can refer to the created secrets under the same namespace and the controller can include those referred secrets. The secrets are provide by `EnvironmentBasedSecretsProvider`, which can be used by `context.getSecret()` in Pulsar functions and connectors. @@ -280,7 +283,7 @@ spec: initialDelaySeconds: 10 # --- [2] periodSeconds: 10 # --- [3] successThreshold: 1 # --- [4] -... +... # Other configs ``` @@ -310,7 +313,7 @@ Apart from the `PodSecurityContext`, Function Mesh also applies the following `S ```yaml SecurityContext: capabilities: - drop: + drop: - ALL allowPrivilegeEscalation: false ``` diff --git a/docs/reference/crd-config/source-crd-config.md b/docs/reference/crd-config/source-crd-config.md index 8203cab..43958d0 100644 --- a/docs/reference/crd-config/source-crd-config.md +++ b/docs/reference/crd-config/source-crd-config.md @@ -75,7 +75,7 @@ When the Function Mesh Operator creates a container, it uses the `imagePullPolic Function Mesh provides Pulsar cluster configurations in the Function, Source, and Sink CRDs. You can configure TLS encryption, TLS authentication, and OAuth2 authentication using the following configurations. > **Note** -> +> > The `tlsConfig` and `tlsSecret` are exclusive. If you configure TLS configurations, the TLS Secret will not take effect.
@@ -177,7 +177,7 @@ The output topics of a Pulsar Function. This table lists options available for t |Name | Description | | --- | --- | -| `topics` | The output topic of a Pulsar Function (If none is specified, no output is written). | +| `topics` | The output topic of a Pulsar Function (If none is specified, no output is written). | | `sinkSerdeClassName` | The map of output topics to SerDe class names (as a JSON string). | | `sinkSchemaType` | The built-in schema type or custom schema class name to be used for messages sent by the function.| | `producerConf` | The producer specifications. Available options:
- `batchBuilder`: The type of batch construction method. Support the key-based batcher.
- `compressionType`: the message data compression type used by a producer. Available options are `LZ4`, `NONE`, `ZLIB`, `ZSTD`, and `SNAPPY`. By default, it is set to `LZ4`. This option is only available for the runner image v3.0.0 or above.
- `cryptoConfig`: the cryptography configurations of the producer.
- `maxPendingMessages`: the maximum number of pending messages.
- `maxPendingMessagesAcrossPartitions`: the maximum number of pending messages across all partitions.
- `useThreadLocalProducers`: configure whether the producer uses a thread. | @@ -189,6 +189,9 @@ When you specify a function or connector, you can optionally specify how much of If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a Pod to use more resources than its `request` for that resource. However, a Pod is not allowed to use more than its resource `limit`. +If the CPU and memory are the same and the function is a Java function then we only allow 90% of the memory available from the request for the JVM size to prevent +out of memory errors. + ## Secrets Function Mesh provides the `secretsMap` field for Function, Source, and Sink in the CRD definition. You can refer to the created secrets under the same namespace and the controller can include those referred secrets. The secrets are provide by `EnvironmentBasedSecretsProvider`, which can be used by `context.getSecret()` in Pulsar functions and connectors. @@ -274,7 +277,7 @@ spec: initialDelaySeconds: 10 # --- [2] periodSeconds: 10 # --- [3] successThreshold: 1 # --- [4] -... +... # Other configs ``` @@ -304,7 +307,7 @@ Apart from the `PodSecurityContext`, Function Mesh also applies the following `S ```yaml SecurityContext: capabilities: - drop: + drop: - ALL allowPrivilegeEscalation: false ``` From ff46964ef29a2fba3cfac872756d6df493c4e75f Mon Sep 17 00:00:00 2001 From: Benjamin Nelson Date: Thu, 16 May 2024 16:06:11 -0500 Subject: [PATCH 3/3] [FM v0.19.0]--Release docs for FM v0.19.0 --- docs/reference/crd-config/function-crd.md | 3 +-- docs/reference/crd-config/sink-crd-config.md | 3 +-- docs/reference/crd-config/source-crd-config.md | 3 +-- 3 files changed, 3 insertions(+), 6 deletions(-) diff --git a/docs/reference/crd-config/function-crd.md b/docs/reference/crd-config/function-crd.md index 4707dde..56ce760 100644 --- a/docs/reference/crd-config/function-crd.md +++ b/docs/reference/crd-config/function-crd.md @@ -231,8 +231,7 @@ When you specify a function or connector, you can optionally specify how much of If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a pod to use more resources than its `request` for that resource specifies. However, a pod is not allowed to use more than its resource `limit`. -If the CPU and memory are the same and the function is a Java function then we only allow 90% of the memory available from the request for the JVM size to prevent -out of memory errors. +For Java functions we only allow a maximum memory usage by the JVM of 40% of the heap, 40% of direct memory and 20% misc. ## Secrets diff --git a/docs/reference/crd-config/sink-crd-config.md b/docs/reference/crd-config/sink-crd-config.md index 1033e6c..d7f2e9e 100644 --- a/docs/reference/crd-config/sink-crd-config.md +++ b/docs/reference/crd-config/sink-crd-config.md @@ -195,8 +195,7 @@ When you specify a function or connector, you can optionally specify how much of If the node where a Pod is running has enough of a resource available, it is possible (and allowed) for a Pod to use more resources than its `request` for that resource. However, a Pod is not allowed to use more than its resource `limit`. -If the CPU and memory are the same and the function is a Java function then we only allow 90% of the memory available from the request for the JVM size to prevent -out of memory errors. +For Java functions we only allow a maximum memory usage by the JVM of 40% of the heap, 40% of direct memory and 20% misc. ## Secrets diff --git a/docs/reference/crd-config/source-crd-config.md b/docs/reference/crd-config/source-crd-config.md index 43958d0..4e8f176 100644 --- a/docs/reference/crd-config/source-crd-config.md +++ b/docs/reference/crd-config/source-crd-config.md @@ -189,8 +189,7 @@ When you specify a function or connector, you can optionally specify how much of If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a Pod to use more resources than its `request` for that resource. However, a Pod is not allowed to use more than its resource `limit`. -If the CPU and memory are the same and the function is a Java function then we only allow 90% of the memory available from the request for the JVM size to prevent -out of memory errors. +For Java functions we only allow a maximum memory usage by the JVM of 40% of the heap, 40% of direct memory and 20% misc. ## Secrets