Skip to content

Commit

Permalink
Address review comments
Browse files Browse the repository at this point in the history
  • Loading branch information
jaredoconnell committed Feb 5, 2024
1 parent 2a22527 commit b2e7427
Show file tree
Hide file tree
Showing 2 changed files with 96 additions and 63 deletions.
12 changes: 6 additions & 6 deletions docs/arcaflow/contributing/expressions.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,26 +14,26 @@ Let's say you have an expression `foo.bar`. The dot notation node is the dot in

### Map accessor

Map accessors are expressions in the form of `foo[bar]`. The left subtree will be the expression to the left, while the right subtree will be the tree representing subexpression within the brackets.
Map accessors are expressions in the form of `foo[bar]`. The left subtree will represent the expression to the left of the brackets (`foo` in the example), while the right subtree will represent the subexpression within the brackets (`bar` in the example).

### Binary Operations

Binary operations include all of the operations that have a left and right component.
Binary operations include all of the operations that have a left and right sub-tree that do not have a special node representing them (dot notation is an example of a special case).
They are represented as a node that has a left and right subtree, and an operator that describes which binary operation type is being applied.


### Unary Operations

Unary operations include logical complement `!` and negation `-`.
They are represented as a node that has one child (the tree it's applied to), and the operator being applied to the child node.
Unary operations include boolean complement `!` and numeric negation `-`.
Unary operations are represented as a node that has one child node (the tree it's applied to) and one operator that describes the operation being applied to the child node.


### Identifiers

Identifiers come in two forms:

1. `$` references the root of the data structure.
2. Any other value accesses object fields.
2. A plain string identifier from a token matching the regular expression `^\w+$`.
a. This may be used for accessing object fields or as function identifiers.

## The API layer

Expand Down
147 changes: 90 additions & 57 deletions docs/arcaflow/workflows/expressions.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# Arcaflow expressions


Arcaflow expressions were inspired by JSONPath, but have diverged from the syntax. You can use expressions in a workflow YAML like this:
Arcaflow expressions were inspired by JSONPath but have diverged from the syntax. You can use expressions in a workflow YAML like this:

```yaml
some_value: !expr $.your.expresion.here
Expand All @@ -14,44 +13,60 @@ This page explains the language elements of expressions.

## Literals

Literals are a very important part of the language. It's required to access any value that's known.
Literals represent values described in an expression, as opposed to values referenced from other sources.

### string
### String

string literals start and end with a matched pair of either single quotes `'` or double quotes `"`, and have zero or more characters between the quotes.

Strings may have escaped values. The most important ones are for backslashes (`\\` for `\`), or for new lines `\n`.

For example, to have this in a string:
Strings may have escaped values. The most important ones are for backslashes (`\\` for `\`) or for embedded newlines `\n`.

Here is the list of supported escape characters:
| Escape | Result |
| ------ | ------ |
| `\\` | `\` backslash character) |
| `\t` | tab character |
| `\n` | newline character |
| `\r` | carriage return character |
| `\b` | backspace character |
| `\"` | `"` double quote character |
| `\'` | `'` single quote character |
| `\0` | null character |

For example, to have the following text represented in a single string:
> test
> test2/\

You would need the expression `"test\ntest2/\\"`

### integer
### Integer

Intergers are non-decimal numbers. They may not start with `0`, unless the value is `0`. For example, `001` is not a valid integer literal.
Integers are whole non-negative base-10 numbers. They may not start with `0`, unless the value is `0`. For example, `001` is not a valid integer literal.

integer expressions may be made negative with a `-` before the number, as mentioned in the unary numbers section.
Examples:
- `0`
- `1`
- `503`

### float
Integer literals can be a part of an expression that can be made negative by prefixing them with the [negation operator `-`](#negation).

float literals are floating point double precision decimal numbers.
### Floating point numbers

Float literals are non-negative floating point double precision decimal numbers.

Supported formats include:
- number characters followed by a period followed by more number characters `1.1` (the typical format)
- number characters followed by a period `1.`
- base-10 exponential scientific notation formats like `5.0e5` and `5.0E-5` for large numbers
- number characters followed by a period followed by zero or more number characters: `1.1` or `1.`
- base-10 exponential scientific notation formats like `5.0e5` and `5.0E-5`

float expressions may be made negative with a `-` before the number, as mentioned in the unary numbers section.
Float literals can be a part of an expression that can be made negative by prefixing them with the [negation operator `-`](#negation).

### boolean
### Boolean

boolean literals have two valid values:
- `true`
- `false`

No other values are valid booleans. They are case sensitive.
No other values are valid boolean literals. The values are case sensitive.

## Root reference

Expand All @@ -70,7 +85,7 @@ $.foo.bar

## Dot notation

The dot notation allows you to dive into an object.
The dot notation allows you to reference fields of an object.

For example, if you have an object on the root data structure named "a" with the field "b" in it, you can access it with:

Expand All @@ -80,11 +95,14 @@ $.a.b

## Bracket accessor

The bracket accessor is used for acessing values whose specifics are not known until runtime. This includes maps or lists.
The bracket accessor is used for referencing values in maps or lists.


#### List access
For list access, you specify the index of the value you want to access. If you have a list named `foo` with one value, `"Hello world!"`, as shown:
For list access, you specify the index of the value you want to access.
Lists are zero-indexed (so the first value has an index of 0).

If you have a list named `foo`:

```yaml
foo:
Expand All @@ -96,32 +114,37 @@ You can access the first value with the expression:
$.foo[0]
```

Giving the output `"Hello world!"`

#### Map access

Maps, also known as dictionaries in some languages, are key-value pair data structures.

For map access in a bracket accessor expression, instead of an integer index, the value in the brackets must match the type of the map's keys.
To use a map in an expression, the expression to the left of the brackets must be a reference to a map. That is then followed by a pair of brackets with a sub-expression between them. That sub-expression must evaluate to a valid key in the map.

Here is an example of a map with key type string, and value type integer, inside the main data structure in a field called foo:
Here is an example of a map whose key are strings, and whose values are integers. The map is stored in a field called `foo` in the root-level object:

```yml
foo:
a: 1
b: 2
```

The value at the key `"b"` can be accessed with the expression:
Given the map shown above, the following expression would yield a value of `2`:
```JavaScript
$.foo["b"]
```

## Functions

A simpler alternative to steps for built-in helper operations.
Functions are built-in tasks with pre-defined behavior and a known input and output schema.

Functions are defined by the engine.

There are currently no built-in functions. Functions are being added later.
Functions:
TO BE DEFINED BEFORE MERGE.

The format for a function is the function's identifying name, followed by `(`, followed by 0 or more comma separated expressions, followed by `)`.
The syntax for a function has multiple parts. First, you have the function's identifying name, followed by `(`, followed by 0 or more comma-separated expressions, followed by `)`.

Example:
```JavaScript
Expand All @@ -141,8 +164,8 @@ The order of operations determines which operators run first. See [Order of Oper
| `/` | [Division](#division)|
| `%` | [Modulus](#modulus)|
| `^` | [Exponentiation](#exponentiation)|
| `==` | [Equals](#equals)|
| `!=` | [Not Equals](#not-equals)|
| `==` | [Equal To](#equal-to)|
| `!=` | [Not Equal To](#not-equal-to)|
| `>` | [Greater Than](#greater-than)|
| `<` | [Less Than](#less-than)|
| `>=` | [Greater Than or Equal To](#greater-than-or-equal-to)|
Expand All @@ -162,81 +185,90 @@ For example, the expression `"a" + "b"` would output the string `"ab"`.

##### Mathematical Addition

When the `+` operator is used with a numerical input, it adds them together.
When the `+` operator is used with numerical operands, it adds them together. The operator requires numerical operands with the same type. You cannot mix float and integer operands.
For example, the expression `2 + 2` would output the integer `4`.


### Subtraction

When the `-` operator is used with a numerical input, it subtracts them. The operator requires numerical input.
When the `-` operator is used with a numerical operands, subtracts the value of the right operand from the value of the left. The operator requires numerical operands with the same type. You cannot mix float and integer operands.

For example, the expression `6 - 4` would output the integer `2`.
The expression `$.a + $.b` would evaluate the values of `a` and `b` within the root, and add them together.
The expression `$.a - $.b` would evaluate the values of `a` and `b` within the root, and subtract the value of `$.b` from `$.a`.

### Multiplication

When the `*` operator is used with a numerical input, it multiplies them. The operator requires numerical input.
When the `*` operator is used with a numerical operands, it multiplies them. The operator requires numerical operands with the same type.

For example, the expression `3 * 3` would output the integer `9`.

### Division

When the `/` operator is used with a numerical input, it divides them. The operator requires numerical input.
When the `/` operator is used with a numerical operands, divides the value of the left operand into the right operand. The operator requires numerical operands with the same type.

The output type matches the input type. Integer division results in the value being rounded down into the resultant integer. If a non-whole number output is required, or if different rounding logic is required, convert the inputs into floating point number with TO BE ADDED BEFORE MERGE.

For example, the expression `3 / 3` would output the integer `1`.

### Modulus

When the `%` operator is used with a numerical input, it outputs the remainder of the division of the operands. The operator requires numerical input.
When the `%` operator is used with a numerical operands, it outputs the remainder of the division of the left operand into the right operand. The operator requires numerical operands with the same type.

For example, the expression `2 % 3` would output the integer `2`.
For example, the expression `5 % 3` would output the integer `2`.

### Exponentiation

When the `^` operator is used with numerical input, it outputs the result of the left side raised to the power of the right side.
The `^` operator outputs the result of the left side raised to the power of the right side. The operator requires numerical operands with the same type.

The mathematical expression 2<sup>3</sup> is represented in the expression language as `2^3`, which would output the integer `8`.

### Equals
### Equal To

The `==` equality operator checks for equality between the left and right type. It returns true when the left and right match.
The `==` operator checks for equality between the left and right type. It returns true when the left and right match. The type must be the same for both operands, so the expression `1 == 1.0` would fail.
The operator currently supports the types `integer`, `float`, `string`, and `boolean`. If another type is required, please create an issue with the expected behavior of the operator with the needed type.

For example, `2 == 2` results in `true`, and `"a" == "b"` results in `false`.

### Not Equals
### Not Equal To

The `!=` operator is the inverse of the [==](#equals ) operator. It returns true when the values do not match.
The `!=` operator is the inverse of the [==](#equals ) operator. It returns true when the values do not match. The type must be the same for both operands, so the expression `1 != 1.0` would fail
The operator currently supports the types `integer`, `float`, `string`, and `boolean`. If another type is required, please create an issue with the expected behavior of the operator with the needed type.

For example, `2 != 2` results in `false`, and `"a" != "b"` results in `true`.

### Greater Than

The `>` inequality operator outputs `true` if the left side is greater than the right side, and `false` otherwise. The operator requires numerical or string input.
The `>` operator outputs `true` if the left side is greater than the right side, and `false` otherwise. The operator requires numerical or string operands. The type must be the same for both operands.
String operands are compared using the lexicographical order of the charset.

For example, the expression `3 > 3` would output the boolean `false`, and `4 > 3` would output `true`.
For an integer example, the expression `3 > 3` would output the boolean `false`, and `4 > 3` would output `true`.
For a string example, the expression `"a" > "b"` would output `false`.

### Less Than

The `<` inequality operator outputs `true` if the left side is less than the right side, and `false` otherwise. The operator requires numerical or string input.
The `<` operator outputs `true` if the left side is less than the right side, and `false` otherwise. The operator requires numerical or string operands. The type must be the same for both operands.
String operands are compared using the lexicographical order of the charset.

For example, the expression `3 < 3` would output the boolean `false`, and `1 < 2` would output `true`.
For an integer example, the expression `3 < 3` would output the boolean `false`, and `1 < 2` would output `true`.
For a string example, the expression `"a" < "b"` would output `true`.

### Greater Than or Equal To

The `>=` inequality operator outputs `true` if the left side is greater than or equal to the right side, and `false` otherwise. The operator requires numerical or string input.
The `>=` operator outputs `true` if the left side is greater than or equal to (not less than) the right side, and `false` otherwise. The operator requires numerical or string operands. The type must be the same for both operands.
String operands are compared using the lexicographical order of the charset.

For example, the expression `3 >= 3` would output the boolean `true`, `3 >= 4` would output `false`, and `4 >= 3` would output `true`.
For an integer example, the expression `3 >= 3` would output the boolean `true`, `3 >= 4` would output `false`, and `4 >= 3` would output `true`.

### Less Than or Equal To

The `<=` inequality operator outputs `true` if the left side is less than or equal to the right side, and `false` otherwise. The operator requires numerical or string input.
The `<=` operator outputs `true` if the left side is less than or equal to (not greater than) the right side, and `false` otherwise. The operator requires numerical or string operands. The type must be the same for both operands.
String operands are compared using the lexicographical order of the charset.

For example, the expression `3 <= 3` would output the boolean `true`, `3 <= 4` would output `true`, and `4 <= 3` would output `false`.

### Logical AND

The `&&` operator returns `true` if both the left and right sides are `true`, and `false` otherwise. This operator requires boolean input.
The `&&` operator returns `true` if both the left and right sides are `true`, and `false` otherwise. This operator requires boolean operands.
Note: There is no short-circuiting as it's currently implemented. Both the left and right are evaluated before the comparison takes place.

All cases:
Expand All @@ -249,7 +281,7 @@ All cases:

### Logical OR

The `&&` operator returns `true` if either or both of the left and right sides are `true`, and `false` otherwise. It is a non-exclusive or operator. This operator requires boolean input.
The `||` operator returns `true` if **either or both** of the left and right sides are `true`, and `false` otherwise. This operator requires boolean operands.
Note: There is no short-circuiting as it's currently implemented. Both the left and right are evaluated before the comparison takes place.

All cases:
Expand All @@ -262,7 +294,7 @@ All cases:

## Unary Operations

These are operations that have one operator before the expressions.
Unary operations are operations that have one input. They are formatted as one operator to the left of the operand expression.

| Operator | Description |
| ---------|--------------------|
Expand All @@ -271,35 +303,36 @@ These are operations that have one operator before the expressions.

### Negation

The negation operator negates the value from the input expression.
The negation operator negates the numeric value from the input expression.
The required format is a dash `-` before the expression.

If the input is not numeric, there will be a type error.
This operation requires numeric input.

Examples with integer literals: `-5`, `- 5`
Example with a float literal: `-50.0`
Example with a reference: `-$.foo`
Example with parentheses and a sub-expression: `-(5 + 5)`

### Logical complement

The logical complement unary operator logically inverts the boolean input.
The required format is an exclamation point `!` before the expression.

If the input expression is not boolean, there will be a type error.
This operation requires boolean input.

Example with a boolean literal: `!true`
Example with a reference: `!$.foo`

## Parentheses

Parentheses are used to force precedence in the expression. They do not do anything implicitly (for example, there is no implied multiplication)
Parentheses are used to force precedence in the expression. They do not do anything implicitly (for example, there is no implied multiplication).

For example, the expression `5 + 5 * 5` evaluates the `5 * 5` before the `+`, resulting in `5 + 25`, and finally `30`.
If you want the 5 + 5 to be run first, you must use parentheses. That gives you the expression `(5 + 5) * 5`, resulting in `10 * 5`, and finally `50`
If you want the 5 + 5 to be run first, you must use parentheses. That gives you the expression `(5 + 5) * 5`, resulting in `10 * 5`, and finally `50`.

## Order of Operations

The order of operations is designed to match most programming languages, and the rules of math.
The order of operations is designed to match mathematics and most programming languages.

Order (highest to lowest):
- [`- ` negation](#negation)
Expand Down

0 comments on commit b2e7427

Please sign in to comment.