Skip to content

Conversation

@dinocosta
Copy link
Contributor

This Pull Request attempts to update the grammar in order to fix the issue reported in zed-industries/zed#42091 . After pairing with @maxbrunsfeld on this, we found that, even though (( was used as a valid start to an arithmetic expansion, that does not appear to be the case, according to bash's manual.

As such, this Pull Request attempts to introduce the following changes:

  • Remove (( as a valid start to an arithmetic expansion, only $(( is now allowed
  • Add (( [expression, ]* expression )) as a valid compound statement following bash's manual

Known Failures

Although there's additions to the script/known-failures.txt file, I've confirmed that these are only here because new files have been added to the repositories used as example, as when running ./script/parse-examples the exact same known failures are added. As such, I believe we can assume that these grammar changes do not introduce new known failures, and remove some.

Formatting

Even though tests have been updated according to these new changes, there's also some that have been automatically re-formatted when running tree-sitter test --update. Let me know if you want me to try and revert some of these, manually updating the tests that need to reflect the new grammar.

No changes for now, we only re-formatted tests to make it easier to diff
them when we actually do changes.
Update the `script/known-failures.txt` document with the latest grammar
in `master`, to make it easier to check what our changes are impacting.
With the introduced changes, some of the examples were failing with a
specific pattern, namely the files:

- `examples/wild-corpus/esoteric/lishp/evaluator.sh`
- `examples/wild-corpus/esoteric/lishp/test.sh`
- `examples/wild-corpus/esoteric/lishp/variables.map.sh`
- `examples/wild-corpus/esoteric/lishp/variables.sh`

When taking a closer look it was noted that the following program was
failing to parse:

```
(( size=1, max_index=10 )
```

After taking a closer look at bash's manual, it seems this should be
possible and it should be considered a compound command instead. As
such, this commit moves this pattern from `test_command` to
`compound_statement`, updating it so that we can have multiple
expressions separated by a comma.

Note that these changes will very very very likely break the existing
tests, as we recently updated some to expect `test_command`. However,
for the time being, I'm keeping those failures since I'd like to have
someone else take another look at the changes I'm introducing with this
commit to see if they make sense.
This commit reverts the changes that had been done to `subscript` in
order to fix failing examples for the following files:

- `examples/gentoo/dev-lang/mlton/mlton-20180207.ebuild`
- `examples/gentoo/eclass/cuda.eclass`
- `examples/wild-corpus/cloud/kubernetes/cluster/lib/logging.sh`
- `examples/wild-corpus/cloud/kubernetes/hack/update-godep-licenses.sh`
- `examples/wild-corpus/git/t/t9902-completion.sh`
- `examples/wild-corpus/shell/bashdb/test/unit/test-columns.sh`
- `examples/wild-corpus/shell/bashdb/lib/frame.sh`
- `examples/gentoo/dev-qt/qt-docs/qt-docs-6.10.0_p202510021201.ebuild`
- `examples/gentoo/dev-qt/qt-docs/qt-docs-6.9.3_p202509261208.ebuild`
- `examples/gentoo/eclass/linux-mod-r1.eclass`

Namely, with the introduced changes, the following program could not be parsed:

```
echo "${array[${index}+1]}"
```

Even though I've confirmed this is valid bash syntax and, if the
variables were set, this would actually be evaluated to the value in
that specific index of the array.

Once again, not touching tests right now as I want someone to
double-check that my changes make sense.
Add the updated tests for both `test/corpus/literals.txt` and
`test/corpus/statements.txt`, while leaving the failure in
`test/corpus/commands.txt`, as the two first simply updated the
`test_command` node to `compound_statement`, which is expected.

The latter broke the binary expression in a parenthesized expression in
a subscript, which is simply returning word instead of actually parsing
the binary or unary expression, still need to figure that out.
Replace `parenthesized_expression` for `compound_statement` in
`subscript`, as that fixes the issue with patterns like
`${array[(($number+1))]}` where `(($number+1))` was being parsed as a
`(parenthesized_expression(word))`.

This removes even more known failures, while introducing a new single
on, which needs to be looked into.
Not sure if we should add `subshell` but, since the last commit removed
`parenthesized_expression`, we now need another one to allow `( ... )`
in subscript.
dinocosta and others added 2 commits December 2, 2025 16:45
@maxbrunsfeld
Copy link
Contributor

We disabled the node bindings test because it does not seem to work currently.

cc @ObserverOfTime - maybe the parser-test-action needs to be updated for the current version of Node/NPM. It doesn't seem like peer dependencies (tree-sitter) are installed now.

@maxbrunsfeld maxbrunsfeld merged commit 5d8a332 into tree-sitter:master Dec 2, 2025
4 checks passed
@ObserverOfTime
Copy link
Member

No, the bindings need to be updated.

dinocosta added a commit to zed-industries/zed that referenced this pull request Dec 2, 2025
With the merging and publishing of
tree-sitter/tree-sitter-bash#311 , we can now go
ahead and update the version of `tree-sitter-bash` that Zed relies on to
the latest version.

Closes #42091

Release Notes:

- Improved grammar for "Shell Script"
oscarvarto pushed a commit to oscarvarto/zed that referenced this pull request Dec 3, 2025
With the merging and publishing of
tree-sitter/tree-sitter-bash#311 , we can now go
ahead and update the version of `tree-sitter-bash` that Zed relies on to
the latest version.

Closes zed-industries#42091

Release Notes:

- Improved grammar for "Shell Script"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants