Skip to content

add an option to generate a partial ast definition #1854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ydaveluy
Copy link
Contributor

@ydaveluy ydaveluy commented Mar 18, 2025

This PR adds an option in the langium-cli configuration to generate the ast with optional properties except for boolean and array types which are respectively initialized with false and [].

When optionalProperties is undefined or false the behavior is unchanged: the generated ast interfaces are as it would be expected when the document is successfully parsed (no recovered token).

When optionalProperties is true the generated ast interfaces match with the output generated by the parser (including invalid inputs which are parsed with recovered tokens generated by the parser): all properties are optional except for boolean and arrays.

edit:
The generated/ast.ts file is relevant when the input document is valid (i.e., no missing tokens) and can be safely used for code generation or any other tasks that require a valid document.

For scoping, validation, and most LSP services, relying on generated/ast.ts is very error-prone because the input document may be incomplete during editing. The parser automatically generates some recovered tokens to parse the document as much as possible. As a result, some mandatory properties in the AST may be undefined.

This PR introduces a generated/ast-partial.ts type definition that can be used as a drop-in replacement for generated/ast.ts. The same types are exported, except that properties that may be undefined after parsing are marked as optional.

As an example the arithmentic language is buggy due to false assumptions on the parsed AST:

Module basicMath
def b: 3;
b % ; 

=> An error occurred during validation: Cannot read properties of undefined (reading '$document')

@ydaveluy ydaveluy force-pushed the option_for_optional_ast_properties branch from 1577c65 to 0e954b8 Compare March 19, 2025 17:20
@ydaveluy ydaveluy marked this pull request as ready for review March 19, 2025 17:23
@ydaveluy
Copy link
Contributor Author

Hi @msujew , I have some doubts about the usability of this feature. It will generate an AST that will be good for scoping and validation but annoying for code generation or when it is assumed that the AST is valid.

So why not generating 2 ast.ts files : keeping the current ast.ts as is and generating an ast-parsial.ts with the same types but with optional properties where required.

With such design it would be possible to import types from ast-partial.ts for scoping and validation and to use ast.ts for generation.

@spoenemann
Copy link
Contributor

spoenemann commented Mar 26, 2025

I have some doubts about the usability of this feature. It will generate an AST that will be good for scoping and validation but annoying for code generation or when it is assumed that the AST is valid.

👍 yes, that's exactly why we decided not to make properties optional.

I assume it's possible to achieve what you want (all properties optional except for array and boolean) using a TypeScript wrapper type (a more sophisticated version of Partial). Have you tried that?

@msujew
Copy link
Member

msujew commented Mar 26, 2025

@spoenemann See the discussion over in #1848 (comment).

@ydaveluy I don't think we should generate 2 ASTs, as that has no benefit compared to the "normal" AST + an optional wrapper type.

@ydaveluy
Copy link
Contributor Author

ydaveluy commented Mar 26, 2025

@spoenemann Yes you're right, a more sophisticated version of DeppPartial works well;

it makes properties optional except if it starts with $ or is a boolean or an array:

export type DeepOptional<T> = T extends Reference<infer U>
  ? Reference<DeepOptional<U>>
  : T extends AstNode
  ? {
      [K in keyof T as K extends `$${string}`
        ? K
        : T[K] extends boolean
        ? K
        : never]: T[K];
    } & {
      [K in keyof T as K extends `$${string}`
        ? never
        : T[K] extends boolean
        ? never
        : K]: T[K] extends Array<infer U>
        ? Array<DeepOptional<U>>
        : DeepOptional<T[K]> | undefined;
    }
  : T;

@spoenemann
Copy link
Contributor

Nice! We could add that to our code base instead of this PR, or WDYT?

However, we'd need a name that better reflects its intent – DeepOptional is almost the same as DeepPartial.

@ydaveluy
Copy link
Contributor Author

ydaveluy commented Mar 26, 2025

@spoenemann Yes I think it could be usefull to add it to Langium. For the new name, no idea :-)

@msujew According to me it is still very error prone to rely only on DeepOptional in the validator or scoping. It is very easy to forget to add DeepOptional everywhere when the ast contains lots of types. Moreover, when using ast.isMyType(element) the element is inferred as MyType which does not provides the optional properties as DeepOptional<MyType>.

I wrote this small script that generate ast-partial.ts from ast.ts, each type is mapped with DeepOptional, functions like isMyType are redefined to return a DeepOptional type and constants are re-exported. This way the ast-partial.ts can be used like the ast.ts:

import * as fs from 'fs'

const INPUT_FILE = "./src/language/generated/ast.ts";
const OUTPUT_FILE = "./src/language/generated/ast-partial.ts";

const content = fs.readFileSync(INPUT_FILE, "utf-8");

const typeRegex = /export\s+(?:interface|type)\s+(\w+)/g;
const functionRegex = /export\s+function\s+(is\w+)\s*\(/g;
const constRegex = /export\s+const\s+(\w+)\s*=/g;

const types = [...content.matchAll(typeRegex)].map(match => match[1]);

const functions = [...content.matchAll(functionRegex)].map(match => match[1]);

const constants = [...content.matchAll(constRegex)].map(match => match[1]);

const output = `import * as base from './ast.js';
import type { AstNode } from 'langium';

type DeepOptional<T> = T extends AstNode
  ? {
      [K in keyof T as K extends \`$\${string}\`
        ? K
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        : T[K] extends boolean | any[]
        ? K
        : never]: T[K];
    } & {
      [K in keyof T as K extends \`$\${string}\`
        ? never
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        : T[K] extends boolean | any[]
        ? never
        : K]?: T[K] extends object ? DeepOptional<T[K]> : T[K];
    }
  : T;

${types.map(type => `export type ${type} = DeepOptional<base.${type}>;`).join("\n")}

${functions.map(fn => `export function ${fn}(item: unknown): item is ${fn.slice(2)} {\n    return base.${fn}(item);\n}`).join("\n")}

${constants.map(constName => `export const ${constName} = base.${constName};`).join("\n")}

`;

fs.writeFileSync(OUTPUT_FILE, output, "utf-8");

Using ast-partial.ts instead of ast.ts helped me to detect many access on potentially undefined properties in my validators.

@ydaveluy ydaveluy marked this pull request as draft March 27, 2025 06:58
@ydaveluy ydaveluy force-pushed the option_for_optional_ast_properties branch from 2771f02 to 5321655 Compare March 28, 2025 08:56
@ydaveluy ydaveluy changed the title add an option to generate optional properties in ast add an option to generate a partial ast definition Mar 28, 2025
@ydaveluy ydaveluy force-pushed the option_for_optional_ast_properties branch from 5321655 to 65fcdbe Compare March 28, 2025 09:25
@ydaveluy ydaveluy force-pushed the option_for_optional_ast_properties branch from 65fcdbe to 76b576e Compare March 28, 2025 09:30
@ydaveluy ydaveluy marked this pull request as ready for review March 28, 2025 09:34
@spoenemann
Copy link
Contributor

In my experience working with TypeScript, there are often situations where we need to make a compromise between full type safety and complexity of the solution. I would not always regard the most type-safe solution as the best one – it depends on the effort required to get there and the amount of additional complexity introduced.

In this case, I lean towards keeping things simpler rather than generating even more type definitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants