Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add array length function, closes #3308 #3317

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

peternewman
Copy link
Contributor

Should this be targeted at main or develop?

Happy to hear other name suggestions, I've mirrored strlen for now.

I did wonder should I add push/pop at the same time (and possibly others...)

@dnmeid
Copy link
Member

dnmeid commented Feb 23, 2025

Here are my thoughts:
I think the name of strlen() is dated back to the time when variables had always and ever been strings and was chosen to make that more clear. I would keep strlen for legacy reasons and add a new generic function length(). That function length should work with strings (number of unicode graphemes), arrays (number of elements), JSON (number of properties). It could also work for integer numbers by returning the length of the string representation. We can then extend this if we introduce e.g. enums in the future.

As it is a feature that I think will not need super heaviy testing it should be targeted at main.

Feel free to add more useful functions.

Please write a little less minimalistic description in the getting started. Especially examples are very useful and the description needs to be suitable for non programmers. If you say a function "lenght" gives the lenght, that is not super useful. You need to explain more basic, like "count of elements".

@dnmeid
Copy link
Member

dnmeid commented Feb 26, 2025

@peternewman still there?

@peternewman
Copy link
Contributor Author

Here are my thoughts: I think the name of strlen() is dated back to the time when variables had always and ever been strings and was chosen to make that more clear. I would keep strlen for legacy reasons and add a new generic function length(). That function length should work with strings (number of unicode graphemes), arrays (number of elements), JSON (number of properties). It could also work for integer numbers by returning the length of the string representation. We can then extend this if we introduce e.g. enums in the future.

That sounds very sensible, I'll pivot to that.

As it is a feature that I think will not need super heaviy testing it should be targeted at main.

Great, I'll leave this targeted as is then.

Feel free to add more useful functions.

I realised pop would need to return both the modified array (or modify it in place) and the item that was popped, I'm not aware of other functions working like that currently. Obviously push would be pretty easy.

Please write a little less minimalistic description in the getting started. Especially examples are very useful and the description needs to be suitable for non programmers. If you say a function "lenght" gives the lenght, that is not super useful. You need to explain more basic, like "count of elements".

Sorry, lazy copy/paste of the strlen thing.

@peternewman still there?

Yeah sorry, just competing open source stuff and other commitments.

expect(ExpressionFunctions.length()).toBe(0)
expect(ExpressionFunctions.length('')).toBe(0)
expect(ExpressionFunctions.length('a')).toBe(1)
expect(ExpressionFunctions.length('abc')).toBe(3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect(ExpressionFunctions.length('ä')).toBe(1) // codepoint U+00E4, one grapheme
expect(ExpressionFunctions.length('̈a')).toBe(1) // codepoints U+0308 U+0061, one grapheme
expect(ExpressionFunctions.length('́̈a')).toBe(1) // codepoints U+0301 U+0308 U+0061, one grapheme

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this is pretty hard to do entirely properly, but do you think of these is "good enough"?
https://mathiasbynens.be/notes/javascript-unicode

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually node.js has the Intl.segmenter() since version 16 but that is not working 100% with combining marks.
https://www.npmjs.com/package/unicode-segmenter should do the trick and is even faster than the built in Intl.segmenter()

len = Object.keys(v).length
} else if (typeof v === 'number') {
len = (v + '').length
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all uncatched types have a lenght property, e.g. BIGINT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added BigInt. What others are you aware (RegExp maybe) of/what do you think I should do in the default case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added RegExp too now...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think string should have its own case and then it is not part of the unknown rest any more. The unknown rest could maybe return NaN.
From the JS perspective there are also iterators and generators that are interesting for length() but currently we don't use them in variables, so if you want to implement them it would be a bonus for possible future use. Also we don't use regex data type currently and I'm not sure about the value of knowing the length of a regex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants