The origin of the package #100

KillyMXI · 2023-08-13T14:03:29Z

KillyMXI
Aug 13, 2023
Maintainer

This is the text I originally wrote for the announcement in r/javascript on November 04, 2021 (link), right after I published version 0.5.1.
I decided I'd like to have it somewhere closer, and Discussions seems to be the perfect place, so I'm bringing it here with minor edits.

I was working on a project, and some part of it required to parse some chunks of text. (Edit: that was aspargvs.)

My initial approach was: "Ok, I can use a BNF grammar-based parser generator for that". And indeed, writing a nice grammar for what you're trying to parse is a good idea. (Edit: that's what I already did in parseley prior to that.)

Next challenge: now I have all those chunks parsed and I have a higher-order task to handle. It might require some (state?) machine or perhaps a... parser combinator. I wasn't going to bring multiple parser dependencies into the project, so my plan was to throw together few functions for the task. The idea is simple, and I wouldn't need many building blocks...

Then I realized the BNF-based parser generator I was going to use has some deal-breaking limitations and I have to replace it. (Edit: can't remember what exact limitations of nearley had me drop it.)

At that point I started to look at available parser combinator packages, questioning how applicable they are to both of my problems (processing text, processing a collection of arbitrary objects). Turns out all of them are made with the goal of text parsing. Some come with a lexer/tokenizer, but implementation doesn't seem to be open and encouraging to use them in less conventional way.

At the same time, I kept thinking about my "primitive" parser combinator. If I polish it a bit more - it might be pretty useful for someone else. And after a brief look at existing alternatives, I decided there is room for it in the "market". So, there be it - I started to work on it as a separate package.

"Polishing" took quite a bit of time though, as you can imagine. It went through several iterations of looking what building blocks might be useful, what blocks other parsers offer, what functional "shapes" may suggest meaningful blocks, writing docs and tests, writing examples, getting new ideas while finishing older ones...

Finally, it all settled. I have some ideas left out as they didn't lead to natural universal implementations. So, while some my decisions might be redundant, there are still areas that may require implementation in client code.

And that's the beauty of it. My "primitive" design is transparent and intended to be trivially extensible. At the same time, I'm quite pleased that client code remains very clean. Although I still have an uneasy feeling about it - either I made a really cool thing or a really dumb thing.

A couple words about string parsing. At the moment I realized I'll have to replace the string parser I decided to introduce some string parsing primitives. I didn't want to stain the generic purity of the core module, so I put that into a separate one.

Some examples were demanding for a tokenizer. My initial plan was to use an existing one. At the moment I only knew one good, but it was written in JS, didn't come with types, and turns out it's not easy to make it work with examples written in TypeScript (because it wouldn't be me if I didn't make everything TypeScript, runnable and testable). So, I hastily made a something like 20-lines lexer in the examples folder and was done with it. Or so I thought at that moment. At the last moment I decided: "Well, it's not nice, it bothers me. Someone will work off my examples and will need it as well". So, I started another package, moved my twenty lines there and expanded from it. Thankfully, it was just few extra days of work. At this time, I found some more alternatives - too late to drop mine, but I got something to compare with and to pick some idea from.

The result of separation: when parsing string input, you have a choice to use either a lexer and the core module, or both char and core modules. I think it's pretty nice.

Since both my packages have 0 dependencies (don't depend on each other as well) - it was trivial to add Deno support.

To document my TypeScript packages, I prefer to use Typedoc with markdown plugin. But for a parser combinators package it wouldn't make much sense. Serving HTML output of Typedoc to GitHub Pages works better this time. I'm not quite happy with current state of Typedoc's default theme, but it is usable and it puts my newborn parser combinators toolkit pretty high among alternatives as far as documentation goes. Type definitions of my design are pretty mouthful but it's digestible when broken down. It seems VScode can't show all the details that Typedoc can unfortunately, so good documentation makes sense even if the project is fully typed.

There are few considerations about the parser combinators toolkit design that I'm too lazy to put into words unless the right question raises, so AMA.

Finally, I can get back to the project that spawned this one (Edit: aspargvs) (and was spawned from yet another project as well (Edit: reimplementing the CLI of html-to-text) - this quest went too far...). But I immediately realized a couple of issues demanding creative solutions in what I thought was almost done the moment I opened it. That felt like a burden and I need a rest and recharge... (Edit: aspargvs took a while to figure out and get it production-ready.)

End of text from November 04, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The origin of the package #100

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

The origin of the package #100

KillyMXI Aug 13, 2023 Maintainer

Replies: 0 comments

KillyMXI
Aug 13, 2023
Maintainer