Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-SEQ is painful #434

Open
marten-seemann opened this issue Aug 1, 2024 · 17 comments
Open

JSON-SEQ is painful #434

marten-seemann opened this issue Aug 1, 2024 · 17 comments

Comments

@marten-seemann
Copy link
Collaborator

When we switched to JSON-SEQ a while back (#172), we assumed that using record separators instead of newline-delimited JSON wouldn't cause too much pain. I think it's time to reevaluate this assumption.

vim

It also turns out that existing software is not that great at handling record record separators. Here's a screenshot of vim messing up the line character count (note that last line, where it usually tells you the cursor position, and it now displays 12-13).
image

jq

jq is actually the tool that's the most compatible with JSONSEQ. However, one MUST use the --seq flag, using standard jq results in the following error, which is not very helpful for the user:

jq: parse error: Invalid numeric literal at line 1, column 2

I had to use a search engine to figure out how to convince jq to use record separators. Definitely not the end of the world, but a subpar developer experience nonetheless.

grep

Another example is grep. Try to extract the transport:connection_started event from this qlog file using grep:

�{
    "qlog_format":"JSON-SEQ",
    "qlog_version":"draft-03",
    "title":"quic-go qlog",
    "configuration":{"code_version":"(devel)"},
    "trace":{
        "vantage_point":{"type":"client"},
        "common_fields":{"ODCID":"65b480747721258697d6","group_id":"65b480747721258697d6","reference_time":1722494434168.3032,"time_format":"relative"},
    },
}
�{
    "time":0.038458,
    "name":"transport:connection_started",
    "data":{"ip_version":"ipv6","src_ip":"::","src_port":55464,"dst_ip":"127.0.0.1","dst_port":6121,"src_cid":"36ff0294","dst_cid":"65b480747721258697d6"},
}

First of all, note how GitHub renders the record separator here: � (I love seeing this character on my screen!).

But back to grep: I haven't figured out a way to grep for the transport:connection_started event, and apparently it's not easy to change the record separator.

greping is very important when debugging, I often find myself filtering a qlog file for the event type(s) that I'm interested in.

@marten-seemann marten-seemann changed the title JSONSEQ is painful JSON-SEQ is painful Aug 1, 2024
@francoismichel
Copy link

francoismichel commented Aug 1, 2024

I've been using and parsing QLOG quite extensively for my research and I can see how it can become painful especially for debugging purposes (sometimes you just have to browse through a file to find the exact problem that triggers some weird behaviour that appears during one run out of 1000). Not sure how my use-case is representative of real-world scenarios though, but I had to do that a lot. It is probably close to what implementers like @marten-seemann sometimes have to do.

If it actually really messes up with existing tools that worked fined with the previous format (I think it was close to, if not exactly jsonlines, which seems quite popular ?), I guess it makes sense to at least discuss that ?

That being said, developing a simple converter from json-seq to jsonlines for debugging use-cases that require grep, vim & co may be a good middleground solution, without necessarily having to change the spec.

@LPardue
Copy link
Member

LPardue commented Aug 1, 2024

Personally I've had no issues with the JSON-SEQ format. The RFC 7464 is clear on various edge cases that might arise when parsing truncated logs. There is no equivalent IETF standard that exists for just a single line. The default serialization format for the QLogFile is JSON, which also allows "pretty-printing" styles to include newlines too, so I don't see the grep argument as too strong (although I do understand it). The majority of time I just print each event on a single line. But I do have use cases where qlogs are meant to be eyeballed by a human, and having pretty printing does actually help those.

VS code deals with the record separator character just fine:

image

There are users of quiche's standalone qlog crate, and users of quiche for client and server. I have heard them having no issue with JSON-SEQ over a single line. We also provide a very simple reader class that seems to work ok, albeit it could probably be hardened by following more advice from RFC 7464.

@LPardue
Copy link
Member

LPardue commented Aug 1, 2024

A middleground might be to subset JSON-SEQ and state something along the lines of "In addition to the formatting rules defined in JSON-SEQ, it is RECOMMENDED that loggers do not embed new lines inside of serialized events, this makes it easier for general-purpose, line-oriented tools, to consume them."

@marten-seemann might argue for a MUST level requirement here - but the failure scenario is trivally recoverable by some preprocessing, In the case where I find embedded newlines useful (and to be consumed by my own tooling), I'd ignore the MUST anyway. But it would mean other tools can't necessarily interop with me then, which is a little sad.

@LPardue
Copy link
Member

LPardue commented Aug 1, 2024

Maybe an interesting point is that xquic logs to slog/clog and then converts to json via python: https://github.com/alibaba/xquic/blob/main/scripts/qlog_parser.py

And tangentially there's an issue on msquic for translation from their preferred format to qlog - microsoft/msquic#1158

@kazuho
Copy link
Member

kazuho commented Aug 1, 2024

@LPardue makes a good point. What QLog defines is an interchange format between tools, not the format that your tools are supposed to log natively. People have the freedom to log in the format they like, then convert that to QLog. FWIW, quicly has taken the same approach.

That said, I think I agree with @marten-seemann and @francoismichel regarding if there is a reason to endorse a serialization format that has edges compared to JSONL.

Knowing that JSONL is not standardized, one option we have is stop short of declaring how we concatenate the events. The idea is that while it is paramount to have the definition of how each event is being serialized, it is not that important to agree on one way of concatenating the events. IMO, it can either be JSONL, JSON-SEQ, or one JSON array.

If most people prefer JSONL, we will end up consolidating on JSONL.

Does that sound like a better strategy compared to naming JSON-SEQ in a normative manner?

@LPardue
Copy link
Member

LPardue commented Aug 2, 2024

Speaking personally with no hats, I don't think we should spend QUIC WG time relitigating all of the finer details of JSON encoding and parsing, for some newline delineated format. Leaving it wooly defined, doesn't seem acceptable to me. The spec is targetted towards tool makers too, selling them short risks interop failures of the structured logging nature of qlog. That doesn't seem like a fair tradeoff when tools working on unstructured lines data can work fine if each event is contained on a line.

@LPardue
Copy link
Member

LPardue commented Aug 2, 2024

As a very boring example - what media type would a handrolled ndjson use? Right now we have proposed to piggy back off of the existing json-seq suffix to use qlog+json-seq. Defining a new suffix would be a lot of work and maybe not event viable.

@marten-seemann
Copy link
Collaborator Author

I might be blissfully ignorant regarding the edge cases of JSON serialization, but in my mind, NLJSON is trivially simple to define:

Every line in and of itself is a valid JSON object.

That's it.

As a very boring example - what media type would a handrolled ndjson use?

I'm not convinced that adding a MIME type to the file itself was the right thing to do, see #435. If we remove this field, we might not need to specify anything.


As a general point, what we should optimize for is (new) developers / researchers handling qlogs. It should be dead simple to take a bunch of qlogs from different implementations, and use them to debug a QUIC transfer.

Of course the file format doesn't matter if you're only using tools that understand the intricacies of qlog, like qvis. But analysis often involves using standard unix tools. Simple example: count how many times a particular event happened: grep <event name> <file.qlog> | wc. We're just creating unnecessary frustration if simple stuff like this doesn't work.

I do realize that this might make it harder to push the spec through the IETF process. This is unfortunate, and certainly annoying for us as editors, and the QUIC WG chairs. But in the end, we're not writing a spec so that it fits neatly into the set of existing RFCs. We're writing a spec, so we have a neat serialization format that's maximally useful for devs and researchers.

@LPardue
Copy link
Member

LPardue commented Aug 2, 2024

client-1928a4431b958983545fe79a00fcf7a670f39d8e.zip unzips to client-1928a4431b958983545fe79a00fcf7a670f39d8e.sqlog, which is a JSON-SEQ file that includes some pretty printing e.g.

�{"qlog_version":"0.3","qlog_format":"JSON-SEQ","title":"quiche-client qlog","description":"quiche-client qlog id=1928a4431b958983545fe79a00fcf7a670f39d8e","trace":{"vantage_point":{"type":"client"},"title":"quiche-client qlog","description":"quiche-client qlog id=1928a4431b958983545fe79a00fcf7a670f39d8e","configuration":{"time_offset":0.0}}}
�{
  "time": 0.0,
  "name": "transport:parameters_set",
  "data": {
    "owner": "local",
    "tls_cipher": "None",
    "disable_active_migration": true,
    "max_idle_timeout": 30000,
    "max_udp_payload_size": 1350,
    "ack_delay_exponent": 3,
    "max_ack_delay": 25,
    "active_connection_id_limit": 2,
    "initial_max_data": 10000000,
    "initial_max_stream_data_bidi_local": 1000000,
    "initial_max_stream_data_bidi_remote": 1000000,
    "initial_max_stream_data_uni": 1000000,
    "initial_max_streams_bidi": 100,
    "initial_max_streams_uni": 100
  }
}

Simple example: count how many times a particular event happened: grep <file.qlog> | wc. We're just creating unnecessary frustration if simple stuff like this doesn't work.

It works:

$ grep "transport:packet_sent" client-1928a4431b958983545fe79a00fcf7a670f39d8e.sq
log | wc -l
     17

or

$ jq -c --seq 'select(.name=="transport:packet_sent")' client-1928a4431b958983545fe79a00fcf7a670f39d8e.sqlog | wc -l
     17

or even

$ jq -c --seq  . client-1928a4431b958983545fe79a00fcf7a670f39d8e.sqlog | grep "transport:packet_sent" qlogs/client-1928a4431b958983545fe79a00fcf7a670f39d8e.sqlog | wc -l
     17

that last one is even has a fragment you can use to seemlessly convert to line-delimited for you own needs, e.g. here's the output of

$ jq -c --seq  . client-1928a4431b958983545fe79a00fcf7a670f39d8e.sqlog | head
{"qlog_version":"0.3","qlog_format":"JSON-SEQ","title":"quiche-client qlog","description":"quiche-client qlog id=1928a4431b958983545fe79a00fcf7a670f39d8e","trace":{"vantage_point":{"type":"client"},"title":"quiche-client qlog","description":"quiche-client qlog id=1928a4431b958983545fe79a00fcf7a670f39d8e","configuration":{"time_offset":0}}}
{"time":0,"name":"transport:parameters_set","data":{"owner":"local","tls_cipher":"None","disable_active_migration":true,"max_idle_timeout":30000,"max_udp_payload_size":1350,"ack_delay_exponent":3,"max_ack_delay":25,"active_connection_id_limit":2,"initial_max_data":10000000,"initial_max_stream_data_bidi_local":1000000,"initial_max_stream_data_bidi_remote":1000000,"initial_max_stream_data_uni":1000000,"initial_max_streams_bidi":100,"initial_max_streams_uni":100}}
{"time":0.078191,"name":"transport:packet_sent","data":{"header":{"packet_type":"initial","packet_number":0,"version":"babababa","scil":20,"dcil":16,"scid":"1928a4431b958983545fe79a00fcf7a670f39d8e","dcid":"6c5443b5ef7ebae8d7830638c8f9dbf5"},"raw":{"length":338,"payload_length":275},"send_at_time":0.078191,"frames":[{"frame_type":"crypto","offset":0,"length":271}]}}
{"time":0.078191,"name":"recovery:metrics_updated","data":{"smoothed_rtt":333,"rtt_variance":166.5,"congestion_window":13500,"bytes_in_flight":338,"ssthresh":18446744073709552000}}
{"time":10.191873,"name":"transport:packet_sent","data":{"header":{"packet_type":"initial","packet_number":1,"version":"1","scil":20,"dcil":16,"scid":"1928a4431b958983545fe79a00fcf7a670f39d8e","dcid":"6c5443b5ef7ebae8d7830638c8f9dbf5"},"raw":{"length":338,"payload_length":275},"send_at_time":10.191873,"frames":[{"frame_type":"crypto","offset":0,"length":271}]}}
{"time":28.821796,"name":"transport:packet_received","data":{"header":{"packet_type":"initial","packet_number":0,"version":"1","scil":20,"dcil":20,"scid":"01f9ec51e1be075788faf55113ec16fe80aeda58","dcid":"1928a4431b958983545fe79a00fcf7a670f39d8e"},"raw":{"length":1200,"payload_length":117},"frames":[{"frame_type":"ack","ack_delay":0.109,"acked_ranges":[[1,1]]},{"frame_type":"crypto","offset":0,"length":90}]}}
{"time":28.821796,"name":"recovery:metrics_updated","data":{"min_rtt":18.629923,"smoothed_rtt":18.629923,"latest_rtt":18.629923,"rtt_variance":9.3149605,"bytes_in_flight":0}}
{"time":29.419956,"name":"transport:packet_received","data":{"header":{"packet_type":"handshake","packet_number":0,"version":"1","scil":20,"dcil":20,"scid":"01f9ec51e1be075788faf55113ec16fe80aeda58","dcid":"1928a4431b958983545fe79a00fcf7a670f39d8e"},"raw":{"length":1033,"payload_length":984},"frames":[{"frame_type":"crypto","offset":0,"length":963}]}}
{"time":29.551624,"name":"transport:packet_received","data":{"header":{"packet_type":"handshake","packet_number":1,"version":"1","scil":20,"dcil":20,"scid":"01f9ec51e1be075788faf55113ec16fe80aeda58","dcid":"1928a4431b958983545fe79a00fcf7a670f39d8e"},"raw":{"length":689,"payload_length":640},"frames":[{"frame_type":"crypto","offset":963,"length":618}]}}
{"time":29.551624,"name":"transport:parameters_set","data":{"owner":"remote","tls_cipher":"Some(AES128_GCM)","original_destination_connection_id":"6c5443b5ef7ebae8d7830638c8f9dbf5","disable_active_migration":false,"max_idle_timeout":0,"max_udp_payload_size":65527,"ack_delay_exponent":3,"max_ack_delay":25,"active_connection_id_limit":2,"initial_max_data":10000000,"initial_max_stream_data_bidi_local":10000000,"initial_max_stream_data_bidi_remote":10000000,"initial_max_stream_data_uni":10000000,"initial_max_streams_bidi":100,"initial_max_streams_uni":100}}

Honestly, for anyone wrangling JSON, jq is ubiquitous and provides a out-of-box capability to filter and convert a standardized canonical format into a multitude of formats, or pipes to other tools.

@LPardue
Copy link
Member

LPardue commented Aug 2, 2024

Media types were defined in response to #158, where @kazuho said

At the moment, we use a custom ndjson format, but if we are to use qlog, there has to be a media-type.

@marten-seemann
Copy link
Collaborator Author

Simple example: count how many times a particular event happened: grep <file.qlog> | wc. We're just creating unnecessary frustration if simple stuff like this doesn't work.

It works:

It works because you picked an example that's so simple that JSON-SEQ doesn't break it. Unfortunately, that doesn't apply to the vast majority of things you might want to do. For example, try counting the number of packets sent that counted a MAX_STREAM_DATA frame. You'd need something like grep packet_sent <file.qlog> | grep max_stream_data | wc. This breaks as soon as the event is not in one line.

Sure, you can use jq for that (once you've figured out that you need to use --seq, yet another annoyance for new developers). It's definitely more powerful than standard bash tools, but then I also have to ask ChatGPT to figure out selectors and filters every time...

@LPardue
Copy link
Member

LPardue commented Aug 2, 2024

wrt to vim, this is expected behaviour as described in https://vimdoc.sourceforge.net/htmldoc/options.html#%27ruler%27
because it chooses to render record separator as ^^

If the number of characters displayed is different from the number of bytes in the text (e.g., for a TAB or a multi-byte character), both the text column (byte number) and the screen column are shown, separated with a dash. For an empty line "0-1" is shown. For an empty buffer the line number will also be zero: "0,0-1".

Anyone that likes to program with TAB would already be used to this

image

@LPardue
Copy link
Member

LPardue commented Aug 2, 2024

You'd need something like grep packet_sent <file.qlog> | grep max_stream_data | wc. This breaks as soon as the event is not in one line.

Adding jq -c --seq . | prefix is not hard to do and satisfies this requirement if people don't want to learn jq syntax.

@rmarx
Copy link
Contributor

rmarx commented Aug 2, 2024

As an individual without hats:

I fully agree with @marten-seemann here... I think standards should reflect how people actually use stuff and the JSON-SEQ thing is plain annoying, even if things like jq have support for it (and we can "work around" it by having transformers and whatnot. My time at a larger company has taught me the absolute value of "sensible defaults" and "predictability through familiarity").

It really annoyed me having to switch from NDJSON to JSON-SEQ when we did just because NDJSON doesn't have an (IETF) RFC (even if I understand the arguments).


As an editor with too many years on this already:

I have zero (0) appetite to try and switch this again at this stage and especially trying to push it through the (eventual) IESG/whatever process needed to finally get these out the door.

With the recent extensibility changes, we can easily write an extension draft that defines how to still use NDJSON, it just won't be part of the default/first set of documents (Oh RFC 9218, have we learned nothing).

I'm also perfectly fine with keeping support for NDJSON in qvis for example.

@rmarx
Copy link
Contributor

rmarx commented Sep 30, 2024

Discussed during editors meeting. Will need face 2 face discussion with @marten-seemann during upcoming meeting to resolve this.

@rmarx
Copy link
Contributor

rmarx commented Oct 17, 2024

Discussed during editors meeting with me, Marten and Lucas.

We can't agree among ourselves whether to switch to NDJSON, add NDJSON or just stick with JSON-SEQ. However, we also feel this is not the most important issue to solve right now, so we will mention it during the upcoming IETF meeting in Dublin to see if people want to come to this issue to advocate for one option or another, but we won't call for a vote/hum on this from the WG at this point.

@LPardue
Copy link
Member

LPardue commented Oct 17, 2024

This needs to be raised on the QUIC mailing list before then, to reach that the broadest set of people

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants