A fast and simple command-line tool for common operations over JSON-lines files, such as:
- converting to and from text files, TSV files
- joining files on (multiple) keys
- merging files line by line
- adding, removing, selecting fields
- ...
You could use jq for some of these tasks (and in fact, jq is a far more general tool) but:
rjpis designed for the JSON-lines format specificallyrjpcan be faster- some common tasks are more easily done in
rjp
This is my attempt to learn a bit of Rust, don't take this tool too seriously. That being said, it is pretty quick and handy, at least for me.
Get rust:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -yClone and build rjp:
git clone https://github.com/ales-t/rjp.git
cd rjp
cargo build --releaseYou will find the binary in target/release/rjp. You can add it to your PATH e.g. like this:
export PATH="$(pwd)/target/release:$PATH"rjp < input_file [INPUT_CONVERSION] [PROCESSOR [PROCESSOR...]] [OUTPUT_CONVERSION] > output_filerjp runs a chain of processors on each instance in the input stream (STDIN), finally printing
the processed instances to STDOUT.
By default, rjp reads the input file as JSON lines. You can optionally specify a file conversion
as the first positional argument.
Convert TSV lines with specified field names.
Aliases: tsv_to_json, from_tsv
Examples:
rjp < in.tsv from_tsv first_field_name,second_field_name,... [PROCESSORS] [OUTPUT_CONVERSION] > output_file
Conversion from TXT treats the whole input line as a single string field, you need to specify its name.
Aliases: txt_to_json, from_txt
Examples:
rjp < in.txt from_txt field_name [PROCESSORS] [OUTPUT_CONVERSION] > output_file
The following processors are implemented (brackets list shorthand aliases):
Add new fields with constant values.
Aliases: add_fields, af, add
Examples:
rjp < in.json add_fields new_field_name:value1,another_field:value2 > out.json
Remove existing fields.
Aliases: drop_fields, df, drop
Examples:
rjp < in.json to_drop,another_to_drop > out.json
Extract items from arrays and objects.
Aliases: extract_items e, extract
Examples:
rjp < in.json array_field[0]:new_field,object_field[key]:another_field > out.json
Perform inner join with another input stream (with optional file conversion).
Note on performance: while the main stream is processed line-by-line, the stream to join is loaded in RAM (i.e. use the smaller file as the joined stream).
Aliases: join, j, inner_join
Examples:
rjp < in.json join file.json key_field_1,key_field_2 > out.json- With file conversion:
rjp < in.json join file.tsv key from_tsv key,tsv_value > out.json
Identical to join, except that lines from the main stream that don't have a corresponding instance
in the joined stream are kept (and no additional fields are added to them).
Aliases: lj, left_join
Merge with another input stream line-by-line, with optional file conversion.
Aliases: merge, mrg
Examples:
rjp < in.json merge file_to_merge.json > out.json- With file conversion:
rjp < in.json merge to_merge.tsv from_tsv col_a,col_b > out.json
Rename fields in instances.
Aliases: rename, rnm
Examples:
rjp < in.json old_name:new_name,another_old:another_new > out.json
Select a subset of fields (the rest are dropped).
Aliases: select_fields, sf, select, sel
Examples:
rjp < in.json select_fields first,second > out.json
Convert a string field to a numeric one.
Aliases: to_number, num
Examples:
rjp < in.json to_number string_field_name:new_numeric_field_name,another_string:another_numeric > out.json
By default, rjp will produce JSON lines. You can change that with a file conversion.
Convert into TSV liens with specified fields.
Aliases: to_tsv, json_to_tsv, tsv
Examples:
rjp < in.json to_tsv field_1,field_2 > out.tsv