Command Reference

All commands support the --verbose global flag for detailed logging.

kmtools [--version] [--verbose] <command> [args]

`kmtools chunk`

Run km find_mutation in parallel across chunked target sequences.

kmtools chunk \
    --threads <N> \
    --km-find-mutation-options "<options>" \
    --km-target-directory <dir> \
    --km-jellyfish-file <file.jf> \
    [--output-dir <dir>] \
    [--prefix <prefix>] \
    [--merge] \
    [--merge-output <file>] \
    [--merge-keep]

Argument	Required	Default	Description
`--threads`	Yes	-	Number of parallel `km` processes
`--km-find-mutation-options`	Yes	-	Options passed directly to `km find_mutation` (quote the string)
`--km-target-directory`	Yes	-	Directory containing pre-split target subdirectories. Must have exactly `--threads` subdirectories with roughly equal file counts.
`--km-jellyfish-file`	Yes	-	Path to the jellyfish k-mer counts file (`.jf`)
`--output-dir`	No	`.`	Directory to save chunk output files
`--prefix`	No	`km_find_mutation_output`	Prefix for output filenames (produces `<prefix>_0.txt`, `<prefix>_1.txt`, etc.)
`--merge`	No	`false`	Merge chunk outputs into a single file after processing
`--merge-output`	No	`km_find_mutation_merged_output.txt`	Filename for merged output when `--merge` is used
`--merge-keep`	No	`false`	Keep intermediate chunk files after merging

How it works

Validates that km is installed and the target directory has the correct structure
Splits work across threads (one subdirectory per thread)
Runs km find_mutation on each chunk in parallel via ThreadPoolExecutor
Optionally merges all results when complete

`kmtools merge`

Combine multiple km output files into a single result.

kmtools merge <input_files...> \
    --output <file> \
    [--keep] \
    [--sort-by <column>] \
    [--drop-duplicates]

Argument	Required	Default	Description
`inputs`	Yes	-	Input files (positional arguments, supports glob patterns)
`--output`	Yes	-	Output file for merged results
`--keep`	No	`false`	Keep input files after merging (default: delete them)
`--sort-by`	No	`None`	Sort merged output by this column name
`--drop-duplicates`	No	`false`	Remove exact duplicate rows from merged output

Notes

Input files are read as tab-separated values
Column consistency is validated across files (mismatches produce warnings)
Without --keep, input files are deleted after successful merge

`kmtools filter`

Filter km find_mutation output against a reference variant list.

kmtools filter \
    --reference <file> \
    --km-output <file> \
    --output <file> \
    [--output-type tsv|csv|xlsx] \
    [--count-threshold <N>] \
    [--use-alt]

Argument	Required	Default	Description
`--reference`	Yes	-	Reference file with known variants (`.csv` or `.tsv`)
`--km-output`	Yes	-	`km find_mutation` output file to filter
`--output`	Yes	-	Output file for filtered results
`--output-type`	No	`tsv`	Output format: `tsv`, `csv`, or `xlsx`
`--count-threshold`	No	`2`	Minimum k-mer count (`Min_coverage`) for a variant to pass
`--use-alt`	No	`false`	Enable alternate sequence matching mode

Filter conditions (standard mode)

A km row passes the filter when ALL of these are true:

TYPE match: km Type matches reference TYPE
COUNT threshold: km Min_coverage >= --count-threshold
ALT match: km Sequence matches the calculated expected variant sequence
Not reference type: reference TYPE is not "Reference"
vs_ref info: km Info field is "vs_ref"

Filter conditions (use-alt mode)

TYPE match: km Type matches alt reference TYPE
COUNT threshold: km Min_coverage >= --count-threshold
Sequence match: km Sequence matches ALT_SEQUENCE from the reference
vs_ref info: km Info field is "vs_ref"

Output columns

Standard mode: SAMPLE, CHROM, POS, REF, ALT, TYPE, FOUND, FILTER_NOTES, KMER_VAF, KMER_MIN_COVERAGE, KMER_EXPRESSION, REF_SEQUENCE, VARIANT_SEQUENCE

Use-alt mode: SAMPLE, CHROM, ALT_SEQUENCE, TYPE, FOUND, FILTER_NOTES, KMER_VAF, KMER_MIN_COVERAGE, KMER_EXPRESSION, REF_SEQUENCE, VARIANT_SEQUENCE

`kmtools plot`

Generate charts from filtered results.

kmtools plot <file> \
    [--output-dir <dir>] \
    [--charts <types>]

Argument	Required	Default	Description
`file`	Yes	-	Filtered results file (positional)
`--output-dir`	No	`.`	Directory to save plot images
`--charts`	No	`all`	Comma-separated chart types

Chart types

Type	Output file	Description
`vaf`	`vaf_distribution.png`	Histogram of variant allele frequencies (rVAF) for found variants
`type`	`type_distribution.png`	Bar chart of variant type counts
`sample`	`sample_summary.png`	Stacked bar: found vs not-found per sample
`overall`	`overall_summary.png`	Pie chart of overall found vs not-found ratio
`all`	All of the above	Generate all chart types

`kmtools runall`

Run the complete pipeline in one command: chunk -> merge -> filter -> plot.

kmtools runall \
    --threads <N> \
    --km-find-mutation-options "<options>" \
    --km-target-directory <dir> \
    --km-jellyfish-file <file.jf> \
    --merge-output <file> \
    --reference <file> \
    --filtered-output <file> \
    [--output-type tsv|csv|xlsx] \
    [--count-threshold <N>] \
    [--output-dir <dir>] \
    [--charts <types>]

Argument	Required	Default	Description
`--threads`	Yes	-	Number of parallel threads for chunking
`--km-find-mutation-options`	Yes	-	Options passed to `km find_mutation`
`--km-target-directory`	Yes	-	Directory with pre-split target subdirectories
`--km-jellyfish-file`	Yes	-	Path to the jellyfish k-mer counts file
`--merge-output`	Yes	-	Output file for merged chunk results
`--reference`	Yes	-	Reference file with known variants
`--filtered-output`	Yes	-	Output file for filtered results
`--output-type`	No	`tsv`	Output format for filtered results
`--count-threshold`	No	`2`	Minimum k-mer count threshold
`--output-dir`	No	`.`	Directory to save plots
`--charts`	No	`all`	Chart types to generate

Pipeline steps

Chunk: Split targets and run km find_mutation in parallel
Merge: Combine chunk outputs into --merge-output
Filter: Filter merged results against --reference into --filtered-output
Plot: Generate charts from filtered results into --output-dir

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Command Reference

`kmtools chunk`

How it works

`kmtools merge`

Notes

`kmtools filter`

Filter conditions (standard mode)

Filter conditions (use-alt mode)

Output columns

`kmtools plot`

Chart types

`kmtools runall`

Pipeline steps

FilesExpand file tree

commands.md

Latest commit

History

commands.md

File metadata and controls

Command Reference

kmtools chunk

How it works

kmtools merge

Notes

kmtools filter

Filter conditions (standard mode)

Filter conditions (use-alt mode)

Output columns

kmtools plot

Chart types

kmtools runall

Pipeline steps

`kmtools chunk`

`kmtools merge`

`kmtools filter`

`kmtools plot`

`kmtools runall`