All commands support the --verbose global flag for detailed logging.
kmtools [--version] [--verbose] <command> [args]Run km find_mutation in parallel across chunked target sequences.
kmtools chunk \
--threads <N> \
--km-find-mutation-options "<options>" \
--km-target-directory <dir> \
--km-jellyfish-file <file.jf> \
[--output-dir <dir>] \
[--prefix <prefix>] \
[--merge] \
[--merge-output <file>] \
[--merge-keep]| Argument | Required | Default | Description |
|---|---|---|---|
--threads |
Yes | - | Number of parallel km processes |
--km-find-mutation-options |
Yes | - | Options passed directly to km find_mutation (quote the string) |
--km-target-directory |
Yes | - | Directory containing pre-split target subdirectories. Must have exactly --threads subdirectories with roughly equal file counts. |
--km-jellyfish-file |
Yes | - | Path to the jellyfish k-mer counts file (.jf) |
--output-dir |
No | . |
Directory to save chunk output files |
--prefix |
No | km_find_mutation_output |
Prefix for output filenames (produces <prefix>_0.txt, <prefix>_1.txt, etc.) |
--merge |
No | false |
Merge chunk outputs into a single file after processing |
--merge-output |
No | km_find_mutation_merged_output.txt |
Filename for merged output when --merge is used |
--merge-keep |
No | false |
Keep intermediate chunk files after merging |
- Validates that
kmis installed and the target directory has the correct structure - Splits work across threads (one subdirectory per thread)
- Runs
km find_mutationon each chunk in parallel viaThreadPoolExecutor - Optionally merges all results when complete
Combine multiple km output files into a single result.
kmtools merge <input_files...> \
--output <file> \
[--keep] \
[--sort-by <column>] \
[--drop-duplicates]| Argument | Required | Default | Description |
|---|---|---|---|
inputs |
Yes | - | Input files (positional arguments, supports glob patterns) |
--output |
Yes | - | Output file for merged results |
--keep |
No | false |
Keep input files after merging (default: delete them) |
--sort-by |
No | None |
Sort merged output by this column name |
--drop-duplicates |
No | false |
Remove exact duplicate rows from merged output |
- Input files are read as tab-separated values
- Column consistency is validated across files (mismatches produce warnings)
- Without
--keep, input files are deleted after successful merge
Filter km find_mutation output against a reference variant list.
kmtools filter \
--reference <file> \
--km-output <file> \
--output <file> \
[--output-type tsv|csv|xlsx] \
[--count-threshold <N>] \
[--use-alt]| Argument | Required | Default | Description |
|---|---|---|---|
--reference |
Yes | - | Reference file with known variants (.csv or .tsv) |
--km-output |
Yes | - | km find_mutation output file to filter |
--output |
Yes | - | Output file for filtered results |
--output-type |
No | tsv |
Output format: tsv, csv, or xlsx |
--count-threshold |
No | 2 |
Minimum k-mer count (Min_coverage) for a variant to pass |
--use-alt |
No | false |
Enable alternate sequence matching mode |
A km row passes the filter when ALL of these are true:
- TYPE match: km
Typematches referenceTYPE - COUNT threshold: km
Min_coverage>=--count-threshold - ALT match: km
Sequencematches the calculated expected variant sequence - Not reference type: reference
TYPEis not"Reference" - vs_ref info: km
Infofield is"vs_ref"
- TYPE match: km
Typematches alt referenceTYPE - COUNT threshold: km
Min_coverage>=--count-threshold - Sequence match: km
SequencematchesALT_SEQUENCEfrom the reference - vs_ref info: km
Infofield is"vs_ref"
Standard mode: SAMPLE, CHROM, POS, REF, ALT, TYPE, FOUND, FILTER_NOTES, KMER_VAF, KMER_MIN_COVERAGE, KMER_EXPRESSION, REF_SEQUENCE, VARIANT_SEQUENCE
Use-alt mode: SAMPLE, CHROM, ALT_SEQUENCE, TYPE, FOUND, FILTER_NOTES, KMER_VAF, KMER_MIN_COVERAGE, KMER_EXPRESSION, REF_SEQUENCE, VARIANT_SEQUENCE
Generate charts from filtered results.
kmtools plot <file> \
[--output-dir <dir>] \
[--charts <types>]| Argument | Required | Default | Description |
|---|---|---|---|
file |
Yes | - | Filtered results file (positional) |
--output-dir |
No | . |
Directory to save plot images |
--charts |
No | all |
Comma-separated chart types |
| Type | Output file | Description |
|---|---|---|
vaf |
vaf_distribution.png |
Histogram of variant allele frequencies (rVAF) for found variants |
type |
type_distribution.png |
Bar chart of variant type counts |
sample |
sample_summary.png |
Stacked bar: found vs not-found per sample |
overall |
overall_summary.png |
Pie chart of overall found vs not-found ratio |
all |
All of the above | Generate all chart types |
Run the complete pipeline in one command: chunk -> merge -> filter -> plot.
kmtools runall \
--threads <N> \
--km-find-mutation-options "<options>" \
--km-target-directory <dir> \
--km-jellyfish-file <file.jf> \
--merge-output <file> \
--reference <file> \
--filtered-output <file> \
[--output-type tsv|csv|xlsx] \
[--count-threshold <N>] \
[--output-dir <dir>] \
[--charts <types>]| Argument | Required | Default | Description |
|---|---|---|---|
--threads |
Yes | - | Number of parallel threads for chunking |
--km-find-mutation-options |
Yes | - | Options passed to km find_mutation |
--km-target-directory |
Yes | - | Directory with pre-split target subdirectories |
--km-jellyfish-file |
Yes | - | Path to the jellyfish k-mer counts file |
--merge-output |
Yes | - | Output file for merged chunk results |
--reference |
Yes | - | Reference file with known variants |
--filtered-output |
Yes | - | Output file for filtered results |
--output-type |
No | tsv |
Output format for filtered results |
--count-threshold |
No | 2 |
Minimum k-mer count threshold |
--output-dir |
No | . |
Directory to save plots |
--charts |
No | all |
Chart types to generate |
- Chunk: Split targets and run
km find_mutationin parallel - Merge: Combine chunk outputs into
--merge-output - Filter: Filter merged results against
--referenceinto--filtered-output - Plot: Generate charts from filtered results into
--output-dir