v2.5.0

shenwei356 · shenwei356 · commit df377151a0c6 · 2023-07-16T11:12:39.000+08:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,19 +1,19 @@
-- [SeqKit v2.5.0](https://github.com/shenwei356/seqkit/releases/tag/v2.5.0) - 2023-03-17
+- [SeqKit v2.5.0](https://github.com/shenwei356/seqkit/releases/tag/v2.5.0) - 2023-07-16
 [![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.5.0/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.5.0)
     - new command `seqkit merge-slides`: merge sliding windows generated from seqkit sliding. [#390](https://github.com/shenwei356/seqkit/issues/390)
     - `seqkit stats`:
-        - a new flag `-N/--N` for appending other N50-like stats as new columns. [#393](https://github.com/shenwei356/seqkit/issues/393)
+        - added a new flag `-N/--N` for appending other N50-like stats as new columns. [#393](https://github.com/shenwei356/seqkit/issues/393)
         - added a progress bar for > 1 input files.
-        - write the result of each file immediately (no output buffer).
+        - write the result of each file immediately (no output buffer) when using `-T/--tabular`.
     - `seqkit translate`:
         - add options `-s/--out-subseqs` and `-m/--min-len` to write ORFs longer than `x` amino acids as individual records. [#389](https://github.com/shenwei356/seqkit/issues/389)
     - `seqkit sum`:
-        - do not remove possible '*' by default. Thanks to @photocyte. [#399](https://github.com/shenwei356/seqkit/issues/399)
+        - do not remove possible '*' by default and delete confusing warnings. Thanks to @photocyte. [#399](https://github.com/shenwei356/seqkit/issues/399)
         - added a progress bar for > 1 input files.
     - `seqkit pair`:
         - remove the restriction of requiring FASTQ format, i.e., FASTA files are also supported.
     - `seqkit seq`:
-        - update help message. [#387](https://github.com/shenwei356/seqkit/issues/387)
+        - update help messages. [#387](https://github.com/shenwei356/seqkit/issues/387)
     - `seqkit fxtab`:
         - faster alphabet computation (`-a/--alphabet`) with a new data structure. Thanks to @elliotwutingfeng [#388](https://github.com/shenwei356/seqkit/pull/388)
     - `seqkit subseq`:
diff --git a/doc/docs/download.md b/doc/docs/download.md
@@ -9,39 +9,26 @@ Please cite: **W Shen**, S Le, Y Li\*, F Hu\*. SeqKit: a cross-platform and ultr
 
 ## Current Version
 
-- [SeqKit v2.4.0](https://github.com/shenwei356/seqkit/releases/tag/v2.4.0) - 2023-03-17
-[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.4.0/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.4.0)
-    - `seqkit`:
-        - **support `bzip2` format**. [#361](https://github.com/shenwei356/seqkit/issues/361)
-        - support setting compression level for `gzip`, `zstd`, and `bzip2` format via `--compress-level`. [#320](https://github.com/shenwei356/seqkit/issues/320)
-        - the global flag `--infile-list` accepts stdin (`-`) now.
-        - wrap the help message of flags.
-    - `seqkit locate`:
-        - **do not remove embeded regions when searching with regular expressions**. [#368](https://github.com/shenwei356/seqkit/issues/368)
-    - `seqkit amplicon`:
-        - fix BED coordinates for amplicons found in the minus strand. [#367](https://github.com/shenwei356/seqkit/issues/367)
-    - `seqkit split`:
-        - fix forgetting to add extension for `--two-pass`. [#332](https://github.com/shenwei356/seqkit/issues/332)
+- [SeqKit v2.5.0](https://github.com/shenwei356/seqkit/releases/tag/v2.5.0) - 2023-07-16
+[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.5.0/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.5.0)
+    - new command `seqkit merge-slides`: merge sliding windows generated from seqkit sliding. [#390](https://github.com/shenwei356/seqkit/issues/390)
     - `seqkit stats`:
-        - fix compute Q1 and Q3 of sequence length for one record. [#353](https://github.com/shenwei356/seqkit/issues/353)
-    - `seqkit grep`:
-        - fix count number (`-C`) for matching with mismatch (`-m > 0`). [#370](https://github.com/shenwei356/seqkit/issues/370)
-    - `seqkit replace`:
-        - **add some flags to match partly records to edit**; these flags are transplanted from `seqkit grep`. [#348](https://github.com/shenwei356/seqkit/issues/348)
-    - `seqkit faidx`:
-        - **allow empty lines at the end of sequences**.
-    - `seqkit faidx/sort/shuffle/split/subseq`:
-        - **new flag `-U/--update-faidx`: update the FASTA index file if it exists, to guarantee the index file matches the FASTA files**. [#364](https://github.com/shenwei356/seqkit/issues/364)
-        - improve log info and update help message. [#365](https://github.com/shenwei356/seqkit/issues/365)
-    - `seqkit seq`: 
-        - allow filtering sequences of length zero. thanks to @penglbio.
-    - `seqkit rename`:
-        - new flag `-s/--separator` for setting separator between original ID/name and the counter (default "_"). [#360](https://github.com/shenwei356/seqkit/issues/360)
-        - new flag `-N/--start-num` for setting starting count number for duplicated IDs/names (default 2). [#360](https://github.com/shenwei356/seqkit/issues/360)
-        - new flag `-1/--rename-1st-rec` for renaming the first record as well. [#360](https://github.com/shenwei356/seqkit/issues/360)
-        - do not append space if there's no description after the sequene ID.
-    - `seqkit sliding`:
-        - new flag `-S/--suffix` for change the suffix added to the sequence ID (default: "_sliding").
+        - added a new flag `-N/--N` for appending other N50-like stats as new columns. [#393](https://github.com/shenwei356/seqkit/issues/393)
+        - added a progress bar for > 1 input files.
+        - write the result of each file immediately (no output buffer) when using `-T/--tabular`.
+    - `seqkit translate`:
+        - add options `-s/--out-subseqs` and `-m/--min-len` to write ORFs longer than `x` amino acids as individual records. [#389](https://github.com/shenwei356/seqkit/issues/389)
+    - `seqkit sum`:
+        - do not remove possible '*' by default and delete confusing warnings. Thanks to @photocyte. [#399](https://github.com/shenwei356/seqkit/issues/399)
+        - added a progress bar for > 1 input files.
+    - `seqkit pair`:
+        - remove the restriction of requiring FASTQ format, i.e., FASTA files are also supported.
+    - `seqkit seq`:
+        - update help messages. [#387](https://github.com/shenwei356/seqkit/issues/387)
+    - `seqkit fxtab`:
+        - faster alphabet computation (`-a/--alphabet`) with a new data structure. Thanks to @elliotwutingfeng [#388](https://github.com/shenwei356/seqkit/pull/388)
+    - `seqkit subseq`:
+        - accept reverse coordinates in BED/GTF. [#392](https://github.com/shenwei356/seqkit/issues/392)
 
         
 ### Links
@@ -176,6 +163,39 @@ fish:
 
 ## Release history
 
+- [SeqKit v2.4.0](https://github.com/shenwei356/seqkit/releases/tag/v2.4.0) - 2023-03-17
+[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.4.0/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.4.0)
+    - `seqkit`:
+        - **support `bzip2` format**. [#361](https://github.com/shenwei356/seqkit/issues/361)
+        - support setting compression level for `gzip`, `zstd`, and `bzip2` format via `--compress-level`. [#320](https://github.com/shenwei356/seqkit/issues/320)
+        - the global flag `--infile-list` accepts stdin (`-`) now.
+        - wrap the help message of flags.
+    - `seqkit locate`:
+        - **do not remove embeded regions when searching with regular expressions**. [#368](https://github.com/shenwei356/seqkit/issues/368)
+    - `seqkit amplicon`:
+        - fix BED coordinates for amplicons found in the minus strand. [#367](https://github.com/shenwei356/seqkit/issues/367)
+    - `seqkit split`:
+        - fix forgetting to add extension for `--two-pass`. [#332](https://github.com/shenwei356/seqkit/issues/332)
+    - `seqkit stats`:
+        - fix compute Q1 and Q3 of sequence length for one record. [#353](https://github.com/shenwei356/seqkit/issues/353)
+    - `seqkit grep`:
+        - fix count number (`-C`) for matching with mismatch (`-m > 0`). [#370](https://github.com/shenwei356/seqkit/issues/370)
+    - `seqkit replace`:
+        - **add some flags to match partly records to edit**; these flags are transplanted from `seqkit grep`. [#348](https://github.com/shenwei356/seqkit/issues/348)
+    - `seqkit faidx`:
+        - **allow empty lines at the end of sequences**.
+    - `seqkit faidx/sort/shuffle/split/subseq`:
+        - **new flag `-U/--update-faidx`: update the FASTA index file if it exists, to guarantee the index file matches the FASTA files**. [#364](https://github.com/shenwei356/seqkit/issues/364)
+        - improve log info and update help message. [#365](https://github.com/shenwei356/seqkit/issues/365)
+    - `seqkit seq`:
+        - allow filtering sequences of length zero. thanks to @penglbio.
+    - `seqkit rename`:
+        - new flag `-s/--separator` for setting separator between original ID/name and the counter (default "_"). [#360](https://github.com/shenwei356/seqkit/issues/360)
+        - new flag `-N/--start-num` for setting starting count number for duplicated IDs/names (default 2). [#360](https://github.com/shenwei356/seqkit/issues/360)
+        - new flag `-1/--rename-1st-rec` for renaming the first record as well. [#360](https://github.com/shenwei356/seqkit/issues/360)
+        - do not append space if there's no description after the sequene ID.
+    - `seqkit sliding`:
+        - new flag `-S/--suffix` for change the suffix added to the sequence ID (default: "_sliding").
 - [SeqKit v2.3.1](https://github.com/shenwei356/seqkit/releases/tag/v2.3.1) - 2022-09-22
 [![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.3.1/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.3.1)
     - `seqkit grep/locate`: fix bug of FMIndex building for empty sequences. [#321](https://github.com/shenwei356/seqkit/issues/321)
diff --git a/doc/docs/usage.md b/doc/docs/usage.md
@@ -158,7 +158,7 @@ reproduced in different environments with same random seed.
 ``` text
 SeqKit -- a cross-platform and ultrafast toolkit for FASTA/Q file manipulation
 
-Version: 2.4.0
+Version: 2.5.0
 
 Author: Wei Shen <shenwei356@gmail.com>
 
@@ -185,7 +185,7 @@ Compression level:
   bzip     1-9     6        https://github.com/dsnet/compress
 
 Usage:
-  seqkit [command] 
+  seqkit [command]
 
 Available Commands:
   amplicon        extract amplicon (or specific region around it) via primer(s)
@@ -204,6 +204,7 @@ Available Commands:
   head            print first N FASTA/Q records
   head-genome     print sequences of the first genome with common prefixes in name
   locate          locate subsequences/motifs, mismatch allowed
+  merge-slides    merge sliding windows generated from seqkit sliding
   mutate          edit sequence (point mutation, insertion, deletion)
   pair            match up paired-end reads from two fastq files
   range           print FASTA/Q records in a range (start:end)
@@ -266,7 +267,7 @@ Human genome from [ensembl](http://uswest.ensembl.org/info/data/ftp/index.html)
 - [`Homo_sapiens.GRCh38.84.gtf.gz`](ftp://ftp.ensembl.org/pub/release-84/gtf/homo_sapiens/Homo_sapiens.GRCh38.84.gtf.gz)
 - `Homo_sapiens.GRCh38.84.bed.gz` is converted from `Homo_sapiens.GRCh38.84.gtf.gz`
 by [`gtf2bed`](http://bedops.readthedocs.org/en/latest/content/reference/file-management/conversion/gtf2bed.html?highlight=gtf2bed)
-with command
+with the command
 
         zcat Homo_sapiens.GRCh38.84.gtf.gz \
             | gtf2bed --do-not-sort \
@@ -3899,6 +3900,10 @@ Flags:
 
 ```
 
+Example:
+
+    seqkit merge-slides sliding_windows.tsv -l 50 -o sliding_windows.merged.tsv
+
 ## genautocomplete
 
 Usage