Deterministic knee radiograph cropping for PNG images.
Supports bilateral (default) and unilateral knee radiography inputs.
Images can be provided via a CSV or directly with --images.
# from the repo root (where pyproject.toml lives)
pip install -e .
Requirements
Python ≥ 3.9
Dependencies are declared in pyproject.toml. For strict reproducibility, keep versions pinned.
Usage
Command Line (CLI)
A) With CSV
knee-crop \
--input-csv path/to/your.csv \
--base-path /path/to/image_root \
--output-dir out_csv
CSV must have a column named png_path (default).
If your column differs, override with --path-col.
B) With images (no CSV)
# absolute paths (simplest)
knee-crop --images /abs/path/knee1.png /abs/path/knee2.png \
--base-path / \
--output-dir out_images
# relative paths (need base-path to resolve)
knee-crop --images knee1.png knee2.png \
--base-path /path/to/image_root \
--output-dir out_images
Unilateral example:
knee-crop --images /abs/path/knee.png \
--base-path / \
--output-dir out_uni \
--input-mode unilateral \
--unilateral-laterality right
Python API
from knee_cropping.pipeline import run_pipeline
from knee_cropping.config import CropConfig
# Bilateral (default, CSV-based)
run_pipeline(
input_csv="path/to/your.csv",
base_path="/path/to/image_root",
path_col="png_path",
output_dir="cropped_out",
n_processes=4,
log_to="console"
)
# Unilateral
run_pipeline(
input_csv="path/to/your_unilateral.csv",
base_path="/path/to/image_root",
output_dir="cropped_out_uni",
input_mode="unilateral",
unilateral_laterality="right"
)
# With images (no CSV)
run_pipeline(
input_csv=None,
base_path="/", # use "/" if absolute paths
output_dir="cropped_out_images",
images=["/abs/path/knee1.png", "/abs/path/knee2.png"],
input_mode="bilateral"
)
Logging
"console" (default): warnings/errors printed live.
"file": full logs written to file in output folder.
"both": console + file.
Log file
Location: inside your output_dir
Name: knee_processing_YYYYMMDD_HHMM.log
What’s logged:
Run lifecycle per image (start/finish, timing)
File I/O (input path read, output paths written)
Preprocessing details (resize, trim, kernels, thresholds)
Bilateral split location
Per-knee steps (CLAHE, ROI, black-box handling, smoothing)
Warnings/errors per image
A tqdm progress bar shows in console. With multiprocessing, log lines may interleave.
Hyperparameters
All hyperparameters are in knee_cropping/config.py (CropConfig).
Defaults reproduce the original pipeline. Example override:
from knee_cropping.config import CropConfig
cfg = CropConfig(clahe_clip=3.0, sobel_threshold=120)
run_pipeline(..., config=cfg)
| Name | Default | Description |
| ----------------------- | ------------- | -------------------------------------------- |
| `resize_height` | `1100` | Target image height before processing |
| `vertical_trim` | `50` | Trim top/bottom borders after resize |
| `sobel_threshold` | `150` | Edge magnitude threshold |
| `gaussian_kernel` | `(5,5)` | Gaussian blur kernel size |
| `sobel_ksize` | `3` | Sobel kernel size (odd int) |
| `split_band` | `(0.30,0.60)` | Width range for bilateral separator |
| `intensity_offset` | `100` | Right-knee intensity peak offset |
| `clahe_clip` | `2.0` | CLAHE clip limit |
| `clahe_tile` | `(16,16)` | CLAHE tile grid size |
| `start_row_frac` | `0.20` | ROI start row as fraction of height |
| `end_row_frac` | `0.80` | ROI end row as fraction of height |
| `black_box_offset` | `50` | Pixels to shift ROI away from black-box rows |
| `edge_threshold_min` | `100` | Minimum edge threshold |
| `savgol_window` | `21` | Savitzky–Golay smoothing window (odd) |
| `savgol_poly` | `3` | Savitzky–Golay polynomial order |
| `black_box_min_area_px` | `10000` | Minimum contour area for black-box detection |
Notes
Bilateral: crops both knees separately (default).
Unilateral: processes the whole image as a single knee (--unilateral-laterality required).
Increase thresholds (e.g., sobel_threshold) to reduce noise; decrease to detect fainter edges.
Larger kernels/smoothing → more robust but less sensitive.
If scans are off-center, adjust split_band.
For scans with prominent black-box markers, increase black_box_offset.
Performance & Parallelism
Use n_processes to match your hardware.
More processes = faster throughput but higher CPU load.
I/O speed (SSD vs HDD) may bottleneck.
With multiprocessing, logs may appear out of order.
Troubleshooting
No outputs / early errors → confirm png_path values are valid relative to base_path.
Failures are logged in cropping_failures.csv inside the output dir.
Logging too noisy? use --log-to console. For full detail: file or both.
License
MIT — see LICENSE.