diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index e1648e9..e5ec53e 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -95,7 +95,11 @@ Then, we can run:
 ```sh
 samply record 'TODO(template) update with your binary e.g. ./target/release/...'
 ```
-This command will open a browser page that contains a graphic representation of where the time is being spent in our application. 
+This command will open a browser page that contains a graphic representation of where the time is being spent in our application.
+
+[Samply Kit](https://github.com/xrvdg/samply-kit) is a small toolkit for analysing and manipulating Samply and Firefox Profile data. It provides utilities for filtering and aggregating sample counts per function:
+- Filtering is very helpful when working with Rayon for example. Rayon clobbers up stack traces and by filtering the nested rayon calls you can have a clean stack trace again.
+- Aggregating sample counts per function helps in finding functions that look like small contributors in a regular flamegraphs, but in aggregate are actually large contributors. This is useful for finding mathematical routines such as multiplications and hashes that need to be optimised, or excessive memory operations that don't show up otherwise.
 
 ### Dhat
 We can add Dhat as a dependency:
@@ -124,5 +128,4 @@ Many other profiling libraries exist, please check the [Rust Performance Book](h
 But these 3 should be enough for the average application to identify bottlenecks and optimize them.
 
 For async-rust we also recommend: [Tracing](https://crates.io/crates/tracing), [Tokio-Console](https://crates.io/crates/tokio-console), and [Oha](https://crates.io/crates/oha).
-For Rayon-based parallel Rust code, we recommend Samply.
-It provides good profiling despite missing some multithreading details.
+For Rayon-based parallel Rust code, we recommend Samply in combination with [Samply Kit](https://github.com/xrvdg/samply-kit) to filter out Rayon from the stack traces.