-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
speed improvements to Aurora FX #4926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- converted to integer math, increasing speed on all ESPs, also shrinks code size - caching values to avoid repeated calculations - CRGBW instead or CRGB, adds white channel support when not using palette - fix for new brightness/gamma handling
WalkthroughRefactors the Aurora effect in Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–75 minutes Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
wled00/FX.cpp (4)
4801-4803
: Be explicit with CRGBW construction to avoid ctor ambiguity.Prefer passing the W channel explicitly for clarity and future-proofing.
- CRGBW mixedRgb = CRGBW(backlight, backlight, backlight); + CRGBW mixedRgb = CRGBW(backlight, backlight, backlight, 0);
4682-4685
: Prefer constexpr over macros for type safety and scoping.These are compile-time constants;
constexpr uint32_t
avoids macro pitfalls and conveys intent.-#define AW_SHIFT 16 -#define AW_SCALE (1 << AW_SHIFT) // 65536 representing 1.0 +constexpr uint32_t AW_SHIFT = 16; +constexpr uint32_t AW_SCALE = (1u << AW_SHIFT); // 65536 representing 1.0
4694-4696
: Document/adjust fixed-point types to match “scaled by AW_SCALE” comment.
basealpha
andspeed_factor
are labeled “scaled by AW_SCALE” but stored inuint16_t
. This works due to the 99% cap, but leaves little headroom and invites accidental widening later. Consideruint32_t
for clarity and future adjustments, or tighten the comment to “Q16.16 value truncated to 16 bits”.
4789-4793
: Minor init flow tweak for safety on first frame.
update()
is called before checkingstillAlive()
. If backing storage isn’t zero-initialized on some path, uninitializedspeed_factor
could be used once. Not a bug today (allocation zeroes data), but safer to init-dead-first:- waves[i].update(SEGLEN, SEGMENT.speed); - if (!(waves[i].stillAlive())) { + if (!waves[i].stillAlive()) { waves[i].init(SEGLEN, SEGMENT.color_from_palette(hw_random8(), false, false, hw_random8(0, 3))); - } + } + waves[i].update(SEGLEN, SEGMENT.speed);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
wled00/FX.cpp
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
wled00/**/*.{cpp,h}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use spaces (2 per level) for C++ firmware files
Files:
wled00/FX.cpp
🧠 Learnings (6)
📓 Common learnings
Learnt from: DedeHai
PR: wled/WLED#4798
File: wled00/FX.cpp:7531-7533
Timestamp: 2025-08-26T11:51:21.817Z
Learning: In WLED PR #4798, DedeHai confirmed that certain gamma-related calls in FX.cpp/FX_fcn.cpp/particle systems are intentional for effect-level shaping (e.g., brightness curves, TV sim, Pride 2015 pre-mix), distinct from final output gamma. Do not flag or remove these in future reviews; add comments when feasible to clarify intent.
Learnt from: mval-sg
PR: wled/WLED#4876
File: wled00/xml.cpp:0-0
Timestamp: 2025-08-28T08:09:20.630Z
Learning: The WLED codebase has opportunities for refactoring hardcoded array bounds (like the "15" used for DMX channels) to use sizeof(array)/sizeof(array[0]) for more maintainable code, but such changes should be done consistently across the entire codebase in a dedicated refactoring effort.
Learnt from: BobLoeffler68
PR: wled/WLED#4891
File: wled00/FX.cpp:3333-3349
Timestamp: 2025-08-31T03:38:14.114Z
Learning: WLED PacMan effect (wled00/FX.cpp): Keep pacmancharacters_t position fields as signed int (not int16_t). Maintainer preference (blazoncek) prioritizes avoiding potential overhead/regressions over minor RAM savings. Avoid type shrinking here unless memory pressure is demonstrated.
Learnt from: netmindz
PR: wled/WLED#4819
File: wled00/FX.cpp:10654-10655
Timestamp: 2025-08-08T17:22:37.374Z
Learning: In WLED FX.cpp, um_data returned by getAudioData() is intentionally untyped because different usermods provide different payloads, and simulateSound() may supply dummy data when the AudioReactive usermod isn’t built. Strongly typed accessors are not feasible; prefer explicit casts (e.g., samplePeak as uint8_t at u_data[3]) or small helpers that encapsulate these casts.
Learnt from: netmindz
PR: wled/WLED#4819
File: wled00/FX.cpp:10654-10655
Timestamp: 2025-08-08T17:31:47.330Z
Learning: In WLED core, avoid introducing AR-specific helper wrappers for um_data access. Keep um_data untyped and, when reading samplePeak, prefer `(*(uint8_t*)um_data->u_data[3]) != 0` over `*(bool*)` to avoid alignment/aliasing issues, while staying decoupled from the AudioReactive usermod.
📚 Learning: 2025-08-26T11:51:21.817Z
Learnt from: DedeHai
PR: wled/WLED#4798
File: wled00/FX.cpp:7531-7533
Timestamp: 2025-08-26T11:51:21.817Z
Learning: In WLED PR #4798, DedeHai confirmed that certain gamma-related calls in FX.cpp/FX_fcn.cpp/particle systems are intentional for effect-level shaping (e.g., brightness curves, TV sim, Pride 2015 pre-mix), distinct from final output gamma. Do not flag or remove these in future reviews; add comments when feasible to clarify intent.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-08-31T03:38:14.114Z
Learnt from: BobLoeffler68
PR: wled/WLED#4891
File: wled00/FX.cpp:3333-3349
Timestamp: 2025-08-31T03:38:14.114Z
Learning: WLED PacMan effect (wled00/FX.cpp): Keep pacmancharacters_t position fields as signed int (not int16_t). Maintainer preference (blazoncek) prioritizes avoiding potential overhead/regressions over minor RAM savings. Avoid type shrinking here unless memory pressure is demonstrated.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-06-15T09:59:52.720Z
Learnt from: netmindz
PR: wled/WLED#4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-08-08T17:22:37.374Z
Learnt from: netmindz
PR: wled/WLED#4819
File: wled00/FX.cpp:10654-10655
Timestamp: 2025-08-08T17:22:37.374Z
Learning: In WLED FX.cpp, um_data returned by getAudioData() is intentionally untyped because different usermods provide different payloads, and simulateSound() may supply dummy data when the AudioReactive usermod isn’t built. Strongly typed accessors are not feasible; prefer explicit casts (e.g., samplePeak as uint8_t at u_data[3]) or small helpers that encapsulate these casts.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-09-02T19:42:57.475Z
Learnt from: freakintoddles2
PR: wled/WLED#4904
File: wled00/FX.cpp:3915-3930
Timestamp: 2025-09-02T19:42:57.475Z
Learning: WLED wled00/FX.cpp::mode_percent — For time-based options (% of Minute/Hour/Day), behavior is clock-like: at rollover (elapsed==0) intentionally show SEGLEN lit for one tick to represent completion, then move to 1 LED at the next second. This is by design (not a progress bar) and should not be “fixed”.
Applied to files:
wled00/FX.cpp
🧬 Code graph analysis (1)
wled00/FX.cpp (2)
wled00/fcn_declare.h (6)
hw_random16
(430-430)hw_random16
(431-431)hw_random16
(432-432)hw_random8
(433-433)hw_random8
(434-434)hw_random8
(435-435)wled00/colors.cpp (2)
color_add
(28-61)color_add
(28-28)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
wled00/FX.cpp (2)
4715-4727
: Clamping cached age factor resolves prior 32-bit overflow risk.Clamping ageFactor_cached to AW_SCALE-1 is a clean alternative to widening to 64-bit and addresses the earlier overflow chain. Nice.
4753-4772
: Signed bound checks fixed the left-side culling bug.Switching to signed comparisons for center vs. bounds correctly expires left-moving waves off-screen.
🧹 Nitpick comments (4)
wled00/FX.cpp (4)
4682-4685
: Good call moving to Q16 fixed-point; add a guard to prevent silent drift later.Consider adding a compile-time check so future refactors don’t accidentally change AW_SHIFT and break assumptions elsewhere.
#define AW_SHIFT 16 #define AW_SCALE (1 << AW_SHIFT) // 65536 representing 1.0 +static_assert(AW_SHIFT == 16, "Aurora fixed-point expects Q16 (AW_SHIFT == 16).");
4689-4697
: Types vs. scale comment mismatch (clarify intent).
- basealpha and speed_factor are Q16 fractions but stored in uint16_t. That’s fine with AW_SHIFT==16, yet the “scaled by AW_SCALE” comment suggests full 0..AW_SCALE which could mislead maintenance.
- Recommend clarifying comments to “Q16 fraction (< AW_SCALE)” to prevent future widenings changing behavior.
- uint16_t basealpha; // scaled by AW_SCALE - uint16_t speed_factor; // scaled by AW_SCALE + uint16_t basealpha; // Q16 fraction (< AW_SCALE) + uint16_t speed_factor; // Q16 fraction (< AW_SCALE)
4729-4746
: Per-LED path is branch-light and fixed-point correct; a couple of small nits.
- offsetFactor division relies on width ≥ 1 (it is). You can make it explicit to avoid future regressions.
- Early cull via wave_start/wave_end is great.
- uint32_t offsetFactor = offset / width; // scaled by AW_SCALE + // width is always >= 1 + uint32_t offsetFactor = (uint32_t)offset / (uint32_t)width; // Q16
4795-4800
: Backlight + inverse gamma: intent is clear; add a guardrails note.Inverse-gamma on a 0..3 backlight produces a subtle floor; good for visual consistency. A short comment that this is per-author intent (effect-level shaping) will help future reviewers avoid “optimizing” it away.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
wled00/FX.cpp
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
wled00/**/*.{cpp,h}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use spaces (2 per level) for C++ firmware files
Files:
wled00/FX.cpp
🧠 Learnings (7)
📓 Common learnings
Learnt from: DedeHai
PR: wled/WLED#4798
File: wled00/FX.cpp:7531-7533
Timestamp: 2025-08-26T11:51:21.817Z
Learning: In WLED PR #4798, DedeHai confirmed that certain gamma-related calls in FX.cpp/FX_fcn.cpp/particle systems are intentional for effect-level shaping (e.g., brightness curves, TV sim, Pride 2015 pre-mix), distinct from final output gamma. Do not flag or remove these in future reviews; add comments when feasible to clarify intent.
Learnt from: mval-sg
PR: wled/WLED#4876
File: wled00/xml.cpp:0-0
Timestamp: 2025-08-28T08:09:20.630Z
Learning: The WLED codebase has opportunities for refactoring hardcoded array bounds (like the "15" used for DMX channels) to use sizeof(array)/sizeof(array[0]) for more maintainable code, but such changes should be done consistently across the entire codebase in a dedicated refactoring effort.
Learnt from: BobLoeffler68
PR: wled/WLED#4891
File: wled00/FX.cpp:3333-3349
Timestamp: 2025-08-31T03:38:14.114Z
Learning: WLED PacMan effect (wled00/FX.cpp): Keep pacmancharacters_t position fields as signed int (not int16_t). Maintainer preference (blazoncek) prioritizes avoiding potential overhead/regressions over minor RAM savings. Avoid type shrinking here unless memory pressure is demonstrated.
Learnt from: netmindz
PR: wled/WLED#4819
File: wled00/FX.cpp:10654-10655
Timestamp: 2025-08-08T17:31:47.330Z
Learning: In WLED core, avoid introducing AR-specific helper wrappers for um_data access. Keep um_data untyped and, when reading samplePeak, prefer `(*(uint8_t*)um_data->u_data[3]) != 0` over `*(bool*)` to avoid alignment/aliasing issues, while staying decoupled from the AudioReactive usermod.
Learnt from: netmindz
PR: wled/WLED#4819
File: wled00/FX.cpp:10654-10655
Timestamp: 2025-08-08T17:22:37.374Z
Learning: In WLED FX.cpp, um_data returned by getAudioData() is intentionally untyped because different usermods provide different payloads, and simulateSound() may supply dummy data when the AudioReactive usermod isn’t built. Strongly typed accessors are not feasible; prefer explicit casts (e.g., samplePeak as uint8_t at u_data[3]) or small helpers that encapsulate these casts.
📚 Learning: 2025-08-26T11:51:21.817Z
Learnt from: DedeHai
PR: wled/WLED#4798
File: wled00/FX.cpp:7531-7533
Timestamp: 2025-08-26T11:51:21.817Z
Learning: In WLED PR #4798, DedeHai confirmed that certain gamma-related calls in FX.cpp/FX_fcn.cpp/particle systems are intentional for effect-level shaping (e.g., brightness curves, TV sim, Pride 2015 pre-mix), distinct from final output gamma. Do not flag or remove these in future reviews; add comments when feasible to clarify intent.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-08-31T03:38:14.114Z
Learnt from: BobLoeffler68
PR: wled/WLED#4891
File: wled00/FX.cpp:3333-3349
Timestamp: 2025-08-31T03:38:14.114Z
Learning: WLED PacMan effect (wled00/FX.cpp): Keep pacmancharacters_t position fields as signed int (not int16_t). Maintainer preference (blazoncek) prioritizes avoiding potential overhead/regressions over minor RAM savings. Avoid type shrinking here unless memory pressure is demonstrated.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-08-08T17:22:37.374Z
Learnt from: netmindz
PR: wled/WLED#4819
File: wled00/FX.cpp:10654-10655
Timestamp: 2025-08-08T17:22:37.374Z
Learning: In WLED FX.cpp, um_data returned by getAudioData() is intentionally untyped because different usermods provide different payloads, and simulateSound() may supply dummy data when the AudioReactive usermod isn’t built. Strongly typed accessors are not feasible; prefer explicit casts (e.g., samplePeak as uint8_t at u_data[3]) or small helpers that encapsulate these casts.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-06-15T09:59:52.720Z
Learnt from: netmindz
PR: wled/WLED#4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-08-28T08:09:20.630Z
Learnt from: mval-sg
PR: wled/WLED#4876
File: wled00/xml.cpp:0-0
Timestamp: 2025-08-28T08:09:20.630Z
Learning: The WLED codebase has opportunities for refactoring hardcoded array bounds (like the "15" used for DMX channels) to use sizeof(array)/sizeof(array[0]) for more maintainable code, but such changes should be done consistently across the entire codebase in a dedicated refactoring effort.
Applied to files:
wled00/FX.cpp
📚 Learning: 2025-09-02T19:42:57.475Z
Learnt from: freakintoddles2
PR: wled/WLED#4904
File: wled00/FX.cpp:3915-3930
Timestamp: 2025-09-02T19:42:57.475Z
Learning: WLED wled00/FX.cpp::mode_percent — For time-based options (% of Minute/Hour/Day), behavior is clock-like: at rollover (elapsed==0) intentionally show SEGLEN lit for one tick to represent completion, then move to 1 LED at the next second. This is by design (not a progress bar) and should not be “fixed”.
Applied to files:
wled00/FX.cpp
🧬 Code graph analysis (1)
wled00/FX.cpp (2)
wled00/fcn_declare.h (6)
hw_random16
(430-430)hw_random16
(431-431)hw_random16
(432-432)hw_random8
(433-433)hw_random8
(434-434)hw_random8
(435-435)wled00/colors.cpp (2)
color_add
(28-61)color_add
(28-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: wled_build / Build Environments (esp32dev)
- GitHub Check: wled_build / Build Environments (esp01_1m_full_compat)
- GitHub Check: wled_build / Build Environments (esp32s3dev_16MB_opi)
- GitHub Check: wled_build / Build Environments (esp32_wrover)
- GitHub Check: wled_build / Build Environments (esp32s3_4M_qspi)
- GitHub Check: wled_build / Build Environments (usermods)
- GitHub Check: wled_build / Build Environments (esp32s3dev_8MB_opi)
- GitHub Check: wled_build / Build Environments (nodemcuv2)
- GitHub Check: wled_build / Build Environments (nodemcuv2_160)
- GitHub Check: wled_build / Build Environments (esp32_eth)
- GitHub Check: wled_build / Build Environments (esp32c3dev)
- GitHub Check: wled_build / Build Environments (lolin_s2_mini)
- GitHub Check: wled_build / Build Environments (esp8266_2m_160)
🔇 Additional comments (3)
wled00/FX.cpp (3)
4703-4713
: Init math looks safe and overflow-aware.
- basealpha capped to 99% avoids the AW_SCALE×AW_SCALE corner.
- center initialization keeps the product < 2^32 even at max segment length, so int32_t center is OK.
4779-4785
: Allocation and first-call bootstrapping are robust.Calling update() on zeroed waves to force a quick “dead → init()” cycle avoids special-casing. Memory footprint stays bounded by aux1.
4802-4810
: Color accumulation path LGTM.Using color_add for CRGBW accumulation with per-wave pre-culling is efficient and visually safe (saturating add).
Just 2¢. Aurora is a slow effect, no need to rush it. 😉 |
yes, but 6FPS was a bit too slow for my taste ;) |
Converted all calculations to integer math. There is very little impact to visual appearance if at all: I was not able to spot any differences.
FPS improve accross all ESPs:
ESP32: ~2x speed improvement thanks to caching and avoiding float divisions
C3: ~6x speed improvement
ESP8266: ~2x speed improvement, maybe more - did only a quick test
RAM usage is the same: new class uses also 32bytes (code of old version states 24bytes but that was ignoring stuffing bytes, it was also 32 bytes)
Changes made:
Code size shrinks by ~300bytes (ESP32) or ~500 bytes on C3/ESP8266.
Can someone who is using this FX regularly please test?
Summary by CodeRabbit
New Features
Refactor
Bug Fixes