-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Improvements to heap-memory and PSRAM handling #4791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Segment `allocateData()` uses more elaborate DRAM checking to reduce fragmentation and allow for larger setups to run on low heap - Segment data allocation fails if minimum contiguous block size runs low to keep the UI working - Increased `MAX_SEGMENT_DATA` to account for better segment data handling - Memory allocation functions try to keep enough DRAM for segment data - Added constant `PSRAM_THRESHOLD` to improve PSARM usage - Increase MIN_HEAP_SIZE to reduce risk of breaking UI due to low memory for JSON response - ESP32 makes use of IRAM (no 8bit access) for pixeluffers, freeing up to 50kB of RAM - Fix to properly get available heap on all platforms: added function `getFreeHeapSize()` - Bugfix for effects that divide by SEGLEN: don't run FX in service() if segment is not active -Syntax fix in AR: calloc() uses (numelements, size) as arguments
- added `allocate_buffer()` function that can be used to allocate large buffers: takes parameters to set preferred ram location, including 32bit accessible RAM on ESP32. Returns null if heap runs low or switches to PSRAM - getFreeHeapSize() and getContiguousFreeHeap() helper functions for all platforms to correctly report free useable heap - updated some constants - updated segment data allocation to free the data if it is large
WalkthroughThis set of changes refactors memory allocation and heap management across the codebase, introducing platform-specific allocation strategies, new buffer allocation flags, and wrapper functions for heap queries. Calls to direct heap size functions are replaced with new abstractions, and memory allocation is updated to prefer PSRAM or DRAM based on platform and buffer size. Several debug outputs and effect data management routines are updated to use these new mechanisms. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes
Possibly related PRs
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
🧰 Additional context used🧠 Learnings (2)📓 Common learnings
wled00/const.h (1)Learnt from: blazoncek 🔇 Additional comments (2)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
usermods/audioreactive/audio_reactive.cpp
(1 hunks)wled00/FX.h
(2 hunks)wled00/FX_fcn.cpp
(8 hunks)wled00/cfg.cpp
(1 hunks)wled00/const.h
(1 hunks)wled00/fcn_declare.h
(2 hunks)wled00/json.cpp
(2 hunks)wled00/util.cpp
(3 hunks)wled00/wled.cpp
(7 hunks)wled00/wled_server.cpp
(1 hunks)wled00/ws.cpp
(2 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: blazoncek
PR: wled/WLED#4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like `if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))` is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: DedeHai
PR: wled/WLED#4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.200Z
Learning: In WLED, maximum segment name length varies by platform:
- ESP8266: 32 characters (WLED_MAX_SEGNAME_LEN = 32)
- ESP32: 64 characters (WLED_MAX_SEGNAME_LEN = 64)
This platform difference can cause truncation when syncing longer names from ESP32 to ESP8266. Additionally, the WLED UI has limitations regarding modified maximum segment name lengths.
wled00/cfg.cpp (1)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
wled00/ws.cpp (3)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
wled00/json.cpp (6)
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: KrX3D
PR: #4664
File: wled00/syslog.cpp:50-51
Timestamp: 2025-04-26T12:51:04.993Z
Learning: In the WLED codebase, syslog configuration variables (syslogHost, syslogPort, syslogProtocol, syslogFacility, syslogSeverity) are defined as global variables in wled.h and are accessed directly in the SyslogPrinter implementation, rather than being class members.
Learnt from: blazoncek
PR: #4658
File: wled00/const.h:140-141
Timestamp: 2025-04-26T19:19:07.600Z
Learning: In WLED, the WLED_MAX_PANELS macro is intentionally defined as a fixed constant value (18) with no redefinition mechanism, making it "unoverridable" - there's no need for a static assertion to check its maximum value.
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.200Z
Learning: In WLED, maximum segment name length varies by platform:
- ESP8266: 32 characters (WLED_MAX_SEGNAME_LEN = 32)
- ESP32: 64 characters (WLED_MAX_SEGNAME_LEN = 64)
This platform difference can cause truncation when syncing longer names from ESP32 to ESP8266. Additionally, the WLED UI has limitations regarding modified maximum segment name lengths.
usermods/audioreactive/audio_reactive.cpp (1)
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
wled00/wled.cpp (3)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
wled00/fcn_declare.h (1)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
wled00/const.h (1)
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
wled00/FX.h (14)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: willmmiles
PR: #4712
File: wled00/FX_fcn.cpp:992-993
Timestamp: 2025-06-07T15:58:42.579Z
Learning: In WLED segments, the backing pixel buffer is always allocated to the full physical segment size, but effect rendering only uses a subset when mirror or grouping are enabled. For 2D segments, this subset is vWidth() * vHeight(), and for 1D segments it's vLength(). Color processing operations like fade_out should iterate over this rendering subset, not the full allocated buffer.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.200Z
Learning: In WLED, maximum segment name length varies by platform:
- ESP8266: 32 characters (WLED_MAX_SEGNAME_LEN = 32)
- ESP32: 64 characters (WLED_MAX_SEGNAME_LEN = 64)
This platform difference can cause truncation when syncing longer names from ESP32 to ESP8266. Additionally, the WLED UI has limitations regarding modified maximum segment name lengths.
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: blazoncek
PR: #4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the getSegment()
function guards against out-of-bounds segment IDs, and getFirstSelectedSegId()
falls back to getMainSegmentId()
if no segments are selected, ensuring no crashes when used through the setValuesFromFirstSelectedSeg()
macro.
Learnt from: blazoncek
PR: #4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the getSegment()
function guards against out-of-bounds segment IDs, and getFirstSelectedSegId()
falls back to getMainSegmentId()
if no segments are selected, ensuring no crashes when used through the setValuesFromFirstSelectedSeg()
macro.
Learnt from: blazoncek
PR: #4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the getSegment()
function guards against out-of-bounds segment IDs by falling back to the main segment ID, and getFirstSelectedSegId()
falls back to getMainSegmentId()
if no segments are selected, ensuring no crashes when used through the setValuesFromFirstSelectedSeg()
macro.
Learnt from: KrX3D
PR: #4585
File: usermods/seven_segment_display_reloaded_v2/seven_segment_display_reloaded_v2.h:121-136
Timestamp: 2025-04-27T09:37:28.415Z
Learning: Using PROGMEM for the seven-segment font array (umSSDRNumbers) in the WLED SSDR usermod causes compilation problems, so it should be left as a regular array.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:343-347
Timestamp: 2025-02-19T12:41:22.676Z
Learning: The setName() method in WLED's Segment class (FX_fcn.cpp) handles all necessary validation internally, including:
- Null pointer checks
- Length validation against WLED_MAX_SEGNAME_LEN
- Memory allocation/reallocation
No additional validation or name clearing is needed before calling setName().
Learnt from: DedeHai
PR: #4682
File: wled00/FX.cpp:8997-9005
Timestamp: 2025-05-09T18:43:15.355Z
Learning: In the WLED codebase, SEGMENT.custom3 is always constrained to the range 0-31 and will not exceed this range.
Learnt from: netmindz
PR: #4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
Learnt from: netmindz
PR: #4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
Learnt from: DedeHai
PR: #4682
File: wled00/FX.cpp:9016-9039
Timestamp: 2025-05-09T18:48:21.296Z
Learning: WLED does not support matrices large enough (≥1024 px) for 32-bit overflow in dxdx + dydy to be a practical concern in particle system effects.
wled00/util.cpp (4)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: KrX3D
PR: #4585
File: usermods/seven_segment_display_reloaded_v2/seven_segment_display_reloaded_v2.h:121-136
Timestamp: 2025-04-27T09:37:28.415Z
Learning: Using PROGMEM for the seven-segment font array (umSSDRNumbers) in the WLED SSDR usermod causes compilation problems, so it should be left as a regular array.
wled00/FX_fcn.cpp (13)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: willmmiles
PR: #4712
File: wled00/FX_fcn.cpp:992-993
Timestamp: 2025-06-07T15:58:42.579Z
Learning: In WLED segments, the backing pixel buffer is always allocated to the full physical segment size, but effect rendering only uses a subset when mirror or grouping are enabled. For 2D segments, this subset is vWidth() * vHeight(), and for 1D segments it's vLength(). Color processing operations like fade_out should iterate over this rendering subset, not the full allocated buffer.
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:343-347
Timestamp: 2025-02-19T12:41:22.676Z
Learning: The setName() method in WLED's Segment class (FX_fcn.cpp) handles all necessary validation internally, including:
- Null pointer checks
- Length validation against WLED_MAX_SEGNAME_LEN
- Memory allocation/reallocation
No additional validation or name clearing is needed before calling setName().
Learnt from: blazoncek
PR: #4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the getSegment()
function guards against out-of-bounds segment IDs, and getFirstSelectedSegId()
falls back to getMainSegmentId()
if no segments are selected, ensuring no crashes when used through the setValuesFromFirstSelectedSeg()
macro.
Learnt from: blazoncek
PR: #4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the getSegment()
function guards against out-of-bounds segment IDs, and getFirstSelectedSegId()
falls back to getMainSegmentId()
if no segments are selected, ensuring no crashes when used through the setValuesFromFirstSelectedSeg()
macro.
Learnt from: blazoncek
PR: #4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the getSegment()
function guards against out-of-bounds segment IDs by falling back to the main segment ID, and getFirstSelectedSegId()
falls back to getMainSegmentId()
if no segments are selected, ensuring no crashes when used through the setValuesFromFirstSelectedSeg()
macro.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: KrX3D
PR: #4585
File: usermods/seven_segment_display_reloaded_v2/seven_segment_display_reloaded_v2.h:121-136
Timestamp: 2025-04-27T09:37:28.415Z
Learning: Using PROGMEM for the seven-segment font array (umSSDRNumbers) in the WLED SSDR usermod causes compilation problems, so it should be left as a regular array.
Learnt from: netmindz
PR: #4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
Learnt from: KrX3D
PR: #4585
File: usermods/seven_segment_display_reloaded_v2/seven_segment_display_reloaded_v2.cpp:430-435
Timestamp: 2025-04-28T20:51:29.773Z
Learning: In WLED, bri
is a global variable used for brightness control.
Learnt from: DedeHai
PR: #4682
File: wled00/FX.cpp:9016-9039
Timestamp: 2025-05-09T18:48:21.296Z
Learning: WLED does not support matrices large enough (≥1024 px) for 32-bit overflow in dxdx + dydy to be a practical concern in particle system effects.
Learnt from: netmindz
PR: #4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
🧬 Code Graph Analysis (5)
wled00/cfg.cpp (1)
wled00/fcn_declare.h (2)
getFreeHeapSize
(586-586)getFreeHeapSize
(589-589)
wled00/ws.cpp (1)
wled00/fcn_declare.h (2)
getFreeHeapSize
(586-586)getFreeHeapSize
(589-589)
wled00/wled_server.cpp (1)
wled00/fcn_declare.h (2)
getFreeHeapSize
(586-586)getFreeHeapSize
(589-589)
wled00/fcn_declare.h (2)
wled00/wled.h (1)
size
(170-178)wled00/util.cpp (2)
allocate_buffer
(719-776)allocate_buffer
(719-719)
wled00/FX.h (1)
wled00/util.cpp (2)
allocate_buffer
(719-776)allocate_buffer
(719-719)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: wled_build / Build Environments (usermods)
- GitHub Check: wled_build / Build Environments (esp32s3_4M_qspi)
- GitHub Check: wled_build / Build Environments (esp32_wrover)
- GitHub Check: wled_build / Build Environments (esp32s3dev_8MB_opi)
- GitHub Check: wled_build / Build Environments (lolin_s2_mini)
- GitHub Check: wled_build / Build Environments (esp32c3dev)
- GitHub Check: wled_build / Build Environments (esp32s3dev_16MB_opi)
- GitHub Check: wled_build / Build Environments (esp32_eth)
- GitHub Check: wled_build / Build Environments (esp32dev_V4)
- GitHub Check: wled_build / Build Environments (esp8266_2m_compat)
- GitHub Check: wled_build / Build Environments (esp32dev)
- GitHub Check: wled_build / Build Environments (esp01_1m_full_compat)
- GitHub Check: wled_build / Build Environments (esp8266_2m_160)
- GitHub Check: wled_build / Build Environments (nodemcuv2_compat)
- GitHub Check: wled_build / Build Environments (esp01_1m_full_160)
- GitHub Check: wled_build / Build Environments (esp01_1m_full)
- GitHub Check: wled_build / Build Environments (nodemcuv2_160)
- GitHub Check: wled_build / Build Environments (nodemcuv2)
- GitHub Check: wled_build / Build Environments (esp8266_2m)
🔇 Additional comments (36)
wled00/cfg.cpp (1)
204-204
: LGTM! Good platform abstraction improvement.The replacement of
ESP.getFreeHeap()
withgetFreeHeapSize()
provides better platform consistency and aligns with the broader memory management improvements in this PR.wled00/ws.cpp (2)
127-128
: LGTM! Consistent platform abstraction for heap queries.The replacement of
ESP.getFreeHeap()
withgetFreeHeapSize()
maintains the same memory management logic while improving platform compatibility. The heap size measurement and debug logging functionality remains unchanged.
137-138
: LGTM! Proper heap monitoring after buffer allocation.The changes maintain the critical memory validation logic that ensures WebSocket buffer allocation succeeded on ESP8266. Using
getFreeHeapSize()
instead ofESP.getFreeHeap()
improves platform abstraction without affecting the memory management strategy.usermods/audioreactive/audio_reactive.cpp (1)
226-229
: No other reversed calloc() argument instances foundI’ve verified that all calloc() calls use the correct
calloc(count, size)
order, including the FFT buffers in usermods/audioreactive/audio_reactive.cpp. No additional misordered calls were detected across the codebase.wled00/wled_server.cpp (1)
369-369
: LGTM - Clean abstraction of heap size queryThe replacement of
ESP.getFreeHeap()
withgetFreeHeapSize()
properly abstracts the platform-specific heap query and aligns with the broader memory management refactoring across the codebase.wled00/json.cpp (3)
815-815
: LGTM - Standardizes ESP32 max allocation heap queryThe replacement of
ESP.getMaxAllocHeap()
withgetContiguousFreeHeap()
provides a consistent cross-platform interface for querying the largest contiguous heap block available for allocation.
826-826
: LGTM - Standardizes ESP8266 max free block queryThe replacement of
ESP.getMaxFreeBlockSize()
withgetContiguousFreeHeap()
unifies the API for querying maximum contiguous heap across ESP32 and ESP8266 platforms.
832-832
: LGTM - Consistent with heap query abstractionThe replacement of
ESP.getFreeHeap()
withgetFreeHeapSize()
maintains the same functionality while using the new standardized heap query interface.wled00/wled.cpp (9)
171-171
: LGTM - Consistent heap query in connection handlingThe replacement of
ESP.getFreeHeap()
withgetFreeHeapSize()
maintains functionality while using the new standardized heap query interface.
243-247
: Enhanced debug output provides valuable memory diagnosticsThe improved debug output now shows both total free heap and contiguous free heap, plus ESP32-specific 32-bit heap information. This provides much better visibility into memory fragmentation and allocation patterns, which is essential for the PSRAM handling improvements.
373-373
: LGTM - Standardized heap reporting in setupThe replacement maintains the same debug functionality while using the new heap query abstraction.
401-401
: LGTM - Consistent with memory management refactoring
431-431
: LGTM - Heap monitoring after config deserialization
443-443
: LGTM - Heap monitoring after strip initialization
448-448
: LGTM - Heap monitoring after usermod setup
513-513
: LGTM - Heap monitoring after server initialization
519-519
: LGTM - Heap monitoring after IR initializationwled00/const.h (2)
549-554
: LGTM! Platform-appropriate minimum heap thresholds.The defined values align well with the existing
WLED_REQUEST_MIN_HEAP
constants and provide appropriate buffers for UI responsiveness while accounting for ESP8266's more limited RAM.
556-563
: Well-designed platform-specific PSRAM thresholds.The threshold values are appropriately tailored to each platform's capabilities:
- ESP32S3 (5120): Highest threshold for best PSRAM support
- ESP32 (4096): Good balance for standard ESP32
- Others (1024): Appropriate fallback, correctly noting C3/ESP8266 lack PSRAM
The allocation strategy comments clearly explain the heap-based decision logic.
wled00/FX.h (3)
91-91
: Accurate comment update reflecting new memory strategy.The updated comment correctly describes the heap memory limit approach and PSRAM reservation strategy for pixel buffers.
95-101
: Appropriate increases to segment data limits.The increased
MAX_SEGMENT_DATA
values properly scale with platform capabilities:
- ESP8266: 10KB (conservative for limited RAM)
- ESP32S2: 20KB (balanced for S2's constraints)
- Other ESP32: 60KB (leveraging full capabilities)
These increases align with the PR objective of supporting larger pixel setups while maintaining platform-appropriate limits.
602-603
: Excellent transition to new buffer allocation system.The change from
d_calloc
toallocate_buffer
is well-implemented:
BFRALLOC_PREFER_PSRAM
: Aligns with PR goal of PSRAM utilizationBFRALLOC_NOBYTEACCESS
: Appropriate for 32-bit pixel dataBFRALLOC_CLEAR
: Maintains zero-initialization behavior- Informative comment about minimal FPS impact (<2%) with PSRAM usage
This properly integrates with the enhanced memory management strategy described in the PR objectives.
wled00/fcn_declare.h (2)
585-591
: Well-designed heap query abstractions.The platform-specific implementations correctly use heap_caps functions for ESP32 variants to get accurate DRAM-only measurements, while falling back to ESP functions for ESP8266. The inline declarations ensure optimal performance.
592-598
: Clean and extensible allocation flag design.The bit flag pattern allows flexible combination of allocation preferences. The naming is clear and consistent, making the API intuitive for users.
wled00/util.cpp (4)
648-658
: Well-implemented calloc variants.The p_calloc implementation correctly mirrors the p_malloc logic while properly handling zero-initialization through heap_caps_calloc_prefer.
688-696
: Conservative DRAM threshold for calloc operations.Good design choice using
3*MIN_HEAP_SIZE
threshold (vs 2x for malloc) as calloc operations typically involve larger allocations that require zeroing.
624-634
: PSRAM_THRESHOLD correctly defined and allocation strategy is sound
- PSRAM_THRESHOLD is defined in wled00/const.h for each target (ESP32S3: 5120, ESP32: 4096, others: 1024).
- MIN_HEAP_SIZE is defined (8 KB or 12 KB) in const.h as expected.
- The p_malloc logic using these thresholds operates independently of Segment::allocateData() optimizations and poses no conflict.
718-776
: Review allocation alignment and heap exhaustion handling.Several observations about the
allocate_buffer
implementation:
Potential alignment issue (line 728): Casting to
uint32_t*
assumes 4-byte alignment, but the size parameter might not be a multiple of 4.Heap exhaustion risk: The MIN_HEAP_SIZE check occurs after allocation (line 767), which could temporarily exhaust the heap and affect system stability.
Consider these improvements:
void *allocate_buffer(size_t size, uint32_t type) { void *buffer = nullptr; + // Pre-check heap to avoid temporary exhaustion + if (getFreeHeapSize() < MIN_HEAP_SIZE + size) { + return nullptr; + } #if defined(ESP8266) || defined(CONFIG_IDF_TARGET_ESP32C3) buffer = d_malloc(size); #else #ifdef CONFIG_IDF_TARGET_ESP32 if(type & BFRALLOC_NOBYTEACCESS) { + // Ensure size is aligned to 4 bytes for 32-bit access + size_t aligned_size = (size + 3) & ~3; - buffer = static_cast<uint32_t*>(heap_caps_malloc(size, MALLOC_CAP_INTERNAL | MALLOC_CAP_32BIT)); + buffer = heap_caps_malloc(aligned_size, MALLOC_CAP_INTERNAL | MALLOC_CAP_32BIT);⛔ Skipped due to learnings
Learnt from: blazoncek PR: wled/WLED#4667 File: usermods/user_fx/user_fx.cpp:27-30 Timestamp: 2025-04-30T05:41:03.633Z Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like `if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))` is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: DedeHai PR: wled/WLED#4750 File: usermods/audioreactive/audio_reactive.cpp:311-321 Timestamp: 2025-06-27T06:04:44.652Z Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
wled00/FX_fcn.cpp (8)
71-80
: LGTM! Good PSRAM utilization for pixel buffers.The implementation correctly uses the new
allocate_buffer
function with appropriate flags for pixel buffer allocation, and properly handles allocation failures.
113-122
: Consistent implementation with copy constructor.The copy assignment operator maintains the same allocation strategy and error handling as the copy constructor, which is good for code consistency.
216-220
: Good implementation of smart buffer deallocation strategy.The code correctly implements the memory management strategy by deallocating large buffers to free memory while retaining small buffers to reduce fragmentation. The debug output is helpful for monitoring buffer management.
469-480
: Smart approach to avoid heap fragmentation.Good decision to use
free()
instead ofrealloc()
to prevent blocking larger heap segments. The pixel buffer allocation correctly uses PSRAM preference consistent with other pixel buffer allocations.
1214-1214
: Correct use of new heap abstraction function.The code properly uses the new
getFreeHeapSize()
wrapper function for platform-independent heap size queries.
1226-1230
: Appropriate PSRAM enforcement for main pixel buffer.The code correctly enforces PSRAM allocation for the main pixel buffer, which aligns with the PR's memory optimization goals. The buffer is properly cleared on allocation and the debug output uses the new heap abstraction.
1274-1274
: Important bugfix for segment transition safety.Good addition to check if the old segment is active before running its effect. This prevents potential crashes from running effects on inactive segments with invalid parameters.
147-191
: All memory-management logic looks solid—constant is defined
FAIR_DATA_PER_SEG
is declared inwled00/FX.h
(#define FAIR_DATA_PER_SEG (MAX_SEGMENT_DATA / WS2812FX::getMaxSegments())
), so no undefined-symbol issue.LGTM.
wled00/fcn_declare.h
Outdated
#ifdef CONFIG_IDF_TARGET_ESP32 | ||
void *pixelbuffer_malloc(size_t size, bool enforcePSRAM = false); // prefer IRAM for pixel buffers if possible | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the implementation of pixelbuffer_malloc
.
The function pixelbuffer_malloc
is declared here but I need to verify its implementation exists in the codebase.
🏁 Script executed:
#!/bin/bash
# Search for the implementation of pixelbuffer_malloc
ast-grep --pattern 'void *pixelbuffer_malloc(size_t $_, $_) {
$$$
}'
# Also search with ripgrep for any definition
rg -A 5 'pixelbuffer_malloc.*\{' --type cpp
Length of output: 118
🏁 Script executed:
#!/usr/bin/env bash
# Search for any declaration or definition of pixelbuffer_malloc across the repository
rg -n 'pixelbuffer_malloc' .
Length of output: 166
Missing implementation for pixelbuffer_malloc
The function pixelbuffer_malloc(size_t, bool)
is declared under #ifdef CONFIG_IDF_TARGET_ESP32
in wled00/fcn_declare.h:555
but no matching definition exists in the repository. This will cause linker errors when building for ESP32.
• Add the corresponding function definition (for example in wled00/fcn_declare.cpp
or another appropriate source file).
• Ensure the implementation handles the enforcePSRAM
flag and IRAM allocation preference correctly.
🤖 Prompt for AI Agents
In wled00/fcn_declare.h around lines 554 to 556, the function pixelbuffer_malloc
is declared but lacks a corresponding implementation, causing linker errors for
ESP32 builds. Add the function definition in an appropriate source file like
wled00/fcn_declare.cpp, implementing the logic to allocate memory for pixel
buffers with preference for IRAM and correctly handling the enforcePSRAM flag to
decide whether to allocate from PSRAM or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO you overestimated the free ram on S2 and ESP8266.
#elif defined(CONFIG_IDF_TARGET_ESP32S2) | ||
#define MAX_NUM_SEGMENTS 20 | ||
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*512) // 10k by default (S2 is short on free RAM) | ||
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*1024) // 20k by default (S2 is short on free RAM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not cut it on every build for S2. Mine is one example, having only about 20-30k of free heap after everything is loaded and operational.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its no longer used in the same way: it now tries to keep this much ram free for FX data if there is PSRAM, otherwise it will limit the use to this amount but if heap runs low, it will respect minheap. so even if you set this to 100k nothing bad happens. one exception (which is no different in current implementation): if a usermod or something else uses heap to deplete it.
#ifdef ESP8266 | ||
#define MAX_NUM_SEGMENTS 16 | ||
/* How much data bytes all segments combined may allocate */ | ||
#define MAX_SEGMENT_DATA 5120 | ||
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*640) // 10k by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a lot (and I mean a lot) for ESP8266.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you can give me a scenario that is critical in ram usage, I can test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 usermods (PIR, temperature, 4LD, AR) and 16x16 matrix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works fine:
Free heap: 16384, Contiguous: 11344
Main:
Free heap: 15720, Contiguous: 7272
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you enable MQTT and use GPIO3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, I dont have the hardware to do that, you'd need to check with your setup.
edit:
since this is all about free heap : if it works in main, I see no reason it should not work with this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the condition to delete and reallocate of the segmentdata should not be FAIR_DATA_PER_SEG
but something a bit larger, might help with fragmentation on ESP8266
@troyhacks - You might be interested in this |
Absolutely - I've been doing this sort of thing in all my personal WLED-MM builds for the last year and agree with the findings. 😁 PSRAM is the solution for more pixels and the speed decrease is negligible - especially compared to it not working. 🤣 @DedeHai has done it more in-depth than my liberal sprinkling of |
I've sifted through code and can say that these are solid changes. However ... Please use space after the keywords. I.e. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
wled00/util.cpp (1)
716-776
: Please verify the SOC_DRAM_HIGH comparison and add documentation.The new allocate_buffer function implements a robust multi-region allocation strategy, but the check at line 728 may be inverted:
- It currently does
Please confirm against the actual definitions of SOC_DRAM_HIGH (and SOC_IRAM_LOW) that this condition correctly detects when the pointer lies outside the desired 32-bit DRAM region. It may need to beif ((uintptr_t)buffer < SOC_DRAM_HIGH) { // “buffer did not fit into 32-bit DRAM region…” }>=
instead.Additionally, this entire allocation flow (flags, fallbacks, safety heap check) is complex—please add a doc comment above allocate_buffer explaining:
- The meaning of each BFRALLOC_* flag
- Which memory regions each path targets
- How the MIN_HEAP_SIZE safeguard works
File to update:
- wled00/util.cpp (around lines 720–740 for the macro check, and the function header around line 716 for documentation)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
wled00/FX_fcn.cpp
(8 hunks)wled00/fcn_declare.h
(2 hunks)wled00/util.cpp
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- wled00/fcn_declare.h
- wled00/FX_fcn.cpp
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: blazoncek
PR: wled/WLED#4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like `if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))` is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: DedeHai
PR: wled/WLED#4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
wled00/util.cpp (4)
Learnt from: blazoncek
PR: #4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))
is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: blazoncek
PR: #4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: DedeHai
PR: #4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: KrX3D
PR: #4585
File: usermods/seven_segment_display_reloaded_v2/seven_segment_display_reloaded_v2.h:121-136
Timestamp: 2025-04-27T09:37:28.415Z
Learning: Using PROGMEM for the seven-segment font array (umSSDRNumbers) in the WLED SSDR usermod causes compilation problems, so it should be left as a regular array.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: wled_build / Build Environments (lolin_s2_mini)
- GitHub Check: wled_build / Build Environments (esp32c3dev)
- GitHub Check: wled_build / Build Environments (usermods)
- GitHub Check: wled_build / Build Environments (nodemcuv2_compat)
- GitHub Check: wled_build / Build Environments (esp32s3dev_16MB_opi)
- GitHub Check: wled_build / Build Environments (esp32_wrover)
- GitHub Check: wled_build / Build Environments (esp32s3dev_8MB_opi)
- GitHub Check: wled_build / Build Environments (esp32s3_4M_qspi)
- GitHub Check: wled_build / Build Environments (esp01_1m_full_160)
- GitHub Check: wled_build / Build Environments (esp8266_2m_compat)
- GitHub Check: wled_build / Build Environments (esp01_1m_full_compat)
- GitHub Check: wled_build / Build Environments (esp32dev)
- GitHub Check: wled_build / Build Environments (esp32dev_V4)
- GitHub Check: wled_build / Build Environments (esp32_eth)
- GitHub Check: wled_build / Build Environments (nodemcuv2)
- GitHub Check: wled_build / Build Environments (nodemcuv2_160)
- GitHub Check: wled_build / Build Environments (esp8266_2m)
- GitHub Check: wled_build / Build Environments (esp01_1m_full)
- GitHub Check: wled_build / Build Environments (esp8266_2m_160)
🔇 Additional comments (2)
wled00/util.cpp (2)
624-658
: LGTM! Improved PSRAM allocation strategy with intelligent threshold-based decisions.The updated allocation functions now use explicit
MALLOC_CAP_INTERNAL
instead ofMALLOC_CAP_DEFAULT
and implement smart threshold logic that prefers DRAM for small allocations when heap is abundant, while falling back to PSRAM when needed. The newp_calloc
function maintains consistency with the existing allocation pattern.
688-696
: Verify the different heap threshold in d_calloc.The
d_calloc
function uses3*MIN_HEAP_SIZE
as the threshold whiled_malloc
andd_realloc
use2*MIN_HEAP_SIZE
. Is this intentional due to the additional overhead of zero-initialization, or should it be consistent?
Using my development ESP I get the following (excerpt):
When UI comes to life there is about 80k heap free.
|
heap is reported incorrectly on ESP32 unless you use this PR. |
That's this PR's code. ESP8266
|
Yes, pulled to my dev. |
ESP32 looks similar to what I saw in tests, heap should be increased by about 35k minus all ledbuffers |
ESP32 with PSRAM (rev.3) reports 75k/3972k heap/PSRAM. 32x32 matrix, 3 usermods, no MQTT. |
Debug will tell you what's happening. I would guess output failed to initialise. |
- Bugfix: `allocate_buffer` did double allocations because of a fallback else that should not be there: if no type is defined, should not allocate anything - Bugfix2: same place: checked for minheap instead of min cotiguous heap - updated comments to make the intent of each memory type clearer
Just for posterity: My tests with WROVER rev.3 ESP32 (4MB PSRAM) showed that FPS is running at 26 for 64x32 matrix using effect (Soap, Octopus) that uses effect data buffer. With buffers in DRAM FPS is at stable 42. |
Can you clarify: which buffers were in PSRAM? I also saw a FPS drop if effect-data is in PSRAM but only minor influence if just the pixel-buffers are in PSRAM. Since I have no ESP32 with PSRAM (Only S2, S3) this would be interesting to distinguish. |
Pixel buffers (effect data is always in DRAM). |
thanks for the clarification. This means that for ESP32 maybe the parameters need to be adjusted to keep pixelbuffers in DRAM and use PSRAM more carefully. |
@DedeHai I have to take my words back. It looks like my tests were flawed. EDIT: This is for rev.3 ESP32. rev.1 is still pending. |
@blazoncek There's also "things" that I've found during my extensive (and poorly documented) performance testing that were less intuitive. I'd have to rerun the tests (and actually write them down this time), but in general I've noticed some random combinations of compiler optimizations and other build alternations that caused bigger FPS losses than I'd expected. I remain positive on PSRAM because "works" versus "crash" is often down to just having enough of any kind of RAM. I haven't seen major impacts on speed, but also I approve of a lower speed over it crashing. Sure, 21845 pixels isn't the fastest on an original ESP32 CPU, but it works just fine when you give it enough RAM. (21845 isn't a random number. That's the actual max limit when you have 16-bit integers and RGB pixels. 😁 My ESP32-P4 build has mostly been modified to break that barrier - but I don't have 22000+ pixels at home so I can only test at the venue where we have an installation of that size in active use, ha.) |
Thanks @troyhacks for the input. I have been using code from this PR in my own branch (with a few modifications of my own) and only have positive experience. However I only tested set-up of 64x32 on rev.3 ESP32 WROVER (4MB PSRAM) and QSPI S3 (2MB PSRAM). FPS drop was barely noticeable when using PSRAM or IRAM and free heap was improved by the size of buffers which is good. IMO the minimum acceptable FPS for any effect is about 25-30, lower than that and you can see the lag. |
I will update this PR with some of the insights from @blazoncek but I need to wait on my WROVER to actually test the code. |
One of the things we tried with WLED-MM was to allocate the biggest memory spaces we thought we'd need right off the top at boot to avoid fragmentation. Then if we ended up needing more we'd reallocate... which could fail and crash - but on boot up it would allocate the new bigger memory spaces on boot. And you likely already know this, but there's Espressif code for checking "biggest allocation we can do with current fragmentation in a required heap type" which is much more reliable than "free memory" which absolutely lies to you because of fragmentation. This code also catches your heap issues and gives you a better sense of what's going on. I found a few spots during debugging when something tries to allocate a ton of RAM for no good reason. https://github.com/troyhacks/WLED/blob/mdev/wled00%2Fwled00.ino#L14 |
allocating a large chunk at bootup will only contribute to fragmentation in current AC implementation with large buffers for each segment. I ran many (many) tests and found that its hard if not impossible to try to control malloc(), that is why I tried defragmenting and that works much better than trying to be very smart about allocations, those are notoriously unpedictable due to wifi stack and other background processes constantly shuffeling heap around. |
I think what's being suggested is moving segment/data/pixel buffer allocation to a separate heap from the main malloc() heap, reserving the maximum amount of space available to use at bootup. The buffer heap can use a different allocator, and could potentially even do clever things like using "smarter" pointers to support moving buffers around to defragment internally. In fact, I think it might be possible to leverage the existing IDF multi heap logic by reserving some memory at compile time, then passing it in as new heap blocks via |
I did not know about |
I played around with |
Yup, that's exactly the tradeoff -- any reservation for pixel buffers we do has to guess at how much memory to leave for the rest of the system. The up side is that if it's correct, we won't accidentally OOM the core system when managing pixel buffers; the down side is that if it's wrong, the system is always unstable -- and this is especially true if it's just a little bit wrong. I don't feel strongly either way on this one, just throwing out all the options before we go to heroic efforts to tame the default allocator. Another task in the "memory management" corner is to figure out what's going on with the web server usage, both at rest and under load. Some systems (definitely seen this on S2s) seem to have an unexpectedly large at-rest allocation when the network comes up. On other systems (baseline ESP32) I've seen dynamic heap usage of over 60kb during a web transaction -- all released once the transaction is completed. I don't know where it's all going yet. ESP8266 can produce a web response with <10k of RAM -- what's the issue on the newer platforms? And of course graceful OOM handling is still a known issue. I've fixed a couple of cases in the web server since 2023, but I don't think I've got them all, and the "is it safe to handle or at least queue this request" logic is still weak. Also I have one last OOM-handling patch to upstream to AsyncTCP (pre-allocating error notification event objects, so they can't be dropped and leak connection handles). |
if the "UI stops responding if heap runs low" can be fixed, there is no need to reserve any memory. What I am seeing is that whenever heap runs below ~8k of contiguous free block-size I keep getting errors and it never recovers, even if the OOM event is just for a brief moment. |
quick test on ESP32 shows that reserving heap yields much worse results than not doing so. If reserving just a few kB too little the UI and especially settings do not work at all (settings are not saved or even reverted, UI stops responding ever so often). |
Investigating in more detail is on my to-do list. (I should probably open some issues for these to track them...) I don't currently consider it a burning issue -- if it breaks when you allocate too many LEDs/segments, thus far we've been able to get away with saying "just don't do that then". Truthfully, I'm not 100% sure it's solvable without going to IDF v5 and working with Espressif on it -- last I looked at it, while LwIP handles allocation failure gracefully, but IIRC the WiFi hardware driver tended to get confused and lock up. (I have a vague recollection when testing web server code that older versions would just immediately |
Good enough for me. We really could use some way of talking back in configuration error cases. "Not saving" is usually nothing more than "I couldn't apply that config for some reason, so I didn't." It was a little frustrating in the RMT-HI development as broken bus code could make the configuration "stuck" and unable to get out of a bad state. |
One possible way to tackle OOM issue is to prevent creation of new segments if available heap drops too low. That should also include starting transitions if enabled. This would also be the "cleanest" solution IMO. The only problem I see with this approach is the determination of "heap too low" condition as AsynTCP/AWS can eat up a hefty amount of heap while processing a request. Which is random by its nature. |
@blazoncek do you know how the |
|
So are you saying its a useless parameter and it should be a #ifdef? |
If you say so. 😄 |
Using an ifdef would save some flash. All p_alloc functions can be skipped |
Besides better use of PSRAM and free heap handling in general there is a huge improvement for ESP32: with these changes it can work up to 64x64 pixels reliably without PSRAM (UI starts failing at half that size currently). Even got it to work up to about 80x64 pixels but UI access becomes unstable
allocateData()
uses more elaborate DRAM checking to reduce fragmentation and allow for larger setups to run on low heapallocateData()
does not limit amount if PSRAM is available (put segment data in PSRAM if DRAM is depleted) -> prefer functionality over speedMAX_SEGMENT_DATA
to account for better segment data handlingPSRAM_THRESHOLD
to improve PSARM usage: anything smaller is kept in DRAM, anything larger is put in PSRAM depending on how much heap is available. If heap is plenty, no PSRAM is used_pixels
is always put in PSRAM if available (except if it fits in ESP32's 32bit IRAM)getFreeHeapSize()
andgetContiguousFreeHeap()
the code may need some polishing (add debug outputs) and some fine-tuning of parameters as I did only test for borderline large 2D setups to see how well it performs under memory starvation but tested on all platforms.
There is a significant improvement if a board has PSRAM: there is virtually no limit to effect data and the heap is kept healthy and it is very stable, even with 10 overlapping segments of 32x32 pixels each.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores