Skip to content

Implement chunked storage for IndexedDB to optimize memory usage#116

Open
Amanmahe wants to merge 2 commits intoupsidedownlabs:mainfrom
Amanmahe:main
Open

Implement chunked storage for IndexedDB to optimize memory usage#116
Amanmahe wants to merge 2 commits intoupsidedownlabs:mainfrom
Amanmahe:main

Conversation

@Amanmahe
Copy link
Member

@Amanmahe Amanmahe commented Feb 11, 2026

Problem

Current IndexedDB implementation rewrites entire file on each append, causing:

  • O(n) memory usage growth
  • 500ms+ per write operation
  • Cannot handle >1 hour recordings at 500Hz

Solution

Implement chunked storage architecture:

  • Split data into 1000-array chunks
  • Append only to last chunk
  • Separate metadata tracking
  • Memory usage: O(1) vs O(n)

Changes

  • Added FileMetadata and DataChunks stores
  • Composite key: [filename, chunkIndex]
  • Updated writeToIndexedDB for chunked append
  • Maintained all existing APIs
  • DB version: 2 → 3 with auto-migration

Performance Impact

Duration Before After
1 min 30MB 2KB
1 hour 1.8GB 2KB

@lorforlinux , Please review and merge it.

Summary by CodeRabbit

  • New Features

    • Per-action loading indicators and disabled states for recording, downloading, ZIP creation, and delete actions.
    • Inline status indicator in file actions to show ongoing saves.
  • Improvements

    • More reliable local storage with faster listing, download and bulk-export (ZIP) flows.
    • Updated NPG-Lite Flasher Web link and clarified battery/firmware messaging.

- Replace single-file storage with chunk-based architecture
- Add FileMetadata store for tracking file information
- Implement DataChunks store with composite key [filename, chunkIndex]
- Update write operations to append only to relevant chunks
- Maintain backward compatibility with existing API
- Improve performance for 500Hz continuous data streams
- Reduce memory usage from O(n) to O(1) for append operations

BREAKING CHANGE: Database version upgraded to 3. Existing data will be migrated automatically.
@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

📝 Walkthrough

Walkthrough

Refactors IndexedDB storage to a chunked, metadata-driven model and wires a background worker into the UI for chunked writes/reads, ZIP/export, and deletion; also adds per-action processing/loading state and updates NPG‑Lite firmware link text/URL in the page UI. (≤50 words)

Changes

Cohort / File(s) Summary
NPG‑Lite UI
src/app/npg-lite/page.tsx
Added processing/loading state flags for recording, download, delete, and ZIP flows; integrated worker messaging for save/list/download/delete; updated recording pipeline to await worker acknowledgements; replaced firmware link URL/text; wired per-item and global loaders.
IndexedDB Worker: chunked storage
workers/indexedDBWorker.ts
Replaced single-record model with chunked FileMetadata/DataChunks stores (schema v3); added openIndexedDB, get/update metadata, writeToIndexedDB (chunking), readFileData (reassemble), saveDataByFilename, saveAllDataAsZip, getFileCountFromIndexedDB, deleteFilesByFilename, deleteAllDataFromIndexedDB; enhanced action payloads and error handling.

Sequence Diagram(s)

sequenceDiagram
  participant UI as Client/UI
  participant W as IndexedDB Worker
  participant DB as IndexedDB

  UI->>W: postMessage { action: "save", filename, data }
  W->>DB: openIndexedDB() / start transaction
  W->>DB: write DataChunks + upsert FileMetadata
  DB-->>W: acknowledge commit
  W-->>UI: postMessage { type: "writeComplete", filename, success }
Loading
sequenceDiagram
  participant UI as Client/UI
  participant W as IndexedDB Worker
  participant DB as IndexedDB
  participant ZIP as ZipGenerator

  UI->>W: postMessage { action: "saveAllAsZip" }
  W->>DB: openIndexedDB() / read FileMetadata list
  loop for each file
    W->>DB: read DataChunks for filename
    DB-->>W: return chunks
    W->>ZIP: convert chunks -> CSV blob
    ZIP-->>W: returns CSV blob
  end
  W->>UI: postMessage { type: "saveAsZip", blobUrl }
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 In burrows of bytes I hop and hum,
Chunks tuck tight where data come from,
Metadata maps my secret trail,
Zip and download — I wag my tail,
Hooray — saved, snacked, and on I run! 🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (2 files):

⚔️ src/app/npg-lite/page.tsx (content)
⚔️ workers/indexedDBWorker.ts (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Implement chunked storage for IndexedDB to optimize memory usage' clearly and concisely summarizes the main objective of the PR—introducing chunked storage to optimize memory usage for IndexedDB operations.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch main
  • Post resolved changes as copyable diffs in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
workers/indexedDBWorker.ts (2)

75-86: ⚠️ Potential issue | 🟠 Major

deleteFile and deleteAll cases lack error handling.

Unlike other action handlers (write, saveAsZip, etc.), these two cases have no try/catch. If deleteFilesByFilename or deleteAllDataFromIndexedDB throws, it becomes an unhandled rejection in the worker — the caller never receives a response and will hang on its Promise.

🛡️ Add error handling
     case 'deleteFile':
+      try {
         if (!filename) {
           throw new Error('Filename is required for deleteFile action.');
         }
         await deleteFilesByFilename(filename);
         handlePostMessage({ success: true, action: 'deleteFile' });
+      } catch (error) {
+        handlePostMessage({ success: false, action: 'deleteFile', error: String(error) });
+      }
       break;

     case 'deleteAll':
+      try {
         await deleteAllDataFromIndexedDB();
         handlePostMessage({ success: true, action: 'deleteAll' });
+      } catch (error) {
+        handlePostMessage({ success: false, action: 'deleteAll', error: String(error) });
+      }
       break;

22-90: ⚠️ Potential issue | 🟠 Major

Missing handler for checkExistence action — regression from refactor.

Both src/components/Connection.tsx (line 410) and src/app/npg-lite/page.tsx (line 789) send { action: 'checkExistence', ... } to this worker, but the switch statement has no corresponding case. This falls through to the default case, which responds with { error: 'Invalid action' } back to the caller. While the new chunked architecture handles file existence implicitly via getFileMetadata(), the stale caller code was not removed during the refactor. Remove the checkExistence calls from both components, or add a handler to acknowledge the action.

🤖 Fix all issues with AI agents
In `@workers/indexedDBWorker.ts`:
- Around line 500-501: Both deleteFilesByFilename and deleteAllDataFromIndexedDB
currently call indexedDB.open("ChordsRecordings", 3) again; change both
functions to accept an existing IDBDatabase parameter (e.g., db: IDBDatabase)
instead of opening a new connection, update their internal logic to use that db,
and update the caller in self.onmessage to pass the already-open db handle
(consistent with writeToIndexedDB and readFileData); also update any other call
sites (lines around 547–548) to pass the db parameter to avoid redundant opens
and version-mismatch risks.
- Around line 580-595: Duplicate schema-creation logic is implemented in the
dbRequest.onupgradeneeded handler here, which diverges from the openIndexedDB
implementation (openIndexedDB) and risks getting out of sync for stores
"FileMetadata" and "DataChunks"; remove this duplicated onupgradeneeded block
and instead reuse the centralized upgrade logic in openIndexedDB (or have
dbRequest use the same helper that creates/updates the FileMetadata and
DataChunks stores) so schema creation lives in one place and both callers invoke
that single function.
- Around line 198-201: Add an early guard that returns when data.length === 0 to
avoid computing negative endChunk and performing a needless metadata
transaction; specifically, before computing
startIndex/endIndex/startChunk/endChunk (and before the loop that writes chunks
and the metadata update that uses metadata.totalRecords), check if data.length
=== 0 and return early so no chunk math or metadata += 0 update occurs. Ensure
the guard references the same variables (data, metadata, CHUNK_SIZE) and
prevents running the for-loop and subsequent metadata update.
- Around line 124-127: The empty migration block in the onupgradeneeded handler
(checking event.oldVersion < 2) will drop data from the old "ChordsRecordings"
store; implement a real migration: open a transaction that reads all entries
from the "ChordsRecordings" object store, transform each recording into the new
chunked representation (split audio/data into chunks and create corresponding
metadata), and write those chunks and metadata into the new stores within the
same upgrade transaction, or if you prefer destructive behavior, explicitly
delete the old store and emit a clear log/notification indicating old data was
removed; update the onupgradeneeded handler to perform one of these paths and
ensure it references event.oldVersion, the "ChordsRecordings" store, and the new
chunk/metadata stores so the migration happens atomically during upgrade.
- Around line 197-268: The current writeToIndexedDB logic uses
performIndexDBTransaction per chunk (reads and writes), causing non-atomic
partial commits and poor performance; refactor to open a single readwrite
transaction over both stores ("FileMetadata" and "DataChunks") for the whole
operation (from calculating startChunk..endChunk through all get/put operations)
so all chunk reads, chunk writes (existingChunk and newChunk handling) and the
metadata update occur atomically, and change the chunk read (store.get) to use
readonly when you perform isolated reads outside that batch; locate usages of
performIndexDBTransaction, the loop variables (startChunk, endChunk,
chunkIndex), existingChunk/newChunk logic and metadata.totalRecords to implement
the single multi-store transaction and update metadata.totalRecords only once
before committing.
🧹 Nitpick comments (4)
workers/indexedDBWorker.ts (4)

111-114: Redundant index on the keyPath field.

The FileMetadata store uses filename as its keyPath (line 112), which already provides unique lookups by filename. Creating an additional index on filename (line 113) is unnecessary and wastes storage.

♻️ Remove redundant index
       if (!db.objectStoreNames.contains("FileMetadata")) {
         const metadataStore = db.createObjectStore("FileMetadata", { keyPath: "filename" });
-        metadataStore.createIndex("filename", "filename", { unique: true });
       }

284-311: Sequential single-chunk transactions are inefficient for reads.

readFileData opens a separate readonly transaction for every chunk. A single transaction with sequential get calls (or a cursor on the byFilename index) would be both faster and guarantee a consistent snapshot.

♻️ Use a single transaction
 const readFileData = async (db: IDBDatabase, filename: string): Promise<number[][]> => {
   const metadata = await getFileMetadata(db, filename);
   const allData: number[][] = [];
-  
-  for (let chunkIndex = 0; chunkIndex < metadata.totalChunks; chunkIndex++) {
-    const chunk = await performIndexDBTransaction(
-      db,
-      "DataChunks",
-      "readonly",
-      (store) => {
-        return new Promise<any>((resolve, reject) => {
-          const key = [filename, chunkIndex];
-          const request = store.get(key);
-          request.onsuccess = () => resolve(request.result);
-          request.onerror = () => reject(request.error);
-        });
-      }
-    );
-    
-    if (chunk && chunk.data) {
-      allData.push(...chunk.data);
-    }
-  }
+
+  const tx = db.transaction("DataChunks", "readonly");
+  const store = tx.objectStore("DataChunks");
+
+  for (let chunkIndex = 0; chunkIndex < metadata.totalChunks; chunkIndex++) {
+    const chunk = await new Promise<any>((resolve, reject) => {
+      const request = store.get([filename, chunkIndex]);
+      request.onsuccess = () => resolve(request.result);
+      request.onerror = () => reject(request.error);
+    });
+    if (chunk?.data) {
+      allData.push(...chunk.data);
+    }
+  }
   
   return allData;
 };

47-55: Dead branch: action === 'getAllData' is always false here.

This code is inside the case 'getFileCountFromIndexedDB' block, so the ternary on line 49 always resolves to getFileCountFromIndexedDB. There is no separate 'getAllData' case in the switch. If getAllDataFromIndexedDB is still needed, add a dedicated case for it; otherwise, remove the dead ternary.

♻️ Simplify
     case 'getFileCountFromIndexedDB':
       try {
-        const dataMethod = action === 'getAllData' ? getAllDataFromIndexedDB : getFileCountFromIndexedDB;
-        const allData = await dataMethod(db);
+        const allData = await getFileCountFromIndexedDB(db);
         handlePostMessage({ allData });
       } catch (error) {

384-401: saveAllDataAsZip and saveDataByFilename open their own DB connections instead of accepting the one from the message handler.

Same issue as with the delete functions — the db opened at line 12 in self.onmessage is available but not passed to these functions. Consider accepting db as a parameter for consistency and to avoid redundant connections.

Also applies to: 427-447

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/app/npg-lite/page.tsx (1)

88-90: ⚠️ Potential issue | 🟠 Major

recordingBuffers is recreated on every render — data written to it may be lost.

recordingBuffers is declared as a plain local variable (not a useRef), so it's re-initialized to empty arrays on every re-render. Since processSample captures it via closure (and processSample is recreated via useCallback with dependencies), buffer data can be lost when the component re-renders mid-recording.

This is likely a pre-existing issue, but the new stopRecording flow that flushes remaining buffer data (lines 972–979) is particularly sensitive to it — it may flush an empty, freshly-allocated buffer.

Suggested direction
-    const NUM_BUFFERS = 4;
-    const recordingBuffers = Array(NUM_BUFFERS)
-        .fill(null)
-        .map(() => [] as number[][]);
+    const NUM_BUFFERS = 4;
+    const recordingBuffersRef = useRef<number[][][]>(
+        Array(NUM_BUFFERS).fill(null).map(() => [] as number[][])
+    );
+    const recordingBuffers = recordingBuffersRef.current;
🤖 Fix all issues with AI agents
In `@src/app/npg-lite/page.tsx`:
- Around line 986-989: The await is ineffective because
getFileCountFromIndexedDB posts a message and returns void; change it to return
a Promise that resolves when the worker responds (or create a Promise wrapper at
the call site) so the caller actually awaits dataset refresh before clearing
isProcessingRecording; update getFileCountFromIndexedDB (or add a new
getFileCountFromIndexedDBAsync) to attach a one-off response listener or use a
request/response id, resolve the Promise when the worker's response arrives, and
then await that Promise before calling setIsProcessingRecording(false).
- Line 982: The linter flags the arrow callback in recordingBuffers.forEach
because the assignment expression "buffer.length = 0" returns a value; change
the callback to use a block body and a statement so it doesn't return anything
(e.g., replace the current inline arrow with a block arrow that sets
buffer.length = 0 inside braces), or alternatively replace the forEach call with
an explicit for...of loop; locate the call to recordingBuffers.forEach and
update it accordingly.
- Around line 761-816: The onmessage handler currently ignores messages that
lack an action (worker posts {error: msg}), so add a fallback branch in
workerRef.current.onmessage that checks for event.data.error when action is
falsy or unmatched and then: call toast.error with the error text, reset
relevant loading states (call setIsDownloadingFile for the filename if present,
setIsDownloadingAll(false), setIsDeletingFile for filename,
setIsDeletingAll(false) as appropriate), and refresh datasets when a delete
failure affects state (call getFileCountFromIndexedDB() or setDatasets([]) as
needed). Update the default/fallback to reference the existing symbols used in
the switch (saveAsZip, deleteFile, deleteAll, setIsDownloadingFile,
setIsDownloadingAll, setIsDeletingFile, setIsDeletingAll, toast.error,
getFileCountFromIndexedDB, setDatasets) so errors are handled even when the
worker omits action.
- Around line 836-872: The processBuffer promise currently only resolves on any
writeComplete message and never rejects or times out; update processBuffer to
reject when the worker responds with success: false (inspect msgSuccess in the
handleMessage for action 'write'/'writeComplete') and to implement a
configurable timeout (e.g., using setTimeout) that rejects if no response
arrives within a reasonable interval; ensure you remove the message listener and
clear the timeout on both success and failure paths and when timing out, so
workerRef event listeners are always cleaned up and stopRecording can handle
failures instead of hanging.

In `@workers/indexedDBWorker.ts`:
- Around line 370-394: The bug is that the truthy check "if (channel && item[i +
1] !== undefined)" treats a valid channel value of 0 as falsy and drops its
data; update the condition in the selectedChannels.map callback (the block that
builds filteredRow) to explicitly check for presence (e.g., channel !==
undefined && channel !== null) before using item[i + 1], keep the item[i + 1]
!== undefined check, and preserve the existing warning fallback when the value
is missing.
- Around line 88-98: The deleteFile and deleteAll branches inside the async
self.onmessage handler can throw (e.g., deleteFilesByFilename,
deleteAllDataFromIndexedDB) which causes unhandled promise rejections and kills
the worker; wrap the logic in the 'deleteFile' and 'deleteAll' cases in
try/catch blocks, call the existing handlePostMessage to return an error
response (e.g., { success: false, action: 'deleteFile'|'deleteAll', error:
err.message }) instead of rethrowing, and ensure any early validation (like
missing filename) also results in a handled error response rather than throwing;
reference deleteFilesByFilename, deleteAllDataFromIndexedDB, self.onmessage, and
handlePostMessage when implementing the fix.
🧹 Nitpick comments (2)
workers/indexedDBWorker.ts (2)

8-12: DB is opened for every worker message, even non-DB actions.

openIndexedDB() is called unconditionally at line 12 for every incoming message, including setCanvasCount and setSelectedChannels which don't need a database connection. Additionally, saveAllDataAsZip (line 403) and saveDataByFilename (line 449) open their own separate DB connections, ignoring the db handle from line 12.

Consider opening the DB lazily (once, cached) or only for actions that require it.


299-361: Tree-style merge still materializes all data into memory via spread operators.

The mergeArrays helper uses [...a, ...b] (line 304), which copies all elements. For a 1-hour recording at 500 Hz (1.8M rows), the tree merge still allocates ~O(n log n) temporary arrays. A simpler Array.prototype.concat or a single flat pass would be more efficient:

// After loading all chunks, just flatten once:
return chunkBuffers.flat();

This avoids all intermediate array copies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant