Implement chunked storage for IndexedDB to optimize memory usage#116
Implement chunked storage for IndexedDB to optimize memory usage#116Amanmahe wants to merge 2 commits intoupsidedownlabs:mainfrom
Conversation
- Replace single-file storage with chunk-based architecture - Add FileMetadata store for tracking file information - Implement DataChunks store with composite key [filename, chunkIndex] - Update write operations to append only to relevant chunks - Maintain backward compatibility with existing API - Improve performance for 500Hz continuous data streams - Reduce memory usage from O(n) to O(1) for append operations BREAKING CHANGE: Database version upgraded to 3. Existing data will be migrated automatically.
📝 WalkthroughWalkthroughRefactors IndexedDB storage to a chunked, metadata-driven model and wires a background worker into the UI for chunked writes/reads, ZIP/export, and deletion; also adds per-action processing/loading state and updates NPG‑Lite firmware link text/URL in the page UI. (≤50 words) Changes
Sequence Diagram(s)sequenceDiagram
participant UI as Client/UI
participant W as IndexedDB Worker
participant DB as IndexedDB
UI->>W: postMessage { action: "save", filename, data }
W->>DB: openIndexedDB() / start transaction
W->>DB: write DataChunks + upsert FileMetadata
DB-->>W: acknowledge commit
W-->>UI: postMessage { type: "writeComplete", filename, success }
sequenceDiagram
participant UI as Client/UI
participant W as IndexedDB Worker
participant DB as IndexedDB
participant ZIP as ZipGenerator
UI->>W: postMessage { action: "saveAllAsZip" }
W->>DB: openIndexedDB() / read FileMetadata list
loop for each file
W->>DB: read DataChunks for filename
DB-->>W: return chunks
W->>ZIP: convert chunks -> CSV blob
ZIP-->>W: returns CSV blob
end
W->>UI: postMessage { type: "saveAsZip", blobUrl }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
workers/indexedDBWorker.ts (2)
75-86:⚠️ Potential issue | 🟠 Major
deleteFileanddeleteAllcases lack error handling.Unlike other action handlers (
write,saveAsZip, etc.), these two cases have notry/catch. IfdeleteFilesByFilenameordeleteAllDataFromIndexedDBthrows, it becomes an unhandled rejection in the worker — the caller never receives a response and will hang on itsPromise.🛡️ Add error handling
case 'deleteFile': + try { if (!filename) { throw new Error('Filename is required for deleteFile action.'); } await deleteFilesByFilename(filename); handlePostMessage({ success: true, action: 'deleteFile' }); + } catch (error) { + handlePostMessage({ success: false, action: 'deleteFile', error: String(error) }); + } break; case 'deleteAll': + try { await deleteAllDataFromIndexedDB(); handlePostMessage({ success: true, action: 'deleteAll' }); + } catch (error) { + handlePostMessage({ success: false, action: 'deleteAll', error: String(error) }); + } break;
22-90:⚠️ Potential issue | 🟠 MajorMissing handler for
checkExistenceaction — regression from refactor.Both
src/components/Connection.tsx(line 410) andsrc/app/npg-lite/page.tsx(line 789) send{ action: 'checkExistence', ... }to this worker, but the switch statement has no corresponding case. This falls through to the default case, which responds with{ error: 'Invalid action' }back to the caller. While the new chunked architecture handles file existence implicitly viagetFileMetadata(), the stale caller code was not removed during the refactor. Remove thecheckExistencecalls from both components, or add a handler to acknowledge the action.
🤖 Fix all issues with AI agents
In `@workers/indexedDBWorker.ts`:
- Around line 500-501: Both deleteFilesByFilename and deleteAllDataFromIndexedDB
currently call indexedDB.open("ChordsRecordings", 3) again; change both
functions to accept an existing IDBDatabase parameter (e.g., db: IDBDatabase)
instead of opening a new connection, update their internal logic to use that db,
and update the caller in self.onmessage to pass the already-open db handle
(consistent with writeToIndexedDB and readFileData); also update any other call
sites (lines around 547–548) to pass the db parameter to avoid redundant opens
and version-mismatch risks.
- Around line 580-595: Duplicate schema-creation logic is implemented in the
dbRequest.onupgradeneeded handler here, which diverges from the openIndexedDB
implementation (openIndexedDB) and risks getting out of sync for stores
"FileMetadata" and "DataChunks"; remove this duplicated onupgradeneeded block
and instead reuse the centralized upgrade logic in openIndexedDB (or have
dbRequest use the same helper that creates/updates the FileMetadata and
DataChunks stores) so schema creation lives in one place and both callers invoke
that single function.
- Around line 198-201: Add an early guard that returns when data.length === 0 to
avoid computing negative endChunk and performing a needless metadata
transaction; specifically, before computing
startIndex/endIndex/startChunk/endChunk (and before the loop that writes chunks
and the metadata update that uses metadata.totalRecords), check if data.length
=== 0 and return early so no chunk math or metadata += 0 update occurs. Ensure
the guard references the same variables (data, metadata, CHUNK_SIZE) and
prevents running the for-loop and subsequent metadata update.
- Around line 124-127: The empty migration block in the onupgradeneeded handler
(checking event.oldVersion < 2) will drop data from the old "ChordsRecordings"
store; implement a real migration: open a transaction that reads all entries
from the "ChordsRecordings" object store, transform each recording into the new
chunked representation (split audio/data into chunks and create corresponding
metadata), and write those chunks and metadata into the new stores within the
same upgrade transaction, or if you prefer destructive behavior, explicitly
delete the old store and emit a clear log/notification indicating old data was
removed; update the onupgradeneeded handler to perform one of these paths and
ensure it references event.oldVersion, the "ChordsRecordings" store, and the new
chunk/metadata stores so the migration happens atomically during upgrade.
- Around line 197-268: The current writeToIndexedDB logic uses
performIndexDBTransaction per chunk (reads and writes), causing non-atomic
partial commits and poor performance; refactor to open a single readwrite
transaction over both stores ("FileMetadata" and "DataChunks") for the whole
operation (from calculating startChunk..endChunk through all get/put operations)
so all chunk reads, chunk writes (existingChunk and newChunk handling) and the
metadata update occur atomically, and change the chunk read (store.get) to use
readonly when you perform isolated reads outside that batch; locate usages of
performIndexDBTransaction, the loop variables (startChunk, endChunk,
chunkIndex), existingChunk/newChunk logic and metadata.totalRecords to implement
the single multi-store transaction and update metadata.totalRecords only once
before committing.
🧹 Nitpick comments (4)
workers/indexedDBWorker.ts (4)
111-114: Redundant index on thekeyPathfield.The
FileMetadatastore usesfilenameas itskeyPath(line 112), which already provides unique lookups by filename. Creating an additional index onfilename(line 113) is unnecessary and wastes storage.♻️ Remove redundant index
if (!db.objectStoreNames.contains("FileMetadata")) { const metadataStore = db.createObjectStore("FileMetadata", { keyPath: "filename" }); - metadataStore.createIndex("filename", "filename", { unique: true }); }
284-311: Sequential single-chunk transactions are inefficient for reads.
readFileDataopens a separatereadonlytransaction for every chunk. A single transaction with sequentialgetcalls (or a cursor on thebyFilenameindex) would be both faster and guarantee a consistent snapshot.♻️ Use a single transaction
const readFileData = async (db: IDBDatabase, filename: string): Promise<number[][]> => { const metadata = await getFileMetadata(db, filename); const allData: number[][] = []; - - for (let chunkIndex = 0; chunkIndex < metadata.totalChunks; chunkIndex++) { - const chunk = await performIndexDBTransaction( - db, - "DataChunks", - "readonly", - (store) => { - return new Promise<any>((resolve, reject) => { - const key = [filename, chunkIndex]; - const request = store.get(key); - request.onsuccess = () => resolve(request.result); - request.onerror = () => reject(request.error); - }); - } - ); - - if (chunk && chunk.data) { - allData.push(...chunk.data); - } - } + + const tx = db.transaction("DataChunks", "readonly"); + const store = tx.objectStore("DataChunks"); + + for (let chunkIndex = 0; chunkIndex < metadata.totalChunks; chunkIndex++) { + const chunk = await new Promise<any>((resolve, reject) => { + const request = store.get([filename, chunkIndex]); + request.onsuccess = () => resolve(request.result); + request.onerror = () => reject(request.error); + }); + if (chunk?.data) { + allData.push(...chunk.data); + } + } return allData; };
47-55: Dead branch:action === 'getAllData'is alwaysfalsehere.This code is inside the
case 'getFileCountFromIndexedDB'block, so the ternary on line 49 always resolves togetFileCountFromIndexedDB. There is no separate'getAllData'case in the switch. IfgetAllDataFromIndexedDBis still needed, add a dedicated case for it; otherwise, remove the dead ternary.♻️ Simplify
case 'getFileCountFromIndexedDB': try { - const dataMethod = action === 'getAllData' ? getAllDataFromIndexedDB : getFileCountFromIndexedDB; - const allData = await dataMethod(db); + const allData = await getFileCountFromIndexedDB(db); handlePostMessage({ allData }); } catch (error) {
384-401:saveAllDataAsZipandsaveDataByFilenameopen their own DB connections instead of accepting the one from the message handler.Same issue as with the delete functions — the
dbopened at line 12 inself.onmessageis available but not passed to these functions. Consider acceptingdbas a parameter for consistency and to avoid redundant connections.Also applies to: 427-447
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/app/npg-lite/page.tsx (1)
88-90:⚠️ Potential issue | 🟠 Major
recordingBuffersis recreated on every render — data written to it may be lost.
recordingBuffersis declared as a plain local variable (not auseRef), so it's re-initialized to empty arrays on every re-render. SinceprocessSamplecaptures it via closure (andprocessSampleis recreated viauseCallbackwith dependencies), buffer data can be lost when the component re-renders mid-recording.This is likely a pre-existing issue, but the new
stopRecordingflow that flushes remaining buffer data (lines 972–979) is particularly sensitive to it — it may flush an empty, freshly-allocated buffer.Suggested direction
- const NUM_BUFFERS = 4; - const recordingBuffers = Array(NUM_BUFFERS) - .fill(null) - .map(() => [] as number[][]); + const NUM_BUFFERS = 4; + const recordingBuffersRef = useRef<number[][][]>( + Array(NUM_BUFFERS).fill(null).map(() => [] as number[][]) + ); + const recordingBuffers = recordingBuffersRef.current;
🤖 Fix all issues with AI agents
In `@src/app/npg-lite/page.tsx`:
- Around line 986-989: The await is ineffective because
getFileCountFromIndexedDB posts a message and returns void; change it to return
a Promise that resolves when the worker responds (or create a Promise wrapper at
the call site) so the caller actually awaits dataset refresh before clearing
isProcessingRecording; update getFileCountFromIndexedDB (or add a new
getFileCountFromIndexedDBAsync) to attach a one-off response listener or use a
request/response id, resolve the Promise when the worker's response arrives, and
then await that Promise before calling setIsProcessingRecording(false).
- Line 982: The linter flags the arrow callback in recordingBuffers.forEach
because the assignment expression "buffer.length = 0" returns a value; change
the callback to use a block body and a statement so it doesn't return anything
(e.g., replace the current inline arrow with a block arrow that sets
buffer.length = 0 inside braces), or alternatively replace the forEach call with
an explicit for...of loop; locate the call to recordingBuffers.forEach and
update it accordingly.
- Around line 761-816: The onmessage handler currently ignores messages that
lack an action (worker posts {error: msg}), so add a fallback branch in
workerRef.current.onmessage that checks for event.data.error when action is
falsy or unmatched and then: call toast.error with the error text, reset
relevant loading states (call setIsDownloadingFile for the filename if present,
setIsDownloadingAll(false), setIsDeletingFile for filename,
setIsDeletingAll(false) as appropriate), and refresh datasets when a delete
failure affects state (call getFileCountFromIndexedDB() or setDatasets([]) as
needed). Update the default/fallback to reference the existing symbols used in
the switch (saveAsZip, deleteFile, deleteAll, setIsDownloadingFile,
setIsDownloadingAll, setIsDeletingFile, setIsDeletingAll, toast.error,
getFileCountFromIndexedDB, setDatasets) so errors are handled even when the
worker omits action.
- Around line 836-872: The processBuffer promise currently only resolves on any
writeComplete message and never rejects or times out; update processBuffer to
reject when the worker responds with success: false (inspect msgSuccess in the
handleMessage for action 'write'/'writeComplete') and to implement a
configurable timeout (e.g., using setTimeout) that rejects if no response
arrives within a reasonable interval; ensure you remove the message listener and
clear the timeout on both success and failure paths and when timing out, so
workerRef event listeners are always cleaned up and stopRecording can handle
failures instead of hanging.
In `@workers/indexedDBWorker.ts`:
- Around line 370-394: The bug is that the truthy check "if (channel && item[i +
1] !== undefined)" treats a valid channel value of 0 as falsy and drops its
data; update the condition in the selectedChannels.map callback (the block that
builds filteredRow) to explicitly check for presence (e.g., channel !==
undefined && channel !== null) before using item[i + 1], keep the item[i + 1]
!== undefined check, and preserve the existing warning fallback when the value
is missing.
- Around line 88-98: The deleteFile and deleteAll branches inside the async
self.onmessage handler can throw (e.g., deleteFilesByFilename,
deleteAllDataFromIndexedDB) which causes unhandled promise rejections and kills
the worker; wrap the logic in the 'deleteFile' and 'deleteAll' cases in
try/catch blocks, call the existing handlePostMessage to return an error
response (e.g., { success: false, action: 'deleteFile'|'deleteAll', error:
err.message }) instead of rethrowing, and ensure any early validation (like
missing filename) also results in a handled error response rather than throwing;
reference deleteFilesByFilename, deleteAllDataFromIndexedDB, self.onmessage, and
handlePostMessage when implementing the fix.
🧹 Nitpick comments (2)
workers/indexedDBWorker.ts (2)
8-12: DB is opened for every worker message, even non-DB actions.
openIndexedDB()is called unconditionally at line 12 for every incoming message, includingsetCanvasCountandsetSelectedChannelswhich don't need a database connection. Additionally,saveAllDataAsZip(line 403) andsaveDataByFilename(line 449) open their own separate DB connections, ignoring thedbhandle from line 12.Consider opening the DB lazily (once, cached) or only for actions that require it.
299-361: Tree-style merge still materializes all data into memory via spread operators.The
mergeArrayshelper uses[...a, ...b](line 304), which copies all elements. For a 1-hour recording at 500 Hz (1.8M rows), the tree merge still allocates ~O(n log n) temporary arrays. A simplerArray.prototype.concator a single flat pass would be more efficient:// After loading all chunks, just flatten once: return chunkBuffers.flat();This avoids all intermediate array copies.
Problem
Current IndexedDB implementation rewrites entire file on each append, causing:
Solution
Implement chunked storage architecture:
Changes
FileMetadataandDataChunksstores[filename, chunkIndex]writeToIndexedDBfor chunked appendPerformance Impact
@lorforlinux , Please review and merge it.
Summary by CodeRabbit
New Features
Improvements