Skip to content

feat: cross-machine usage aggregation + automatic sync#55

Open
Yeachan-Heo wants to merge 14 commits intojunhoyeo:mainfrom
Yeachan-Heo:feat/cross-machine-aggregation-sync
Open

feat: cross-machine usage aggregation + automatic sync#55
Yeachan-Heo wants to merge 14 commits intojunhoyeo:mainfrom
Yeachan-Heo:feat/cross-machine-aggregation-sync

Conversation

@Yeachan-Heo
Copy link
Copy Markdown
Contributor

@Yeachan-Heo Yeachan-Heo commented Dec 27, 2025

Summary

Problem Solved

  • Before: Machine A=1000 tokens, Machine B=500 tokens → Result: 500 (Machine A data LOST)
  • After: Machine A=1000 tokens, Machine B=500 tokens → Result: 1500 (properly aggregated)

New Commands

tokscale sync setup    # Set up hourly cron job
tokscale sync status   # Check if sync is active
tokscale sync remove   # Remove the cron job

Changes

  • Track per-device contributions in devices field
  • Added --quiet flag to tokscale submit for cron compatibility
  • Sync logs to ~/.config/tokscale/sync.log
  • Legacy data migrated to devices.__legacy__

Testing

  • 28 tests covering: cross-device aggregation, same-device replacement, legacy migration, multi-model scenarios

…e aggregation

- Add DeviceSourceData interface for per-device contribution tracking
- Add devices field to SourceBreakdownData type in helpers.ts and schema.ts
- Modify mergeSourceBreakdowns() to accept deviceId parameter
- Add recalculateSourceAggregate() helper to sum across devices
- Migrate existing data without devices field to __legacy__ device
- Update submit route to pass tokenRecord.tokenId as deviceId
- Add tests for same-device replacement, cross-device aggregation, and legacy migration
- Add sync.ts with setupSync(), removeSync(), syncStatus() functions
- Support crontab (macOS/Linux) and Task Scheduler (Windows)
- Use process.argv[1] for CLI path resolution
- Crontab entry uses --quiet flag for silent execution
- Log output to ~/.config/tokscale/sync.log
…ron cleanup

- Store devices[deviceId] on new day inserts to prevent double-counting
  when same device resubmits (was migrating to __legacy__ and adding)
- Fix shell injection risk in crontab setup using printf with escaped quotes
- Fix Windows Task Scheduler logging by wrapping in cmd.exe for redirections
- Use specific grep pattern 'tokscale submit --quiet' to avoid removing
  unrelated cron jobs
- Add tests for insert→resubmit flow to verify no double-counting
… direct tests

- Add ?? {} defensive defaults for models access in mergeSourceBreakdowns()
- Use TOKSCALE_SYNC_MANAGED marker for cron entries (prevents false matches)
- Windows sync now uses .cmd script file approach (simpler, no quoting issues)
- Add direct mergeSourceBreakdowns() function call tests for legacy data handling
…pe coercion

- Legacy migration now uses '__legacy__' device instead of current deviceId
- This preserves historical data separately from new device contributions
- Added Number(...) || 0 for all arithmetic in recalculateSourceAggregate
- Handles potential string values from JSON serialization
- Updated tests to expect __legacy__ entry for legacy data migrations
- Reset aggregates before empty devices check to prevent stale values
- Use Number() coercion in recalculateDayTotals for consistency
Convert all || 0 patterns to Number(...) || 0 to handle cases where
values may be strings from JSON parsing or database retrieval.
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Dec 27, 2025

@Yeachan-Heo is attempting to deploy a commit to the Inevitable Team on Vercel.

A member of the Team first needs to authorize it.

@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Dec 27, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
tokscale Ready Ready Preview, Comment Jan 6, 2026 10:31pm

@junhoyeo junhoyeo force-pushed the feat/cross-machine-aggregation-sync branch from f906e18 to 6591b0e Compare December 28, 2025 07:17
@junhoyeo junhoyeo self-requested a review December 28, 2025 07:22
@junhoyeo junhoyeo self-assigned this Dec 28, 2025
@junhoyeo
Copy link
Copy Markdown
Owner

Hey @Yeachan-Heo, thanks for opening a PR tackling two tasks. Appreciate the contribution despite the complexity! 🙌
I'll review and merge once everything looks good.

One heads-up: the commits and PR body had the "Co-authored-by: Junho Yeo <...>" footer attached for some reason, so I removed those and force-pushed. Just wanted to let you know!

- Windows sync: Mark as experimental/disabled pending security review
- bunx detection: Prevent sync setup from temp cache paths with clear error message
- Crontab security: Add path validation to prevent injection via control chars
- Legacy migration: Fix modelId→models conversion to preserve model breakdown

Closes issues identified in PR junhoyeo#55 review.
@junhoyeo
Copy link
Copy Markdown
Owner

🔧 Additional Fixes Applied

Addressed the security and stability issues identified in the review. Here's what was fixed:

1. Windows Sync → Experimental (Disabled)

  • Windows Task Scheduler support is now marked as experimental and disabled by default
  • Shows clear warning message directing users to track progress
  • setupSync, syncStatus, and removeSync all handle Windows gracefully

2. bunx Temp Path Detection

  • Added isBunxTempPath() to detect temporary cache paths like:
    • /tmp/bunx-501-tokscale/node_modules/.bin/tokscale
    • ~/.bun/install/cache/@tokscale/cli@x.x.x/...
    • /var/folders/.../T/bunx-... (macOS)
  • Shows clear error with install instructions when users try bunx tokscale sync setup

3. Crontab Security

  • Added validatePathSafety() to prevent crontab injection via:
    • Newlines (\n, \r)
    • Null bytes (\0)
    • % characters (cron converts to newlines!)
    • All control characters (ASCII 0-31, DEL 0x7F)
  • Validation runs before building cron entry

4. Legacy modelIdmodels Migration Bug (Critical)

  • Fixed the bug that was creating empty models: {} during legacy migration
  • Now properly handles:
    • Legacy data with modelId only → creates correct model breakdown
    • Legacy data with empty models: {} and modelId → falls back to modelId
    • Legacy data with populated models → uses existing models
  • Added unit tests for all migration scenarios

Files Changed

File Changes
packages/cli/src/sync.ts +148 lines (bunx detection, path validation, Windows experimental)
packages/frontend/src/lib/db/helpers.ts +23 lines (modelId migration fix)
packages/frontend/__tests__/api/submit.test.ts +53 lines (migration tests)

Commit

5f932fc fix: address security and stability issues in cross-machine sync

All changes are surgical and independently verifiable. Ready for re-review! 🚀

@junhoyeo
Copy link
Copy Markdown
Owner

@Yeachan-Heo fyi, I will hold this pr from merging before #64 -- there will be significant changes there!

@junhoyeo
Copy link
Copy Markdown
Owner

junhoyeo commented Jan 6, 2026

Legacy data with modelId only → creates correct model breakdown

As of today, no users have the deprecated modeld field!

Resolved conflicts:
- README.md: Keep main's structure (Overview before Contents), add Automatic Sync to TOC
- cli.ts: Combine knownCommands to include both 'sync' and 'pricing'
- submit.ts: Keep quiet flag feature with main's simplified structure
The submit command already has minimal necessary logs, and output is
redirected to log file for cron jobs anyway. --no-spinner exists for
scripts that need clean stdout.
@junhoyeo
Copy link
Copy Markdown
Owner

@cubic-dev-ai review this pull request

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai bot commented Jan 24, 2026

@cubic-dev-ai review this pull request

@junhoyeo I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 7 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/cli/src/sync.ts">

<violation number="1" location="packages/cli/src/sync.ts:26">
P2: Variable `path` shadows the imported `path` module from `node:path`. This creates confusion and potential runtime errors if the function is later modified to use `path.join()` or similar methods. Rename to `cliPath` or `execPath`.</violation>
</file>

<file name="packages/cli/src/cli.ts">

<violation number="1" location="packages/cli/src/cli.ts:461">
P2: The new `--interval` flag is accepted but ignored: `setupSync` does not use the interval and the cron entry is hard-coded to hourly. Either implement interval handling or remove the flag to avoid misleading users.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@BackGwa
Copy link
Copy Markdown

BackGwa commented Feb 16, 2026

It looks like there haven’t been updates on this PR lately. Is it still active?
I’d love to help move this forward if I can.

Yeachan-Heo added a commit to Yeachan-Heo/tokscale that referenced this pull request Feb 28, 2026
Port device-level tracking from PR junhoyeo#55 into the client-based structure.
Each client now tracks per-device contributions (keyed by API token ID)
to prevent double-counting when the same user submits from multiple
machines. On resubmit, only the device's data is replaced while other
devices' data is preserved. Legacy data without device tracking is
gracefully migrated to a __legacy__ device key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cross-machine usage not properly aggregated

3 participants