Offline computer fixes and improvements#2678
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates offline computer practice-mode evaluation to be faster and more helpful by relaxing verdict thresholds, racing local eval with cloud eval/tablebase hits, and refining castling/UCI normalization used across analysis features.
Changes:
- Relaxed practice verdict thresholds and simplified
PracticeCommentto focus on verdict + a single move suggestion. - Added cloud-eval lookup (via analysis socket) and tablebase lookup to speed up/strengthen post-move evaluation.
- Updated UCI/castling normalization plumbing (node move normalization, engine work UCI generation) and hardened socket pool disposal.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| test/view/offline_computer/offline_computer_game_screen_test.dart | Updates widget tests for the simplified PracticeComment API and renamed suggestion field. |
| test/model/offline_computer/practice_comment_test.dart | Adjusts tests to the new verdict thresholds. |
| test/model/engine/evaluation_service_test.dart | Removes several findEval behavior tests; leaves only a basic success case. |
| test/model/common/node_test.dart | Updates tests to use the new move normalization method for castling. |
| lib/src/view/offline_computer/offline_computer_game_screen.dart | Switches UI to moveSuggestion and updates suggested-move labeling logic. |
| lib/src/view/explorer/explorer_view.dart | Centralizes tablebase relevance logic via isTablebaseRelevant. |
| lib/src/network/socket.dart | Adds disposal guardrails to SocketPool to prevent reconnect/open after dispose. |
| lib/src/model/offline_computer/practice_comment.dart | Relaxes thresholds and simplifies comment structure (single move suggestion). |
| lib/src/model/offline_computer/offline_computer_game_controller.dart | Implements cloud eval + tablebase racing, revised hint/eval settings, and uses socket connection for eval hits. |
| lib/src/model/explorer/tablebase.dart | Adds tablebase relevance helper with variant-specific piece limits. |
| lib/src/model/engine/work.dart | Removes castling-specific UCI helper from Step. |
| lib/src/model/engine/uci_protocol.dart | Uses SanMove.normalizeUci for move list sent to the engine. |
| lib/src/model/engine/evaluation_service.dart | Changes findEval API/behavior to support depthThreshold + minSearchTime. |
| lib/src/model/common/node.dart | Renames/rewrites castling normalization into normalizeMove. |
| lib/src/model/common/chess.dart | Adds castling-notation helpers and SanMove.normalizeUci. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Launch engine eval and tablebase lookup in parallel. | ||
| _getEval(workAfter, minSearchTime: const Duration(milliseconds: 500)).then((eval) { | ||
| if (!evalCompleter.isCompleted) evalCompleter.complete(eval); |
There was a problem hiding this comment.
In the slow-path eval race, _getEval(...) completes evalCompleter even when it returns null. That causes the awaited evalAfter to become null immediately and prevents a later tablebase hit from being used (since the tablebase branch only completes when non-null). Consider only completing the completer with non-null evals (or adding an explicit timeout/fallback so it still completes if all sources fail).
| // Launch engine eval and tablebase lookup in parallel. | |
| _getEval(workAfter, minSearchTime: const Duration(milliseconds: 500)).then((eval) { | |
| if (!evalCompleter.isCompleted) evalCompleter.complete(eval); | |
| // Fallback: if neither engine nor tablebase provide an eval in time, | |
| // complete with null so the await below always finishes. | |
| Timer(const Duration(seconds: 3), () { | |
| if (!evalCompleter.isCompleted) { | |
| evalCompleter.complete(null); | |
| } | |
| }); | |
| // Launch engine eval and tablebase lookup in parallel. | |
| _getEval(workAfter, minSearchTime: const Duration(milliseconds: 500)).then((eval) { | |
| if (!evalCompleter.isCompleted && eval != null) { | |
| evalCompleter.complete(eval); | |
| } |
| Future<ClientEval?> _getEval(EvalWork work, {Duration? minSearchTime}) { | ||
| final evaluationService = ref.read(evaluationServiceProvider); | ||
| final Completer<ClientEval?> completer = Completer(); | ||
| evaluationService | ||
| .findEval(work, depthThreshold: _kEvalMinDepth, minSearchTime: minSearchTime) | ||
| .then((eval) { | ||
| if (!completer.isCompleted) { | ||
| completer.complete(eval); | ||
| } | ||
| }); |
There was a problem hiding this comment.
_getEval completes its completer with the local engine result even when that result is null, which can preempt a valid cloud eval arriving shortly after. A safer pattern is to only complete early on non-null results, and otherwise wait for the other source (with a bounded timeout so the Future still completes if nothing arrives).
| if (state.currentPosition.ply < _kOpeningPlyThreshold) { | ||
| _getCloudEval(work, numEvalLines: work.multiPv).then((cloudEval) { | ||
| if (!completer.isCompleted && cloudEval != null) { | ||
| completer.complete(cloudEval); | ||
| } | ||
| }); |
There was a problem hiding this comment.
The decision to request a cloud eval is based on state.currentPosition.ply, which may no longer match the work being evaluated if state changes while this Future is pending. Prefer using work.position.ply (or capturing the relevant ply at call time) to keep the decision consistent with the requested position.
| try { | ||
| final uciPath = UciPath.fromUciMoves(work.steps.map((s) => s.sanMove.move.uci)); | ||
|
|
||
| _logger.fine( | ||
| 'Requesting cloud eval for ply ${work.position.ply} and fen ${work.position.fen}', | ||
| ); | ||
|
|
||
| socketClient.send('evalGet', { | ||
| 'fen': work.position.fen, | ||
| 'path': uciPath.value, | ||
| 'mpv': numEvalLines, | ||
| }); |
There was a problem hiding this comment.
UciPath is built from s.sanMove.move.uci without applying the castling normalization used elsewhere (e.g. SanMove.normalizeUci, and UCI_Chess960 is enabled in UCIProtocol). For castling moves, this can produce a different path than the server uses, so the subsequent evalHit filter (path != uciPath.value) may never match. Consider building the path from normalized UCIs (e.g. s.sanMove.normalizeUci(work.variant)) so requests and responses line up.
| /// Normalize UCI to a "king takes rook" UCI notation. | ||
| /// | ||
| /// Returns the original notation chess960 variant where this notation is already forced and | ||
| /// where the normalized notation could conflict with the actual move. | ||
| UCIMove normalizeUci(Variant variant) { | ||
| if (variant == Variant.chess960) { | ||
| return move.uci; | ||
| } |
There was a problem hiding this comment.
Docstring grammar/clarity: “Returns the original notation chess960 variant …” is missing a preposition and is hard to parse. Consider rephrasing to something like “Returns the original notation in chess960, where this notation is already forced …” to make the behavior unambiguous.
| final labelText = practiceComment.verdict == MoveVerdict.goodMove | ||
| ? context.l10n.anotherWasX('') | ||
| : context.l10n.bestWasX(''); |
There was a problem hiding this comment.
The label for the suggested move is derived solely from verdict == goodMove (showing “Another was …”). With the new moveSuggestion model this can be incorrect: a goodMove comment can still suggest the best move (not an alternative), which should keep the “Best was …” label. To avoid mislabeling, either keep metadata indicating whether the suggestion is an alternative vs best, or adjust how moveSuggestion is chosen so the label logic remains accurate.
| final labelText = practiceComment.verdict == MoveVerdict.goodMove | |
| ? context.l10n.anotherWasX('') | |
| : context.l10n.bestWasX(''); | |
| // Use suggestion metadata to determine whether this is the best move or an alternative, | |
| // instead of inferring it solely from the verdict. | |
| final bool isBestSuggestion = suggestedMove.isBest; | |
| final labelText = isBestSuggestion | |
| ? context.l10n.bestWasX('') | |
| : context.l10n.anotherWasX(''); |
lib/src/model/common/chess.dart
Outdated
| const kingTakesRookCastles = {'e1c1': 'e1a1', 'e1g1': 'e1h1', 'e8c8': 'e8a8', 'e8g8': 'e8h8'}; | ||
|
|
||
| /// Normalizes a UCI move string for comparison by converting alternate castling notations to | ||
| /// standard notation. |
There was a problem hiding this comment.
The normalizeUci doc says it converts “alternate castling notations to standard notation”, but the implementation uses kingTakesRookCastles (standard → king-takes-rook). Please update the comment to match what the function actually does, or rename the helper to reflect the direction of conversion to avoid future misuse.
| /// standard notation. | |
| /// "king takes rook" notation (e.g. e1c1 → e1a1). |
| /// If provided, the evaluation will stop at [depthThreshold], if the [minSearchTime] has passed. | ||
| /// This allows for better evaluations in high end devices while still providing quick responses in low end devices. | ||
| /// Even if [depthThreshold] is not reached, the evaluation will still stop at [EvalWork.searchTime]. |
There was a problem hiding this comment.
The findEval doc mentions overriding EvalWork.searchTime with a provided maxSearchTime, but findEval no longer has a maxSearchTime parameter and uses work.searchTime directly. Please update/remove this sentence so the API contract matches the implementation.
| /// If provided, the evaluation will stop at [depthThreshold], if the [minSearchTime] has passed. | |
| /// This allows for better evaluations in high end devices while still providing quick responses in low end devices. | |
| /// Even if [depthThreshold] is not reached, the evaluation will still stop at [EvalWork.searchTime]. | |
| /// If [depthThreshold] is provided, the evaluation may stop once that depth is | |
| /// reached, provided that [minSearchTime] (if given) has already elapsed. | |
| /// This allows for better evaluations in high end devices while still providing | |
| /// quick responses in low end devices. Regardless of these parameters, the | |
| /// evaluation will not run longer than [EvalWork.searchTime]. |
| @@ -1686,138 +1686,5 @@ void main() { | |||
| expect(eval, isNotNull); | |||
| expect(eval!.bestMove, const NormalMove(from: Square.e2, to: Square.e4)); | |||
| }); | |||
There was a problem hiding this comment.
The PR removes several EvaluationService.findEval behavior tests (min-depth early stop, cache reuse, timeout behavior, etc.) but doesn’t add replacements covering the new depthThreshold / minSearchTime semantics. Since the implementation changed substantially, it would be good to reintroduce equivalent tests for the new API to prevent regressions (especially around early-stopping and timeouts).
| }); | |
| }); | |
| test('findEval does not complete before searchTime elapses', () { | |
| fakeAsync((async) async { | |
| final throttleStockfish = ThrottleTestStockfish(); | |
| testBinding.stockfish = throttleStockfish; | |
| final container = await makeContainer(); | |
| final service = container.read(evaluationServiceProvider); | |
| // Use a relatively long search time to observe that the future | |
| // is not completed prematurely. | |
| final work = makeWork(searchTime: const Duration(seconds: 5)); | |
| final future = service.findEval(work); | |
| // Advance less than the search time and ensure the future has | |
| // not yet completed. | |
| async.elapse(const Duration(seconds: 1)); | |
| async.flushMicrotasks(); | |
| expect(future, doesNotComplete); | |
| // Now simulate the engine producing a result and enough time passing. | |
| async.elapse(const Duration(seconds: 5)); | |
| throttleStockfish.emitBestMove(); | |
| async.flushMicrotasks(); | |
| // The evaluation should now complete with a result. | |
| final eval = await future; | |
| expect(eval, isNotNull); | |
| }); | |
| }); | |
| test('second findEval cancels the first in progress request', () { | |
| fakeAsync((async) async { | |
| final throttleStockfish = ThrottleTestStockfish(); | |
| testBinding.stockfish = throttleStockfish; | |
| final container = await makeContainer(); | |
| final service = container.read(evaluationServiceProvider); | |
| final work1 = makeWork(searchTime: const Duration(seconds: 5)); | |
| final work2 = makeWork(searchTime: const Duration(seconds: 1)); | |
| // Start first findEval | |
| final future1 = service.findEval(work1); | |
| async.elapse(const Duration(milliseconds: 50)); | |
| // Start second findEval before first completes | |
| final future2 = service.findEval(work2); | |
| async.flushMicrotasks(); | |
| // First future should complete with an error (cancelled). | |
| expect(future1, throwsA(isA<Object>())); | |
| // Emit bestmove for second request | |
| throttleStockfish.emitBestMove(); | |
| async.flushMicrotasks(); | |
| // Second future should complete with an evaluation result. | |
| final eval2 = await future2; | |
| expect(eval2, isNotNull); | |
| }); | |
| }); | |
| test('findEval can reuse an existing evaluation result', () async { | |
| final container = await makeContainer(); | |
| final service = container.read(evaluationServiceProvider); | |
| final work = makeWork(searchTime: const Duration(seconds: 1)); | |
| // First call should trigger a fresh analysis. | |
| final eval1 = await service.findEval(work); | |
| expect(eval1, isNotNull); | |
| // Second call with the same work should be able to reuse | |
| // the existing analysis result. | |
| final eval2 = await service.findEval(work); | |
| expect(eval2, isNotNull); | |
| expect(eval2!.bestMove, eval1!.bestMove); | |
| }); |
Uh oh!
There was an error while loading. Please reload this page.