Skip to content

Conversation

@sr2echa
Copy link
Member

@sr2echa sr2echa commented Sep 5, 2025

Summary

  • Improved the backend with v2 schema for tooling
  • Improved Chat Interface with Tabs

TODO

  • Media Analysis
  • Multi Agentic calls
  • Rearrange and position, layout assets
  • Asset aware suggestions and effects
  • Templates?

Far fetched...

  • A/B Testing

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @sr2echa, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a substantial upgrade to the AI assistant and the chat experience within the video editor. The backend now leverages a more flexible and powerful V2 schema for AI tool calls, enabling more sophisticated and accurate interactions with the editing timeline. Concurrently, the frontend ChatBox has been reimagined as a multi-tabbed interface, offering enhanced message management, context-aware actions, and a searchable history. These improvements collectively aim to provide users with a more intuitive and powerful AI-driven editing workflow.

Highlights

  • Enhanced AI Functionality: The AI assistant's capabilities have been significantly upgraded with a new backend V2 schema for tool calls, enabling more precise and flexible interactions with the video editing timeline. This includes improved parsing of time-related arguments and a more robust system for AI-driven actions.
  • Improved Chat Interface: The ChatBox has been transformed into a multi-tabbed interface, allowing users to manage multiple conversations. It now features a context menu for messages, enabling actions like inline editing and restoring timeline snapshots, along with a searchable chat history panel for better organization.
  • Expanded Timeline Operations: New functionalities have been added to the timeline, including the ability to group and ungroup scrubbers, move grouped media to the media bin, and refined collision detection and transition management. The AI can now directly invoke these advanced timeline operations.
  • Comprehensive Tooling Catalog: A new, extensible tooling catalog has been introduced in the backend, defining a wide range of AI-callable functions for timeline manipulation, text property adjustments, and project settings, laying the groundwork for future AI-driven features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-structured enhancement to the AI chat functionality, including a new tabbed interface and a more robust v2 backend schema for tool calling. The backend refactoring is particularly strong. My review focuses on improving the robustness and maintainability of the new frontend logic in ChatBox.tsx, particularly around state management, asynchronous operations, and data parsing, to prevent potential runtime errors.

Comment on lines 535 to 542
setTimeout(() => {
const tl = latestTimelineRef.current;
const trackIndex = Math.max(0, trackNumber - 1);
const track = tl.tracks?.[trackIndex];
if (!track) return;
const target = newId ? tl.tracks[trackIndex]?.scrubbers.find((s) => s.id === newId) : undefined;
if (target) llmResizeScrubber(target.id, durationSeconds, pps, tl, handleUpdateScrubber);
}, 150);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using setTimeout with a fixed delay to handle the resizing of a newly added scrubber is fragile. This can lead to race conditions where the resize operation is attempted before the component has re-rendered with the new scrubber, especially on slower devices or during complex state updates.

A more robust approach would be to use a useEffect hook that triggers the resize operation once the new scrubber is confirmed to be present in the timelineState. This would involve creating a small state to queue the resize request, which the useEffect hook would then consume.

Comment on lines +182 to +205
const loadTabs = (): ChatTab[] => {
try {
const raw = localStorage.getItem(STORAGE_KEY);
if (!raw) return [];
const parsed = JSON.parse(raw);
if (Array.isArray(parsed)) {
return parsed.map((t: any) => ({
id: String(t.id ?? Date.now().toString()),
name: String(t.name ?? "Chat"),
messages: Array.isArray(t.messages)
? t.messages.map((m: any) => ({
id: String(m.id ?? Date.now().toString()),
content: String(m.content ?? ""),
isUser: Boolean(m.isUser),
timestamp: m && m.timestamp ? new Date(m.timestamp) : new Date(),
}))
: [],
timelineSnapshot: t.timelineSnapshot ?? null,
createdAt: Number(t.createdAt ?? Date.now()),
}));
}
} catch {}
return [];
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The loadTabs function parses data directly from localStorage without validation. If the stored data becomes corrupted or its schema changes in a future update, JSON.parse could throw an error, or the application could proceed with malformed data, leading to unpredictable runtime errors.

To make this more robust, consider using a schema validation library like zod to safely parse and validate the data. This ensures that the application only works with data that conforms to the expected ChatTab[] shape.

For example, you could define schemas for your types:

import { z } from "zod";

const MessageSchema = z.object({
  id: z.string(),
  content: z.string(),
  isUser: z.boolean(),
  timestamp: z.coerce.date(),
  snapshot: z.any().optional().nullable(),
});

const ChatTabSchema = z.object({
  id: z.string(),
  name: z.string(),
  messages: z.array(MessageSchema),
  timelineSnapshot: z.any().optional().nullable(),
  createdAt: z.number(),
});

const TabsSchema = z.array(ChatTabSchema);

And then use TabsSchema.safeParse(parsed) to validate the data.

Comment on lines 583 to 633
} else if (fn === "LLMResizeScrubber" || fn === "ResizeScrubber") {
const startSecForDiff = toSeconds((args as any).start_seconds) ?? toSeconds((args as any).position_seconds);
const candidateDur =
toSeconds((args as any).new_duration_seconds) ??
toSeconds((args as any).duration_seconds) ??
toSeconds((args as any).seconds) ??
toSeconds((args as any).duration) ??
toSeconds((args as any).newDurationSeconds) ??
toSeconds((args as any).durationInSeconds) ??
// try to parse free-form text provided by model (e.g., "12 seconds long")
(typeof (args as any).new_text_content === "string"
? toSeconds((args as any).new_text_content)
: undefined);
const endSecVal = toSeconds((args as any).end_seconds);
const dur =
candidateDur ??
(startSecForDiff !== undefined && endSecVal !== undefined
? Math.max(0, endSecVal - startSecForDiff)
: undefined);
const ppsVal = toNumber(args.pixels_per_second) ?? pixelsPerSecond;
const trackNum = toNumber((args as any).track_number) ?? toNumber((args as any).new_track_number);
let targetId = typeof args.scrubber_id === "string" ? (args.scrubber_id as string) : undefined;
if (!targetId && trackNum !== undefined) {
const trackIndex = Math.max(0, Math.floor(trackNum) - 1);
const track = timelineState.tracks?.[trackIndex];
if (track && track.scrubbers.length > 0) {
const nameSub =
typeof (args as any).scrubber_name === "string"
? String((args as any).scrubber_name).toLowerCase()
: undefined;
if (nameSub) {
const found = track.scrubbers.find((s) => s.name.toLowerCase().includes(nameSub));
if (found) targetId = found.id;
}
if (!targetId) {
// fallback to rightmost scrubber
targetId = track.scrubbers.reduce(
(best, s) => (s.left > best.left ? s : best),
track.scrubbers[0],
).id;
}
}
}
if (dur && dur > 0 && targetId) {
llmResizeScrubber(targetId, dur, ppsVal, timelineState, handleUpdateScrubber);
aiResponseContent = `✅ Resized scrubber to ${dur}s.`;
} else if (!targetId) {
aiResponseContent = `❌ Unable to resize: could not identify target scrubber.`;
} else {
aiResponseContent = `❌ Unable to resize: invalid duration.`;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The extensive use of (args as any) to access properties from the LLM response is type-unsafe and can lead to runtime errors if the LLM returns a slightly different schema than expected. This also makes the code harder to maintain.

A safer pattern would be to use a helper function that can safely access a property from a list of possible keys. This encapsulates the logic for handling aliased or optional arguments from the model.

Example helper:

const getArg = (keys: string[]): unknown => {
  for (const key of keys) {
    if (key in args) {
      return args[key];
    }
  }
  return undefined;
};

// Usage:
const startSecForDiff = toSeconds(getArg(['start_seconds', 'position_seconds']));

total = 0.0
matched = False
for m in re.finditer(r"(?P<num>\d+(?:\.\d+)?)\s*(?P<unit>h|hr|hrs|hour|hours|m|min|mins|minute|minutes|s|sec|secs|second|seconds|ms|millisecond|milliseconds)\b",
s):

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on a
user-provided value
may run slow on strings with many repetitions of '9'.
text = (user_text or "").lower()

# Extract explicit FROM ... TO ... first
m = re.search(r"from\s+([^\s]+(?:\s*[a-z]+)?)\s+to\s+([^\s]+(?:\s*[a-z]+)?)", text)

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on a
user-provided value
may run slow on strings starting with 'from !' and with many repetitions of 'aa'.
# TO ... LONG / SET TO ... / MAKE (IT)? ...
# Examples: "to 12 seconds long", "set to 12s", "make it 8 sec", "12s long"
if updated.get("duration_seconds") is None:
m4 = re.search(r"(?:to\s+)?([^\s]+(?:\s*[a-z]+)?)\s+long", text)

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on a
user-provided value
may run slow on strings with many repetitions of '!'.
This
regular expression
that depends on a
user-provided value
may run slow on strings starting with '!' and with many repetitions of 'aa'.
if dur_candidate is not None:
updated["duration_seconds"] = dur_candidate
if updated.get("duration_seconds") is None:
m5 = re.search(r"(?:set\s+(?:it\s+)?to|make\s+(?:it\s+)?)\s+([^\s]+(?:\s*[a-z]+)?)", text)

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on a
user-provided value
may run slow on strings starting with 'make ' and with many repetitions of ' '.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants