-
Notifications
You must be signed in to change notification settings - Fork 97
feat: enhance AI functionality + ChatBox with context menu and utili… #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: canary
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @sr2echa, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request delivers a substantial upgrade to the AI assistant and the chat experience within the video editor. The backend now leverages a more flexible and powerful V2 schema for AI tool calls, enabling more sophisticated and accurate interactions with the editing timeline. Concurrently, the frontend ChatBox has been reimagined as a multi-tabbed interface, offering enhanced message management, context-aware actions, and a searchable history. These improvements collectively aim to provide users with a more intuitive and powerful AI-driven editing workflow.
Highlights
- Enhanced AI Functionality: The AI assistant's capabilities have been significantly upgraded with a new backend V2 schema for tool calls, enabling more precise and flexible interactions with the video editing timeline. This includes improved parsing of time-related arguments and a more robust system for AI-driven actions.
- Improved Chat Interface: The ChatBox has been transformed into a multi-tabbed interface, allowing users to manage multiple conversations. It now features a context menu for messages, enabling actions like inline editing and restoring timeline snapshots, along with a searchable chat history panel for better organization.
- Expanded Timeline Operations: New functionalities have been added to the timeline, including the ability to group and ungroup scrubbers, move grouped media to the media bin, and refined collision detection and transition management. The AI can now directly invoke these advanced timeline operations.
- Comprehensive Tooling Catalog: A new, extensible tooling catalog has been introduced in the backend, defining a wide range of AI-callable functions for timeline manipulation, text property adjustments, and project settings, laying the groundwork for future AI-driven features.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and well-structured enhancement to the AI chat functionality, including a new tabbed interface and a more robust v2 backend schema for tool calling. The backend refactoring is particularly strong. My review focuses on improving the robustness and maintainability of the new frontend logic in ChatBox.tsx, particularly around state management, asynchronous operations, and data parsing, to prevent potential runtime errors.
app/components/chat/ChatBox.tsx
Outdated
| setTimeout(() => { | ||
| const tl = latestTimelineRef.current; | ||
| const trackIndex = Math.max(0, trackNumber - 1); | ||
| const track = tl.tracks?.[trackIndex]; | ||
| if (!track) return; | ||
| const target = newId ? tl.tracks[trackIndex]?.scrubbers.find((s) => s.id === newId) : undefined; | ||
| if (target) llmResizeScrubber(target.id, durationSeconds, pps, tl, handleUpdateScrubber); | ||
| }, 150); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using setTimeout with a fixed delay to handle the resizing of a newly added scrubber is fragile. This can lead to race conditions where the resize operation is attempted before the component has re-rendered with the new scrubber, especially on slower devices or during complex state updates.
A more robust approach would be to use a useEffect hook that triggers the resize operation once the new scrubber is confirmed to be present in the timelineState. This would involve creating a small state to queue the resize request, which the useEffect hook would then consume.
| const loadTabs = (): ChatTab[] => { | ||
| try { | ||
| const raw = localStorage.getItem(STORAGE_KEY); | ||
| if (!raw) return []; | ||
| const parsed = JSON.parse(raw); | ||
| if (Array.isArray(parsed)) { | ||
| return parsed.map((t: any) => ({ | ||
| id: String(t.id ?? Date.now().toString()), | ||
| name: String(t.name ?? "Chat"), | ||
| messages: Array.isArray(t.messages) | ||
| ? t.messages.map((m: any) => ({ | ||
| id: String(m.id ?? Date.now().toString()), | ||
| content: String(m.content ?? ""), | ||
| isUser: Boolean(m.isUser), | ||
| timestamp: m && m.timestamp ? new Date(m.timestamp) : new Date(), | ||
| })) | ||
| : [], | ||
| timelineSnapshot: t.timelineSnapshot ?? null, | ||
| createdAt: Number(t.createdAt ?? Date.now()), | ||
| })); | ||
| } | ||
| } catch {} | ||
| return []; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loadTabs function parses data directly from localStorage without validation. If the stored data becomes corrupted or its schema changes in a future update, JSON.parse could throw an error, or the application could proceed with malformed data, leading to unpredictable runtime errors.
To make this more robust, consider using a schema validation library like zod to safely parse and validate the data. This ensures that the application only works with data that conforms to the expected ChatTab[] shape.
For example, you could define schemas for your types:
import { z } from "zod";
const MessageSchema = z.object({
id: z.string(),
content: z.string(),
isUser: z.boolean(),
timestamp: z.coerce.date(),
snapshot: z.any().optional().nullable(),
});
const ChatTabSchema = z.object({
id: z.string(),
name: z.string(),
messages: z.array(MessageSchema),
timelineSnapshot: z.any().optional().nullable(),
createdAt: z.number(),
});
const TabsSchema = z.array(ChatTabSchema);And then use TabsSchema.safeParse(parsed) to validate the data.
| } else if (fn === "LLMResizeScrubber" || fn === "ResizeScrubber") { | ||
| const startSecForDiff = toSeconds((args as any).start_seconds) ?? toSeconds((args as any).position_seconds); | ||
| const candidateDur = | ||
| toSeconds((args as any).new_duration_seconds) ?? | ||
| toSeconds((args as any).duration_seconds) ?? | ||
| toSeconds((args as any).seconds) ?? | ||
| toSeconds((args as any).duration) ?? | ||
| toSeconds((args as any).newDurationSeconds) ?? | ||
| toSeconds((args as any).durationInSeconds) ?? | ||
| // try to parse free-form text provided by model (e.g., "12 seconds long") | ||
| (typeof (args as any).new_text_content === "string" | ||
| ? toSeconds((args as any).new_text_content) | ||
| : undefined); | ||
| const endSecVal = toSeconds((args as any).end_seconds); | ||
| const dur = | ||
| candidateDur ?? | ||
| (startSecForDiff !== undefined && endSecVal !== undefined | ||
| ? Math.max(0, endSecVal - startSecForDiff) | ||
| : undefined); | ||
| const ppsVal = toNumber(args.pixels_per_second) ?? pixelsPerSecond; | ||
| const trackNum = toNumber((args as any).track_number) ?? toNumber((args as any).new_track_number); | ||
| let targetId = typeof args.scrubber_id === "string" ? (args.scrubber_id as string) : undefined; | ||
| if (!targetId && trackNum !== undefined) { | ||
| const trackIndex = Math.max(0, Math.floor(trackNum) - 1); | ||
| const track = timelineState.tracks?.[trackIndex]; | ||
| if (track && track.scrubbers.length > 0) { | ||
| const nameSub = | ||
| typeof (args as any).scrubber_name === "string" | ||
| ? String((args as any).scrubber_name).toLowerCase() | ||
| : undefined; | ||
| if (nameSub) { | ||
| const found = track.scrubbers.find((s) => s.name.toLowerCase().includes(nameSub)); | ||
| if (found) targetId = found.id; | ||
| } | ||
| if (!targetId) { | ||
| // fallback to rightmost scrubber | ||
| targetId = track.scrubbers.reduce( | ||
| (best, s) => (s.left > best.left ? s : best), | ||
| track.scrubbers[0], | ||
| ).id; | ||
| } | ||
| } | ||
| } | ||
| if (dur && dur > 0 && targetId) { | ||
| llmResizeScrubber(targetId, dur, ppsVal, timelineState, handleUpdateScrubber); | ||
| aiResponseContent = `✅ Resized scrubber to ${dur}s.`; | ||
| } else if (!targetId) { | ||
| aiResponseContent = `❌ Unable to resize: could not identify target scrubber.`; | ||
| } else { | ||
| aiResponseContent = `❌ Unable to resize: invalid duration.`; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extensive use of (args as any) to access properties from the LLM response is type-unsafe and can lead to runtime errors if the LLM returns a slightly different schema than expected. This also makes the code harder to maintain.
A safer pattern would be to use a helper function that can safely access a property from a list of possible keys. This encapsulates the logic for handling aliased or optional arguments from the model.
Example helper:
const getArg = (keys: string[]): unknown => {
for (const key of keys) {
if (key in args) {
return args[key];
}
}
return undefined;
};
// Usage:
const startSecForDiff = toSeconds(getArg(['start_seconds', 'position_seconds']));| total = 0.0 | ||
| matched = False | ||
| for m in re.finditer(r"(?P<num>\d+(?:\.\d+)?)\s*(?P<unit>h|hr|hrs|hour|hours|m|min|mins|minute|minutes|s|sec|secs|second|seconds|ms|millisecond|milliseconds)\b", | ||
| s): |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
user-provided value
| text = (user_text or "").lower() | ||
|
|
||
| # Extract explicit FROM ... TO ... first | ||
| m = re.search(r"from\s+([^\s]+(?:\s*[a-z]+)?)\s+to\s+([^\s]+(?:\s*[a-z]+)?)", text) |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
user-provided value
| # TO ... LONG / SET TO ... / MAKE (IT)? ... | ||
| # Examples: "to 12 seconds long", "set to 12s", "make it 8 sec", "12s long" | ||
| if updated.get("duration_seconds") is None: | ||
| m4 = re.search(r"(?:to\s+)?([^\s]+(?:\s*[a-z]+)?)\s+long", text) |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
user-provided value
This
regular expression
user-provided value
| if dur_candidate is not None: | ||
| updated["duration_seconds"] = dur_candidate | ||
| if updated.get("duration_seconds") is None: | ||
| m5 = re.search(r"(?:set\s+(?:it\s+)?to|make\s+(?:it\s+)?)\s+([^\s]+(?:\s*[a-z]+)?)", text) |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
Summary
TODO
Far fetched...