Skip to content

Conversation

@Sempre0721
Copy link

@Sempre0721 Sempre0721 commented Sep 22, 2025

Description

This PR adds a backend API endpoint using Next.js to interact with yt-dlp for searching videos, retrieving video metadata, and preparing download functionality. It enables the application to fetch video information from platforms like YouTube programmatically via server-side execution.

The motivation is to provide a robust, type-safe, and secure way to integrate rich video metadata and search capabilities into the frontend without exposing sensitive logic or dependencies to the client. This lays the foundation for future features such as video downloading, playlist parsing, and media aggregation.

This change does not yet include the full download streaming implementation but prepares the infrastructure for it.

Type of change

  • New feature (non-breaking change which adds functionality)
  • Code refactoring
  • This change requires a documentation update

How Has This Been Tested?

The API was tested locally using curl commands and Postman to verify all three actions (search, info, and placeholder download) work as expected.

Test Commands:

  1. Search videos:
    curl -X POST http://localhost:3000/api/video?action=search \
      -H "Content-Type: application/json" \
      -d '{"keyword": "react tutorial", "page": 1, "pageSize": 5}'

2.Get video info:

curl -X POST http://localhost:3000/api/video?action=info \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}'

3.Download (placeholder):

curl -X POST http://localhost:3000/api/video?action=download \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}'

All endpoints returned correctly structured JSON responses with proper error handling for invalid inputs and subprocess failures.

Test Configuration:

Node version: 18.17.0
Operating System: macOS Ventura / Ubuntu 22.04
Python version: 3.11
yt-dlp installed via: pip install yt-dlp
⚠️ Note: Testing requires yt-dlp and Python to be installed and available in the system PATH.

Screenshots (if applicable)
N/A (Backend API change, no UI impact)

Checklist:
My code follows the style guidelines of this project
I have performed a self-review of my code
I have added screenshots if ui has been changed
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published in downstream modules
Additional context
This API relies on yt-dlp being available in the runtime environment. It will not work on Vercel or other serverless platforms unless yt-dlp is bundled via custom Docker setup.
Future improvements:
Implement full download support with progress tracking and file streaming.
Add caching (Redis) for search and info results to reduce load and improve performance.
Add rate limiting and input sanitization for production security.
Support more extractors and formats.
Consider creating a Dockerfile to bundle Python + yt-dlp + Node.js for consistent deployment.

Summary by CodeRabbit

  • New Features
    • Introduces a POST endpoint at /api/download-videos with three actions: search, info, and download.
    • Search supports keyword, pagination (page, pageSize), and returns structured JSON results.
    • Info returns detailed metadata for a provided URL.
    • Download currently queues a request and returns a downloadId with timestamp.
    • Strong input validation with clear 400 responses for invalid requests.
    • Consistent, descriptive error handling with appropriate status codes (including 500 for server errors).

@netlify
Copy link

netlify bot commented Sep 22, 2025

👷 Deploy request for appcut pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 4abc0f5

@vercel
Copy link

vercel bot commented Sep 22, 2025

@Sempre0721 is attempting to deploy a commit to the OpenCut OSS Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 22, 2025

Walkthrough

Introduces a new Next.js POST route at /api/download-videos that dispatches actions (search, info, download) based on a query parameter. Uses zod for input validation, spawns yt-dlp for search and info, streams and parses JSON output, and returns structured JSON responses with centralized error handling.

Changes

Cohort / File(s) Change summary
New API route: video ops
apps/web/src/app/api/download-videos/route.ts
Adds POST handler routing by action (search, info, download). Implements zod validation for search/info bodies. Integrates yt-dlp process execution for search/info, with stdout/stderr streaming, JSON parsing, and error handling. Provides placeholder download response. Exports POST function.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant API as /api/download-videos (POST)
  participant Validator as Zod Validator
  participant YT as yt-dlp (child process)

  rect rgba(230,240,255,0.4)
    Note over Client,API: Request with ?action=search|info|download
    Client->>API: POST /api/download-videos?action=...
    API->>Validator: Validate body (per action)
    alt Invalid input
      Validator-->>API: Errors
      API-->>Client: 400 JSON {error, details}
    else Valid input
      opt action=search or info
        API->>YT: spawn yt-dlp with args
        YT-->>API: stdout/stderr streams
        API->>API: Aggregate output, parse JSON
        alt Non-zero exit / parse fail
          API-->>Client: 500 JSON {error, details}
        else Success
          API-->>Client: 200 JSON {success, data}
        end
      end
      opt action=download
        API->>API: Generate downloadId, timestamp
        API-->>Client: 202 JSON {queued, id, url}
      end
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I thump the keys—request in flight,
Three paths diverge in server light.
Search and info, streams that flow,
JSON crumbs from yt-dlp’s glow.
Downloads queued, a future treat—
A rabbit’s backlog, swift yet sweet.
Hop, parse, respond—complete! 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "Feature/video download" is concise and directly related to the primary change in this PR, which adds a Next.js backend API for video search, metadata retrieval, and a download pathway using yt-dlp. It communicates the overall intent to reviewers scanning history. The title is slightly generic and could be more specific about the API surface and yt-dlp integration.
Description Check ✅ Passed The PR description largely follows the repository template: it includes a clear summary, motivation, type of change, detailed test commands, test configuration, and notes about runtime requirements and deployment caveats. However it is missing a "Fixes #" reference (if applicable) and contains an inconsistency between the implemented route (apps/web/src/app/api/download-videos/route.ts) and the example curl commands that use /api/video, which should be corrected to avoid confusion. The checklist is present but left unchecked and should be updated to reflect completed verification steps.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

startIndex.toString(),
"--playlist-end",
endIndex.toString(),
`ytsearch${pageSize}:${keyword}`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`ytsearch${pageSize}:${keyword}`,
`ytsearch${endIndex}:${keyword}`,

The pagination logic is broken - pages beyond the first will return empty results because yt-dlp is only searching for pageSize total results instead of enough results to support pagination.

View Details

Analysis

Pagination bug in handleSearch() returns empty results for pages beyond first

What fails: handleSearch() in apps/web/src/app/api/download-videos/route.ts uses ytsearch${pageSize}:${keyword} which limits yt-dlp to find only pageSize total results, but pagination logic tries to extract ranges beyond that limit

How to reproduce:

# Page 2 with pageSize=20 tries to get results 21-40 from only 20 total results
yt-dlp --playlist-start 21 --playlist-end 40 "ytsearch20:python tutorial"

Result: Returns empty array: {"entries": [], "requested_entries": []}

Expected: Should return 20 results (items 21-40) like when using ytsearch40:python tutorial

Root cause: Line 95 searches for pageSize results but needs endIndex results to support the requested page range

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (8)
apps/web/src/app/api/download-videos/route.ts (8)

1-2: Pin the runtime to Node to prevent accidental Edge deployment.

child_process is unavailable on Edge. Make this explicit.

 // apps/web/src/app/api/download-videos/route.ts
+
+export const runtime = "nodejs";

17-21: Coerce numeric inputs and trim keyword.

This makes the API tolerant of JSON string numbers and stray whitespace.

-const searchRequestSchema = z.object({
-  keyword: z.string().min(1, "Keyword is required"),
-  page: z.number().int().positive().optional().default(1),
-  pageSize: z.number().int().min(1).max(50).optional().default(20),
-});
+const searchRequestSchema = z.object({
+  keyword: z.string().trim().min(1, "Keyword is required"),
+  page: z.coerce.number().int().positive().default(1),
+  pageSize: z.coerce.number().int().min(1).max(50).default(20),
+});

46-47: Avoid console in server code.

Replace with your logger or remove.

-    console.error("Video API error:", error);
+    // TODO: route to structured logger/observability

86-96: Add a hard timeout/abort for yt-dlp to prevent hanging requests.

Use AbortController and clear on close.

-    return new Promise((resolve) => {
-      const ytDlp = spawn("yt-dlp", [
+    return new Promise((resolve) => {
+      const controller = new AbortController();
+      const timeout = setTimeout(() => controller.abort(), 15000);
+      const ytDlp = spawn("yt-dlp", [
         "--dump-single-json",
         "--flat-playlist",
         "--no-warnings",
         "--playlist-start",
         startIndex.toString(),
         "--playlist-end",
         endIndex.toString(),
         `ytsearch${endIndex}:${keyword}`,
-      ]);
+      ], { signal: controller.signal });
@@
-      ytDlp.on("close", (code) => {
+      ytDlp.on("close", (code) => {
+        clearTimeout(timeout);
@@
-      ytDlp.on("error", (error) => {
+      ytDlp.on("error", (error) => {
+        clearTimeout(timeout);

Also applies to: 109-122, 124-135, 137-157, 159-171


210-218: Mirror the same timeout/abort for info.

Prevents stuck metadata requests.

-    return new Promise((resolve) => {
-      const ytDlp = spawn("yt-dlp", [
+    return new Promise((resolve) => {
+      const controller = new AbortController();
+      const timeout = setTimeout(() => controller.abort(), 15000);
+      const ytDlp = spawn("yt-dlp", [
         "--dump-single-json",
         "--no-warnings",
         "--compat-options",
         "no-youtube-channel-redirect",
         url,
-      ]);
+      ], { signal: controller.signal });
@@
-      ytDlp.on("close", (code) => {
+      ytDlp.on("close", (code) => {
+        clearTimeout(timeout);
@@
-      ytDlp.on("error", (error) => {
+      ytDlp.on("error", (error) => {
+        clearTimeout(timeout);

Also applies to: 230-278, 280-291


334-346: Use randomUUID from node:crypto and return 202 Accepted for queued work.

Avoid relying on a global and reflect queue semantics.

-    return NextResponse.json({
-      success: true,
-      message: "Download started (placeholder)",
-      data: {
-        url,
-        status: "queued",
-        downloadId: crypto.randomUUID(),
-        startedAt: new Date().toISOString(),
-      },
-    });
+    return NextResponse.json(
+      {
+        success: true,
+        message: "Download started (placeholder)",
+        data: {
+          url,
+          status: "queued",
+          downloadId: randomUUID(),
+          startedAt: new Date().toISOString(),
+        },
+      },
+      { status: 202 }
+    );

334-336: Unify TODO language.

Prefer English comments for consistency.

-    // TODO: 实际调用 yt-dlp 下载视频文件
-    // 当前返回模拟响应
+    // TODO: Call yt-dlp to download the video file (placeholder response for now)

27-45: Add basic guardrails (rate limiting, allowlist, and concurrency caps).

External process calls can be abused; add per-IP rate limits, a simple domain allowlist, and a small worker queue before enabling download.

Would you like a follow-up PR sketch using an in-memory semaphore + upstream reverse-proxy rate limits?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a346bdf and 4abc0f5.

📒 Files selected for processing (1)
  • apps/web/src/app/api/download-videos/route.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Don't use TypeScript enums.
Don't export imported variables.
Don't add type annotations to variables, parameters, and class properties that are initialized with literal expressions.
Don't use TypeScript namespaces.
Don't use non-null assertions with the ! postfix operator.
Don't use parameter properties in class constructors.
Don't use user-defined types.
Use as const instead of literal types and type annotations.
Use either T[] or Array<T> consistently.
Initialize each enum member value explicitly.
Use export type for types.
Use import type for types.
Make sure all enum members are literal values.
Don't use TypeScript const enum.
Don't declare empty interfaces.
Don't let variables evolve into any type through reassignments.
Don't use the any type.
Don't misuse the non-null assertion operator (!) in TypeScript files.
Don't use implicit any type on variable declarations.
Don't merge interfaces and classes unsafely.
Don't use overload signatures that aren't next to each other.
Use the namespace keyword instead of the module keyword to declare TypeScript namespaces.
Don't use empty type parameters in type aliases and interfaces.
Don't use any or unknown as type constraints.
Don't use the TypeScript directive @ts-ignore.
Use consistent accessibility modifiers on class properties and methods.
Use function types instead of object types with call signatures.
Don't use void type outside of generic or return types.

**/*.{ts,tsx}: Don't use primitive type aliases or misleading types
Don't use the TypeScript directive @ts-ignore
Don't use TypeScript enums
Use either T[] or Array consistently
Don't use the any type

Files:

  • apps/web/src/app/api/download-videos/route.ts
**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{js,jsx,ts,tsx}: Don't use the return value of React.render.
Don't use consecutive spaces in regular expression literals.
Don't use the arguments object.
Don't use primitive type aliases or misleading types.
Don't use the comma operator.
Don't write functions that exceed a given Cognitive Complexity score.
Don't use unnecessary boolean casts.
Don't use unnecessary callbacks with flatMap.
Use for...of statements instead of Array.forEach.
Don't create classes that only have static members (like a static namespace).
Don't use this and super in static contexts.
Don't use unnecessary catch clauses.
Don't use unnecessary constructors.
Don't use unnecessary continue statements.
Don't export empty modules that don't change anything.
Don't use unnecessary escape sequences in regular expression literals.
Don't use unnecessary labels.
Don't use unnecessary nested block statements.
Don't rename imports, exports, and destructured assignments to the same name.
Don't use unnecessary string or template literal concatenation.
Don't use String.raw in template literals when there are no escape sequences.
Don't use useless case statements in switch statements.
Don't use ternary operators when simpler alternatives exist.
Don't use useless this aliasing.
Don't initialize variables to undefined.
Don't use the void operators (they're not familiar).
Use arrow functions instead of function expressions.
Use Date.now() to get milliseconds since the Unix Epoch.
Use .flatMap() instead of map().flat() when possible.
Use literal property access instead of computed property access.
Don't use parseInt() or Number.parseInt() when binary, octal, or hexadecimal literals work.
Use concise optional chaining instead of chained logical expressions.
Use regular expression literals instead of the RegExp constructor when possible.
Don't use number literal object member names that aren't base 10 or use underscore separators.
Remove redundant terms from logical expressions.
Use while loops instead of...

Files:

  • apps/web/src/app/api/download-videos/route.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)

**/*.{ts,tsx,js,jsx}: Don't use the comma operator
Use for...of statements instead of Array.forEach
Don't initialize variables to undefined
Use .flatMap() instead of map().flat() when possible
Don't assign a value to itself
Avoid unused imports and variables
Don't use await inside loops
Don't hardcode sensitive data like API keys and tokens
Don't use the delete operator
Don't use global eval()
Use String.slice() instead of String.substr() and String.substring()
Don't use else blocks when the if block breaks early
Put default function parameters and optional function parameters last
Use new when throwing an error
Use String.trimStart() and String.trimEnd() over String.trimLeft() and String.trimRight()

Files:

  • apps/web/src/app/api/download-videos/route.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Vercel Agent Review
🔇 Additional comments (1)
apps/web/src/app/api/download-videos/route.ts (1)

10-12: LGTM: URL validation is adequate.

The schema prevents malformed URLs.

Comment on lines +3 to +5
import { NextRequest, NextResponse } from "next/server";
import { z } from "zod";
import { spawn } from "child_process";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use node: protocol for built-ins and import types correctly. Also import randomUUID.

Aligns with guidelines and avoids relying on a possibly-missing global crypto.

-import { NextRequest, NextResponse } from "next/server";
+import type { NextRequest } from "next/server";
+import { NextResponse } from "next/server";
-import { spawn } from "child_process";
+import { spawn } from "node:child_process";
+import { randomUUID } from "node:crypto";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { NextRequest, NextResponse } from "next/server";
import { z } from "zod";
import { spawn } from "child_process";
import type { NextRequest } from "next/server";
import { NextResponse } from "next/server";
import { z } from "zod";
import { spawn } from "node:child_process";
import { randomUUID } from "node:crypto";
🤖 Prompt for AI Agents
In apps/web/src/app/api/download-videos/route.ts lines 3-5, change the built-in
imports to use the node: protocol and import types correctly: replace
"child_process" with "node:child_process" and import randomUUID from
"node:crypto"; also import NextRequest and NextResponse as types (using import
type) from "next/server" so they are erased at runtime. Update the top-of-file
imports accordingly and ensure you only use spawn from node:child_process and
randomUUID from node:crypto in the implementation.

Comment on lines +82 to +96
const { keyword, page, pageSize } = validationResult.data;
const startIndex = (page - 1) * pageSize + 1;
const endIndex = page * pageSize;

return new Promise((resolve) => {
const ytDlp = spawn("yt-dlp", [
"--dump-single-json",
"--flat-playlist",
"--no-warnings",
"--playlist-start",
startIndex.toString(),
"--playlist-end",
endIndex.toString(),
`ytsearch${pageSize}:${keyword}`,
]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Search pagination bug: ytsearch limit must reflect endIndex.

With page > 1, ytsearch${pageSize}: returns too few items; --playlist-start shifts past the available results, yielding empty pages.

-    const { keyword, page, pageSize } = validationResult.data;
-    const startIndex = (page - 1) * pageSize + 1;
-    const endIndex = page * pageSize;
+    const { keyword, page, pageSize } = validationResult.data;
+    const startIndex = (page - 1) * pageSize + 1;
+    const endIndex = page * pageSize;
@@
-      const ytDlp = spawn("yt-dlp", [
+      const ytDlp = spawn("yt-dlp", [
         "--dump-single-json",
         "--flat-playlist",
         "--no-warnings",
         "--playlist-start",
         startIndex.toString(),
         "--playlist-end",
         endIndex.toString(),
-        `ytsearch${pageSize}:${keyword}`,
+        `ytsearch${endIndex}:${keyword}`,
       ]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const { keyword, page, pageSize } = validationResult.data;
const startIndex = (page - 1) * pageSize + 1;
const endIndex = page * pageSize;
return new Promise((resolve) => {
const ytDlp = spawn("yt-dlp", [
"--dump-single-json",
"--flat-playlist",
"--no-warnings",
"--playlist-start",
startIndex.toString(),
"--playlist-end",
endIndex.toString(),
`ytsearch${pageSize}:${keyword}`,
]);
const { keyword, page, pageSize } = validationResult.data;
const startIndex = (page - 1) * pageSize + 1;
const endIndex = page * pageSize;
return new Promise((resolve) => {
const ytDlp = spawn("yt-dlp", [
"--dump-single-json",
"--flat-playlist",
"--no-warnings",
"--playlist-start",
startIndex.toString(),
"--playlist-end",
endIndex.toString(),
`ytsearch${endIndex}:${keyword}`,
]);

Comment on lines +137 to +144
const result = JSON.parse(stdoutData);
resolve(
NextResponse.json({
success: true,
data: Array.isArray(result) ? result : [result],
})
);
} catch (parseError) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Parse yt-dlp search output via entries array.

--dump-single-json --flat-playlist ytsearch… returns an object with entries. Current code wraps the object in an array.

-          const result = JSON.parse(stdoutData);
-          resolve(
-            NextResponse.json({
-              success: true,
-              data: Array.isArray(result) ? result : [result],
-            })
-          );
+          const parsed = JSON.parse(stdoutData);
+          const entries = Array.isArray(parsed?.entries) ? parsed.entries : [];
+          resolve(
+            NextResponse.json({
+              success: true,
+              data: entries,
+              pagination: { page, pageSize },
+            })
+          );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const result = JSON.parse(stdoutData);
resolve(
NextResponse.json({
success: true,
data: Array.isArray(result) ? result : [result],
})
);
} catch (parseError) {
const parsed = JSON.parse(stdoutData);
const entries = Array.isArray(parsed?.entries) ? parsed.entries : [];
resolve(
NextResponse.json({
success: true,
data: entries,
pagination: { page, pageSize },
})
);
} catch (parseError) {
🤖 Prompt for AI Agents
In apps/web/src/app/api/download-videos/route.ts around lines 137 to 144, the
code parses yt-dlp JSON and always wraps the parsed object in an array, but
yt-dlp with --dump-single-json --flat-playlist returns an object with an entries
array; change the logic to detect if parsed result has an entries property (and
it's an array) and use that array as the data, otherwise if parsed result is
already an array use it, and only wrap a non-array, non-entries object in an
array before returning in NextResponse.json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants