Feat/Added the GenerateVideo feature #759

xiaoxianhjy · 2025-10-28T10:13:13Z

Video Generate

一个“AI 科普短视频”工作流。支持全自动与人工协同两种模式，产生脚本、语音、插画/动画、字幕，并合成为成片。

快速检查（必读）

在首次运行前，建议完成以下检查：

运行环境

Windows / Python 3.10+（推荐）
已安装 FFmpeg，并添加到 PATH（ffmpeg -version 可执行）
Manim 可用（manim -h 可执行）

Python 依赖（若未安装）

依赖在仓库 requirements 下，或按需安装：moviepy、Pillow、edge-tts、matplotlib 等

资源文件（已随仓库提供）

自定义字体与背景音乐：projects/video_generate/core/asset/
- bg_audio.mp3
- 字小魂扶摇手书(商用需授权).ttf

可选的 API Key（全自动模式常用）

MODELSCOPE_API_KEY：用于 ModelScope 模型调用

提示：未设置 Key 也可运行“只合成/人工模式”，但全自动模式可能因缺少 LLM 能力失败。

运行方式一：全自动模式（auto）

按主题从零到一自动生成并合成视频：

# 可选：设置 API Key
$env:MODELSCOPE_API_KEY="你的ModelScopeKey"

# 运行三步工作流（脚本 → 素材 → 合成）
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode auto --trust_remote_code true

输出将位于 ms-agent/projects/video_generate/output/<主题>/。

运行方式二：人工模式（human）

适合需要人工把控动画的流程：自动产出“脚本/语音/插画/字幕/占位前景”，然后在“人工工作室”内逐段制作/审批前景动画，最终一键完整合成。

先生成素材（不自动渲染 Manim）

ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode human --trust_remote_code true

打开人工工作室（指向上一步生成的主题目录）

# 确保将 ms-agent 包目录加入 PYTHONPATH
$env:PYTHONPATH="项目本地目录\ms-agent"

# 以模块方式启动交互式工作室
python -m projects.video_generate.core.human_animation_studio "项目本地目录\ms-agent\projects\video_generate\output\主题"

在工作室中：

1 查看待制作任务 → 2 开始制作动画 → 生成/改进 Manim 代码 → 创建预览 → 批准动画
当所有片段完成后，系统会自动合并前景并执行“完整合成（背景+字幕+音频+前景+音乐）”生成成片

运行方式三：只合成（已有素材）

如果目录中已经有 asset_info.json（或你只想重新合成）：

ms-agent run --config "ms-agent/projects/video_generate/workflow_from_assets.yaml" `
	--query "项目本地目录\ms-agent\projects\video_generate\output\<主题>\asset_info.json" `
	--animation_mode human `
	--trust_remote_code true

该流程只执行合成，不会重新生成脚本/插画/动画。若存在已审批的透明前景（finals/scene_*_final.mov），将优先使用。

目录说明

video_agent.py：三步逻辑的 Agent 封装
workflow.yaml：三步编排；workflow_from_assets.yaml：只合成编排
core/workflow.py：主流程；core/human_animation_studio.py：人工工作室
core/asset/：字体与背景音乐
output/：运行产物
scripts/compose_from_asset_info.py：从现有 asset_info.json 直接合成的辅助脚本

gemini-code-assist · 2025-10-28T10:13:45Z

Summary of Changes

Hello @xiaoxianhjy, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求引入了一个强大的“视频生成”功能，旨在自动化和辅助创建AI科普短视频。它通过引入灵活的动画模式选择（全自动或人工辅助），极大地提升了视频制作的效率和可控性。核心变更包括命令行参数的扩展、环境变量的智能传播，以及一系列新开发的模块，这些模块协同工作以处理动画任务、生成高质量视觉资产并确保Manim动画代码的布局和内容质量。此外，还提供了清晰的文档和工作流配置，使用户能够轻松上手并定制视频生成过程。

Highlights

新功能：视频生成工作流: 引入了一个全面的AI科普短视频生成工作流，支持全自动和人工辅助两种模式，涵盖脚本、语音、插画/动画、字幕的生成与最终合成。
动画模式控制: 在 ms_agent/cli/run.py 中新增了 --animation_mode 命令行参数，允许用户选择 auto（全自动）或 human（人工辅助）动画制作模式，并通过环境变量将此模式传递给下游代理。
核心模块增强: 新增了多个核心Python模块，包括动画生产模式管理、背景图片生成、Manim空间布局检查、增强的Manim提示词系统以及视觉质量评估，以确保生成视频的质量和灵活性。
新代理与工作流定义: 引入了 video_agent.py 作为新的代理，并定义了新的YAML工作流文件（workflow.yaml、workflow_from_assets.yaml 等），用于编排视频生成过程的各个阶段。
详细文档: 为 projects/video_generate 目录添加了详细的 README.md 文件，提供了环境设置、三种运行模式（全自动、人工、只合成）、目录结构和常见问题的全面指南。

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

这个 PR 增加了一个功能强大的“AI 科普短视频”工作流，代码结构清晰，考虑了全自动和人工协同两种模式，非常出色。特别是其中包含了多轮的代码生成、静态分析和自我修复机制，这是一个很先进的设计。

不过，也存在一些可以改进的地方：

循环依赖: core 目录下的几个模块存在循环导入问题，例如 workflow.py 和 balanced_spatial_system.py 等，这会导致程序在启动时失败。建议将共享的函数（如 modai_model_request）提取到独立的工具模块中。
代码质量: 部分新添加的代码存在一些问题，比如 manim_quality_controller.py 中的 ManimQualityController 类似乎未完成或已损坏，调用了不存在的方法。
配置硬编码: 多处代码中硬编码了模型名称和 API 地址，建议将它们移入配置文件，方便后续修改和管理。

总的来说，这是一个非常棒的功能，在解决上述问题后会更加健壮和易于维护。

gemini-code-assist · 2025-10-28T10:16:21Z

projects/video_generate/core/balanced_spatial_system.py

+
+            try:
+                # 调用LLM进行修复
+                from .workflow import modai_model_request


此处 from .workflow import modai_model_request 造成了循环依赖。workflow.py 模块在顶部导入了 balanced_spatial_system.py，而这里又反向导入了 workflow.py。这会导致 Python 在启动时抛出 ImportError。建议将 modai_model_request 函数移动到一个独立的工具模块（例如 core/utils.py），然后让 workflow.py 和 balanced_spatial_system.py 都从这个新模块导入。这个问题在 manim_quality_controller.py 中也存在。

gemini-code-assist · 2025-10-28T10:16:21Z

projects/video_generate/core/manim_quality_controller.py

+import ast
+from typing import Dict, List, Tuple, Optional
+from dataclasses import dataclass
+from optimized_manim_prompts import OptimizedManimPrompts, FixContext


from optimized_manim_prompts import OptimizedManimPrompts, FixContext 这行导入语句存在问题。由于 manim_quality_controller.py 和 optimized_manim_prompts.py 在同一个 core 目录下，应该使用相对导入 from .optimized_manim_prompts import ...。当前的写法要求 core 目录在 PYTHONPATH 中，这是一种不稳定的实践，并且可能会导致模块找不到的错误。

Suggested change

from optimized_manim_prompts import OptimizedManimPrompts, FixContext

from .optimized_manim_prompts import OptimizedManimPrompts, FixContext

gemini-code-assist · 2025-10-28T10:16:22Z

projects/video_generate/core/manim_quality_controller.py

+    def process_manim_code(self, raw_code, scene_name, content_description = ""):
+        """主要处理流程"""
+
+        log = []
+        log.append(f" 开始处理 {scene_name}")
+
+        # 步骤1: 预处理清理
+        log.append(" 步骤1: 代码预处理...")
+        current_code = self.preprocessor.get_clean_code(raw_code)
+
+        # 步骤2: 质量检查
+        log.append(" 步骤2: 质量检查...")
+        report = self.preprocessor.preprocess_code(current_code, scene_name)
+
+        # 如果质量良好，直接返回
+        if not report.needs_llm_fix:
+            log.append(" 代码质量良好，无需修复")
+            return ProcessingResult(
+                success=True,
+                final_code=current_code,
+                attempts_used=0,
+                issues_resolved=[],
+                remaining_issues=report.layout_issues,
+                processing_log=log
+            )


process_manim_code 方法似乎已损坏或未完成。

它调用了 self.preprocessor.get_clean_code(raw_code) 和 self.preprocessor.preprocess_code(current_code, scene_name)，但 ManimCodePreprocessor 类中并没有 get_clean_code 方法，且 preprocess_code 方法只接受一个参数。

它访问了 report.needs_llm_fix、report.layout_issues、report.complexity_score 和 report.confidence，但这些属性并未在 CodeQualityReport 数据类中定义。

这会导致运行时抛出 AttributeError。请修复此类的实现。

gemini-code-assist · 2025-10-28T10:16:22Z

ms_agent/cli/run.py

+        if getattr(self.args, 'animation_mode', None):
+            os.environ['MS_ANIMATION_MODE'] = self.args.animation_mode


通过环境变量 MS_ANIMATION_MODE 来传递配置是一种隐式依赖，这会使得代码的追踪、测试和维护变得更加困难。当其他开发者阅读 video_agent.py 时，可能不清楚这个环境变量是从哪里设置的。建议将这个参数通过函数调用链显式地传递下去，例如通过 engine.run 的 **kwargs。

gemini-code-assist · 2025-10-28T10:16:22Z

projects/video_generate/README.md

+ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode auto --trust_remote_code true
+```
+
+输出将位于 `ms-agent/projects/video_generate/output/<主题>/`。
+
+## 运行方式二：人工模式（human）
+
+适合需要人工把控动画的流程：自动产出“脚本/语音/插画/字幕/占位前景”，然后在“人工工作室”内逐段制作/审批前景动画，最终一键完整合成。
+
+1) 先生成素材（不自动渲染 Manim）
+```powershell
+ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode human --trust_remote_code true
+```
+
+2) 打开人工工作室（指向上一步生成的主题目录）
+```powershell
+# 确保将 ms-agent 包目录加入 PYTHONPATH
+$env:PYTHONPATH="项目本地目录\ms-agent"
+
+# 以模块方式启动交互式工作室
+python -m projects.video_generate.core.human_animation_studio "项目本地目录\ms-agent\projects\video_generate\output\主题"
+```
+
+在工作室中：
+- 1 查看待制作任务 → 2 开始制作动画 → 生成/改进 Manim 代码 → 创建预览 → 批准动画
+- 当所有片段完成后，系统会自动合并前景并执行“完整合成（背景+字幕+音频+前景+音乐）”生成成片
+
+## 运行方式三：只合成（已有素材）
+
+如果目录中已经有 `asset_info.json`（或你只想重新合成）：
+
+```powershell
+ms-agent run --config "ms-agent/projects/video_generate/workflow_from_assets.yaml" `
+	--query "项目本地目录\ms-agent\projects\video_generate\output\<主题>\asset_info.json" `
+	--animation_mode human `
+	--trust_remote_code true


README 中的示例命令路径和环境变量设置可能会让用户感到困惑。

命令 ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" 中的路径 ms-agent/projects/... 看起来很奇怪。如果用户在仓库根目录运行，路径应该是 projects/video_generate/workflow.yaml。请澄清运行命令时所在的当前工作目录，并相应地修正路径。

$env:PYTHONPATH="项目本地目录\ms-agent" 是 PowerShell 语法。虽然文档推荐 Windows，但提供 bash/zsh 的等效命令 (export PYTHONPATH="项目本地目录/ms-agent") 会对跨平台用户更友好。

命令中的路径分隔符 \ 是 Windows 特有的，建议统一使用 / 以提高跨平台兼容性。

gemini-code-assist · 2025-10-28T10:16:22Z

projects/video_generate/core/animation_production_modes.py

+        try:
+            with open(self.tasks_file, 'r', encoding='utf-8') as f:
+                tasks_data = json.load(f)
+
+            for task_id, task_dict in tasks_data.items():
+                # 恢复枚举类型
+                task_dict['mode'] = AnimationProductionMode(task_dict['mode'])
+                task_dict['status'] = AnimationStatus(task_dict['status'])
+
+                self.tasks[task_id] = AnimationTask(**task_dict)
+
+        except Exception as e:
+            print(f"加载任务文件失败: {e}")


在 load_tasks 方法中，使用了过于宽泛的 except Exception as e。这会捕获所有类型的异常，可能会掩盖一些潜在的 bug，例如 json.JSONDecodeError（文件格式损坏）、FileNotFoundError（文件不存在）或 KeyError（JSON 结构不正确）。建议捕获更具体的异常类型，并提供更明确的错误日志，以便于调试。

Suggested change

try:

with open(self.tasks_file, 'r', encoding='utf-8') as f:

tasks_data = json.load(f)

for task_id, task_dict in tasks_data.items():

# 恢复枚举类型

task_dict['mode'] = AnimationProductionMode(task_dict['mode'])

task_dict['status'] = AnimationStatus(task_dict['status'])

self.tasks[task_id] = AnimationTask(**task_dict)

except Exception as e:

print(f"加载任务文件失败: {e}")

try:

with open(self.tasks_file, 'r', encoding='utf-8') as f:

tasks_data = json.load(f)

for task_id, task_dict in tasks_data.items():

# 恢复枚举类型

task_dict['mode'] = AnimationProductionMode(task_dict['mode'])

task_dict['status'] = AnimationStatus(task_dict['status'])

self.tasks[task_id] = AnimationTask(**task_dict)

except FileNotFoundError:

# 文件不存在是正常情况，无需打印错误

return

except (json.JSONDecodeError, KeyError) as e:

print(f"加载或解析任务文件失败 ({self.tasks_file}): {e}")

except Exception as e:

print(f"加载任务时发生未知错误: {e}")

gemini-code-assist · 2025-10-28T10:16:22Z

projects/video_generate/core/animation_production_modes.py

+        # 添加占位文本
+        try:
+            # 尝试使用自定义字体
+            font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf')


此处使用的字体文件名 字魂龙吟手书(商用需授权).ttf 与 README.md 和 background_image.py 中使用的 字小魂扶摇手书(商用需授权).ttf 不一致。请统一文件名以避免资源加载失败。

Suggested change

font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf')

font_path = os.path.join(os.path.dirname(__file__), 'asset', '字小魂扶摇手书(商用需授权).ttf')

gemini-code-assist · 2025-10-28T10:16:22Z

projects/video_generate/core/animation_production_modes.py

+        except Exception as e:
+            print(f"创建占位符视频失败: {e}")
+            return None


调用 ffmpeg 进程时，如果发生错误，subprocess.run 会抛出 CalledProcessError。当前的异常捕获虽然能捕捉到错误，但没有将 ffmpeg 的具体错误输出（stderr）打印出来，这给调试带来了困难。建议在捕获异常时，打印 e.stderr 的内容。

Suggested change

except Exception as e:

print(f"创建占位符视频失败: {e}")

return None

except subprocess.CalledProcessError as e:

print(f"创建占位符视频失败: {e}")

if e.stderr:

print(f"FFmpeg stderr: {e.stderr.decode('utf-8', errors='ignore')}")

return None

except Exception as e:

print(f"创建占位符视频失败: {e}")

return None

gemini-code-assist · 2025-10-28T10:16:22Z

projects/video_generate/core/enhanced_quality_system.py

+        client = OpenAI(
+            base_url='https://api-inference.modelscope.cn/v1',
+            api_key=os.environ.get('MODELSCOPE_API_KEY'),
+        )


此处硬编码了 ModelScope 的 API 地址和模型名称。为了提高灵活性和可维护性，建议将这些值移到配置文件中，或者通过类的构造函数传入。

gemini-code-assist · 2025-10-28T10:16:23Z

projects/video_generate/video_agent.py

+    def _generate_assets_from_script(self, script_path: str, topic: str) -> str:
+        """
+        Parses the script, generates TTS, animations, and subtitles.
+        This function is a wrapper around the core logic in workflow.py.
+        """
+        print("[video_agent] Starting asset generation from script...")
+
+        with open(script_path, 'r', encoding='utf-8') as f:
+            script = f.read()
+
+        # Resolve topic from meta if available to keep consistent with original query
+        try:
+            meta_path = os.path.join(os.path.dirname(script_path), 'meta.json')
+            if os.path.exists(meta_path):
+                meta = json.load(open(meta_path, 'r', encoding='utf-8'))
+                topic = meta.get('topic', topic)
+        except Exception as e:
+            print(f"[video_agent] Failed to read topic from meta.json: {e}")
+
+        # Use the script's directory as the output directory for this topic
+        full_output_dir = os.path.dirname(script_path)
+        os.makedirs(full_output_dir, exist_ok=True)
+
+        # 1. Parse script into segments
+        print("[video_agent] Parsing script into segments...")
+        segments = video_workflow.parse_structured_content(script)
+
+        # Further split long text segments
+        final_segments = []
+        for segment in segments:
+            if segment['type'] == 'text' and len(segment['content']) > 100:
+                subsegments = video_workflow.split_text_by_punctuation(segment['content'])
+                for subseg_dict in subsegments:
+                    if subseg_dict['content'].strip():
+                        final_segments.append({
+                            'content': subseg_dict['content'].strip(),
+                            'type': 'text',
+                            'parent_segment': segment
+                        })
+            else:
+                final_segments.append(segment)
+        segments = final_segments
+        print(f"[video_agent] Script parsed into {len(segments)} segments.")
+
+        # 2. Generate assets for each segment
+        asset_paths = {
+            "audio_paths": [],
+            "foreground_paths": [],
+            "subtitle_paths": [],
+            "illustration_paths": [],
+            "subtitle_segments_list": []
+        }
+
+        tts_dir = os.path.join(full_output_dir, "audio")
+        os.makedirs(tts_dir, exist_ok=True)
+
+        subtitle_dir = os.path.join(full_output_dir, "subtitles")
+        os.makedirs(subtitle_dir, exist_ok=True)
+
+        # Prepare illustration paths list aligned to segments
+        illustration_paths: List[str] = []
+
+        for i, segment in enumerate(segments):
+            print(f"[video_agent] Processing segment {i+1}/{len(segments)}: {segment['type']}")
+
+            # Clean content to avoid issues with markers
+            tts_text = video_workflow.clean_content(segment.get('content', ''))
+
+            # Generate TTS
+            audio_path = os.path.join(tts_dir, f"segment_{i+1}.mp3")
+            if tts_text:
+                if video_workflow.edge_tts_generate(tts_text, audio_path):
+                    segment['audio_duration'] = video_workflow.get_audio_duration(audio_path)
+                else:
+                    video_workflow.create_silent_audio(audio_path, duration=3.0)
+                    segment['audio_duration'] = 3.0
+            else:
+                video_workflow.create_silent_audio(audio_path, duration=2.0)
+                segment['audio_duration'] = 2.0
+            asset_paths["audio_paths"].append(audio_path)
+
+            # Generate Animation (only for non-text types)
+            if segment['type'] != 'text' and self.animation_mode != 'human':
+                manim_code = video_workflow.generate_manim_code(
+                    content=video_workflow.clean_content(segment['content']),
+                    content_type=segment['type'],
+                    scene_number=i + 1,
+                    audio_duration=segment.get('audio_duration', 8.0),
+                    main_theme=topic,
+                    context_segments=segments,
+                    segment_index=i,
+                    total_segments=segments
+                )
+                video_path = None
+                if manim_code:
+                    scene_name = f"Scene{i+1}"
+                    scene_dir = os.path.join(full_output_dir, f"scene_{i+1}")
+                    video_path = video_workflow.render_manim_scene(manim_code, scene_name, scene_dir)
+                asset_paths["foreground_paths"].append(video_path)
+            else:
+                # In human mode, skip auto manim rendering (leave placeholders)
+                asset_paths["foreground_paths"].append(None)
+
+            # Initialize placeholders for subtitles; will fill after loop
+            illustration_paths.append(None)
+            asset_paths["subtitle_paths"].append(None)
+            asset_paths["subtitle_segments_list"].append([])
+
+        # Generate illustrations for text segments (mirrors original logic)
+        try:
+            text_segments = [seg for seg in segments if seg.get('type') == 'text']
+            if text_segments:
+                illustration_prompts_path = os.path.join(full_output_dir, 'illustration_prompts.json')
+                if os.path.exists(illustration_prompts_path):
+                    illustration_prompts = json.load(open(illustration_prompts_path, 'r', encoding='utf-8'))
+                else:
+                    illustration_prompts = video_workflow.generate_illustration_prompts([seg['content'] for seg in text_segments])
+                    json.dump(illustration_prompts, open(illustration_prompts_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=2)
+
+                images_dir = os.path.join(full_output_dir, 'images')
+                os.makedirs(images_dir, exist_ok=True)
+                image_paths_path = os.path.join(images_dir, 'image_paths.json')
+                if os.path.exists(image_paths_path):
+                    image_paths = json.load(open(image_paths_path, 'r', encoding='utf-8'))
+                else:
+                    image_paths = video_workflow.generate_images(illustration_prompts, output_dir=full_output_dir)
+                    # move to images folder for consistent paths
+                    for i, img_path in enumerate(image_paths):
+                        if os.path.exists(img_path):
+                            new_path = os.path.join(images_dir, f'illustration_{i+1}.png' if img_path.lower().endswith('.png') else f'illustration_{i+1}.jpg')
+                            try:
+                                os.replace(img_path, new_path)
+                            except Exception:
+                                try:
+                                    import shutil
+                                    shutil.move(img_path, new_path)
+                                except Exception:
+                                    new_path = img_path
+                            image_paths[i] = new_path
+                    json.dump(image_paths, open(image_paths_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=2)
+
+                fg_out_dir = os.path.join(images_dir, 'output_black_only')
+                os.makedirs(fg_out_dir, exist_ok=True)
+                # process background removal if needed
+                if len([f for f in os.listdir(fg_out_dir) if f.lower().endswith('.png')]) < len(image_paths):
+                    video_workflow.keep_only_black_for_folder(images_dir, fg_out_dir)
+
+                # map illustrations back to segment indices
+                text_idx = 0
+                for idx, seg in enumerate(segments):
+                    if seg.get('type') == 'text':
+                        if text_idx < len(image_paths):
+                            transparent_path = os.path.join(fg_out_dir, f'illustration_{text_idx+1}.png')
+                            if os.path.exists(transparent_path):
+                                illustration_paths[idx] = transparent_path
+                            else:
+                                illustration_paths[idx] = image_paths[text_idx]
+                            text_idx += 1
+                        else:
+                            illustration_paths[idx] = None
+                    else:
+                        illustration_paths[idx] = None
+            else:
+                illustration_paths = [None] * len(segments)
+        except Exception as e:
+            print(f"[video_agent] Illustration generation failed: {e}")
+            illustration_paths = [None] * len(segments)
+
+        # Attach illustration paths to asset_paths
+        asset_paths["illustration_paths"] = illustration_paths
+
+        # Generate bilingual subtitles
+        def _split_subtitles(text: str, max_chars: int = 30) -> List[str]:
+            import re
+            sentences = re.split(r'([。！？；，、])', text)
+            subs, cur = [], ""
+            for s in sentences:
+                if not s.strip():
+                    continue
+                test = cur + s
+                if len(test) <= max_chars:
+                    cur = test
+                else:
+                    if cur:
+                        subs.append(cur.strip())
+                    cur = s
+            if cur.strip():
+                subs.append(cur.strip())
+            return subs
+
+        for i, seg in enumerate(segments):
+            try:
+                if seg.get('type') != 'text':
+                    zh_text = seg.get('explanation', '') or seg.get('content', '')
+                    parts = _split_subtitles(zh_text, max_chars=30)
+                    img_list = []
+                    for idx_p, part in enumerate(parts):
+                        sub_en = video_workflow.translate_text_to_english(part)
+                        temp_path, _h = video_workflow.create_bilingual_subtitle_image(
+                            zh_text=part,
+                            en_text=sub_en,
+                            width=1720,
+                            height=120
+                        )
+                        if temp_path and os.path.exists(temp_path):
+                            final_sub_path = os.path.join(subtitle_dir, f"bilingual_subtitle_{i+1}_{idx_p+1}.png")
+                            try:
+                                os.replace(temp_path, final_sub_path)
+                            except Exception:
+                                import shutil
+                                shutil.move(temp_path, final_sub_path)
+                            img_list.append(final_sub_path)
+                    asset_paths["subtitle_segments_list"][i] = img_list
+                    asset_paths["subtitle_paths"][i] = img_list[0] if img_list else None
+                else:
+                    zh_text = seg.get('content', '')
+                    en_text = video_workflow.translate_text_to_english(zh_text)
+                    temp_path, _h = video_workflow.create_bilingual_subtitle_image(
+                        zh_text=zh_text,
+                        en_text=en_text,
+                        width=1720,
+                        height=120
+                    )
+                    if temp_path and os.path.exists(temp_path):
+                        final_sub_path = os.path.join(subtitle_dir, f"bilingual_subtitle_{i+1}.png")
+                        try:
+                            os.replace(temp_path, final_sub_path)
+                        except Exception:
+                            import shutil
+                            shutil.move(temp_path, final_sub_path)
+                        asset_paths["subtitle_paths"][i] = final_sub_path
+                        asset_paths["subtitle_segments_list"][i] = [final_sub_path]
+            except Exception as e:
+                print(f"[video_agent] Subtitle generation failed at segment {i+1}: {e}")
+
+        # Save all necessary info for the next step
+        asset_info = {
+            "topic": topic,
+            "output_dir": full_output_dir,
+            "segments": segments,
+            "asset_paths": asset_paths,
+            "animation_mode": self.animation_mode
+        }
+        asset_info_path = os.path.join(full_output_dir, "asset_info.json")
+        with open(asset_info_path, 'w', encoding='utf-8') as f:
+            json.dump(asset_info, f, ensure_ascii=False, indent=2)
+
+        # 兼容工作室的完整合成：同时输出 segments.json
+        try:
+            with open(os.path.join(full_output_dir, 'segments.json'), 'w', encoding='utf-8') as sf:
+                json.dump(segments, sf, ensure_ascii=False, indent=2)
+        except Exception as _e:
+            print(f"[video_agent] 写入 segments.json 失败: {_e}")
+
+        # In human mode, drop a short README to guide manual studio
+        if self.animation_mode == 'human':
+            readme_path = os.path.join(full_output_dir, 'HUMAN_README.txt')
+            try:
+                with open(readme_path, 'w', encoding='utf-8') as rf:
+                    rf.write(
+                        "本目录为人工动画模式生成的素材预备目录\n"
+                        "- 已生成脚本、语音、插画、字幕与占位前景（无自动动画）\n"
+                        "- 下一步：进入互动动画工作室制作每个动画片段\n\n"
+                        "启动命令示例：\n"
+                        "# 先确保将 ms-agent 目录加入 PYTHONPATH 环境变量\n"
+                        "# PowerShell:\n"
+                        "# $env:PYTHONPATH=\"{}\"\n"
+                        "# 然后以模块方式启动工作室：\n"
+                        "python -m projects.video_generate.core.human_animation_studio \"{}\"\n".format(
+                            os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')),  # ms-agent 根目录
+                            full_output_dir
+                        )
+                    )
+            except Exception as _e:
+                print(f"[video_agent] Failed to write HUMAN_README: {_e}")
+
+        print(f"[video_agent] Asset generation complete. Info saved to {asset_info_path}")
+        return asset_info_path


_generate_assets_from_script 方法过于庞大和复杂，承担了太多的职责，包括：解析脚本、生成 TTS、生成插画、生成字幕、处理文件路径等。这违反了单一职责原则，使得代码难以阅读、测试和维护。建议将此方法重构为多个更小的、职责单一的辅助函数，例如 _generate_tts_for_segments, _generate_illustrations_for_segments, _generate_subtitles_for_segments 等。

添加了GenerateVideo功能

7237f1b

gemini-code-assist bot reviewed Oct 28, 2025

View reviewed changes

	from optimized_manim_prompts import OptimizedManimPrompts, FixContext
	from .optimized_manim_prompts import OptimizedManimPrompts, FixContext

		if getattr(self.args, 'animation_mode', None):
		os.environ['MS_ANIMATION_MODE'] = self.args.animation_mode

	font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf')
	font_path = os.path.join(os.path.dirname(__file__), 'asset', '字小魂扶摇手书(商用需授权).ttf')

Feat/Added the GenerateVideo feature #759

Are you sure you want to change the base?

Feat/Added the GenerateVideo feature #759

Uh oh!

Conversation

xiaoxianhjy commented Oct 28, 2025

Video Generate

快速检查（必读）

运行方式一：全自动模式（auto）

运行方式二：人工模式（human）

运行方式三：只合成（已有素材）

目录说明

Uh oh!

gemini-code-assist bot commented Oct 28, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant