Skip to content

Conversation

@xiaoxianhjy
Copy link

Video Generate

一个“AI 科普短视频”工作流。支持全自动与人工协同两种模式,产生脚本、语音、插画/动画、字幕,并合成为成片。

快速检查(必读)

在首次运行前,建议完成以下检查:

  1. 运行环境
  • Windows / Python 3.10+(推荐)
  • 已安装 FFmpeg,并添加到 PATH(ffmpeg -version 可执行)
  • Manim 可用(manim -h 可执行)
  1. Python 依赖(若未安装)
  • 依赖在仓库 requirements 下,或按需安装:moviepy、Pillow、edge-tts、matplotlib 等
  1. 资源文件(已随仓库提供)
  • 自定义字体与背景音乐:projects/video_generate/core/asset/
    • bg_audio.mp3
    • 字小魂扶摇手书(商用需授权).ttf
  1. 可选的 API Key(全自动模式常用)
  • MODELSCOPE_API_KEY:用于 ModelScope 模型调用

提示:未设置 Key 也可运行“只合成/人工模式”,但全自动模式可能因缺少 LLM 能力失败。

运行方式一:全自动模式(auto)

按主题从零到一自动生成并合成视频:

# 可选:设置 API Key
$env:MODELSCOPE_API_KEY="你的ModelScopeKey"

# 运行三步工作流(脚本 → 素材 → 合成)
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode auto --trust_remote_code true

输出将位于 ms-agent/projects/video_generate/output/<主题>/

运行方式二:人工模式(human)

适合需要人工把控动画的流程:自动产出“脚本/语音/插画/字幕/占位前景”,然后在“人工工作室”内逐段制作/审批前景动画,最终一键完整合成。

  1. 先生成素材(不自动渲染 Manim)
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode human --trust_remote_code true
  1. 打开人工工作室(指向上一步生成的主题目录)
# 确保将 ms-agent 包目录加入 PYTHONPATH
$env:PYTHONPATH="项目本地目录\ms-agent"

# 以模块方式启动交互式工作室
python -m projects.video_generate.core.human_animation_studio "项目本地目录\ms-agent\projects\video_generate\output\主题"

在工作室中:

  • 1 查看待制作任务 → 2 开始制作动画 → 生成/改进 Manim 代码 → 创建预览 → 批准动画
  • 当所有片段完成后,系统会自动合并前景并执行“完整合成(背景+字幕+音频+前景+音乐)”生成成片

运行方式三:只合成(已有素材)

如果目录中已经有 asset_info.json(或你只想重新合成):

ms-agent run --config "ms-agent/projects/video_generate/workflow_from_assets.yaml" `
	--query "项目本地目录\ms-agent\projects\video_generate\output\<主题>\asset_info.json" `
	--animation_mode human `
	--trust_remote_code true

该流程只执行合成,不会重新生成脚本/插画/动画。若存在已审批的透明前景(finals/scene_*_final.mov),将优先使用。

目录说明

  • video_agent.py:三步逻辑的 Agent 封装
  • workflow.yaml:三步编排;workflow_from_assets.yaml:只合成编排
  • core/workflow.py:主流程;core/human_animation_studio.py:人工工作室
  • core/asset/:字体与背景音乐
  • output/:运行产物
  • scripts/compose_from_asset_info.py:从现有 asset_info.json 直接合成的辅助脚本

@gemini-code-assist
Copy link

Summary of Changes

Hello @xiaoxianhjy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求引入了一个强大的“视频生成”功能,旨在自动化和辅助创建AI科普短视频。它通过引入灵活的动画模式选择(全自动或人工辅助),极大地提升了视频制作的效率和可控性。核心变更包括命令行参数的扩展、环境变量的智能传播,以及一系列新开发的模块,这些模块协同工作以处理动画任务、生成高质量视觉资产并确保Manim动画代码的布局和内容质量。此外,还提供了清晰的文档和工作流配置,使用户能够轻松上手并定制视频生成过程。

Highlights

  • 新功能:视频生成工作流: 引入了一个全面的AI科普短视频生成工作流,支持全自动和人工辅助两种模式,涵盖脚本、语音、插画/动画、字幕的生成与最终合成。
  • 动画模式控制: 在 ms_agent/cli/run.py 中新增了 --animation_mode 命令行参数,允许用户选择 auto(全自动)或 human(人工辅助)动画制作模式,并通过环境变量将此模式传递给下游代理。
  • 核心模块增强: 新增了多个核心Python模块,包括动画生产模式管理、背景图片生成、Manim空间布局检查、增强的Manim提示词系统以及视觉质量评估,以确保生成视频的质量和灵活性。
  • 新代理与工作流定义: 引入了 video_agent.py 作为新的代理,并定义了新的YAML工作流文件(workflow.yamlworkflow_from_assets.yaml 等),用于编排视频生成过程的各个阶段。
  • 详细文档: 为 projects/video_generate 目录添加了详细的 README.md 文件,提供了环境设置、三种运行模式(全自动、人工、只合成)、目录结构和常见问题的全面指南。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这个 PR 增加了一个功能强大的“AI 科普短视频”工作流,代码结构清晰,考虑了全自动和人工协同两种模式,非常出色。特别是其中包含了多轮的代码生成、静态分析和自我修复机制,这是一个很先进的设计。

不过,也存在一些可以改进的地方:

  1. 循环依赖: core 目录下的几个模块存在循环导入问题,例如 workflow.pybalanced_spatial_system.py 等,这会导致程序在启动时失败。建议将共享的函数(如 modai_model_request)提取到独立的工具模块中。
  2. 代码质量: 部分新添加的代码存在一些问题,比如 manim_quality_controller.py 中的 ManimQualityController 类似乎未完成或已损坏,调用了不存在的方法。
  3. 配置硬编码: 多处代码中硬编码了模型名称和 API 地址,建议将它们移入配置文件,方便后续修改和管理。

总的来说,这是一个非常棒的功能,在解决上述问题后会更加健壮和易于维护。


try:
# 调用LLM进行修复
from .workflow import modai_model_request

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

此处 from .workflow import modai_model_request 造成了循环依赖。workflow.py 模块在顶部导入了 balanced_spatial_system.py,而这里又反向导入了 workflow.py。这会导致 Python 在启动时抛出 ImportError。建议将 modai_model_request 函数移动到一个独立的工具模块(例如 core/utils.py),然后让 workflow.pybalanced_spatial_system.py 都从这个新模块导入。这个问题在 manim_quality_controller.py 中也存在。

import ast
from typing import Dict, List, Tuple, Optional
from dataclasses import dataclass
from optimized_manim_prompts import OptimizedManimPrompts, FixContext

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

from optimized_manim_prompts import OptimizedManimPrompts, FixContext 这行导入语句存在问题。由于 manim_quality_controller.pyoptimized_manim_prompts.py 在同一个 core 目录下,应该使用相对导入 from .optimized_manim_prompts import ...。当前的写法要求 core 目录在 PYTHONPATH 中,这是一种不稳定的实践,并且可能会导致模块找不到的错误。

Suggested change
from optimized_manim_prompts import OptimizedManimPrompts, FixContext
from .optimized_manim_prompts import OptimizedManimPrompts, FixContext

Comment on lines +95 to +119
def process_manim_code(self, raw_code, scene_name, content_description = ""):
"""主要处理流程"""

log = []
log.append(f" 开始处理 {scene_name}")

# 步骤1: 预处理清理
log.append(" 步骤1: 代码预处理...")
current_code = self.preprocessor.get_clean_code(raw_code)

# 步骤2: 质量检查
log.append(" 步骤2: 质量检查...")
report = self.preprocessor.preprocess_code(current_code, scene_name)

# 如果质量良好,直接返回
if not report.needs_llm_fix:
log.append(" 代码质量良好,无需修复")
return ProcessingResult(
success=True,
final_code=current_code,
attempts_used=0,
issues_resolved=[],
remaining_issues=report.layout_issues,
processing_log=log
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

process_manim_code 方法似乎已损坏或未完成。

  • 它调用了 self.preprocessor.get_clean_code(raw_code)self.preprocessor.preprocess_code(current_code, scene_name),但 ManimCodePreprocessor 类中并没有 get_clean_code 方法,且 preprocess_code 方法只接受一个参数。
  • 它访问了 report.needs_llm_fixreport.layout_issuesreport.complexity_scorereport.confidence,但这些属性并未在 CodeQualityReport 数据类中定义。

这会导致运行时抛出 AttributeError。请修复此类的实现。

Comment on lines +102 to +103
if getattr(self.args, 'animation_mode', None):
os.environ['MS_ANIMATION_MODE'] = self.args.animation_mode

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

通过环境变量 MS_ANIMATION_MODE 来传递配置是一种隐式依赖,这会使得代码的追踪、测试和维护变得更加困难。当其他开发者阅读 video_agent.py 时,可能不清楚这个环境变量是从哪里设置的。建议将这个参数通过函数调用链显式地传递下去,例如通过 engine.run**kwargs

Comment on lines +36 to +71
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode auto --trust_remote_code true
```

输出将位于 `ms-agent/projects/video_generate/output/<主题>/`

## 运行方式二:人工模式(human)

适合需要人工把控动画的流程:自动产出“脚本/语音/插画/字幕/占位前景”,然后在“人工工作室”内逐段制作/审批前景动画,最终一键完整合成。

1) 先生成素材(不自动渲染 Manim)
```powershell
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode human --trust_remote_code true
```

2) 打开人工工作室(指向上一步生成的主题目录)
```powershell
# 确保将 ms-agent 包目录加入 PYTHONPATH
$env:PYTHONPATH="项目本地目录\ms-agent"
# 以模块方式启动交互式工作室
python -m projects.video_generate.core.human_animation_studio "项目本地目录\ms-agent\projects\video_generate\output\主题"
```

在工作室中:
- 1 查看待制作任务 → 2 开始制作动画 → 生成/改进 Manim 代码 → 创建预览 → 批准动画
- 当所有片段完成后,系统会自动合并前景并执行“完整合成(背景+字幕+音频+前景+音乐)”生成成片

## 运行方式三:只合成(已有素材)

如果目录中已经有 `asset_info.json`(或你只想重新合成):

```powershell
ms-agent run --config "ms-agent/projects/video_generate/workflow_from_assets.yaml" `
--query "项目本地目录\ms-agent\projects\video_generate\output\<主题>\asset_info.json" `
--animation_mode human `
--trust_remote_code true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

README 中的示例命令路径和环境变量设置可能会让用户感到困惑。

  • 命令 ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" 中的路径 ms-agent/projects/... 看起来很奇怪。如果用户在仓库根目录运行,路径应该是 projects/video_generate/workflow.yaml。请澄清运行命令时所在的当前工作目录,并相应地修正路径。
  • $env:PYTHONPATH="项目本地目录\ms-agent" 是 PowerShell 语法。虽然文档推荐 Windows,但提供 bash/zsh 的等效命令 (export PYTHONPATH="项目本地目录/ms-agent") 会对跨平台用户更友好。
  • 命令中的路径分隔符 \ 是 Windows 特有的,建议统一使用 / 以提高跨平台兼容性。

Comment on lines +152 to +164
try:
with open(self.tasks_file, 'r', encoding='utf-8') as f:
tasks_data = json.load(f)

for task_id, task_dict in tasks_data.items():
# 恢复枚举类型
task_dict['mode'] = AnimationProductionMode(task_dict['mode'])
task_dict['status'] = AnimationStatus(task_dict['status'])

self.tasks[task_id] = AnimationTask(**task_dict)

except Exception as e:
print(f"加载任务文件失败: {e}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

load_tasks 方法中,使用了过于宽泛的 except Exception as e。这会捕获所有类型的异常,可能会掩盖一些潜在的 bug,例如 json.JSONDecodeError(文件格式损坏)、FileNotFoundError(文件不存在)或 KeyError(JSON 结构不正确)。建议捕获更具体的异常类型,并提供更明确的错误日志,以便于调试。

Suggested change
try:
with open(self.tasks_file, 'r', encoding='utf-8') as f:
tasks_data = json.load(f)
for task_id, task_dict in tasks_data.items():
# 恢复枚举类型
task_dict['mode'] = AnimationProductionMode(task_dict['mode'])
task_dict['status'] = AnimationStatus(task_dict['status'])
self.tasks[task_id] = AnimationTask(**task_dict)
except Exception as e:
print(f"加载任务文件失败: {e}")
try:
with open(self.tasks_file, 'r', encoding='utf-8') as f:
tasks_data = json.load(f)
for task_id, task_dict in tasks_data.items():
# 恢复枚举类型
task_dict['mode'] = AnimationProductionMode(task_dict['mode'])
task_dict['status'] = AnimationStatus(task_dict['status'])
self.tasks[task_id] = AnimationTask(**task_dict)
except FileNotFoundError:
# 文件不存在是正常情况,无需打印错误
return
except (json.JSONDecodeError, KeyError) as e:
print(f"加载或解析任务文件失败 ({self.tasks_file}): {e}")
except Exception as e:
print(f"加载任务时发生未知错误: {e}")

# 添加占位文本
try:
# 尝试使用自定义字体
font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此处使用的字体文件名 字魂龙吟手书(商用需授权).ttfREADME.mdbackground_image.py 中使用的 字小魂扶摇手书(商用需授权).ttf 不一致。请统一文件名以避免资源加载失败。

Suggested change
font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf')
font_path = os.path.join(os.path.dirname(__file__), 'asset', '字小魂扶摇手书(商用需授权).ttf')

Comment on lines +254 to +256
except Exception as e:
print(f"创建占位符视频失败: {e}")
return None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

调用 ffmpeg 进程时,如果发生错误,subprocess.run 会抛出 CalledProcessError。当前的异常捕获虽然能捕捉到错误,但没有将 ffmpeg 的具体错误输出(stderr)打印出来,这给调试带来了困难。建议在捕获异常时,打印 e.stderr 的内容。

Suggested change
except Exception as e:
print(f"创建占位符视频失败: {e}")
return None
except subprocess.CalledProcessError as e:
print(f"创建占位符视频失败: {e}")
if e.stderr:
print(f"FFmpeg stderr: {e.stderr.decode('utf-8', errors='ignore')}")
return None
except Exception as e:
print(f"创建占位符视频失败: {e}")
return None

Comment on lines +98 to +101
client = OpenAI(
base_url='https://api-inference.modelscope.cn/v1',
api_key=os.environ.get('MODELSCOPE_API_KEY'),
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此处硬编码了 ModelScope 的 API 地址和模型名称。为了提高灵活性和可维护性,建议将这些值移到配置文件中,或者通过类的构造函数传入。

Comment on lines +64 to +341
def _generate_assets_from_script(self, script_path: str, topic: str) -> str:
"""
Parses the script, generates TTS, animations, and subtitles.
This function is a wrapper around the core logic in workflow.py.
"""
print("[video_agent] Starting asset generation from script...")

with open(script_path, 'r', encoding='utf-8') as f:
script = f.read()

# Resolve topic from meta if available to keep consistent with original query
try:
meta_path = os.path.join(os.path.dirname(script_path), 'meta.json')
if os.path.exists(meta_path):
meta = json.load(open(meta_path, 'r', encoding='utf-8'))
topic = meta.get('topic', topic)
except Exception as e:
print(f"[video_agent] Failed to read topic from meta.json: {e}")

# Use the script's directory as the output directory for this topic
full_output_dir = os.path.dirname(script_path)
os.makedirs(full_output_dir, exist_ok=True)

# 1. Parse script into segments
print("[video_agent] Parsing script into segments...")
segments = video_workflow.parse_structured_content(script)

# Further split long text segments
final_segments = []
for segment in segments:
if segment['type'] == 'text' and len(segment['content']) > 100:
subsegments = video_workflow.split_text_by_punctuation(segment['content'])
for subseg_dict in subsegments:
if subseg_dict['content'].strip():
final_segments.append({
'content': subseg_dict['content'].strip(),
'type': 'text',
'parent_segment': segment
})
else:
final_segments.append(segment)
segments = final_segments
print(f"[video_agent] Script parsed into {len(segments)} segments.")

# 2. Generate assets for each segment
asset_paths = {
"audio_paths": [],
"foreground_paths": [],
"subtitle_paths": [],
"illustration_paths": [],
"subtitle_segments_list": []
}

tts_dir = os.path.join(full_output_dir, "audio")
os.makedirs(tts_dir, exist_ok=True)

subtitle_dir = os.path.join(full_output_dir, "subtitles")
os.makedirs(subtitle_dir, exist_ok=True)

# Prepare illustration paths list aligned to segments
illustration_paths: List[str] = []

for i, segment in enumerate(segments):
print(f"[video_agent] Processing segment {i+1}/{len(segments)}: {segment['type']}")

# Clean content to avoid issues with markers
tts_text = video_workflow.clean_content(segment.get('content', ''))

# Generate TTS
audio_path = os.path.join(tts_dir, f"segment_{i+1}.mp3")
if tts_text:
if video_workflow.edge_tts_generate(tts_text, audio_path):
segment['audio_duration'] = video_workflow.get_audio_duration(audio_path)
else:
video_workflow.create_silent_audio(audio_path, duration=3.0)
segment['audio_duration'] = 3.0
else:
video_workflow.create_silent_audio(audio_path, duration=2.0)
segment['audio_duration'] = 2.0
asset_paths["audio_paths"].append(audio_path)

# Generate Animation (only for non-text types)
if segment['type'] != 'text' and self.animation_mode != 'human':
manim_code = video_workflow.generate_manim_code(
content=video_workflow.clean_content(segment['content']),
content_type=segment['type'],
scene_number=i + 1,
audio_duration=segment.get('audio_duration', 8.0),
main_theme=topic,
context_segments=segments,
segment_index=i,
total_segments=segments
)
video_path = None
if manim_code:
scene_name = f"Scene{i+1}"
scene_dir = os.path.join(full_output_dir, f"scene_{i+1}")
video_path = video_workflow.render_manim_scene(manim_code, scene_name, scene_dir)
asset_paths["foreground_paths"].append(video_path)
else:
# In human mode, skip auto manim rendering (leave placeholders)
asset_paths["foreground_paths"].append(None)

# Initialize placeholders for subtitles; will fill after loop
illustration_paths.append(None)
asset_paths["subtitle_paths"].append(None)
asset_paths["subtitle_segments_list"].append([])

# Generate illustrations for text segments (mirrors original logic)
try:
text_segments = [seg for seg in segments if seg.get('type') == 'text']
if text_segments:
illustration_prompts_path = os.path.join(full_output_dir, 'illustration_prompts.json')
if os.path.exists(illustration_prompts_path):
illustration_prompts = json.load(open(illustration_prompts_path, 'r', encoding='utf-8'))
else:
illustration_prompts = video_workflow.generate_illustration_prompts([seg['content'] for seg in text_segments])
json.dump(illustration_prompts, open(illustration_prompts_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=2)

images_dir = os.path.join(full_output_dir, 'images')
os.makedirs(images_dir, exist_ok=True)
image_paths_path = os.path.join(images_dir, 'image_paths.json')
if os.path.exists(image_paths_path):
image_paths = json.load(open(image_paths_path, 'r', encoding='utf-8'))
else:
image_paths = video_workflow.generate_images(illustration_prompts, output_dir=full_output_dir)
# move to images folder for consistent paths
for i, img_path in enumerate(image_paths):
if os.path.exists(img_path):
new_path = os.path.join(images_dir, f'illustration_{i+1}.png' if img_path.lower().endswith('.png') else f'illustration_{i+1}.jpg')
try:
os.replace(img_path, new_path)
except Exception:
try:
import shutil
shutil.move(img_path, new_path)
except Exception:
new_path = img_path
image_paths[i] = new_path
json.dump(image_paths, open(image_paths_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=2)

fg_out_dir = os.path.join(images_dir, 'output_black_only')
os.makedirs(fg_out_dir, exist_ok=True)
# process background removal if needed
if len([f for f in os.listdir(fg_out_dir) if f.lower().endswith('.png')]) < len(image_paths):
video_workflow.keep_only_black_for_folder(images_dir, fg_out_dir)

# map illustrations back to segment indices
text_idx = 0
for idx, seg in enumerate(segments):
if seg.get('type') == 'text':
if text_idx < len(image_paths):
transparent_path = os.path.join(fg_out_dir, f'illustration_{text_idx+1}.png')
if os.path.exists(transparent_path):
illustration_paths[idx] = transparent_path
else:
illustration_paths[idx] = image_paths[text_idx]
text_idx += 1
else:
illustration_paths[idx] = None
else:
illustration_paths[idx] = None
else:
illustration_paths = [None] * len(segments)
except Exception as e:
print(f"[video_agent] Illustration generation failed: {e}")
illustration_paths = [None] * len(segments)

# Attach illustration paths to asset_paths
asset_paths["illustration_paths"] = illustration_paths

# Generate bilingual subtitles
def _split_subtitles(text: str, max_chars: int = 30) -> List[str]:
import re
sentences = re.split(r'([。!?;,、])', text)
subs, cur = [], ""
for s in sentences:
if not s.strip():
continue
test = cur + s
if len(test) <= max_chars:
cur = test
else:
if cur:
subs.append(cur.strip())
cur = s
if cur.strip():
subs.append(cur.strip())
return subs

for i, seg in enumerate(segments):
try:
if seg.get('type') != 'text':
zh_text = seg.get('explanation', '') or seg.get('content', '')
parts = _split_subtitles(zh_text, max_chars=30)
img_list = []
for idx_p, part in enumerate(parts):
sub_en = video_workflow.translate_text_to_english(part)
temp_path, _h = video_workflow.create_bilingual_subtitle_image(
zh_text=part,
en_text=sub_en,
width=1720,
height=120
)
if temp_path and os.path.exists(temp_path):
final_sub_path = os.path.join(subtitle_dir, f"bilingual_subtitle_{i+1}_{idx_p+1}.png")
try:
os.replace(temp_path, final_sub_path)
except Exception:
import shutil
shutil.move(temp_path, final_sub_path)
img_list.append(final_sub_path)
asset_paths["subtitle_segments_list"][i] = img_list
asset_paths["subtitle_paths"][i] = img_list[0] if img_list else None
else:
zh_text = seg.get('content', '')
en_text = video_workflow.translate_text_to_english(zh_text)
temp_path, _h = video_workflow.create_bilingual_subtitle_image(
zh_text=zh_text,
en_text=en_text,
width=1720,
height=120
)
if temp_path and os.path.exists(temp_path):
final_sub_path = os.path.join(subtitle_dir, f"bilingual_subtitle_{i+1}.png")
try:
os.replace(temp_path, final_sub_path)
except Exception:
import shutil
shutil.move(temp_path, final_sub_path)
asset_paths["subtitle_paths"][i] = final_sub_path
asset_paths["subtitle_segments_list"][i] = [final_sub_path]
except Exception as e:
print(f"[video_agent] Subtitle generation failed at segment {i+1}: {e}")

# Save all necessary info for the next step
asset_info = {
"topic": topic,
"output_dir": full_output_dir,
"segments": segments,
"asset_paths": asset_paths,
"animation_mode": self.animation_mode
}
asset_info_path = os.path.join(full_output_dir, "asset_info.json")
with open(asset_info_path, 'w', encoding='utf-8') as f:
json.dump(asset_info, f, ensure_ascii=False, indent=2)

# 兼容工作室的完整合成:同时输出 segments.json
try:
with open(os.path.join(full_output_dir, 'segments.json'), 'w', encoding='utf-8') as sf:
json.dump(segments, sf, ensure_ascii=False, indent=2)
except Exception as _e:
print(f"[video_agent] 写入 segments.json 失败: {_e}")

# In human mode, drop a short README to guide manual studio
if self.animation_mode == 'human':
readme_path = os.path.join(full_output_dir, 'HUMAN_README.txt')
try:
with open(readme_path, 'w', encoding='utf-8') as rf:
rf.write(
"本目录为人工动画模式生成的素材预备目录\n"
"- 已生成脚本、语音、插画、字幕与占位前景(无自动动画)\n"
"- 下一步:进入互动动画工作室制作每个动画片段\n\n"
"启动命令示例:\n"
"# 先确保将 ms-agent 目录加入 PYTHONPATH 环境变量\n"
"# PowerShell:\n"
"# $env:PYTHONPATH=\"{}\"\n"
"# 然后以模块方式启动工作室:\n"
"python -m projects.video_generate.core.human_animation_studio \"{}\"\n".format(
os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')), # ms-agent 根目录
full_output_dir
)
)
except Exception as _e:
print(f"[video_agent] Failed to write HUMAN_README: {_e}")

print(f"[video_agent] Asset generation complete. Info saved to {asset_info_path}")
return asset_info_path

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

_generate_assets_from_script 方法过于庞大和复杂,承担了太多的职责,包括:解析脚本、生成 TTS、生成插画、生成字幕、处理文件路径等。这违反了单一职责原则,使得代码难以阅读、测试和维护。建议将此方法重构为多个更小的、职责单一的辅助函数,例如 _generate_tts_for_segments, _generate_illustrations_for_segments, _generate_subtitles_for_segments 等。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant