-
Notifications
You must be signed in to change notification settings - Fork 403
Feat/Added the GenerateVideo feature #759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @xiaoxianhjy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求引入了一个强大的“视频生成”功能,旨在自动化和辅助创建AI科普短视频。它通过引入灵活的动画模式选择(全自动或人工辅助),极大地提升了视频制作的效率和可控性。核心变更包括命令行参数的扩展、环境变量的智能传播,以及一系列新开发的模块,这些模块协同工作以处理动画任务、生成高质量视觉资产并确保Manim动画代码的布局和内容质量。此外,还提供了清晰的文档和工作流配置,使用户能够轻松上手并定制视频生成过程。 Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
这个 PR 增加了一个功能强大的“AI 科普短视频”工作流,代码结构清晰,考虑了全自动和人工协同两种模式,非常出色。特别是其中包含了多轮的代码生成、静态分析和自我修复机制,这是一个很先进的设计。
不过,也存在一些可以改进的地方:
- 循环依赖:
core目录下的几个模块存在循环导入问题,例如workflow.py和balanced_spatial_system.py等,这会导致程序在启动时失败。建议将共享的函数(如modai_model_request)提取到独立的工具模块中。 - 代码质量: 部分新添加的代码存在一些问题,比如
manim_quality_controller.py中的ManimQualityController类似乎未完成或已损坏,调用了不存在的方法。 - 配置硬编码: 多处代码中硬编码了模型名称和 API 地址,建议将它们移入配置文件,方便后续修改和管理。
总的来说,这是一个非常棒的功能,在解决上述问题后会更加健壮和易于维护。
|
|
||
| try: | ||
| # 调用LLM进行修复 | ||
| from .workflow import modai_model_request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处 from .workflow import modai_model_request 造成了循环依赖。workflow.py 模块在顶部导入了 balanced_spatial_system.py,而这里又反向导入了 workflow.py。这会导致 Python 在启动时抛出 ImportError。建议将 modai_model_request 函数移动到一个独立的工具模块(例如 core/utils.py),然后让 workflow.py 和 balanced_spatial_system.py 都从这个新模块导入。这个问题在 manim_quality_controller.py 中也存在。
| import ast | ||
| from typing import Dict, List, Tuple, Optional | ||
| from dataclasses import dataclass | ||
| from optimized_manim_prompts import OptimizedManimPrompts, FixContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from optimized_manim_prompts import OptimizedManimPrompts, FixContext 这行导入语句存在问题。由于 manim_quality_controller.py 和 optimized_manim_prompts.py 在同一个 core 目录下,应该使用相对导入 from .optimized_manim_prompts import ...。当前的写法要求 core 目录在 PYTHONPATH 中,这是一种不稳定的实践,并且可能会导致模块找不到的错误。
| from optimized_manim_prompts import OptimizedManimPrompts, FixContext | |
| from .optimized_manim_prompts import OptimizedManimPrompts, FixContext |
| def process_manim_code(self, raw_code, scene_name, content_description = ""): | ||
| """主要处理流程""" | ||
|
|
||
| log = [] | ||
| log.append(f" 开始处理 {scene_name}") | ||
|
|
||
| # 步骤1: 预处理清理 | ||
| log.append(" 步骤1: 代码预处理...") | ||
| current_code = self.preprocessor.get_clean_code(raw_code) | ||
|
|
||
| # 步骤2: 质量检查 | ||
| log.append(" 步骤2: 质量检查...") | ||
| report = self.preprocessor.preprocess_code(current_code, scene_name) | ||
|
|
||
| # 如果质量良好,直接返回 | ||
| if not report.needs_llm_fix: | ||
| log.append(" 代码质量良好,无需修复") | ||
| return ProcessingResult( | ||
| success=True, | ||
| final_code=current_code, | ||
| attempts_used=0, | ||
| issues_resolved=[], | ||
| remaining_issues=report.layout_issues, | ||
| processing_log=log | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process_manim_code 方法似乎已损坏或未完成。
- 它调用了
self.preprocessor.get_clean_code(raw_code)和self.preprocessor.preprocess_code(current_code, scene_name),但ManimCodePreprocessor类中并没有get_clean_code方法,且preprocess_code方法只接受一个参数。 - 它访问了
report.needs_llm_fix、report.layout_issues、report.complexity_score和report.confidence,但这些属性并未在CodeQualityReport数据类中定义。
这会导致运行时抛出 AttributeError。请修复此类的实现。
| if getattr(self.args, 'animation_mode', None): | ||
| os.environ['MS_ANIMATION_MODE'] = self.args.animation_mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode auto --trust_remote_code true | ||
| ``` | ||
|
|
||
| 输出将位于 `ms-agent/projects/video_generate/output/<主题>/`。 | ||
|
|
||
| ## 运行方式二:人工模式(human) | ||
|
|
||
| 适合需要人工把控动画的流程:自动产出“脚本/语音/插画/字幕/占位前景”,然后在“人工工作室”内逐段制作/审批前景动画,最终一键完整合成。 | ||
|
|
||
| 1) 先生成素材(不自动渲染 Manim) | ||
| ```powershell | ||
| ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode human --trust_remote_code true | ||
| ``` | ||
|
|
||
| 2) 打开人工工作室(指向上一步生成的主题目录) | ||
| ```powershell | ||
| # 确保将 ms-agent 包目录加入 PYTHONPATH | ||
| $env:PYTHONPATH="项目本地目录\ms-agent" | ||
| # 以模块方式启动交互式工作室 | ||
| python -m projects.video_generate.core.human_animation_studio "项目本地目录\ms-agent\projects\video_generate\output\主题" | ||
| ``` | ||
|
|
||
| 在工作室中: | ||
| - 1 查看待制作任务 → 2 开始制作动画 → 生成/改进 Manim 代码 → 创建预览 → 批准动画 | ||
| - 当所有片段完成后,系统会自动合并前景并执行“完整合成(背景+字幕+音频+前景+音乐)”生成成片 | ||
|
|
||
| ## 运行方式三:只合成(已有素材) | ||
|
|
||
| 如果目录中已经有 `asset_info.json`(或你只想重新合成): | ||
|
|
||
| ```powershell | ||
| ms-agent run --config "ms-agent/projects/video_generate/workflow_from_assets.yaml" ` | ||
| --query "项目本地目录\ms-agent\projects\video_generate\output\<主题>\asset_info.json" ` | ||
| --animation_mode human ` | ||
| --trust_remote_code true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README 中的示例命令路径和环境变量设置可能会让用户感到困惑。
- 命令
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml"中的路径ms-agent/projects/...看起来很奇怪。如果用户在仓库根目录运行,路径应该是projects/video_generate/workflow.yaml。请澄清运行命令时所在的当前工作目录,并相应地修正路径。 $env:PYTHONPATH="项目本地目录\ms-agent"是 PowerShell 语法。虽然文档推荐 Windows,但提供bash/zsh的等效命令 (export PYTHONPATH="项目本地目录/ms-agent") 会对跨平台用户更友好。- 命令中的路径分隔符
\是 Windows 特有的,建议统一使用/以提高跨平台兼容性。
| try: | ||
| with open(self.tasks_file, 'r', encoding='utf-8') as f: | ||
| tasks_data = json.load(f) | ||
|
|
||
| for task_id, task_dict in tasks_data.items(): | ||
| # 恢复枚举类型 | ||
| task_dict['mode'] = AnimationProductionMode(task_dict['mode']) | ||
| task_dict['status'] = AnimationStatus(task_dict['status']) | ||
|
|
||
| self.tasks[task_id] = AnimationTask(**task_dict) | ||
|
|
||
| except Exception as e: | ||
| print(f"加载任务文件失败: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在 load_tasks 方法中,使用了过于宽泛的 except Exception as e。这会捕获所有类型的异常,可能会掩盖一些潜在的 bug,例如 json.JSONDecodeError(文件格式损坏)、FileNotFoundError(文件不存在)或 KeyError(JSON 结构不正确)。建议捕获更具体的异常类型,并提供更明确的错误日志,以便于调试。
| try: | |
| with open(self.tasks_file, 'r', encoding='utf-8') as f: | |
| tasks_data = json.load(f) | |
| for task_id, task_dict in tasks_data.items(): | |
| # 恢复枚举类型 | |
| task_dict['mode'] = AnimationProductionMode(task_dict['mode']) | |
| task_dict['status'] = AnimationStatus(task_dict['status']) | |
| self.tasks[task_id] = AnimationTask(**task_dict) | |
| except Exception as e: | |
| print(f"加载任务文件失败: {e}") | |
| try: | |
| with open(self.tasks_file, 'r', encoding='utf-8') as f: | |
| tasks_data = json.load(f) | |
| for task_id, task_dict in tasks_data.items(): | |
| # 恢复枚举类型 | |
| task_dict['mode'] = AnimationProductionMode(task_dict['mode']) | |
| task_dict['status'] = AnimationStatus(task_dict['status']) | |
| self.tasks[task_id] = AnimationTask(**task_dict) | |
| except FileNotFoundError: | |
| # 文件不存在是正常情况,无需打印错误 | |
| return | |
| except (json.JSONDecodeError, KeyError) as e: | |
| print(f"加载或解析任务文件失败 ({self.tasks_file}): {e}") | |
| except Exception as e: | |
| print(f"加载任务时发生未知错误: {e}") |
| # 添加占位文本 | ||
| try: | ||
| # 尝试使用自定义字体 | ||
| font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| except Exception as e: | ||
| print(f"创建占位符视频失败: {e}") | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
调用 ffmpeg 进程时,如果发生错误,subprocess.run 会抛出 CalledProcessError。当前的异常捕获虽然能捕捉到错误,但没有将 ffmpeg 的具体错误输出(stderr)打印出来,这给调试带来了困难。建议在捕获异常时,打印 e.stderr 的内容。
| except Exception as e: | |
| print(f"创建占位符视频失败: {e}") | |
| return None | |
| except subprocess.CalledProcessError as e: | |
| print(f"创建占位符视频失败: {e}") | |
| if e.stderr: | |
| print(f"FFmpeg stderr: {e.stderr.decode('utf-8', errors='ignore')}") | |
| return None | |
| except Exception as e: | |
| print(f"创建占位符视频失败: {e}") | |
| return None |
| client = OpenAI( | ||
| base_url='https://api-inference.modelscope.cn/v1', | ||
| api_key=os.environ.get('MODELSCOPE_API_KEY'), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def _generate_assets_from_script(self, script_path: str, topic: str) -> str: | ||
| """ | ||
| Parses the script, generates TTS, animations, and subtitles. | ||
| This function is a wrapper around the core logic in workflow.py. | ||
| """ | ||
| print("[video_agent] Starting asset generation from script...") | ||
|
|
||
| with open(script_path, 'r', encoding='utf-8') as f: | ||
| script = f.read() | ||
|
|
||
| # Resolve topic from meta if available to keep consistent with original query | ||
| try: | ||
| meta_path = os.path.join(os.path.dirname(script_path), 'meta.json') | ||
| if os.path.exists(meta_path): | ||
| meta = json.load(open(meta_path, 'r', encoding='utf-8')) | ||
| topic = meta.get('topic', topic) | ||
| except Exception as e: | ||
| print(f"[video_agent] Failed to read topic from meta.json: {e}") | ||
|
|
||
| # Use the script's directory as the output directory for this topic | ||
| full_output_dir = os.path.dirname(script_path) | ||
| os.makedirs(full_output_dir, exist_ok=True) | ||
|
|
||
| # 1. Parse script into segments | ||
| print("[video_agent] Parsing script into segments...") | ||
| segments = video_workflow.parse_structured_content(script) | ||
|
|
||
| # Further split long text segments | ||
| final_segments = [] | ||
| for segment in segments: | ||
| if segment['type'] == 'text' and len(segment['content']) > 100: | ||
| subsegments = video_workflow.split_text_by_punctuation(segment['content']) | ||
| for subseg_dict in subsegments: | ||
| if subseg_dict['content'].strip(): | ||
| final_segments.append({ | ||
| 'content': subseg_dict['content'].strip(), | ||
| 'type': 'text', | ||
| 'parent_segment': segment | ||
| }) | ||
| else: | ||
| final_segments.append(segment) | ||
| segments = final_segments | ||
| print(f"[video_agent] Script parsed into {len(segments)} segments.") | ||
|
|
||
| # 2. Generate assets for each segment | ||
| asset_paths = { | ||
| "audio_paths": [], | ||
| "foreground_paths": [], | ||
| "subtitle_paths": [], | ||
| "illustration_paths": [], | ||
| "subtitle_segments_list": [] | ||
| } | ||
|
|
||
| tts_dir = os.path.join(full_output_dir, "audio") | ||
| os.makedirs(tts_dir, exist_ok=True) | ||
|
|
||
| subtitle_dir = os.path.join(full_output_dir, "subtitles") | ||
| os.makedirs(subtitle_dir, exist_ok=True) | ||
|
|
||
| # Prepare illustration paths list aligned to segments | ||
| illustration_paths: List[str] = [] | ||
|
|
||
| for i, segment in enumerate(segments): | ||
| print(f"[video_agent] Processing segment {i+1}/{len(segments)}: {segment['type']}") | ||
|
|
||
| # Clean content to avoid issues with markers | ||
| tts_text = video_workflow.clean_content(segment.get('content', '')) | ||
|
|
||
| # Generate TTS | ||
| audio_path = os.path.join(tts_dir, f"segment_{i+1}.mp3") | ||
| if tts_text: | ||
| if video_workflow.edge_tts_generate(tts_text, audio_path): | ||
| segment['audio_duration'] = video_workflow.get_audio_duration(audio_path) | ||
| else: | ||
| video_workflow.create_silent_audio(audio_path, duration=3.0) | ||
| segment['audio_duration'] = 3.0 | ||
| else: | ||
| video_workflow.create_silent_audio(audio_path, duration=2.0) | ||
| segment['audio_duration'] = 2.0 | ||
| asset_paths["audio_paths"].append(audio_path) | ||
|
|
||
| # Generate Animation (only for non-text types) | ||
| if segment['type'] != 'text' and self.animation_mode != 'human': | ||
| manim_code = video_workflow.generate_manim_code( | ||
| content=video_workflow.clean_content(segment['content']), | ||
| content_type=segment['type'], | ||
| scene_number=i + 1, | ||
| audio_duration=segment.get('audio_duration', 8.0), | ||
| main_theme=topic, | ||
| context_segments=segments, | ||
| segment_index=i, | ||
| total_segments=segments | ||
| ) | ||
| video_path = None | ||
| if manim_code: | ||
| scene_name = f"Scene{i+1}" | ||
| scene_dir = os.path.join(full_output_dir, f"scene_{i+1}") | ||
| video_path = video_workflow.render_manim_scene(manim_code, scene_name, scene_dir) | ||
| asset_paths["foreground_paths"].append(video_path) | ||
| else: | ||
| # In human mode, skip auto manim rendering (leave placeholders) | ||
| asset_paths["foreground_paths"].append(None) | ||
|
|
||
| # Initialize placeholders for subtitles; will fill after loop | ||
| illustration_paths.append(None) | ||
| asset_paths["subtitle_paths"].append(None) | ||
| asset_paths["subtitle_segments_list"].append([]) | ||
|
|
||
| # Generate illustrations for text segments (mirrors original logic) | ||
| try: | ||
| text_segments = [seg for seg in segments if seg.get('type') == 'text'] | ||
| if text_segments: | ||
| illustration_prompts_path = os.path.join(full_output_dir, 'illustration_prompts.json') | ||
| if os.path.exists(illustration_prompts_path): | ||
| illustration_prompts = json.load(open(illustration_prompts_path, 'r', encoding='utf-8')) | ||
| else: | ||
| illustration_prompts = video_workflow.generate_illustration_prompts([seg['content'] for seg in text_segments]) | ||
| json.dump(illustration_prompts, open(illustration_prompts_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=2) | ||
|
|
||
| images_dir = os.path.join(full_output_dir, 'images') | ||
| os.makedirs(images_dir, exist_ok=True) | ||
| image_paths_path = os.path.join(images_dir, 'image_paths.json') | ||
| if os.path.exists(image_paths_path): | ||
| image_paths = json.load(open(image_paths_path, 'r', encoding='utf-8')) | ||
| else: | ||
| image_paths = video_workflow.generate_images(illustration_prompts, output_dir=full_output_dir) | ||
| # move to images folder for consistent paths | ||
| for i, img_path in enumerate(image_paths): | ||
| if os.path.exists(img_path): | ||
| new_path = os.path.join(images_dir, f'illustration_{i+1}.png' if img_path.lower().endswith('.png') else f'illustration_{i+1}.jpg') | ||
| try: | ||
| os.replace(img_path, new_path) | ||
| except Exception: | ||
| try: | ||
| import shutil | ||
| shutil.move(img_path, new_path) | ||
| except Exception: | ||
| new_path = img_path | ||
| image_paths[i] = new_path | ||
| json.dump(image_paths, open(image_paths_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=2) | ||
|
|
||
| fg_out_dir = os.path.join(images_dir, 'output_black_only') | ||
| os.makedirs(fg_out_dir, exist_ok=True) | ||
| # process background removal if needed | ||
| if len([f for f in os.listdir(fg_out_dir) if f.lower().endswith('.png')]) < len(image_paths): | ||
| video_workflow.keep_only_black_for_folder(images_dir, fg_out_dir) | ||
|
|
||
| # map illustrations back to segment indices | ||
| text_idx = 0 | ||
| for idx, seg in enumerate(segments): | ||
| if seg.get('type') == 'text': | ||
| if text_idx < len(image_paths): | ||
| transparent_path = os.path.join(fg_out_dir, f'illustration_{text_idx+1}.png') | ||
| if os.path.exists(transparent_path): | ||
| illustration_paths[idx] = transparent_path | ||
| else: | ||
| illustration_paths[idx] = image_paths[text_idx] | ||
| text_idx += 1 | ||
| else: | ||
| illustration_paths[idx] = None | ||
| else: | ||
| illustration_paths[idx] = None | ||
| else: | ||
| illustration_paths = [None] * len(segments) | ||
| except Exception as e: | ||
| print(f"[video_agent] Illustration generation failed: {e}") | ||
| illustration_paths = [None] * len(segments) | ||
|
|
||
| # Attach illustration paths to asset_paths | ||
| asset_paths["illustration_paths"] = illustration_paths | ||
|
|
||
| # Generate bilingual subtitles | ||
| def _split_subtitles(text: str, max_chars: int = 30) -> List[str]: | ||
| import re | ||
| sentences = re.split(r'([。!?;,、])', text) | ||
| subs, cur = [], "" | ||
| for s in sentences: | ||
| if not s.strip(): | ||
| continue | ||
| test = cur + s | ||
| if len(test) <= max_chars: | ||
| cur = test | ||
| else: | ||
| if cur: | ||
| subs.append(cur.strip()) | ||
| cur = s | ||
| if cur.strip(): | ||
| subs.append(cur.strip()) | ||
| return subs | ||
|
|
||
| for i, seg in enumerate(segments): | ||
| try: | ||
| if seg.get('type') != 'text': | ||
| zh_text = seg.get('explanation', '') or seg.get('content', '') | ||
| parts = _split_subtitles(zh_text, max_chars=30) | ||
| img_list = [] | ||
| for idx_p, part in enumerate(parts): | ||
| sub_en = video_workflow.translate_text_to_english(part) | ||
| temp_path, _h = video_workflow.create_bilingual_subtitle_image( | ||
| zh_text=part, | ||
| en_text=sub_en, | ||
| width=1720, | ||
| height=120 | ||
| ) | ||
| if temp_path and os.path.exists(temp_path): | ||
| final_sub_path = os.path.join(subtitle_dir, f"bilingual_subtitle_{i+1}_{idx_p+1}.png") | ||
| try: | ||
| os.replace(temp_path, final_sub_path) | ||
| except Exception: | ||
| import shutil | ||
| shutil.move(temp_path, final_sub_path) | ||
| img_list.append(final_sub_path) | ||
| asset_paths["subtitle_segments_list"][i] = img_list | ||
| asset_paths["subtitle_paths"][i] = img_list[0] if img_list else None | ||
| else: | ||
| zh_text = seg.get('content', '') | ||
| en_text = video_workflow.translate_text_to_english(zh_text) | ||
| temp_path, _h = video_workflow.create_bilingual_subtitle_image( | ||
| zh_text=zh_text, | ||
| en_text=en_text, | ||
| width=1720, | ||
| height=120 | ||
| ) | ||
| if temp_path and os.path.exists(temp_path): | ||
| final_sub_path = os.path.join(subtitle_dir, f"bilingual_subtitle_{i+1}.png") | ||
| try: | ||
| os.replace(temp_path, final_sub_path) | ||
| except Exception: | ||
| import shutil | ||
| shutil.move(temp_path, final_sub_path) | ||
| asset_paths["subtitle_paths"][i] = final_sub_path | ||
| asset_paths["subtitle_segments_list"][i] = [final_sub_path] | ||
| except Exception as e: | ||
| print(f"[video_agent] Subtitle generation failed at segment {i+1}: {e}") | ||
|
|
||
| # Save all necessary info for the next step | ||
| asset_info = { | ||
| "topic": topic, | ||
| "output_dir": full_output_dir, | ||
| "segments": segments, | ||
| "asset_paths": asset_paths, | ||
| "animation_mode": self.animation_mode | ||
| } | ||
| asset_info_path = os.path.join(full_output_dir, "asset_info.json") | ||
| with open(asset_info_path, 'w', encoding='utf-8') as f: | ||
| json.dump(asset_info, f, ensure_ascii=False, indent=2) | ||
|
|
||
| # 兼容工作室的完整合成:同时输出 segments.json | ||
| try: | ||
| with open(os.path.join(full_output_dir, 'segments.json'), 'w', encoding='utf-8') as sf: | ||
| json.dump(segments, sf, ensure_ascii=False, indent=2) | ||
| except Exception as _e: | ||
| print(f"[video_agent] 写入 segments.json 失败: {_e}") | ||
|
|
||
| # In human mode, drop a short README to guide manual studio | ||
| if self.animation_mode == 'human': | ||
| readme_path = os.path.join(full_output_dir, 'HUMAN_README.txt') | ||
| try: | ||
| with open(readme_path, 'w', encoding='utf-8') as rf: | ||
| rf.write( | ||
| "本目录为人工动画模式生成的素材预备目录\n" | ||
| "- 已生成脚本、语音、插画、字幕与占位前景(无自动动画)\n" | ||
| "- 下一步:进入互动动画工作室制作每个动画片段\n\n" | ||
| "启动命令示例:\n" | ||
| "# 先确保将 ms-agent 目录加入 PYTHONPATH 环境变量\n" | ||
| "# PowerShell:\n" | ||
| "# $env:PYTHONPATH=\"{}\"\n" | ||
| "# 然后以模块方式启动工作室:\n" | ||
| "python -m projects.video_generate.core.human_animation_studio \"{}\"\n".format( | ||
| os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')), # ms-agent 根目录 | ||
| full_output_dir | ||
| ) | ||
| ) | ||
| except Exception as _e: | ||
| print(f"[video_agent] Failed to write HUMAN_README: {_e}") | ||
|
|
||
| print(f"[video_agent] Asset generation complete. Info saved to {asset_info_path}") | ||
| return asset_info_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Video Generate
一个“AI 科普短视频”工作流。支持全自动与人工协同两种模式,产生脚本、语音、插画/动画、字幕,并合成为成片。
快速检查(必读)
在首次运行前,建议完成以下检查:
projects/video_generate/core/asset/bg_audio.mp3字小魂扶摇手书(商用需授权).ttf提示:未设置 Key 也可运行“只合成/人工模式”,但全自动模式可能因缺少 LLM 能力失败。
运行方式一:全自动模式(auto)
按主题从零到一自动生成并合成视频:
输出将位于
ms-agent/projects/video_generate/output/<主题>/。运行方式二:人工模式(human)
适合需要人工把控动画的流程:自动产出“脚本/语音/插画/字幕/占位前景”,然后在“人工工作室”内逐段制作/审批前景动画,最终一键完整合成。
在工作室中:
运行方式三:只合成(已有素材)
如果目录中已经有
asset_info.json(或你只想重新合成):该流程只执行合成,不会重新生成脚本/插画/动画。若存在已审批的透明前景(finals/scene_*_final.mov),将优先使用。
目录说明
video_agent.py:三步逻辑的 Agent 封装workflow.yaml:三步编排;workflow_from_assets.yaml:只合成编排core/workflow.py:主流程;core/human_animation_studio.py:人工工作室core/asset/:字体与背景音乐output/:运行产物scripts/compose_from_asset_info.py:从现有asset_info.json直接合成的辅助脚本