huggingface · AmadeusGB · Mar 11, 2025 · Mar 12, 2025
diff --git a/units/cn/_toctree.yml b/units/cn/_toctree.yml
@@ -0,0 +1,260 @@
+- title: 单元 0. 欢迎来到课程
+  sections:
+  - local: unit0/introduction
+    title: 欢迎来到课程 🤗
+  - local: unit0/setup
+    title: 环境配置
+  - local: unit0/discord101
+    title: Discord 使用指南
+- title: 单元 1. 深度强化学习简介
+  sections:
+  - local: unit1/introduction
+    title: 介绍
+  - local: unit1/what-is-rl
+    title: 什么是强化学习？
+  - local: unit1/rl-framework
+    title: 强化学习框架
+  - local: unit1/tasks
+    title: 任务类型
+  - local: unit1/exp-exp-tradeoff
+    title: 探索与利用的权衡
+  - local: unit1/two-methods
+    title: 解决强化学习问题的两种主要方法
+  - local: unit1/deep-rl
+    title: 深度强化学习中的"深度"
+  - local: unit1/summary
+    title: 总结
+  - local: unit1/glossary
+    title: 术语表
+  - local: unit1/hands-on
+    title: 实践练习
+  - local: unit1/quiz
+    title: 测验
+  - local: unit1/conclusion
+    title: 结论
+  - local: unit1/additional-readings
+    title: 扩展阅读
+- title: 奖励单元 1. 与Huggy一起深度强化学习入门
+  sections:
+  - local: unitbonus1/introduction
+    title: 介绍
+  - local: unitbonus1/how-huggy-works
+    title: Huggy是如何工作的？
+  - local: unitbonus1/train
+    title: 训练Huggy
+  - local: unitbonus1/play
+    title: 与Huggy互动
+  - local: unitbonus1/conclusion
+    title: 结论
+- title: 直播 1. 课程工作方式，问答，以及与Huggy互动
+  sections:
+  - local: live1/live1
+    title: 直播 1. 课程工作方式，问答，以及与Huggy互动 🐶
+- title: 单元 2. Q-Learning入门
+  sections:
+  - local: unit2/introduction
+    title: 介绍
+  - local: unit2/what-is-rl
+    title: 什么是强化学习？简要回顾
+  - local: unit2/two-types-value-based-methods
+    title: 基于价值的方法的两种类型
+  - local: unit2/bellman-equation
+    title: Bellman方程，简化我们的价值估计
+  - local: unit2/mc-vs-td
+    title: 蒙特卡洛与时序差分学习
+  - local: unit2/mid-way-recap
+    title: 中途回顾
+  - local: unit2/mid-way-quiz
+    title: 中途测验
+  - local: unit2/q-learning
+    title: Q-Learning介绍
+  - local: unit2/q-learning-example
+    title: Q-Learning示例
+  - local: unit2/q-learning-recap
+    title: Q-Learning回顾
+  - local: unit2/glossary
+    title: 术语表
+  - local: unit2/hands-on
+    title: 实践练习
+  - local: unit2/quiz2
+    title: Q-Learning测验
+  - local: unit2/conclusion
+    title: 结论
+  - local: unit2/additional-readings
+    title: 扩展阅读
+- title: 单元 3. 使用Atari游戏的深度Q-Learning
+  sections:
+  - local: unit3/introduction
+    title: 介绍
+  - local: unit3/from-q-to-dqn
+    title: 从Q-Learning到深度Q-Learning
+  - local: unit3/deep-q-network
+    title: 深度Q网络(DQN)
+  - local: unit3/deep-q-algorithm
+    title: 深度Q算法
+  - local: unit3/glossary
+    title: 术语表
+  - local: unit3/hands-on
+    title: 实践练习
+  - local: unit3/quiz
+    title: 测验
+  - local: unit3/conclusion
+    title: 结论
+  - local: unit3/additional-readings
+    title: 扩展阅读
+- title: 奖励单元 2. 使用Optuna进行自动超参数调优
+  sections:
+  - local: unitbonus2/introduction
+    title: 介绍
+  - local: unitbonus2/optuna
+    title: Optuna
+  - local: unitbonus2/hands-on
+    title: 实践练习
+- title: 单元 4. 使用PyTorch的策略梯度
+  sections:
+  - local: unit4/introduction
+    title: 介绍
+  - local: unit4/what-are-policy-based-methods
+    title: 什么是基于策略的方法？
+  - local: unit4/advantages-disadvantages
+    title: 策略梯度方法的优缺点
+  - local: unit4/policy-gradient
+    title: 深入策略梯度
+  - local: unit4/pg-theorem
+    title: (可选) 策略梯度定理
+  - local: unit4/glossary
+    title: 术语表
+  - local: unit4/hands-on
+    title: 实践练习
+  - local: unit4/quiz
+    title: 测验
+  - local: unit4/conclusion
+    title: 结论
+  - local: unit4/additional-readings
+    title: 扩展阅读
+- title: 单元 5. Unity ML-Agents入门
+  sections:
+  - local: unit5/introduction
+    title: 介绍
+  - local: unit5/how-mlagents-works
+    title: ML-Agents是如何工作的？
+  - local: unit5/snowball-target
+    title: SnowballTarget环境
+  - local: unit5/pyramids
+    title: Pyramids环境
+  - local: unit5/curiosity
+    title: (可选) 深度强化学习中的好奇心是什么？
+  - local: unit5/hands-on
+    title: 实践练习
+  - local: unit5/bonus
+    title: 奖励. 学习使用Unity和MLAgents创建自己的环境
+  - local: unit5/quiz
+    title: 测验
+  - local: unit5/conclusion
+    title: 结论
+- title: 单元 6. 使用机器人环境的Actor-Critic方法
+  sections:
+  - local: unit6/introduction
+    title: 介绍
+  - local: unit6/variance-problem
+    title: Reinforce中的方差问题
+  - local: unit6/advantage-actor-critic
+    title: 优势Actor-Critic (A2C)
+  - local: unit6/hands-on
+    title: 使用Panda-Gym进行机器人模拟的优势Actor-Critic (A2C) 🤖
+  - local: unit6/quiz
+    title: 测验
+  - local: unit6/conclusion
+    title: 结论
+  - local: unit6/additional-readings
+    title: 扩展阅读
+- title: 单元 7. 多智能体和AI对战AI入门
+  sections:
+  - local: unit7/introduction
+    title: 介绍
+  - local: unit7/introduction-to-marl
+    title: 多智能体强化学习(MARL)简介
+  - local: unit7/multi-agent-setting
+    title: 设计多智能体系统
+  - local: unit7/self-play
+    title: 自我对弈
+  - local: unit7/hands-on
+    title: 训练我们的足球队击败你同学的队伍(AI对战AI)
+  - local: unit7/quiz
+    title: 测验
+  - local: unit7/conclusion
+    title: 结论
+  - local: unit7/additional-readings
+    title: 扩展阅读
+- title: 单元 8. 第1部分 近端策略优化(PPO)
+  sections:
+  - local: unit8/introduction
+    title: 介绍
+  - local: unit8/intuition-behind-ppo
+    title: PPO背后的直觉
+  - local: unit8/clipped-surrogate-objective
+    title: 引入截断替代目标函数
+  - local: unit8/visualize
+    title: 可视化截断替代目标函数
+  - local: unit8/hands-on-cleanrl
+    title: 使用CleanRL实现PPO
+  - local: unit8/conclusion
+    title: 结论
+  - local: unit8/additional-readings
+    title: 扩展阅读
+- title: 单元 8. 第2部分 使用Doom的近端策略优化(PPO)
+  sections:
+  - local: unit8/introduction-sf
+    title: 介绍
+  - local: unit8/hands-on-sf
+    title: 使用Sample Factory和Doom实现PPO
+  - local: unit8/conclusion-sf
+    title: 结论
+- title: 奖励单元 3. 强化学习高级主题
+  sections:
+  - local: unitbonus3/introduction
+    title: 介绍
+  - local: unitbonus3/model-based
+    title: 基于模型的强化学习
+  - local: unitbonus3/offline-online
+    title: 离线与在线强化学习
+  - local: unitbonus3/generalisation
+    title: 泛化强化学习
+  - local: unitbonus3/rlhf
+    title: 基于人类反馈的强化学习
+  - local: unitbonus3/decision-transformers
+    title: 决策Transformers和离线强化学习
+  - local: unitbonus3/language-models
+    title: 强化学习中的语言模型
+  - local: unitbonus3/curriculum-learning
+    title: 强化学习的(自动)课程学习
+  - local: unitbonus3/envs-to-try
+    title: 值得尝试的有趣环境
+  - local: unitbonus3/learning-agents
+    title: Unreal Learning Agents简介
+  - local: unitbonus3/godotrl
+    title: Godot RL简介
+  - local: unitbonus3/student-works
+    title: 学生项目
+  - local: unitbonus3/rl-documentation
+    title: 强化学习文档简介
+- title: 奖励单元 5. 使用Godot RL Agents的模仿学习
+  sections:
+  - local: unitbonus5/introduction
+    title: 介绍
+  - local: unitbonus5/the-environment
+    title: 环境
+  - local: unitbonus5/getting-started
+    title: 入门指南
+  - local: unitbonus5/train-our-robot
+    title: 训练我们的机器人
+  - local: unitbonus5/customize-the-environment
+    title: (可选) 自定义环境
+  - local: unitbonus5/conclusion
+    title: 结论
+- title: 认证与祝贺
+  sections:
+  - local: communication/conclusion
+    title: 恭喜
+  - local: communication/certification
+    title: 获取你的完成证书
diff --git a/units/cn/communication/certification.mdx b/units/cn/communication/certification.mdx
@@ -0,0 +1,29 @@
+# 认证流程
+
+认证流程**完全免费**：
+
+- 获取*完成证书*：你需要**通过80%的作业**。
+- 获取*优秀证书*：你需要**通过100%的作业**。
+
+**没有截止日期，课程是自定进度的**。
+
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit0/certification.jpg" alt="课程认证" width="100%"/>
+
+当我们说通过时，**我们的意思是你的模型必须被推送到Hub并获得等于或高于最低要求的结果**。
+
+要检查你的进度以及哪些单元你已通过/未通过：https://huggingface.co/spaces/ThomasSimonini/Check-my-progress-Deep-RL-Course
+
+现在你已准备好进行认证流程，你需要：
+
+1. 前往：https://huggingface.co/spaces/huggingface-projects/Deep-RL-Course-Certification/
+2. 输入你的*hugging face用户名*、*名字*、*姓氏*
+
+3. 点击"生成我的证书"。
+  - 如果你通过了80%的作业，**恭喜**你获得了完成证书。
+  - 如果你通过了100%的作业，**恭喜**你获得了优秀证书。
+  - 如果你低于80%，不要气馁！检查哪些单元你需要重做以获取证书。
+
+4. 你可以下载PDF格式和PNG格式的证书。
+
+欢迎在Twitter上分享你的证书（标记@ThomasSimonini和@huggingface）以及在LinkedIn上分享。
+
diff --git a/units/cn/communication/conclusion.mdx b/units/cn/communication/conclusion.mdx
@@ -0,0 +1,24 @@
+# 恭喜
+
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/communication/thumbnail.png" alt="缩略图"/>
+
+
+**恭喜你完成本课程！** 通过坚持不懈、努力工作和决心，**你已经获得了深度强化学习的坚实基础**。
+
+但完成本课程**并不是你旅程的终点**。这只是开始：不要犹豫，去探索奖励单元3，那里我们展示了你可能感兴趣的主题。也不要犹豫**分享你正在做的事情，并在discord服务器中提问**
+
+**感谢**你参与本课程。**我希望你喜欢这门课程，就像我喜欢编写它一样**。
+
+欢迎**使用[此表单](https://forms.gle/BzKXWzLAGZESGNaE9)给我们反馈如何改进课程**
+
+别忘了**在下一部分查看如何获取（如果你通过了）你的完成证书‎‍🎓。**
+
+最后一件事，要与强化学习团队和我保持联系：
+
+- [在Twitter上关注我](https://twitter.com/thomassimonini)
+- [关注Hugging Face的Twitter账号](https://twitter.com/huggingface)
+- [加入Hugging Face Discord](https://www.hf.co/join/discord)
+
+## 持续学习，保持精彩 🤗
+
+Thomas Simonini,
diff --git a/units/cn/live1/live1.mdx b/units/cn/live1/live1.mdx
@@ -0,0 +1,9 @@
+# 直播1：课程如何运作，问答，以及与Huggy互动
+
+在这第一次直播中，我们解释了课程如何运作（范围、单元、挑战等）并回答了你们的问题。
+
+最后，我们看了一些你们训练的月球着陆器代理，并与你们的Huggies🐶互动
+
+<Youtube id="JeJIswxyrsM" />
+
+要了解下一次直播的安排，**请查看discord服务器**。我们也会**给你发送电子邮件**。如果你不能参加，别担心，我们会录制直播内容。
diff --git a/units/cn/unit0/discord101.mdx b/units/cn/unit0/discord101.mdx
@@ -0,0 +1,37 @@
+# Discord 101 [[discord-101]]
+
+嘿！我是Huggy，一只狗狗🐕，我期待在这个强化学习课程中与你一起训练！
+虽然我对捡拾木棍知之甚少（目前如此），但我对Discord略知一二。所以我写了这份指南来帮助你了解它！
+
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit0/huggy-logo.jpg" alt="Huggy Logo"/>
+
+Discord是一个免费的聊天平台。如果你用过Slack，**它非常相似**。有一个拥有50000名成员的Hugging Face社区Discord服务器，你可以<a href="https://discord.gg/ydHrjt3WP5">点击这里一键加入</a>。这么多人类可以一起玩耍！
+
+刚开始使用Discord可能有点令人生畏，所以让我带你了解一下。
+
+当你[注册我们的Discord服务器](http://hf.co/join/discord)时，你将选择你的兴趣。确保**点击"强化学习"**，你将获得访问包含所有课程相关频道的强化学习类别的权限。如果你想加入更多频道，尽管去做吧！🚀
+
+然后点击下一步，你将在`#introduce-yourself`频道**介绍自己**。
+
+
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit0/discord2.jpg" alt="Discord"/>
+
+它们位于强化学习类别中。**不要忘记通过在`role-assigment`中点击🤖强化学习来注册这些频道**。
+- `rl-announcements`：我们在这里提供**关于课程的最新信息**。
+- `rl-discussions`：你可以在这里**讨论强化学习并分享信息**。
+- `rl-study-group`：你可以在这里**提问并与同学交流**。
+- `rl-i-made-this`：你可以在这里**分享你的项目和模型**。
+
+HF社区服务器有一个蓬勃发展的人类社区，他们对许多领域感兴趣，所以你也可以从中学习。有论文讨论、活动和许多其他内容。
+
+这有用吗？我可以分享一些技巧：
+
+- 还有**语音频道**你也可以使用，尽管大多数人更喜欢文字聊天。
+- 你可以**使用markdown风格**进行文字聊天。所以如果你正在编写代码，你可以使用这种风格。可惜这对链接不太适用。
+- 你也可以开启线程！当**是一个长对话**时，这是个好主意。
+
+希望这对你有用！如果你有问题，尽管问！
+
+回头见！
+
+Huggy 🐶