demosmith-mcp

An MCP (Model Context Protocol) server for automated demo recording with video, documentation, and screenshot generation. Perfect for creating product demos, tutorials, and documentation with AI agents.

Demo

Demo: GitHub login flow with animated cursor, click effects, and auto-generated documentation

Features

Video Recording - Automatic screen recording of browser sessions
Screenshot Capture - Automatic screenshots at each step
Animated Cursor - Smooth cursor animations with click effects and sounds
TTS Narration - AI-powered voiceover with multiple providers (OpenAI, ElevenLabs, Azure, Edge)
Multiple Output Formats:
- Video (WebM)
- Video with Audio (MP4)
- Playwright Trace (interactive replay)
- Markdown Guide
- JSON Steps
- Narration Script + JSON (with timestamps)
- Subtitles (SRT/VTT)
- Interactive HTML Tutorial
- GIF Preview
Multi-language Support - English and Chinese
Multi-tab Support - Work with multiple browser tabs
Flexible Element Selection - By ref, text, label, placeholder, CSS, XPath

Installation

npm install demosmith-mcp
npx playwright install chromium

Usage

As MCP Server

Add to your Claude Code MCP configuration (~/.claude/mcp.json):

{
  "mcpServers": {
    "demosmith": {
      "command": "npx",
      "args": ["demosmith-mcp"]
    }
  }
}

CLI Mode

# Replay a recorded demo
demosmith replay ./steps.json -o ./output --video

# Generate documentation from steps
demosmith generate ./steps.json -l zh -o ./docs

# Serve generated files locally
demosmith serve ./output

MCP Tools

Session Management

Tool	Description
`demosmith_start`	Start a new demo recording session
`demosmith_end`	End session and generate all deliverables
`demosmith_status`	Get current session status

Navigation & Discovery

Tool	Description
`demosmith_navigate`	Navigate to a URL
`demosmith_snapshot`	Get accessibility tree snapshot for element refs

Core Actions

Tool	Description
`demosmith_click`	Click an element (with animated cursor)
`demosmith_fill`	Fill a text input (with typing animation)
`demosmith_select`	Select from dropdown
`demosmith_press_key`	Press keyboard key or combination
`demosmith_hover`	Hover over element (for tooltips/menus)
`demosmith_drag`	Drag and drop
`demosmith_upload`	Upload file

Page Actions

Tool	Description
`demosmith_scroll`	Scroll page or element
`demosmith_wait`	Wait for condition
`demosmith_screenshot`	Take manual screenshot

Verification

Tool	Description
`demosmith_assert`	Verify conditions (text, visibility, URL, etc.)

Tab Management

Tool	Description
`demosmith_new_tab`	Open new browser tab
`demosmith_switch_tab`	Switch to different tab
`demosmith_close_tab`	Close a tab
`demosmith_list_tabs`	List all open tabs

Element Selectors

demosmith supports multiple ways to locate elements:

# By ref (from snapshot)
"1", "2", "3"

# By visible text
"text:Submit"
"text:/Submit|Cancel/"  # regex

# By label
"label:Email"

# By placeholder
"placeholder:Enter your name"

# By role and name
"role:button:Submit"
"role:textbox"

# By test ID
"testid:submit-btn"

# By CSS selector
"css:.btn-primary"

# By XPath
"xpath://button[@type='submit']"

# By alt text
"alt:Logo"

# By title
"title:Close"

Example Workflow

1. demosmith_start(url="https://example.com/login", title="Login Demo")
2. demosmith_snapshot()  → Get element refs
3. demosmith_fill(ref="label:Email", value="user@example.com", description="Enter email")
4. demosmith_fill(ref="label:Password", value="password123", description="Enter password")
5. demosmith_click(ref="text:Sign In", description="Click sign in button")
6. demosmith_assert(type="url", expected="/dashboard", description="Verify redirect")
7. demosmith_end()  → Returns all deliverables

Output Files

After ending a session, the following files are generated:

output/
├── demo.webm              # Screen recording video
├── demo-with-audio.mp4    # Video with TTS narration (if TTS enabled)
├── demo.gif               # Animated GIF preview
├── trace.zip              # Playwright trace (interactive replay)
├── guide.md               # Markdown documentation
├── steps.json             # Structured step data
├── narration.txt          # Voiceover script
├── narration.json         # Timed narration for TTS APIs
├── narration.mp3          # Generated audio (if TTS enabled)
├── subtitles.srt          # SRT subtitles
├── subtitles.vtt          # VTT subtitles
├── tutorial.html          # Interactive HTML tutorial
├── animated-preview.html  # HTML preview (fallback)
└── assets/
    ├── step-001.png
    ├── step-002.png
    └── ...

See examples/github-login-demo/ for a complete example output.

Configuration Options

Start Session Options

Option	Type	Default	Description
`title`	string	required	Demo title
`startUrl`	string	required	Starting URL
`outputDir`	string	temp dir	Output directory
`video`	boolean	true	Record video
`trace`	boolean	true	Record Playwright trace
`screenshotOnStep`	boolean	true	Auto-screenshot each step
`headless`	boolean	false	Run browser headless
`viewport`	object	1280x720	Browser viewport size
`storageState`	string	-	Path to login state file

Animation Options

Click and fill actions support animation options:

Option	Type	Default	Description
`animated`	boolean	true	Enable cursor animation
`moveDuration`	number	500	Cursor movement duration (ms)
`typeDelay`	number	50	Delay between keystrokes (ms)

Assert Types

The demosmith_assert tool supports these verification types:

Type	Description
`text`	Check element text content
`visible`	Check element is visible
`hidden`	Check element is hidden
`url`	Check current URL
`title`	Check page title
`value`	Check input value
`checked`	Check checkbox is checked
`enabled`	Check element is enabled
`disabled`	Check element is disabled
`count`	Check number of matching elements

Multi-language Support

Generated content supports English and Chinese. Set via CLI:

demosmith generate ./steps.json -l zh  # Chinese
demosmith generate ./steps.json -l en  # English (default)

Custom Templates

You can provide custom templates for output generation using Mustache-like syntax:

# {{session.title}}

{{#each steps}}
## Step {{this.id}}: {{this.description}}

{{#if this.screenshotRelative}}
![Screenshot]({{this.screenshotRelative}})
{{/if}}
{{/each}}

Login Session Support

Save a login session with Playwright:

await context.storageState({ path: 'auth.json' });

Use in demo:

demosmith_start(url="...", title="...", storageState="auth.json")

TTS Narration

Generate AI voiceover for your demos by passing TTS options to demosmith_end:

demosmith_end(tts={
  provider: "openai",
  apiKey: "sk-...",
  voice: "alloy"
})

Supported TTS Providers

Provider	API Key Required	Voices	Notes
`openai`	Yes	alloy, echo, fable, onyx, nova, shimmer	Best quality
`elevenlabs`	Yes	Various voice IDs	Most natural
`azure`	Yes	en-US-JennyNeural, etc.	SSML support
`edge`	No	en-US-AriaNeural, etc.	Free, requires `edge-tts` CLI

TTS Options

Option	Type	Description
`provider`	string	TTS provider (openai, elevenlabs, azure, edge)
`apiKey`	string	API key (not needed for edge)
`voice`	string	Voice ID or name
`language`	string	Language code (e.g., en-US, zh-CN)
`speed`	number	Speech speed multiplier

Environment Variables

For Azure TTS, set the region:

export AZURE_SPEECH_REGION=eastus

Narration JSON Format

The generated narration.json contains timed segments for custom TTS integration:

{
  "title": "Login Demo",
  "totalDurationMs": 15000,
  "segments": [
    {
      "stepId": 1,
      "startMs": 2000,
      "endMs": 4500,
      "durationMs": 2500,
      "text": "Click the login button"
    }
  ]
}

Development

# Install dependencies
pnpm install

# Build
pnpm build

# Run MCP server
pnpm start

# Run CLI
pnpm cli help

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
examples/github-login-demo		examples/github-login-demo
src		src
.gitignore		.gitignore
README.md		README.md
demo-direct.mjs		demo-direct.mjs
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
run-demo.mjs		run-demo.mjs
test-animated.mjs		test-animated.mjs
test-direct.mjs		test-direct.mjs
test-e2e.mjs		test-e2e.mjs
test-generators.mjs		test-generators.mjs
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

demosmith-mcp

Demo

Features

Installation

Usage

As MCP Server

CLI Mode

MCP Tools

Session Management

Navigation & Discovery

Core Actions

Page Actions

Verification

Tab Management

Element Selectors

Example Workflow

Output Files

Configuration Options

Start Session Options

Animation Options

Assert Types

Multi-language Support

Custom Templates

Login Session Support

TTS Narration

Supported TTS Providers

TTS Options

Environment Variables

Narration JSON Format

Development

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages