Skip to content
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
ff1b7c9
First iteration of proposal draft to introduce khr avatar extensions
Kjakubzak Jul 26, 2025
1413d20
Updated Contributes list as requested
Kjakubzak Jul 28, 2025
7b4723e
Addressing some comments
Kjakubzak Jul 29, 2025
de2169f
Update README.md
Kjakubzak Jul 29, 2025
de03ddb
Update README.md
Kjakubzak Jul 29, 2025
4a1f58a
Grammar am bad
Kjakubzak Jul 29, 2025
685b1c3
Updated KHR_avatar_skeleton_biped readme to be more prescriptive
Kjakubzak Jul 30, 2025
5effdee
Update the expressions extension descriptions to be more descriptive
Kjakubzak Aug 9, 2025
22b69b9
Updating the mapping extensions
Kjakubzak Aug 21, 2025
f89a0e3
Addressing minor self-nit
Kjakubzak Aug 21, 2025
a520102
Changing namespaces to KHR_character and KHR_character_avatar
Kjakubzak Sep 2, 2025
dd2159a
Small updates
Kjakubzak Sep 2, 2025
55bd217
More minor updates
Kjakubzak Sep 2, 2025
c461bab
Add virtual joint example for non-rotation-respecting virtual joint
Kjakubzak Sep 2, 2025
3822282
Update typed expression extensions to reflect a centralized expressio…
Kjakubzak Sep 2, 2025
085360f
Fit nits with last commit
Kjakubzak Sep 2, 2025
59993b8
Iterated on KHR_character_expression_procedural
Kjakubzak Sep 2, 2025
18d2d88
Testing mesh annotation refactor
Kjakubzak Sep 3, 2025
2ec0909
Changed virtual joints to virtual transforms
Kjakubzak Sep 3, 2025
c321afa
Minor nit fixes
Kjakubzak Sep 3, 2025
985455a
More implementation notes
Kjakubzak Sep 3, 2025
f560e6e
Update extensions/2.0/Khronos/KHR_character_virtual_transforms/README.md
Kjakubzak Sep 4, 2025
639579c
Minor nits and moving virtual transform to KHR
Kjakubzak Sep 9, 2025
3b34038
Update README.md
Kjakubzak Sep 9, 2025
527faa9
Merge branch 'kjakubzak/avatar_ext_mesh_annotation_consolidated' into…
Kjakubzak Sep 9, 2025
de83236
Update README.md
Kjakubzak Sep 9, 2025
9e54e51
Update extensions/2.0/Khronos/KHR_character_skeleton_biped/README.md
Kjakubzak Sep 10, 2025
522371b
Update extensions/2.0/Khronos/KHR_character_skeleton_biped/README.md
Kjakubzak Sep 10, 2025
e5c010d
Updates to bindpose, mapping, and virtual transform extension readmes
Kjakubzak Sep 21, 2025
7994f0e
Initial Draft Schemas
Kjakubzak Sep 22, 2025
8a09213
Update khr_mesh_annotation readme
Kjakubzak Sep 22, 2025
99861a3
Fixed readme typo
Kjakubzak Sep 22, 2025
d2e7cca
Addressing bad copy/paste and stale documentation
Kjakubzak Oct 7, 2025
f13ade3
Minor formatting consistency fixes
Kjakubzak Oct 7, 2025
c7ae046
Fixes to khr_character_expression_mapping
Kjakubzak Oct 13, 2025
53935f5
Change KHR_character_skeleton_biped README to be clearer
Kjakubzak Oct 16, 2025
df66d5f
Fixing KHR_character_expression_morphtarget inconsistencies
Kjakubzak Dec 10, 2025
1812744
Delaying KHR_character_avatar until we have more specific use-cases.
Kjakubzak Dec 10, 2025
3e4bff5
Delaying KHR_mesh_annotation
Kjakubzak Dec 10, 2025
8caa030
Delaying KHR_mesh_annotation_renderview
Kjakubzak Dec 10, 2025
1e93bfa
Delaying KHR_character_skeleton_biped
Kjakubzak Dec 17, 2025
56868be
Updated KHR_character_expression_morphtarget to explicitly depend on …
Kjakubzak Dec 17, 2025
34c6689
KHR_character_skeleton_bindpose update
Kjakubzak Dec 17, 2025
9dcf075
Update KHR_character to use rootNode instead of sceneIndex
Kjakubzak Dec 17, 2025
607430e
Individual Contributor ->Independent Contributor
Kjakubzak Dec 17, 2025
1c91e28
Updating README contributor lists with new TSG contributors
Kjakubzak Dec 17, 2025
3eb2b6d
KHR_character_skeleton_mapping - Inverted key/value pairs to align wi…
Kjakubzak Dec 17, 2025
f3f2c90
Updated schema references to https://json-schema.org/draft/2020-12/sc…
Kjakubzak Jan 12, 2026
ee572d4
Update KHR_character_expression.KHR_character_expression_procedural.s…
Kjakubzak Jan 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 151 additions & 0 deletions extensions/2.0/Khronos/KHR_avatar/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# KHR_avatar

## Contributors

- Ken Jakubzak, Meta
- Hideaki Eguchi / VirtualCast, Inc.
- K. S. Ernest (iFire) Lee, Individual Contributor / https://github.com/fire
- Shinnosuke Iwaki / VirtualCast, Inc.
- 0b5vr / pixiv Inc.
- Leonard Daly, Individual Contributor

## Status

**Draft** – This extension is not yet ratified by the Khronos Group and is subject to change.

## Dependencies

Written against the glTF 2.0 specification.

Dependencies: `KHR_xmp_json_ld`
This extension also leverages the `KHR_xmp_json_ld` pattern for attaching extensible metadata as JSON-LD blocks within glTF assets. For background on this approach, see:
[KHR_xmp_json_ld](https://github.com/KhronosGroup/glTF/tree/main/extensions/2.0/Khronos/KHR_xmp_json_ld)

## Overview

The `KHR_avatar` extension designates a glTF asset as representing an avatar. This top-level marker enables tools and runtimes to interpret the asset as containing avatar-specific content such as rigging, blendshapes, animation retargeting, or metadata.

This extension does not define avatar features directly but acts as a root declaration that avatar-related extensions may be present, and that consumers should treat the asset using avatar-specific logic and pipelines. It's part of the wider set of KHR avatar extensions that are meant to be building blocks to represent a contract stating functionality and data requirements between a given model and an endpoint.

The extension supports referencing the source `scene` that represents the avatar and optionally includes structured metadata through the `KHR_xmp_json_ld` mechanism.

## Extension Schema

```json
{
"extensions": {
"KHR_avatar": {
"sceneIndex": 0
}
}
}
```

### Properties

| Property | Type | Description |
|--------------|---------|-----------------------------------------------------------------------------|
| `sceneIndex` | integer | Index of the glTF `scene` representing the avatar. Used to distinguish the avatar root when multiple scenes exist. |

## Metadata Attachment: JSON_XMP_LD

Avatar metadata should be expressed using the `KHR_xmp_json_ld` format, a structured mechanism for attaching JSON-LD metadata blocks to glTF files. In the context of `KHR_avatar`, this allows consistent expression of avatar provenance, licensing, creator, versioning, and intended use, among others.

The `KHR_xmp_json_ld` block is placed at the root level of the glTF asset as part of the defined extension usage. Metadata keys and structures are defined in the shared Khronos Avatar Metadata schema (TBD).

| DC/XMP_JSON_LD Property | Why | Required |
|-------------------------|------------------------------------------------------------------------------|----------|
| dc:title | | Yes |
| dc:creator | | Yes |
| dc:license | | No |
| dc:rights | | No |
| dc:created | Date on which the asset was created | No |
| dc:publisher | Identifies the entity responsible for making the resource available; important for understanding the source and authority of the content. | No |
| dc:description | Context and a summary of the content | No |
| dc:subject | Can potentially be used for content tagging/association | No |
| dc:source | Important for tracing the provenance and ensuring proper attribution. | Yes |
| khr:version | | No |
| khr:thumbnailImage | | No |

## Example

```json
{
"asset": {
"version": "2.0"
},
"scene": 0,
"scenes": [
{
"nodes": [0]
}
],
"nodes": [
{
"name": "AvatarRoot"
}
],
"extensionsUsed": [
"KHR_avatar",
"KHR_xmp_json_ld"
],
"extensions": {
"KHR_avatar": {
"sceneIndex": 0
},

"KHR_xmp_json_ld": {
"packets": [
{
"@context": {
"dc": "http://purl.org/dc/elements/1.1/",
"vrm": "https://github.com/vrm-c/vrm-specification/blob/master/specification/VRMC_vrm-1.0/meta.md"
},
"dc:title": "Example Model",
"dc:creator": {
"@list": [
"Author1",
"AuthorEmail1@email.com",
"Author2",
"AuthorEmail2@email.com"
]
},
"dc:license": {
"@list": [
"https://vrm.dev/licenses/1.0/",
"https://example.com/third-party-license"
]
},
"dc:created": "2023-05-05",
"dc:rights": "Copyright information about the model",
"dc:publisher": "Imaginary Corporation A, LLC",
"dc:description": "A sentence, or paragraph describing the avatar at hand",
"dc:subject": {
"@list": [
"Example trait",
"Another example trait"
]
},
"dc:source": "imaginaryCompany.com/avatarl",
"khr:version": "1.0",
"khr:thumbnailImage": 0
}
]
}
}
}
```

## Implementation Notes

- `sceneIndex` is required and indicates what scene a particular avatar belongs to
- Consumers should use this marker as a signal to search for additional avatar-related extensions, including skeletal, expression, and other khronos avatar extensions.
- Support for `JSON_XMP_LD` is encouraged to ensure interoperable metadata across tools and runtimes.

## Known Implementations


## License

This extension specification is licensed under the Khronos Group Extension License.
See: https://www.khronos.org/registry/gltf/license.html
101 changes: 101 additions & 0 deletions extensions/2.0/Khronos/KHR_avatar_expression_joint/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# KHR_avatar_expression_joint

## Contributors

- Ken Jakubzak, Meta
- Hideaki Eguchi / VirtualCast, Inc.
- K. S. Ernest (iFire) Lee, Individual Contributor / https://github.com/fire
- Shinnosuke Iwaki / VirtualCast, Inc.
- 0b5vr / pixiv Inc.
- Leonard Daly, Individual Contributor

## Status

**Draft** – This extension is not yet ratified by the Khronos Group and is subject to change.

## Dependencies

Written against the glTF 2.0 specification.
Dependent on: `KHR_avatar`
Typically used in conjunction with: `KHR_avatar_expression_mapping`

## Overview

The `KHR_avatar_expression_joint` extension provides a semantic mapping between facial expressions and joint transformations in the glTF node hierarchy. It enables tools and runtimes to associate expressions like `blink`, `smile`, or `jawOpen` with specific nodes whose transforms are animated using standard glTF animation channels.

This extension is purely descriptive: it does not define or store animation data itself.

## Expression Vocabulary

Expression types include:

- **Emotions**: `happy`, `angry`, `surprised`, etc.
- **Visemes**: `aa`, `ee`, `th`, `oo`, etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not sufficient to just define "etc". The extension should define a large list of interoperable names, or else the extension does not do much to further interoperability.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expression are allowed to define anything that is in the categories of morphshape, joints or textures so we put the vrm defaults here. You are correct. https://github.com/vrm-c/vrm-specification/blob/master/specification/VRMC_vrm-1.0/expressions.md#lip-sync-procedural

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example of a potential list of visemes:

Screenshot 2025-07-28 at 7 50 25 PM

Copy link

@fire fire Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't want to define the possible visemes. For example in Godot Engine, we decided to use unified expressions. https://docs.vrcft.io/docs/tutorial-avatars/tutorial-avatars-extras/unified-blendshapes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same problem we're currently facing across the industry. We don't have standards or prevalent shared vocabularies. Once we have wider adoption, I truly believe that those that use these extensions can come together to establish those vocabularies.

For now though, establishing it without getting feedback from groups that'd use it would end in frustration. I'd much rather try to establish something that is flexible, interoperable, AND can be used for when the community comes together to form those.

For now though, the expression and joint mapping extensions are meant to provide mechanisms to map creator expressions to endpoint expressions. This extension is more to denote what an expression 'is' (animation/channel-wise). Once you have the creator/producing pipeline's concept of what the relative expressions are, mapping them to an endpoints desired/expected set becomes easier.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@aaronfranke aaronfranke Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anyone is interested in further discussion on viseme blend shape naming standardization, discuss here: meshula/LabRCSF#5

- **Modifiers**: `left`, `right`, `upper`, `lower`, etc.
- **Gestures and Actions**: `blink`, `smile`, `jawOpen`, etc.

Optionally, these may be aligned with industry standards, such as [Facial Action Coding System (FACS)](https://en.wikipedia.org/wiki/Facial_Action_Coding_System).

## Extension Schema

```json
{
"extensions": {
"KHR_avatar_expression_mapping": {
"expressions": [
{
"expression": "smile",
"animation": 0,
"channels": [0,1,2]
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extension on... what? The root of the document? If it's tied to one animation, does it make sense to extend animation? Why or why not?

The other extensions in this pull request similarly do not explicitly define what they are extending.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bad copy-paste; Was putting this together in a way that I've clearly made some mistakes. Fixing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root extensions in the majority of cases. I'll do another pass to make it more clear in the next couple of days. Apologies.

{
"expression": "frown",
"animation": 1,
"channels": [0,1]
}
]
}
}
}
```

### Properties

| Property | Type | Description |
|--------------|---------|-----------------------------------------------------------------------------|
| `expressions`| array | Array of mappings between animation/channels and expression labels. |
| `animation` | integer | Index into the glTF `animations[]` array representing an expression animation. |
| `expression` | string | Expression name this joint contributes to. |
| `channels` | array | array representing channels that must correspond to either `"rotation"`, `"translation"`, or `"scale"`; indicates transform types. |

## Animation Integration

- Expression timing, blending, and control must use glTF `animations` channels.
- Animations targeting expression-driven `rotation`, `translation`, or `scale` must conform to glTF 2.0's animation model.
- This ensures consistency, ease of implementation, and interoperability across runtimes.

Each animation channel used to drive an expression should operate within a **normalized 0-to-1 range**, where:
- `0.0` indicates the expression is fully inactive.
- `1.0` indicates the expression is fully active.

The transformation values themselves (e.g., degree of rotation or distance of translation) should scale proportionally with the normalized input range.

This approach simplifies avatar implementation by centralizing expression playback in the glTF animation system and unifying runtime logic for blending and prioritization.

### Recommended Interpolation for Binary Expressions

For expressions that represent binary or toggle states (such as `blinkLeft`, `blinkRight`, or `jawOpen`), the use of glTF animation channels with `"interpolation": "STEP"` is strongly recommended.

STEP interpolation ensures that an expression toggles cleanly between fully off (`0.0`) and fully on (`1.0`) states, providing crisp visual transitions and avoiding interpolation artifacts that could occur with `LINEAR` interpolation in binary scenarios.


## Implementation Notes

- Multiple joints may be assigned to the same expression.
- Expression states should be normalized to the [0.0–1.0] range for consistent runtime interpretation.
- This extension does not conflict with standard rigging or skinning systems.

## License

This extension is licensed under the Khronos Group Extension License.
See: https://www.khronos.org/registry/gltf/license.html
104 changes: 104 additions & 0 deletions extensions/2.0/Khronos/KHR_avatar_expression_mapping/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# KHR_avatar_expression_mapping

## Contributors

- Ken Jakubzak, Meta
- Hideaki Eguchi / VirtualCast, Inc.
- K. S. Ernest (iFire) Lee, Individual Contributor / https://github.com/fire
- Shinnosuke Iwaki / VirtualCast, Inc.
- 0b5vr / pixiv Inc.
- Leonard Daly, Individual Contributor

## Status

**Draft** – This extension is not yet ratified by the Khronos Group and is subject to change.

## Dependencies

Written against the glTF 2.0 specification.
Dependent on: `KHR_avatar`
Can be used alongside: `KHR_avatar_expression_morphtargets` or other expression sources

## Overview


## Reference Expression Vocabulary
Expression names may be grouped into categories including:

- **Emotions** (e.g., `happy`, `angry`, `surprised`)
- **Visemes** (e.g., `aa`, `oo`, `th`)
Copy link
Contributor

@aaronfranke aaronfranke Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, should give a full list. Also, inconsistent wording: etc vs e.g. (I recommend etc, or go full English words and use phrases like "for example", "such as", "and so on", "and more").

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer "for example" and "etc.".

- **Modifiers** (e.g., `left`, `right`, `upper`, `lower`)
- **Gestures and Actions** (e.g., `blink`, `smile`, `jawOpen`)

Implementers are encouraged to use this vocabulary directly or map custom expressions to it using `KHR_avatar_expression_mapping`.

The `KHR_avatar_expression_mapping` extension provides a general-purpose mechanism for mapping expression names used in an avatar’s mesh to a known expression vocabulary or rig specification. This allows different authoring pipelines or runtimes to translate between heterogeneous expression sets.

## Extension Schema

```json
{
"extensions": {
"KHR_avatar_expression_mapping": {
"mappings": {
"smileLeft": [
{ "target": "Smile", "weight": 0.8 },
{ "target": "LeftCheekRaise", "weight": 0.2 }
],
"jawOpen": [
{ "target": "MouthOpen", "weight": 1.0 }
]
}
}
}
}
```

### Properties

| Property | Type | Description |
|--------------|---------|-----------------------------------------------------------------------------|
| `mappings` | object | Dictionary mapping expression names to reference vocabulary terms. |
| `target` | string | Name of the expression in the target vocabulary. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "target vocabulary"? This isn't explained anywhere. The list of visemes etc are in a section for "Reference Expression Vocabulary" which is the description of mappings, but this term for target is undefined.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expressions names can be anything, both FACS or the VRM blend shapes standard we know today, to content creator definitions or future standard we have no idea about.

| `weight` | number | Influence of this target. Must sum to 1.0 per expression key. |


### Mapping Types

This extension supports both one-to-one and one-to-many mappings:

- **One-to-one**: An expression maps directly to a single reference vocabulary term with weight 1.0.
- **One-to-many**: An expression is composed from multiple reference terms, blended using assigned weights.

This allows developers to bridge between custom expression sets and shared vocabularies.

## Implementation Notes

- This extension is typically used at the top level of the glTF file.
- Expression names should match those used in `KHR_avatar_expression_morphtargets`, animation tracks, or tracking pipelines.
- Tools can interpret this mapping to apply automatic translation between expression sets.

## Example

```json
{
"extensionsUsed": [
"KHR_avatar_expression_mapping"
],
"extensions": {
"KHR_avatar_expression_mapping": {
"mappings": {
"smileLeft": [
{ "target": "Smile", "weight": 0.8 },
{ "target": "LeftCheekRaise", "weight": 0.2 }
]
}
}
}
}
```

## License

This extension specification is licensed under the Khronos Group Extension License.
See: https://www.khronos.org/registry/gltf/license.html
Loading