Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added LayoutLMv3 #2178

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

carrycooldude
Copy link

Description

This PR fixes the LayoutLMv3 checkpoint conversion script to properly handle different spatial embedding dimensions between the base and large models. The base model uses 128 dimensions for all spatial embeddings, while the large model uses 171 dimensions for x/y coordinates and 170 dimensions for height/width.

Changes Made

  • Added dynamic detection of spatial embedding dimensions from the Hugging Face model
  • Implemented padding for smaller embeddings to match the maximum dimension
  • Updated projection matrices to use consistent dimensions
  • Added detailed debug output for spatial embedding shapes

Technical Details

The conversion script now:

  1. Detects individual dimensions for x, y, h, w embeddings
  2. Uses the maximum dimension (171 for large model) for all embeddings
  3. Pads smaller embeddings (170) with zeros to match the larger dimension
  4. Creates projection matrices with consistent dimensions

Testing

  • Successfully converted both base and large models
  • Verified output shapes match expected dimensions
  • Confirmed no dimension mismatch errors during conversion

Output Example

Screenshot from 2025-03-30 12-50-29

@divyashreepathihalli
Copy link
Collaborator

@carrycooldude That you for the PR - the code structure does not match KerasHub style.
please go through the guide here - https://github.com/keras-team/keras-hub/blob/master/CONTRIBUTING_MODELS.md
Take a look at other model folders.
What would the task model look like?
the preset file contents should be just metadata and kaggle hub path
Can you provide a model code usage example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants