Skip to content

Conversation

@GandalfTea
Copy link
Contributor

@GandalfTea GandalfTea commented Nov 25, 2025

Summary

Add support for new model architectures:

  • Olmo3
  • GLM4
  • Qwen3-MoE

Changes

 src/dnet/core/models/__init__.py  |   6 +++
 src/dnet/core/models/glm4.py      | 119 ++++++++++++++++++++++++++++++++++++++++++
 src/dnet/core/models/olmo3.py     | 120 ++++++++++++++++++++++++++++++++++++++++++
 src/dnet/core/models/qwen3_moe.py | 187 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 432 insertions(+)

Testing

  Olmo 3
! mlx-community/Olmo-3-1025-7B-4bit          [FAIL] (junk output) 
+ mlx-community/Olmo-3-7B-Think-4bit
+ mlx-community/Olmo-3-7B-Think-SFT-4bit
+ mlx-community/Olmo-3-7B-Instruct-4bit
+ mlx-community/Olmo-3-7B-Instruct-SFT-4bit

+ mlx-community/Olmo-3-1025-7B-8bit
+ mlx-community/Olmo-3-7B-Think-8bit
+ mlx-community/Olmo-3-7B-Think-SFT-8bit
+ mlx-community/Olmo-3-7B-Instruct-8bit
+ mlx-community/Olmo-3-7B-Instruct-SFT-8bit

! mlx-community/Olmo-3-7B-Instruct-bf16.        [FAIL] (bf16 fails sampling)
! mlx-community/Olmo-3-7B-Instruct-SFT-bfloat16 [FAIL] 
! mlx-community/Olmo-3-7B-Think-bfloat16        [FAIL] 
! mlx-community/Olmo-3-7B-Think-SFT-bfloat16
! mlx-community/Olmo-3-1025-7B-bfloat16         [FAIL] 

+ mlx-community/Olmo-3-1125-32B-4bit
+ mlx-community/Olmo-3-1125-32B-8bit


  GLM
+ mlx-community/GLM-4-9B-0414-4bit
+ mlx-community/GLM-Z1-9B-0414-4bit
+ mlx-community/GLM-4-9B-0414-8bit 
+ mlx-community/GLM-Z1-9B-0414-8bit
+ mlx-community/GLM-4-32B-0414-4bit
+ mlx-community/GLM-Z1-32B-0414-4bit
! mlx-community/GLM-Z1-9B-0414-bf16 [FAIL] (failed sampling)
! mlx-community/GLM-4-9B-0414-bf16  [FAIL] (failed sampling)

  Qwen3-MoE
+ mlx-community/Qwen3-30B-A3B-4bit
+ mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit
+ mlx-community/Qwen3-Coder-30B-A3B-Instruct-8bit

Also modified existent catalogue entries:

- mlx-community/Qwen3-8B-bf16 (failed sampling)

Dependencies

Commit is dependent on distilp PRs: firstbatchxyz/distilp#18 and firstbatchxyz/distilp#17

@GandalfTea GandalfTea force-pushed the oto/add-models branch 2 times, most recently from bb41004 to a282bef Compare November 25, 2025 14:54
@GandalfTea
Copy link
Contributor Author

mlx-community/GLM-Z1-9B-0414-bf16 fails with:

2025-11-25 07:52:38,633 - dnet - ERROR - fit_in_memory.py:164 - End-shard sampling failed: [matmul] Last dimension of first input with shape (1,14,4096) must match second to last dimension of second input with shape (151552,4096).

@andthattoo
Copy link
Member

Models are now tracked within catalogue. See catalog

@GandalfTea
Copy link
Contributor Author

Added the working models to the catalogue. I'll look into 6bit and bf16 quantization problems

@GandalfTea GandalfTea marked this pull request as ready for review November 25, 2025 18:07
@GandalfTea GandalfTea marked this pull request as draft November 26, 2025 05:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants