Skip to content

Question/request: High-Level Model Architecture Visualization Tool #36069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
d-kleine opened this issue Feb 6, 2025 · 4 comments
Open

Question/request: High-Level Model Architecture Visualization Tool #36069

d-kleine opened this issue Feb 6, 2025 · 4 comments
Labels
Feature request Request for a new feature

Comments

@d-kleine
Copy link
Contributor

d-kleine commented Feb 6, 2025

Feature request

I hope this is the right place to make this feature request: I propose adding a high-level visualization tool for model architectures hosted on HuggingFace Hub. The tool should provide an intuitive, abstracted view of model architectures, focusing on major components and their interactions rather than low-level operations.

The implementation could be either:

  • A standalone function within the transformers library (e.g., model.visualize_architecture())
  • A separate lightweight package that integrates with the transformers library

While starting with transformer-based models would be logical, the tool should be extensible to other architectures in the future.

This request applies to the entire model catalog on HuggingFace Hub - not just transformer-based models or LLMs in specific, but ideally all model architectures including computer vision models, speech models, multimodal architectures, and others.

Motivation

I have searched the HuggingFace forum and GitHub issues for similar feature requests or discussions. While there are conversations about model visualization using tools like torchview (see for example here) and discussions about model architecture understanding, I could not find any direct feature requests for a high-level architecture visualization tool. This suggests this feature would fill an important gap in the current HuggingFace ecosystem.

Current visualization tools like torchview provide technically accurate but overwhelming visualizations that:

  • Display repetitive components (like transformer blocks) individually
  • Show low-level operations as separate components
  • Generate complex graphs that are difficult for newcomers to understand

A high-level visualization tool would:

  • Help users quickly understand model architectures, saving time by avoiding deep dives into code and papers
  • Provide clear visual representation of major components and their relationships
  • Support educational purposes and architecture comparison
  • Make model selection more intuitive for practitioners

The tool should:

  • Generate clean, hierarchical diagrams showing major architectural components and how they are chained with each other
  • Collapse repetitive structures (like transformer blocks) into single units with count indicators
  • Show key parameters and dimensions, etc.
  • Quickly generate visualizations, even for complex model architectures
  • Allow different levels of detail through zoom/expansion and display settings
  • Support export to common output formats (PNG, SVG)
  • Ideally, include hover tooltips with component descriptions

Your contribution

The exact nature of my involvement would naturally depend on your vision and needs. Generally speaking, I could help with

  • Providing ideas on how to make the tool user-friendly
  • Testing the tool and providing constructive feedback
  • Contributing code
@d-kleine d-kleine added the Feature request Request for a new feature label Feb 6, 2025
@Rocketknight1
Copy link
Member

This sounds like a cool feature! There may be some tricky implementation details, but if you can get it working I think we'd be happy to include it in the library (cc @ArthurZucker to confirm)

We probably don't have the bandwidth to do it internally, but if you want to open a PR and try adding it yourself, we could probably offer some support

@marthos1
Copy link

marthos1 commented Feb 7, 2025

귀신나오게 하지말아주세용

@d-kleine
Copy link
Contributor Author

Just to get a general idea, this is how torchview visualizes GPT-2:

pip install torch transformers graphviz torchview
from transformers import AutoModel, AutoTokenizer
from torchview import draw_graph
import torch


device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.nn.Linear(in_features=1, out_features=1, device=device)
model_graph = draw_graph(model, input_size=(32, 1), device=device)


model = AutoModel.from_pretrained("openai-community/gpt2")
tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")

inputs = tokenizer("Hello world!", return_tensors="pt")

model_graph = draw_graph(model, input_data=inputs, device=device, expand_nested=True)

model_graph.visual_graph.node_attr["fontname"] = "Helvetica"
model_graph.visual_graph

Image

The cool thing about the above implementation is that is it can already visualize models hosted on HuggingFace. It can also visually "wrap" certain components to blocks (see dashed lines) with the information from the model files, but it would require some more refinements and declutter (technical labels for operations, repeated blocks will not be collapsed, etc.).

@Rocketknight1
Copy link
Member

Yeah, for sure - if you could get something like this working cleanly for Transformers models that'd be great.

The major difficulty, though, is that we have a lot of conventions that aren't 100% universal rules, so any code you write will probably need edge cases, or may not support all models. Still, it's a nice visualization if you can handle things like repeating blocks, so I'm excited about adding something like this to the library!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

3 participants