Skip to content

Conversation

MagellaX
Copy link

• After setup_model_and_optimizer returns the model, compute total parameters with the existing get_parameters_in_billions utility.
• Print a single formatted line on args.rank == 0.
• No functional or performance impact—purely a log addition.

Implementation details
Leverages get_parameters_in_billions(model) to keep counting logic consistent.
Converts billions → exact count (int(total_params_B * 1e9)) for clarity.
flush=True ensures the line appears promptly even in buffered environments.
Guarded by args.rank == 0 so it prints once per job, regardless of model/data/pipeline parallelism.

Testing
Unit tests
pytest -q tests → all green.
Smoke-test (examples/pretrain_gpt_tiny.sh)
Confirmed the new line appears once on rank 0 and nowhere else.
Training proceeds unaltered.

Backward compatibility
None of the existing outputs were removed or changed; only one new line is added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant