Skip to content

Commit 98b5d45

Browse files
committed
Minor
1 parent fa0cfc4 commit 98b5d45

2 files changed

Lines changed: 8 additions & 4 deletions

File tree

profiling/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ on how to install and start using NSight.
88

99
### Code usage
1010

11-
Here is a basic example how to use the Nsight API for a PyTorch model:
11+
Here is a basic template how to use the Nsight API for a PyTorch model:
1212

1313
torch.cuda.cudart().cudaProfilerStart()
1414

profiling/profile_vllm_nsight.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,16 @@ def sample_token_sequences(vocab_size:int, num_seq: int, len_mean:int, len_std:i
1313
def main(args: argparse.Namespace):
1414

1515
print(args)
16-
16+
1717
llm = LLM(
1818
model=args.model,
1919
quantization=args.quantization,
2020
tensor_parallel_size=args.tensor_parallel_size,
21-
#max_num_seqs=args.batch_size,
22-
#max_num_batched_tokens=args.batch_size * args.input_len,
21+
#max_num_seqs=args.batch_size,
22+
#max_num_batched_tokens=args.batch_size * (args.input_len + args.output_len),
2323
trust_remote_code=args.trust_remote_code,
2424
dtype=args.dtype,
25+
disable_log_stats=not args.enable_log_stats
2526
)
2627

2728
vocab_size = llm.llm_engine.tokenizer.vocab_size
@@ -98,6 +99,9 @@ def run_to_completion(profile:bool=False):
9899
parser.add_argument('--profile',
99100
action='store_true',
100101
help='profile CUDA code')
102+
parser.add_argument('--enable-log-stats',
103+
action='store_true',
104+
help='log vLLM statistics')
101105

102106
args = parser.parse_args()
103107
main(args)

0 commit comments

Comments
 (0)