GeistHaus
log in · sign up

[feature suggestion] instead of using random datasets, it should use real datasets · Issue #359 · SemiAnalysisAI/InferenceX

github.com

Almost all benchmark configurations set max_model_len or for TensorRT --max_seq_len, which controls the maximum supported length of request (inclusive of the prompt and any generated output). It is...

1 page links to this URL