Slow performance of Categorify operation on Triton Inference Server #1885

rahuljantwal-8451 · 2024-10-03T21:29:30Z

Description

When running an NVTabular workflow with Categorify operations in Triton Inference Server, the performance is significantly slow when dealing with high cardinality data.

Environment

Merlin Tensorflow Container
23.12

Steps to Reproduce

Generate a High Cardinality Dataset using generate_dataset.py
Process the dataset using NVTabular process_dataset.py
Export the NVTabular workflow as a Triton ensemble export_ensemble.py
Run the Triton server

tritonserver --model-repository=./ensemble/

Expected Behavior

The Categorify operation should perform efficiently, with category data being cached between requests, resulting in performance similar to that observed in a Jupyter notebook environment.

Actual Behavior

The Categorify operation is slow, with each request taking as long as the first request, suggesting that category data is not being effectively cached between requests.

Results

Below are the result based on benchmarking script - encode.sh

Cardinality	Ensemble Triton	TransformWorkflow Jupyter
50	30 ms	38 ms
5k	30 ms	43 ms
5M	1270 ms	88.8 ms
50M	15833 ms	550 ms

rahuljantwal-8451 added the bug Something isn't working label Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow performance of Categorify operation on Triton Inference Server #1885

Slow performance of Categorify operation on Triton Inference Server #1885

rahuljantwal-8451 commented Oct 3, 2024 •

edited

Loading

Slow performance of Categorify operation on Triton Inference Server #1885

Slow performance of Categorify operation on Triton Inference Server #1885

Comments

rahuljantwal-8451 commented Oct 3, 2024 • edited Loading

Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Results

rahuljantwal-8451 commented Oct 3, 2024 •

edited

Loading