Set gpu per model #3085

eliteprox · 2024-06-25T05:30:31Z

What does this pull request do? Explain your changes. (required)

Adds the ability to set preferred gpu by model in the aimodels config.

    {
        "pipeline": "text-to-image",
        "model_id": "SG161222/RealVisXL_V4.0_Lightning",
        "price_per_unit": 190000,
        "warm": false,
        "gpus": [0]
    },

Only works for warm models currently. More changes needed to follow gpu flag when loading cold models

Linked to changes in livepeer/ai-worker#111

Specific updates (required)

How did you test each of these updates (required)

Does this pull request close any open issues?

AI-134

Checklist:

Read the contribution guide
make runs successfully
All tests in ./test.sh pass
README and other documentation updated
Pending changelog updated

…ilar to Text2Image and Image2Video (livepeer#3092)

…3093) This commit ensures that the I2I pipeline latency score calculation now considers the number of images.

…ivepeer#3099) This commit adds support for the `num_inference_steps` parameter to the I2I, I2V and upscale pipelines. It also fixes a incorrect latencyScore calculation for the bytedance model.

* Add speech-to-text pipeline, refactor processAIRequest and handleAIRequest to allow for various response types * Pin gomod to ai-runner for testing * Revert "Pin gomod to ai-runner for testing" This reverts commit d4ba500. * Update go mod dep for ai-worker * Calculate pixel value of audio file * fix go-mod deps * Adjust price calculation * one second per pixel * cleanup, fix missing duration * Add supported file types, calculate price by milliseconds * Add bad request response for unsupported file types * Update name of function * Update go mod to ai-runner * Use ffmpeg to get duration * update install_ffmpeg.sh to parse audio better * Check for audio codec instead of video codec * gomod edits * add docker file * Update install_ffmpeg.sh to improve audio support, Add duration validation and logging, pin lpms * rename speech-to-text to audio-to-text * Update go-mod * cleanup * update go mod * remove comment * update gomod * Update lpms mod * Update to latest lpms * Update lpms * feat(ai): apply code improvements to AudioToText pipeline This commit applies several code improvements to the AudioToText codebase. * Remove unnecessary logic * Remove unused error * Fix missing err * Update go.mod and tidy * chore(ai): update ai-worker and lpms to latest version This commit ensures that the ai-worker and lpms are at the latest versions which contain the changes needed for the audio-to-text pipeline. --------- Co-authored-by: 0xb79orch <[email protected]> Co-authored-by: Rick Staa <[email protected]>

* Add gateway metric for roundtrip ai times by model and pipeline * Rename metrics and add unique manifest * Fix name mismatch * modelsRequested not working correctly * feat: add initial POC AI gateway metrics This commit adds the initial AI gateway metrics so that they can reviewed by others. The code still need to be cleaned up and the buckets adjusted. * feat: improve AI metrics This commit improves the AI metrics so that they are easier to work with. * feat(ai): log no capacity error to metrics This commit ensures that an error is logged when the Gateway could not find orchestrators for a given model and capability. * feat(ai): add TicketValueSent and TicketsSent metrics This commit ensure that the `ticket_value_sent` abd `tickets_sent` metrics are also created for a AI Gateway. * fix(ai): ensure that AI metrics have orch address label This commit ensures that the AI gateway metrics contain the orch address label. * fix(ai): fix incorrect Gateway pricing metric This commit ensures that the AI job pricing is calculated correctly and cleans up the codebase. * refactor(ai): remove Orch label from ai_request_price metric This commit removes the Orch label from the ai_request_price metrics since that information is better to be retrieved from another endpoint. --------- Co-authored-by: Elite Encoder <[email protected]>

This commit adds the gateway metrics to the Audio-to-text pipeline.

* Add gateway metric for roundtrip ai times by model and pipeline * Rename metrics and add unique manifest * Fix name mismatch * modelsRequested not working correctly * feat: add initial POC AI gateway metrics This commit adds the initial AI gateway metrics so that they can reviewed by others. The code still need to be cleaned up and the buckets adjusted. * feat: improve AI metrics This commit improves the AI metrics so that they are easier to work with. * feat(ai): log no capacity error to metrics This commit ensures that an error is logged when the Gateway could not find orchestrators for a given model and capability. * feat(ai): add TicketValueSent and TicketsSent metrics This commit ensure that the `ticket_value_sent` abd `tickets_sent` metrics are also created for a AI Gateway. * fix(ai): ensure that AI metrics have orch address label This commit ensures that the AI gateway metrics contain the orch address label. * feat(ai): add orchestrator AI census metrics This commit introduces a suite of AI orchestrator metrics to the census module, mirroring those received by the Gateway. The newly added metrics include `ai_models_requested`, `ai_request_latency_score`, `ai_request_price`, and `ai_request_errors`, facilitating comprehensive tracking and analysis of AI request handling performance on the orchestrator side. * refactor: improve orchestrator metrics tags This commit ensures that the right tags are attached to the Orchestrator AI metrics. * refactor(ai): improve latency score calculations This commit ensures that no devide by zero errors can occur in the latency score calculations. --------- Co-authored-by: Elite Encoder <[email protected]>

This commit applies some small comment changes to ease the conflicts between the main and ai-video branch.

eliteprox · 2024-08-12T21:43:00Z

Closing in favor of changes coming in #3106

eliteprox added 2 commits June 24, 2024 21:41

adding gpu flag to models config

ed5a108

Finish testing of warm models with gpu parameter, Cold model support wip

45c4837

github-actions bot added the AI Issues and PR related to the AI-video branch. label Jun 25, 2024

ad-astra-video and others added 10 commits July 26, 2024 11:36

chore: update Image2Image and Upscale OS storage to use requestID sim…

82ee7c6

…ilar to Text2Image and Image2Video (livepeer#3092)

fix(ai): account for number of images in I2I latency score (livepeer#…

c4b0182

…3093) This commit ensures that the I2I pipeline latency score calculation now considers the number of images.

feat(ai): add 'num_inference_steps' to I2I,I2V and upscale pipeliens (l…

307bf53

…ivepeer#3099) This commit adds support for the `num_inference_steps` parameter to the I2I, I2V and upscale pipelines. It also fixes a incorrect latencyScore calculation for the bytedance model.

feat(ai): add A2T gateway metrics (livepeer#3100)

f8a1cd0

This commit adds the gateway metrics to the Audio-to-text pipeline.

ci(ai): improve ci comments

4aefc25

This commit applies some small comment changes to ease the conflicts between the main and ai-video branch.

adding gpu flag to models config

780f9cc

rebase

0685c10

eliteprox closed this Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set gpu per model #3085

Set gpu per model #3085

eliteprox commented Jun 25, 2024 •

edited

Loading

eliteprox commented Aug 12, 2024

Set gpu per model #3085

Set gpu per model #3085

Conversation

eliteprox commented Jun 25, 2024 • edited Loading

eliteprox commented Aug 12, 2024

eliteprox commented Jun 25, 2024 •

edited

Loading