Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] Retry not working in 2.3.0 #11329

Open
asaff1 opened this issue Oct 25, 2024 · 0 comments
Open

[backend] Retry not working in 2.3.0 #11329

asaff1 opened this issue Oct 25, 2024 · 0 comments

Comments

@asaff1
Copy link

asaff1 commented Oct 25, 2024

Environment

Steps to reproduce

Created any pipeline that fails. I created this simple pipeline. Uploaded from the UI "Upload pipeline". Start a run of this pipeline with fail = true. The run fails as expected. Try to click the "Retry" button in the UI, the workflow will be stuck at pending, no retry will be done.

import os
from kfp import dsl
from kfp.dsl import Dataset, Model, Input, Output
from typing import Optional, List

@dsl.component(base_image="python:3.9")
def say_hello(name: str, number: int, opt_int: Optional[int]) -> List[str]:
    hello_text = f'Hello, {name}! number={number} opt_int={opt_int}'
    print(hello_text)
    return ["hello", name, str(number), str(opt_int)]
    #return hello_text


@dsl.component(base_image="python:3.9")
def say_hello_list(list_hello: List[str]) -> str:
    hello_text = f'Hello list_hello={list_hello}'
    print(hello_text)
    return hello_text


@dsl.component(base_image="python:3.9", packages_to_install=['pandas==1.3.5', "numpy==1.*"])
def create_dataset(iris_dataset: Output[Dataset], fail: bool):
    import pandas as pd
    import random
    r = random.random()
    #print("Failing at random < 0.5. r=", r)
    #if r < 0.5:
#    raise ValueError(f"Failing at random! r={r}")
    if fail:
        raise ValueError("fail=true, failing!")
    df = pd.DataFrame({"name": ["a", "b", "c"], "age": [11, 12, 13]})
    with open(iris_dataset.path, 'w') as f:
        df.to_csv(f)

@dsl.component(base_image="python:3.9", packages_to_install=['pandas==1.3.5', "numpy==1.*"])
def print_dataset(ds: Input[Dataset], a: int):
    print(f"hello a={a}")
    print(ds)
   
@dsl.pipeline
def hello_pipeline(recipient: str, number: int, opt_int: Optional[int], fail: bool) -> str:
    hello_task = say_hello(name=recipient, number=number, opt_int=opt_int)
    hello_list_task = say_hello_list(list_hello=hello_task.output)
    create_dataset_task = create_dataset(fail=fail)
    create_dataset_task.set_caching_options(False)
    print_dataset(ds=create_dataset_task.output, a=44)
    return hello_list_task.output


from kfp import compiler
compiler.Compiler().compile(hello_pipeline,  os.path.basename(__file__).replace(".py", ".yaml"))

Expected result

Retry should work.
By the way, I also deployed before kubeflow pipelines 2.0.3, and retry did work as expected. So it must be something that broke in the recent releases.

Materials and Reference


Impacted by this bug? Give it a 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant