Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Example Data passing pipeline not working with AWS setup #11310

Open
asaff1 opened this issue Oct 18, 2024 · 1 comment
Open

[bug] Example Data passing pipeline not working with AWS setup #11310

asaff1 opened this issue Oct 18, 2024 · 1 comment
Labels

Comments

@asaff1
Copy link

asaff1 commented Oct 18, 2024

Environment

Steps to reproduce

Simply run the example "Data passing in python" pipeline. It will fail on the first 'preprocess' step. You will get the error:
"preprocess/output_dataset_one": failed to close Writer for bucket: blob (key "preprocess/output_dataset_one") (code=Unknown): MissingRegion: could not find region configuration

Additional info

The S3 bucket exists, region and AWS keys were set correctly in the params.env, secret.env
I do see main.log created in the S3 bucket. Means that there's a connection.

Full Log of the "preprocess" step:

time="2024-10-18T09:17:32.751Z" level=info msg="capturing logs" argo=true
I1018 09:17:32.772345      15 cache.go:116] Connecting to cache endpoint 10.100.224.77:8887
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 
https://pip.pypa.io/warnings/venv
[KFP Executor 2024-10-18 09:17:40,567 INFO]: Looking for component `preprocess` in --component_module_path `/tmp/tmp.wI3ac7VQmd/ephemeral_component.py`
[KFP Executor 2024-10-18 09:17:40,567 INFO]: Loading KFP component "preprocess" from /tmp/tmp.wI3ac7VQmd/ephemeral_component.py (directory "/tmp/tmp.wI3ac7VQmd" and module name "ephemeral_component")
[KFP Executor 2024-10-18 09:17:40,568 INFO]: Got executor_input:
{
    "inputs": {
        "parameterValues": {
            "message": "message"
        }
    },
    "outputs": {
        "parameters": {
            "output_bool_parameter_path": {
                "outputFile": "/tmp/kfp/outputs/output_bool_parameter_path"
            },
            "output_dict_parameter_path": {
                "outputFile": "/tmp/kfp/outputs/output_dict_parameter_path"
            },
            "output_list_parameter_path": {
                "outputFile": "/tmp/kfp/outputs/output_list_parameter_path"
            },
            "output_parameter_path": {
                "outputFile": "/tmp/kfp/outputs/output_parameter_path"
            }
        },
        "artifacts": {
            "output_dataset_one": {
                "artifacts": [
                    {
                        "type": {
                            "schemaTitle": "system.Dataset",
                            "schemaVersion": "0.0.1"
                        },
                        "uri": "s3://my-kubeflow-bucket/tutorial-data-passing/d5e52188-0cff-4cdc-a4ad-88fa9f0a9fbd/preprocess/output_dataset_one"
                    }
                ]
            },
            "output_dataset_two_path": {
                "artifacts": [
                    {
                        "type": {
                            "schemaTitle": "system.Dataset",
                            "schemaVersion": "0.0.1"
                        },
                        "uri": "s3://my-kubeflow-bucket/tutorial-data-passing/d5e52188-0cff-4cdc-a4ad-88fa9f0a9fbd/preprocess/output_dataset_two_path"
                    }
                ]
            }
        },
        "outputFile": "/tmp/kfp_outputs/output_metadata.json"
    }
}
I1018 09:17:40.639542      15 launcher_v2.go:704] ExecutorOutput: {
  "artifacts": {
    "output_dataset_one": {
      "artifacts": [
        {
          "name": "",
          "uri": "s3://my-kubeflow-bucket/tutorial-data-passing/d5e52188-0cff-4cdc-a4ad-88fa9f0a9fbd/preprocess/output_dataset_one",
          "metadata": {}
        }
      ]
    },
    "output_dataset_two_path": {
      "artifacts": [
        {
          "name": "",
          "uri": "s3://my-kubeflow-bucket/tutorial-data-passing/d5e52188-0cff-4cdc-a4ad-88fa9f0a9fbd/preprocess/output_dataset_two_path",
          "metadata": {}
        }
      ]
    }
  }
}
I1018 09:17:40.660890      15 launcher_v2.go:150] publish success.
F1018 09:17:40.660930      15 main.go:49] failed to execute component: failed to upload output artifact "output_dataset_one" to remote storage URI "s3://my-kubeflow-bucket/tutorial-data-passing/d5e52188-0cff-4cdc-a4ad-88fa9f0a9fbd/preprocess/output_dataset_one": uploadFile(): unable to complete copying "/s3/my-kubeflow-bucket/tutorial-data-passing/d5e52188-0cff-4cdc-a4ad-88fa9f0a9fbd/preprocess/output_dataset_one" to remote storage "preprocess/output_dataset_one": failed to close Writer for bucket: blob (key "preprocess/output_dataset_one") (code=Unknown): MissingRegion: could not find region configuration
time="2024-10-18T09:17:40.753Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 1

Expected result

Pipeline should run successfully

Impacted by this bug? Give it a 👍.

@asaff1
Copy link
Author

asaff1 commented Oct 20, 2024

UPDATE:
The solution was to add to kfp-launcher:

  providers: |-
    s3:
      default:
        endpoint: s3.amazonaws.com
        disableSSL: false
        region: us-east-2
        credentials:
          fromEnv: false
          secretRef:
            secretName: mlpipeline-minio-artifact
            accessKeyKey: accesskey
            secretKeyKey: secretkey

In addition to the defaultPipelineRoot.
Must say that I would expect this to be generated based on the AWS configuration I linked above

@asaff1 asaff1 closed this as completed Oct 20, 2024
@asaff1 asaff1 reopened this Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant