[bug] MLFlow Validator traces are orphaned from parent traces when running as async #1163

CalebCourier · 2024-11-14T16:55:47Z

Describe the bug
When validators are run asynchronously, whether in an AsyncGuard or in a Guard when an async event loop is available (default behaviour), the spans created during a validator's validate method are orphaned from the parent spans. This is likely due to the function being run in an executor with which the context that holds the parent is not provided.

To Reproduce

import os
import mlflow
from rich import print
from guardrails import Guard
from guardrails.hub import RestrictToTopic, ValidLength
from guardrails.integrations.databricks import MlFlowInstrumentor

# Setup some environment variables for the LLM
os.environ["DATABRICKS_API_KEY"] = os.environ.get("DATABRICKS_TOKEN", "your-databricks-key")
os.environ["DATABRICKS_API_BASE"] = os.environ.get("DATABRICKS_HOST", "https://abc-123ab12a-1234.cloud.databricks.com")

# Run this in the terminal
# ! mlflow server --host localhost --port 8080

mlflow.set_tracking_uri(uri="http://localhost:8080")

MlFlowInstrumentor(experiment_name="MLFlow x Guardrails Async").instrument()


guard = Guard(name='content-guard').use_many(
    RestrictToTopic(valid_topics=["computer programming", "computer science", "algorithms"], disable_llm=True, on_fail="exception"),
    ValidLength(min=1, max=150, on_fail="exception")
)


instructions = { "role": "system", "content": "You are a helpful assistant that gives advice about writing clean code and other programming practices." }
prompt = "Write a short summary about recursion in less than 100 characters."

try:
    response = guard(
        model="databricks/databricks-dbrx-instruct",
        messages=[instructions, { "role":"user", "content": prompt }],
    )
    print(response)
except Exception as e:
    print(e)


# Go to http://localhost:8080
# Select the MLFlow x Guardrails Async experiment
# Go to Traces tab and see that there are 3 traces not just one, the latter two of which are from the validators and are unattached from the first trace which holds the Guard hierarchy.

Expected behavior
Spans should maintain relationships regardless of sync vs async.

Library version:
0.6.0
Likely all versions since https://github.com/guardrails-ai/guardrails/releases/tag/v0.5.10 for Guard
All versions of async before 0.5.10 if run_in_separate_process was assigned to True

Additional context
This behaviour was noticed with the MLFlowInstrumentor, but is likely present with all OTEL instrumentations. Though this should be tested.

The text was updated successfully, but these errors were encountered:

CalebCourier · 2024-11-14T17:44:12Z

Upon further investigation, this issue is not present in our other instrumentations because the function being wrapper for tracing in those situations is outside of the ThreadPoolExecutor. i.e. in those cases we are wrapping the function that calls Validator.validate not Validator.validate itself like we are in the MLFlowInstrumentor.

CalebCourier added the bug Something isn't working label Nov 14, 2024

CalebCourier mentioned this issue Nov 14, 2024

Hotfix for MLFlow validator spans during async execution #1164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] MLFlow Validator traces are orphaned from parent traces when running as async #1163

[bug] MLFlow Validator traces are orphaned from parent traces when running as async #1163

CalebCourier commented Nov 14, 2024

CalebCourier commented Nov 14, 2024

[bug] MLFlow Validator traces are orphaned from parent traces when running as async #1163

[bug] MLFlow Validator traces are orphaned from parent traces when running as async #1163

Comments

CalebCourier commented Nov 14, 2024

CalebCourier commented Nov 14, 2024