[bug] Async streaming validation duplicates output in the presence of multiple validators #1090

JosephCatrambone · 2024-09-25T16:22:08Z

Quick thank you to user new in the Discord. Link to the thread: https://discord.com/channels/1085077079697150023/1288085320805388298/1288085864521666591

Describe the bug
When AsyncGuard.use_many is given multiple guards (in this case, DetectPII and ToxicLanguage) with on_fail="fix", the stream of outputs will be reduplicated.

To Reproduce
Steps to reproduce the behavior:

import asyncio
import os
from dotenv import load_dotenv
import litellm
import openai
import guardrails

from guardrails.hub import DetectPII, ToxicLanguage


from typing import Callable, Dict, Optional
from guardrails.validators import (
    PassResult,
    register_validator,
    ValidationResult,
    Validator,
)


# Load environment variables
load_dotenv()
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_key = os.getenv("OPENAI_API_KEY")


# VERSION 1
guard = guardrails.AsyncGuard().use_many(
    DetectPII(pii_entities="pii", on_fail="fix")
)

# VERSION 2
# guard = guardrails.AsyncGuard().use_many(
#    DetectPII(pii_entities="pii", on_fail="fix"), ToxicLanguage(on_fail="fix")
# )

async def generate_text():
    fragment_generator = await guard(
            litellm.acompletion,
            api_key=openai.api_key,
            api_base=openai.api_base,
            model="openai/mistralai/Mistral-Nemo-Instruct-2407",
            messages=[
                {"role": "system", "content": "Only write my sentences provided please and nothing else please."},
                {
                    "role": "user",
                    "content": """Peter is funny and lives in New York. My name is Peter. Who are you Brian ?""",
                },
            ],
            max_tokens=1024,
            temperature=0,
            stream=True,
        )
    
    text = ""
    async for op in fragment_generator:
            print(op)
            await asyncio.sleep(0)
            text += op.validated_output

    print(text)
# Run the async function to generate text
import asyncio

asyncio.run(generate_text())

Expected behavior
Example model output: "My friend Alex is a researcher at Purdue University. (Numerous Obscenities)"
Expected cleaned output: "My friend is a researcher at ."
Observed output: "My friend My friend friend is a researcher at friend is a researcher at ."

Library version:
Guardrails 0.5.10

Additional context
Happens in a notebook and in a terminal.

The text was updated successfully, but these errors were encountered:

github-actions · 2024-10-26T03:36:22Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 14 days.

JosephCatrambone · 2024-10-31T18:02:18Z

CC @nichwch I think your async changes fixed this, right?

JosephCatrambone added the bug Something isn't working label Sep 25, 2024

zsimjee modified the milestones: v0.5.13, v0.6.0 release tracker Sep 25, 2024

github-actions bot added the Stale label Oct 26, 2024

github-actions bot removed the Stale label Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Async streaming validation duplicates output in the presence of multiple validators #1090

[bug] Async streaming validation duplicates output in the presence of multiple validators #1090

JosephCatrambone commented Sep 25, 2024

github-actions bot commented Oct 26, 2024

JosephCatrambone commented Oct 31, 2024

[bug] Async streaming validation duplicates output in the presence of multiple validators #1090

[bug] Async streaming validation duplicates output in the presence of multiple validators #1090

Comments

JosephCatrambone commented Sep 25, 2024

github-actions bot commented Oct 26, 2024

JosephCatrambone commented Oct 31, 2024