Having trouble replicating IrisVectorStore Llama Index demo from iris-vector-search for my program's user table #10

ericmariasis · 2024-07-22T12:18:04Z

I'll preface this by saying I'm not sure if I found a bug or because I'm somehow misusing IrisVectorStore. Also for testing you'll probably need an OpenAI token.

Basically I have code from the regular llama-index module working in my Python project which has SimpleDirectoryReader objects similar in nature to the demo I mentioned (https://github.com/intersystems-community/iris-vector-search/blob/main/d...). And I have other code working (not shown) that can add new users to a SQL table in Iris.

I tried to use IRISVectorStore in a manner similar to the below excerpt from the demo code but I just changed the table name to the name of my user table. And I also just changed the documents object in that code to my own SimpleDirectoryReader object.

However no matter how many times I try to run with those changes I get a flurry of exceptions where the trace makes little sense to me. I can confirm that my code in place to connect to my user table locally does work. I'll include the trace at the bottom.

# StorageContext captures how vectors will be stored
vector_store = IRISVectorStore.from_params(
    connection_string = url,
    table_name = "paul_graham_essay",
    embed_dim = 1536,  # openai embedding dimension
    engine_args = { "connect_args": {"sslcontext": sslcontext} }

Below is the entire code module where I use llama Index and you can see the block of commented out code I tried to add in run_query_on_files in addition to the setup steps above similar to the demo.

import textwrap

import nest_asyncio
from openai import OpenAIError
from pydantic import ValidationError

nest_asyncio.apply()

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext
from llama_index.llms.openai import OpenAI

from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_iris import IRISVectorStore

import os
from .myconfig import *

os.environ["OPENAI_API_KEY"] = f'{OPENAI_API_KEY}'

username = f'{DB_USER}'
password = f'{DB_PASS}'
hostname = os.getenv('IRIS_HOSTNAME', f'{DB_URL}')
port = f'{DB_PORT}'
namespace = f'{DB_NAMESPACE}'

from llama_index.core import Settings

Settings.llm = OpenAI(temperature=0.2, model="gpt-3.5-turbo")

import ssl

certificateFile = "/usr/cert-demo/certificateSQLaaS.pem"

if (os.path.exists(certificateFile)):
    print("Located SSL certficate at '%s', initializing SSL configuration", certificateFile)
    sslcontext = ssl.create_default_context(cafile=certificateFile)
else:
    print("No certificate file found, continuing with insecure connection")
    sslcontext = None

from sqlalchemy import create_engine, text

url = f"iris://{username}:{password}@{hostname}:{port}/{namespace}"

engine = create_engine(url, connect_args={"sslcontext": sslcontext})
with engine.connect() as conn:
    print(conn.execute(text("SELECT 'hello world!'")).first()[0])

# StorageContext captures how vectors will be stored
vector_store = IRISVectorStore.from_params(
    connection_string = url,
    table_name = "user",
    embed_dim = 1536,  # openai embedding dimension
    engine_args = { "connect_args": {"sslcontext": sslcontext} }
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
def get_filename_before_dot(filename):
    name, extension = os.path.splitext(filename)
    return name


def run_query_on_files(files, query):
    # Check if the OpenAI API key is provided
    if not os.getenv("OPENAI_API_KEY"):
        return "Cannot run model. No API key provided."

    try:
        queryEngineTools = []
        for file in files:
            curDoc = SimpleDirectoryReader(input_files=[file]).load_data()
            # index = VectorStoreIndex.from_documents(
            #     curDoc,
            #     storage_context=storage_context,
            #     show_progress=True,
            # )
            # query_engine = index.as_query_engine()
            # userResp = query_engine.query("Summarize this content.")
            # print(textwrap.fill(str(userResp), 100))
            curVectorStore = VectorStoreIndex.from_documents(curDoc)
            curEngine = curVectorStore.as_query_engine(similarity_top_k=3)
            curTool = QueryEngineTool(query_engine=curEngine, metadata=ToolMetadata(
                name=get_filename_before_dot(file),
                description=get_filename_before_dot(file)
            ))
            queryEngineTools.append(curTool)

        if len(files) > 0:
            s_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=queryEngineTools)
            response = s_engine.query(query)
            return response
        return ''
    except OpenAIError as e:
        return "Cannot run model. Invalid API key or other OpenAI error."
    except ValidationError as e:
        print(f"Validation error: {str(e)}")
        return "Validation error occurred."
    except Exception as e:
        print(f"An unexpected error occurred: {str(e)}")
        return "An unexpected error occurred."

And here is a trace of the error I get.

Parsing nodes: 100%|████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 991.33it/s]
Generating embeddings: 100%|█████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.29s/it]
An unexpected error occurred: 1 validation error for NodeWithScore
node
  Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)
Parsing nodes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1009.22it/s]
Generating embeddings: 100%|█████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.74it/s] 
An unexpected error occurred: 1 validation error for NodeWithScore
node
  Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)
Parsing nodes: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s] 
Generating embeddings: 100%|█████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  8.55it/s] 
An unexpected error occurred: 1 validation error for NodeWithScore
node
  Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)
Parsing nodes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1009.70it/s] 
Generating embeddings: 100%|█████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  5.21it/s] 
An unexpected error occurred: 1 validation error for NodeWithScore
node
  Can't instantiate abstract class BaseNode with abstract methods get_content, get_metadata_str, get_type, hash, set_content (type=type_error)

My question is basically does anybody know for sure that IRISVectorStore can successfully extract information from a user table? Or might I have hit some weird edge case when trying to use this?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Having trouble replicating IrisVectorStore Llama Index demo from iris-vector-search for my program's user table #10

Having trouble replicating IrisVectorStore Llama Index demo from iris-vector-search for my program's user table #10

ericmariasis commented Jul 22, 2024

Having trouble replicating IrisVectorStore Llama Index demo from iris-vector-search for my program's user table #10

Having trouble replicating IrisVectorStore Llama Index demo from iris-vector-search for my program's user table #10

Comments

ericmariasis commented Jul 22, 2024