Integration with OCI Generative AI and OpenSearch

Added in version 2.9.1.

OCI Generative Embedding

The Generative AI Embedding Models convert textual input - ranging from phrases and sentences to entire paragraphs - into a structured format known as embeddings. Each piece of text input is transformed into a numerical array consisting of 1024 distinct numbers. The following pretrained model is available for creating text embeddings:

  • embed-english-light-v2.0

To find out the latest supported embedding model, check the documentation.

The following code snippet shows how to use the Generative AI Embedding Models:

from ads.llm import GenerativeAIEmbeddings
import ads

ads.set_auth("resource_principal")

oci_embedings = GenerativeAIEmbeddings(
    compartment_id="ocid1.compartment.####",
    client_kwargs=dict(service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
)

Retrieval QA with OpenSearch

OCI OpenSearch

OCI Search with OpenSearch is a fully managed service which makes searching vast datasets and getting quick results fast and easy. In large language model world, you can use it as a vector store to store your documents and conduct keyword search or semantic search with help of a text embedding model. For a complete walk through on spinning up a OCI OpenSearch Cluster, see Search and visualize data using OCI Search Service with OpenSearch.

Semantic Search with OCI OpenSearch

With the OCI OpenSearch and OCI Generative Embedding, you can do semantic search by using langchain. The following code snippet shows how to do semantic search with OCI OpenSearch:

from langchain.vectorstores import OpenSearchVectorSearch
import os
# Saving the credentials as environment variables is not recommended. You should save them in Vault instead in prod.
os.environ['OCI_OPENSEARCH_USERNAME'] = "username"
os.environ['OCI_OPENSEARCH_PASSWORD'] = "password"
os.environ['OCI_OPENSEARCH_VERIFY_CERTS'] = "False"

# specify the index name that you would like to conduct semantic search on.
INDEX_NAME = "your_index_name"

opensearch_vector_search = OpenSearchVectorSearch(
    "https://localhost:9200", # your oci opensearch private endpoint
    embedding_function=oci_embedings,
    index_name=INDEX_NAME,
    engine="lucene",
    http_auth=(os.environ["OCI_OPENSEARCH_USERNAME"], os.environ["OCI_OPENSEARCH_PASSWORD"]),
    verify_certs=os.environ["OCI_OPENSEARCH_VERIFY_CERTS"],
)
opensearch_vector_search.similarity_search("your query", k=2, size=2)

Retrieval QA Using OCI OpenSearch as a Retriever

Since the search result usually cannot be directly used to answer a specific question. More practical solution is to send the origiral query along with the searched results to a Large Language model to get a more coherent answer. You can also use OCI OpenSearch as a retriever for retrieval QA. The following code snippet shows how to use OCI OpenSearch as a retriever:

from langchain.chains import RetrievalQA
from ads.llm import GenerativeAI

ads.set_auth("resource_principal")

oci_llm = GenerativeAI(
    compartment_id="ocid1.compartment.####",
    client_kwargs=dict(service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
)

retriever = opensearch_vector_search.as_retriever(search_kwargs={"vector_field": "embeds",
                                                                "text_field": "text",
                                                                "k": 3,
                                                                "size": 3})
qa = RetrievalQA.from_chain_type(
    llm=oci_llm,
    chain_type="stuff",
    retriever=retriever,
    chain_type_kwargs={
        "verbose": True
    }
)
qa.run("your question")

Retrieval QA with FAISS

FAISS as Vector DB

A lot of the time, your documents are not that large and you dont have a OCI OpenSearch cluster set up. In that case, you can use FAISS as your in-memory vector store, which can also do similarty search very efficiently.

The following code snippet shows how to use FAISS along with OCI Embedding Model to do semantic search:

from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS

loader = TextLoader("your.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

l = len(docs)
embeddings = []
for i in range(l // 16 + 1):
    subdocs = [item.page_content for item in docs[i * 16: (i + 1) * 16]]
    embeddings.extend(oci_embedings.embed_documents(subdocs))

texts = [item.page_content for item in docs]
text_embedding_pairs = [(text, embed) for text, embed in  zip(texts, embeddings)]
db = FAISS.from_embeddings(text_embedding_pairs, oci_embedings)
db.similarity_search("your query", k=2, size=2)

Retrieval QA Using FAISS Vector Store as a Retriever

Similarly, you can use FAISS Vector Store as a retriever to build a retrieval QA engine using langchain. The following code snippet shows how to use OCI OpenSearch as a retriever:

from langchain.chains import RetrievalQA
from ads.llm import GenerativeAI
import ads

ads.set_auth("resource_principal")

oci_llm = GenerativeAI(
    compartment_id="ocid1.compartment.####",
    client_kwargs=dict(service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
)
retriever = db.as_retriever()
qa = RetrievalQA.from_chain_type(
    llm=oci_llm,
    chain_type="stuff",
    retriever=retriever,
    chain_type_kwargs={
        "verbose": True
    }
)
qa.run("your question")

Deployment of Retrieval QA

As of version 0.0.346, Langchain does not support serialization of any vector stores. This will be a problem when you want to deploy a Retrieval QA langchain application. To solve this problem, we extended our support of vector stores serialization:

  • OpenSearchVectorSearch

  • FAISS

OpenSearchVectorSearch Serialization

langchain does not automatically support serialization of OpenSearchVectorSearch. However, ADS provides a way to serialize OpenSearchVectorSearch. To serialize OpenSearchVectorSearch, you need to use environment variables to store the credentials. The following variables can be passed in through the corresponding environment variables:

  • http_auth: (OCI_OPENSEARCH_USERNAME, OCI_OPENSEARCH_PASSWORD)

  • verify_certs: OCI_OPENSEARCH_VERIFY_CERTS

  • ca_certs: OCI_OPENSEARCH_CA_CERTS

The following code snippet shows how to use OpenSearchVectorSearch with environment variables:

from langchain.vectorstores import OpenSearchVectorSearch
import os

os.environ['OCI_OPENSEARCH_USERNAME'] = "username"
os.environ['OCI_OPENSEARCH_PASSWORD'] = "password"
os.environ['OCI_OPENSEARCH_VERIFY_CERTS'] = "False"

INDEX_NAME = "your_index_name"
opensearch_vector_search = OpenSearchVectorSearch(
    "https://localhost:9200",
    embedding_function=oci_embedings,
    index_name=INDEX_NAME,
    engine="lucene",
    http_auth=(os.environ["OCI_OPENSEARCH_USERNAME"], os.environ["OCI_OPENSEARCH_PASSWORD"]),
    verify_certs=os.environ["OCI_OPENSEARCH_VERIFY_CERTS"],
)

During deployment, it is very important that you remember to pass in those environment variables as well or retrieve them from the Vault in score.py which is recommended and more secure:

.deploy(deployment_log_group_id="ocid1.loggroup.####",
        deployment_access_log_id="ocid1.log.####",
        deployment_predict_log_id="ocid1.log.####",
        environment_variables={"OCI_OPENSEARCH_USERNAME":"<oci_opensearch_username>",
                                "OCI_OPENSEARCH_PASSWORD": "<oci_opensearch_password>",
                                "OCI_OPENSEARCH_VERIFY_CERTS": "<oci_opensearch_verify_certs>",)

Deployment of Retrieval QA with OpenSearch

Here is an example code snippet for deployment of Retrieval QA using OpenSearch as a retriever:

from ads.llm import GenerativeAIEmbeddings, GenerativeAI
from ads.llm.deploy import ChainDeployment
from langchain.chains import RetrievalQA
from langchain.vectorstores import OpenSearchVectorSearch

import ads
import os

ads.set_auth("resource_principal")

oci_embedings = GenerativeAIEmbeddings(
    compartment_id="ocid1.compartment.####",
    client_kwargs=dict(service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
)

oci_llm = GenerativeAI(
    compartment_id="ocid1.compartment.####",
    client_kwargs=dict(service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
)
# Saving the credentials as environment variables is not recommended. You should save them in Vault instead in prod.
os.environ['OCI_OPENSEARCH_USERNAME'] = "username"
os.environ['OCI_OPENSEARCH_PASSWORD'] = "password"
os.environ['OCI_OPENSEARCH_VERIFY_CERTS'] = "True" # make sure this is capitalized.
os.environ['OCI_OPENSEARCH_CA_CERTS'] = "path/to/oci_opensearch_ca.pem"

INDEX_NAME = "your_index_name"
opensearch_vector_search = OpenSearchVectorSearch(
    "https://localhost:9200", # your endpoint
    embedding_function=oci_embedings,
    index_name=INDEX_NAME,
    engine="lucene",
    http_auth=(os.environ["OCI_OPENSEARCH_USERNAME"], os.environ["OCI_OPENSEARCH_PASSWORD"]),
    verify_certs=os.environ["OCI_OPENSEARCH_VERIFY_CERTS"],
    ca_certs=os.environ["OCI_OPENSEARCH_CA_CERTS"],
)

retriever = opensearch_vector_search.as_retriever(search_kwargs={"vector_field": "embeds",
                                                                "text_field": "text",
                                                                "k": 3,
                                                                "size": 3})
qa = RetrievalQA.from_chain_type(
    llm=oci_llm,
    chain_type="stuff",
    retriever=retriever,
    chain_type_kwargs={
        "verbose": True
    }
)

model = ChainDeployment(qa)
model.prepare(force_overwrite=True,
        inference_conda_env="<custom_conda_environment_uri>",
        inference_python_version="<python_version>",
        )

model.save()
res = model.verify("your prompt")
model.deploy(deployment_log_group_id="ocid1.loggroup.####",
        deployment_access_log_id="ocid1.log.####",
        deployment_predict_log_id="ocid1.log.####",
        environment_variables={"OCI_OPENSEARCH_USERNAME":"<oci_opensearch_username>",
                                "OCI_OPENSEARCH_PASSWORD": "<oci_opensearch_password>",
                                "OCI_OPENSEARCH_VERIFY_CERTS": "<oci_opensearch_verify_certs>",
                                "OCI_OPENSEARCH_CA_CERTS": "<oci_opensearch_ca_certs>"},)

model.predict("your prompt")

Deployment of Retrieval QA with FAISS

Here is an example code snippet for deployment of Retrieval QA using FAISS as a retriever:

from ads.llm import GenerativeAIEmbeddings, GenerativeAI
from ads.llm.deploy import ChainDeployment
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA

import ads

ads.set_auth("resource_principal")
oci_embedings = GenerativeAIEmbeddings(
    compartment_id="ocid1.compartment.####",
    client_kwargs=dict(service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
)

oci_llm = GenerativeAI(
    compartment_id="ocid1.compartment.####",
    client_kwargs=dict(service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com") # this can be omitted after Generative AI service is GA.
)

loader = TextLoader("your.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

l = len(docs)
embeddings = []
for i in range(l // 16 + 1):
    subdocs = [item.page_content for item in docs[i * 16: (i + 1) * 16]]
    embeddings.extend(oci_embedings.embed_documents(subdocs))

texts = [item.page_content for item in docs]
text_embedding_pairs = [(text, embed) for text, embed in zip(texts, embeddings)]
db = FAISS.from_embeddings(text_embedding_pairs, oci_embedings)

retriever = db.as_retriever()
qa = RetrievalQA.from_chain_type(
    llm=oci_llm,
    chain_type="stuff",
    retriever=retriever,
    chain_type_kwargs={
        "verbose": True
    }
)

model = ChainDeployment(qa)
model.prepare(force_overwrite=True,
        inference_conda_env="<custom_conda_environment_uri>",
        inference_python_version="<python_version>",
        )

model.save()
res = model.verify("your prompt")
model.deploy(deployment_log_group_id="ocid1.loggroup.####",
        deployment_access_log_id="ocid1.log.####",
        deployment_predict_log_id="ocid1.log.####")

model.predict("your prompt")