Overview
In a previous article, we highlighted all of the new AI features Matillion has released as a part of their cloud offering, the Data Productivity Cloud (article here). Like any conversation around AI, things are changing weekly. Thus, the purpose of this article is to double click on a specific type of functionality – Retrieval Augmented Generation (RAG).
What is RAG?
Simply put, RAG is an approach to make LLMs more dynamic. It is a technique in the AI and natural language processing (NLP) fields that combines the strengths of the two approaches: information retrieval and generative modeling. What a RAG design does is enhance outputs from LLMs by fetching the most relevant and recent data from external knowledgebases outside of the model’s training datasets to optimize the generated responses (e.g. bringing specified business data to existing LLMs).
How Does RAG Work?
There are 2 sides to a RAG design: (1) the front-end inputs from a user (e.g. ticket creation, source system, or chat application) and (2) the backend data management. A typical workflow might look like this:
Why is RAG growing in adoption?
RAG is often the go-to framework to reduce hallucinations and leverage LLMs to integrate unstructured data into existing structured datasets. RAG is often leveraged with chat bots and automated generated responses to an inquiry leveraging a response from an LLM. Let’s look at a few use-cases of AI applications being built in the market that run on top of RAG pipelines:
- Chat Bots – employees can self-serve information about a given domain where the data is readily available to an LLM to search and quickly surface the corresponding information.
- Accelerate Response Generation – augment an LLM with internal knowledge repositories and prompt the model to generate responses to inquiries
- New Hire Onboarding – new hires can interact with an application that pulls from a repository of company-specific documents, training materials, and past queries to provide real-time, contextually relevant information to new hires.
- Update Documentation / Content Creation – integrate survey, review, and transcript data to better understand your customer’s needs and update product/internal documentation.
With regards to Matillion, they are uniquely positioned for the operations-type of use-case due to their ingestion, transformation, prompting, and reverse ETL capabilities.
Matillion’s RAG Capabilities
Historically, Matillion has been incredibly strong integrating and transforming semi-structured data into highly structured datasets thanks to their native connectors and push-down architecture. Additionally, they’ve had the ‘reverse ETL’ capabilities to push data from your DW back to source systems.
Now, they have released a suite of features that positions them well for RAG use-cases. With the following features, users can ingest unstructured datasets, prompt LLMs from within a pipeline, store the results into a vector store, and continuously update the vectors so the LLMs have the most up-to-date information. All of which allows customers to confidently leverage the power of LLMs to generate responses with guardrails to reduce hallucinations.
RAG pipeline Matillion features include:
- Azure Document Intelligence – ingest data from PDFs and free-hand text
- Azure Speech Transcribe – translate speech to text
- Azure OpenAI Prompt – an API call to leverage OpenAI within a client’s infrastructure
- AWS Textract components – ingest data from PDFs and free-hand text
- AWS Transcribe – extracts data from media and audio files and converts that data into text.
- AWS Bedrock Prompt – an API to leverage LLMs supported by Bedrock
- Snowflake Vector Upsert
- Postgres Vector Upsert
- Pinecone Vector Query
- Pinecone Vector Upsert
Example Use-case
If we double click on a ticketing system use-case, here’s a high-level architecture for an end-to-end RAG pipeline.
The use-case is to accelerate the response rate of employees by leveraging LLMs to generate responses to an inquiry by referencing the text from the ticket and internal knowledgebase documents. As you can see above, Matillion can:
- Ingests data from source systems
- Transform & standardize formats
- Create chunks for large files & store chunks as embeddings
- Provide context for the prompt
- Call the API of Bedrock, OpenAI or Snowflake Artic and pass the prompt through to the service
- Store the generated responses in a table
- Push the responses back to source systems
Conclusion
Matillion’s new features allow businesses to leverage a RAG famework to accelerate genAI use-cases. Specifically, their integrations with cloud-services to ingest unstructured data, chunking and embedding capabilities, AI-prompting components, vector query and upsert functionality, means teams can start executing AI projects without needing a team of AI engineers.
Comments are closed