Introduction
What is RAG and why is it useful?
RAG helps your chatbot answer questions more accurately by retrieving relevant information from a predefined set of documents (your "knowledge base") before generating a response. Instead of relying solely on the LLM's pre-trained knowledge, it provides the LLM with specific, up-to-date context, reducing hallucinations and improving relevance.
Open Source Components We'll Use:
- `ollama`: A fantastic tool for running open-source LLMs and embedding models locally. It simplifies model management immensely.
- `llama3` (or similar): A powerful open-source LLM from Meta, served by `ollama`.
- `nomic-embed-text`: An open-source embedding model, also served by `ollama`, used to convert text into numerical vectors for searching.
- `FAISS`: Facebook AI Similarity Search, an open-source library for efficient similarity search and clustering of dense vectors. We'll use it as our vector store.
- `langchain`: An open-source framework designed to build applications with LLMs, making RAG pipelines very straightforward.
Getting Started: Setup Steps
Before running the code, you'll need to set up your environment.
1. Install `ollama`:
Download and install `ollama` from their official website: [ollama.com](https://ollama.com/). Follow the instructions for your operating system.
2. Pull LLM and Embedding Models with `ollama`:
Once `ollama` is installed and running, open your terminal and pull the necessary models. We'll use `llama3` for the LLM and `nomic-embed-text` for embeddings.
ollama pull llama3
ollama pull nomic-embed-text
(You can choose other `ollama` models if you prefer, just ensure they are pulled and update the model names in the Python code accordingly.)
3. Install Python Libraries:
You'll need `langchain-community`, `langchain-core`, and `faiss-cpu`.
pip install langchain-community langchain-core faiss-cpu
The Shortest RAG Chatbot Code
Here's the Python code for your RAG chatbot. It's designed to be as minimal as possible while demonstrating the full RAG workflow.
#1. Import necessary components from langchain
from langchain_community.llms import Ollama
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.output_parsers import StrOutputParser
# --- Configuration ---
LLM_MODEL = "llama3" # Ensure this model is pulled in Ollama
EMBEDDING_MODEL = "nomic-embed-text" # Ensure this model is pulled in Ollama
# 2. Define your knowledge base (sample documents)
# In a real-world scenario, these would be loaded from files, databases, etc.
documents = [
Document(page_content="GlobalTech Innovations is a leading technology company specializing in AI, cloud computing, and sustainable energy solutions."),
Document(page_content="The company was founded by Dr. Evelyn Reed and Mr. David Chen on 15 March 2005."),
Document(page_content="GlobalTech's headquarters are located in Silicon Valley, California, with major offices in London and Singapore."),
Document(page_content="Their flagship product, 'Nexus AI', provides advanced data analytics and machine learning capabilities for enterprises."),
Document(page_content="The current CEO of GlobalTech Innovations is Sarah Jenkins, appointed in January 2023."),
Document(page_content="GlobalTech is committed to fostering innovation and developing ethical AI practices."),
Document(page_content="They recently launched 'EcoCloud', a new cloud platform designed for energy efficiency and reduced carbon footprint."),
Document(page_content="GlobalTech Innovations employs over 10,000 people worldwide and serves a diverse global client base."),
]
# 3. Initialize Embedding Model (via Ollama)
print(f"Initializing embedding model: {EMBEDDING_MODEL}...")
embeddings = OllamaEmbeddings(model=EMBEDDING_MODEL)
# 4. Create and populate the Vector Store (FAISS)
# This step embeds your documents and stores them for efficient retrieval.
print("Creating FAISS vector store from documents...")
vectorstore = FAISS.from_documents(documents, embeddings)
retriever = vectorstore.as_retriever() # A retriever helps fetch relevant documents
# 5. Initialize the Large Language Model (via Ollama)
print(f"Initializing LLM: {LLM_MODEL}...")
llm = Ollama(model=LLM_MODEL)
# 6. Define the RAG Prompt Template
# This template instructs the LLM on how to use the retrieved context.
template = """You are a helpful assistant.
Answer the question based ONLY on the following context.
If you cannot find the answer in the context, politely state that you don't have enough information.
Context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
# 7. Construct the RAG Chain
# This chain defines the flow: retrieve -> format prompt -> invoke LLM -> parse output.
print("Setting up the RAG chain...")
rag_chain = (
{"context": retriever, "question": RunnablePassthrough()} # Retrieve context, pass question
| prompt # Apply prompt template
| llm # Invoke the LLM
| StrOutputParser() # Parse LLM output to string
)
# 8. Start the Chat Loop
print("\n--- RAG Chatbot Ready! ---")
print(f"Using LLM: {LLM_MODEL}, Embeddings: {EMBEDDING_MODEL}")
print("Type your questions, or 'exit' to quit.")
while True:
user_query = input("\nYou: ")
if user_query.lower() == 'exit':
print("Goodbye! Stay productive!")
break
print("Bot (thinking...): ", end="", flush=True)
try:
response = rag_chain.invoke(user_query)
print(response)
except Exception as e:
print(f"An error occurred: {e}. Please ensure Ollama is running and models are pulled correctly.")
How to Run the Chatbot
1. Save the code above as a Python file (e.g., `rag_chatbot.py`).
2. Open your terminal or command prompt.
3. Navigate to the directory where you saved the file.
4. Run the script:
python rag_chatbot.py
5. The chatbot will initialize, and then you can start asking questions!
Example Interactions:
You: Who founded GlobalTech Innovations?
Bot: GlobalTech Innovations was founded by Dr. Evelyn Reed and Mr. David Chen.
You: Where is GlobalTech's headquarters?
Bot: GlobalTech's headquarters are located in Silicon Valley, California, with major offices in London and Singapore.
You: What is Nexus AI?
Bot: 'Nexus AI' is GlobalTech Innovations' flagship product, providing advanced data analytics and machine learning capabilities for enterprises.
You: What is the capital of France?
Bot: I don't have enough information to answer that question.
This setup provides a robust, yet incredibly concise, foundation for a RAG-powered chatbot using entirely open-source tools. Enjoy streamlining your information retrieval!
No comments:
Post a Comment