Friday, May 01, 2026

The Shortest LLM and RAG Chatbot




Introduction

How can we build the shortest possible LLM Chatbot with RAG support? Here is my attempt. Can you beat it?


What is RAG and why is it useful?


RAG helps your chatbot answer questions more accurately by retrieving relevant information from a predefined set of documents (your "knowledge base") before generating a response. Instead of relying solely on the LLM's pre-trained knowledge, it provides the LLM with specific, up-to-date context, reducing hallucinations and improving relevance.


Open Source Components We'll Use:


  • `ollama`: A fantastic tool for running open-source LLMs and embedding models locally. It simplifies model management immensely.
  • `llama3` (or similar): A powerful open-source LLM from Meta, served by `ollama`.
  • `nomic-embed-text`: An open-source embedding model, also served by `ollama`, used to convert text into numerical vectors for searching.
  • `FAISS`: Facebook AI Similarity Search, an open-source library for efficient similarity search and clustering of dense vectors. We'll use it as our vector store.
  • `langchain`: An open-source framework designed to build applications with LLMs, making RAG pipelines very straightforward.


Getting Started: Setup Steps


Before running the code, you'll need to set up your environment.


1.  Install `ollama`:

Download and install `ollama` from their official website: [ollama.com](https://ollama.com/). Follow the instructions for your operating system.


2.  Pull LLM and Embedding Models with `ollama`:

Once `ollama` is installed and running, open your terminal and pull the necessary models. We'll use `llama3` for the LLM and `nomic-embed-text` for embeddings.


    ollama pull llama3

    ollama pull nomic-embed-text

    

(You can choose other `ollama` models if you prefer, just ensure they are pulled and update the model names in the Python code accordingly.)


3.  Install Python Libraries:

You'll need `langchain-community`, `langchain-core`, and `faiss-cpu`.


    pip install langchain-community langchain-core faiss-cpu

   


The Shortest RAG Chatbot Code


Here's the Python code for your RAG chatbot. It's designed to be as minimal as possible while demonstrating the full RAG workflow.



#1. Import necessary components from langchain

from langchain_community.llms import Ollama

from langchain_community.embeddings import OllamaEmbeddings

from langchain_community.vectorstores import FAISS

from langchain_core.documents import Document

from langchain_core.prompts import ChatPromptTemplate

from langchain_core.runnables import RunnablePassthrough, RunnableLambda

from langchain_core.output_parsers import StrOutputParser


# --- Configuration ---

LLM_MODEL = "llama3"  # Ensure this model is pulled in Ollama

EMBEDDING_MODEL = "nomic-embed-text" # Ensure this model is pulled in Ollama


# 2. Define your knowledge base (sample documents)

# In a real-world scenario, these would be loaded from files, databases, etc.

documents = [

    Document(page_content="GlobalTech Innovations is a leading technology company specializing in AI, cloud computing, and sustainable energy solutions."),

    Document(page_content="The company was founded by Dr. Evelyn Reed and Mr. David Chen on 15 March 2005."),

    Document(page_content="GlobalTech's headquarters are located in Silicon Valley, California, with major offices in London and Singapore."),

    Document(page_content="Their flagship product, 'Nexus AI', provides advanced data analytics and machine learning capabilities for enterprises."),

    Document(page_content="The current CEO of GlobalTech Innovations is Sarah Jenkins, appointed in January 2023."),

    Document(page_content="GlobalTech is committed to fostering innovation and developing ethical AI practices."),

    Document(page_content="They recently launched 'EcoCloud', a new cloud platform designed for energy efficiency and reduced carbon footprint."),

    Document(page_content="GlobalTech Innovations employs over 10,000 people worldwide and serves a diverse global client base."),

]


# 3. Initialize Embedding Model (via Ollama)

print(f"Initializing embedding model: {EMBEDDING_MODEL}...")

embeddings = OllamaEmbeddings(model=EMBEDDING_MODEL)


# 4. Create and populate the Vector Store (FAISS)

# This step embeds your documents and stores them for efficient retrieval.

print("Creating FAISS vector store from documents...")

vectorstore = FAISS.from_documents(documents, embeddings)

retriever = vectorstore.as_retriever() # A retriever helps fetch relevant documents


# 5. Initialize the Large Language Model (via Ollama)

print(f"Initializing LLM: {LLM_MODEL}...")

llm = Ollama(model=LLM_MODEL)


# 6. Define the RAG Prompt Template

# This template instructs the LLM on how to use the retrieved context.

template = """You are a helpful assistant.

Answer the question based ONLY on the following context.

If you cannot find the answer in the context, politely state that you don't have enough information.


Context:

{context}


Question: {question}

"""

prompt = ChatPromptTemplate.from_template(template)


# 7. Construct the RAG Chain

# This chain defines the flow: retrieve -> format prompt -> invoke LLM -> parse output.

print("Setting up the RAG chain...")

rag_chain = (

    {"context": retriever, "question": RunnablePassthrough()} # Retrieve context, pass question

    | prompt                                             # Apply prompt template

    | llm                                                     # Invoke the LLM

    | StrOutputParser()                           # Parse LLM output to string

)


# 8. Start the Chat Loop

print("\n--- RAG Chatbot Ready! ---")

print(f"Using LLM: {LLM_MODEL}, Embeddings: {EMBEDDING_MODEL}")

print("Type your questions, or 'exit' to quit.")


while True:

    user_query = input("\nYou: ")

    if user_query.lower() == 'exit':

        print("Goodbye! Stay productive!")

        break

    

    print("Bot (thinking...): ", end="", flush=True)

    try:

        response = rag_chain.invoke(user_query)

        print(response)

    except Exception as e:

        print(f"An error occurred: {e}. Please ensure Ollama is running and models are pulled correctly.")




How to Run the Chatbot


1.  Save the code above as a Python file (e.g., `rag_chatbot.py`).

2.  Open your terminal or command prompt.

3.  Navigate to the directory where you saved the file.

4.  Run the script:

    

    python rag_chatbot.py

    

5.  The chatbot will initialize, and then you can start asking questions!



Example Interactions:



You: Who founded GlobalTech Innovations?

Bot: GlobalTech Innovations was founded by Dr. Evelyn Reed and Mr. David Chen.


You: Where is GlobalTech's headquarters?

Bot: GlobalTech's headquarters are located in Silicon Valley, California, with major offices in London and Singapore.


You: What is Nexus AI?

Bot: 'Nexus AI' is GlobalTech Innovations' flagship product, providing advanced data analytics and machine learning capabilities for enterprises.


You: What is the capital of France?

Bot: I don't have enough information to answer that question.



This setup provides a robust, yet incredibly concise, foundation for a RAG-powered chatbot using entirely open-source tools. Enjoy streamlining your information retrieval!

Thursday, April 30, 2026

THE ART OF PROMPT ENGINEERING: MASTERING CODE, IMAGE, AND VIDEO GENERATION ACROSS AI MODELS

 



In the rapidly evolving landscape of artificial intelligence, the ability to communicate effectively with language models has become an essential skill. Whether you are generating code, creating stunning visuals, or producing dynamic video content, the quality of your output is directly proportional to the quality of your input. This comprehensive guide explores the science and art of crafting excellent prompts that work across different AI models, providing you with parameterizable templates that are both efficient and specific enough to produce professional results.


UNDERSTANDING THE FUNDAMENTALS OF EFFECTIVE PROMPTING


Before diving into specific prompt templates, it is crucial to understand what makes a prompt effective. The best prompts share several key characteristics that transcend the specific domain or model being used. First and foremost, clarity is paramount. Your prompt should leave no room for ambiguity about what you want to achieve. Second, context matters immensely. Providing relevant background information helps the model understand not just what you want, but why you want it, which often leads to more appropriate solutions. Third, specificity in requirements ensures that the output matches your expectations in terms of style, format, and functionality. Finally, structure in your prompts helps the model parse your request efficiently and respond with organized, coherent output.


ADVANCED CODE GENERATION PROMPTS


The Comprehensive Function Generator


When you need to generate a specific function or method, using a well-structured prompt can save hours of development time. Consider this parameterizable template: "Create a {LANGUAGE} function named {FUNCTION_NAME} that {PRIMARY_PURPOSE}. The function should accept {NUMBER} parameters: {PARAMETER_LIST_WITH_TYPES}. It should return {RETURN_TYPE} representing {RETURN_DESCRIPTION}. Include error handling for {ERROR_CASES}. The implementation should prioritize {OPTIMIZATION_GOAL} and follow {CODING_STANDARD} conventions. Add comprehensive docstrings explaining the purpose, parameters, return value, and potential exceptions."


For example, you might fill this template as follows: "Create a Python function named calculate_portfolio_risk that analyzes investment portfolio volatility. The function should accept three parameters: portfolio_weights as a numpy array, covariance_matrix as a two-dimensional numpy array, and time_horizon as an integer representing days. It should return a float representing the annualized portfolio standard deviation. Include error handling for mismatched dimensions, negative weights, and invalid time horizons. The implementation should prioritize numerical stability and follow PEP 8 conventions. Add comprehensive docstrings explaining the purpose, parameters, return value, and potential exceptions."


The Full-Stack Application Architect


When building complete applications, a more comprehensive prompt structure is necessary. Try this approach: "Design and implement a {APPLICATION_TYPE} application for {USE_CASE} using {TECHNOLOGY_STACK}. The application should have {NUMBER} main components: {COMPONENT_LIST}. For the backend, implement {BACKEND_FEATURES} with {DATABASE_TYPE} for data persistence. The frontend should include {UI_COMPONENTS} with {STYLING_APPROACH}. Implement authentication using {AUTH_METHOD} and ensure {SECURITY_REQUIREMENTS}. Include API endpoints for {ENDPOINT_LIST}. Structure the code following {ARCHITECTURE_PATTERN} and include configuration for {DEPLOYMENT_TARGET}. Provide setup instructions and environment variable documentation."


A concrete example would be: "Design and implement a RESTful API application for task management using Node.js with Express and React. The application should have four main components: user management, task CRUD operations, team collaboration, and notification system. For the backend, implement JWT-based authentication, role-based access control, and real-time updates with Socket.io using PostgreSQL for data persistence. The frontend should include a dashboard, kanban board, calendar view, and user profile pages with Tailwind CSS for styling. Implement authentication using JWT tokens with refresh token rotation and ensure all endpoints validate input, sanitize data, and use parameterized queries. Include API endpoints for user registration, login, task creation, task updates, task deletion, team invitations, and notification preferences. Structure the code following the MVC pattern and include configuration for Docker containerization. Provide setup instructions and environment variable documentation."


The Algorithm Optimizer


For performance-critical code, use this specialized template: "Implement an optimized {ALGORITHM_NAME} algorithm in {LANGUAGE} that solves {PROBLEM_DESCRIPTION}. The input will be {INPUT_DESCRIPTION} with typical size of {SIZE_RANGE}. Target time complexity of {TIME_COMPLEXITY} and space complexity of {SPACE_COMPLEXITY}. Use {DATA_STRUCTURES} for efficient operations. Include {OPTIMIZATION_TECHNIQUES} to improve performance. Provide both an iterative and recursive version if applicable. Add benchmarking code that tests performance with inputs of size {TEST_SIZES}. Include comments explaining the algorithmic approach and complexity analysis."


The Test Suite Generator


Quality code requires comprehensive testing. Use this prompt structure: "Generate a complete test suite for {CODE_DESCRIPTION} using {TESTING_FRAMEWORK}. Create tests covering {TEST_CATEGORIES} including happy path scenarios, edge cases, error conditions, and boundary values. For each function {FUNCTION_LIST}, test with {TEST_CASE_TYPES}. Include setup and teardown procedures for {RESOURCE_REQUIREMENTS}. Mock {EXTERNAL_DEPENDENCIES} to ensure unit test isolation. Aim for {COVERAGE_TARGET} code coverage. Structure tests using {TEST_ORGANIZATION_PATTERN} and include descriptive test names that explain what is being tested and expected outcome."


MASTERFUL IMAGE GENERATION PROMPTS


The Photorealistic Scene Creator


For generating lifelike images, precision in description is essential. Use this template: "Create a photorealistic image of {MAIN_SUBJECT} in {SETTING}. The scene takes place during {TIME_OF_DAY} with {WEATHER_CONDITIONS}. The {MAIN_SUBJECT} should be {SUBJECT_DESCRIPTION} positioned {SPATIAL_POSITION}. Use {CAMERA_ANGLE} perspective with {LENS_TYPE} characteristics. Lighting should be {LIGHTING_STYLE} creating {MOOD}. Include {BACKGROUND_ELEMENTS} in the background. The color palette should emphasize {COLOR_SCHEME}. Render in {RESOLUTION_STYLE} with {DETAIL_LEVEL} attention to textures and materials. Style should evoke {ARTISTIC_REFERENCE}."


For instance: "Create a photorealistic image of a vintage motorcycle in an abandoned industrial warehouse. The scene takes place during golden hour with dust particles visible in the light beams streaming through broken windows. The motorcycle should be a 1960s cafe racer with worn leather seat and chrome details positioned at a three-quarter angle in the center of the frame. Use eye-level perspective with 50mm prime lens characteristics creating natural depth of field. Lighting should be dramatic side-lighting creating a nostalgic and melancholic mood. Include rusted machinery, scattered tools, and overgrown vegetation in the background. The color palette should emphasize warm oranges, deep browns, and desaturated teals. Render in 8K quality with meticulous attention to rust textures, leather grain, and chrome reflections. Style should evoke the cinematography of Denis Villeneuve films."


The Conceptual Art Designer


For more abstract or stylized imagery, try this approach: "Design a {ART_STYLE} illustration depicting {CONCEPT}. The composition should use {COMPOSITION_TECHNIQUE} to draw focus to {FOCAL_POINT}. Incorporate symbolic elements including {SYMBOL_LIST} representing {MEANINGS}. The color scheme should be {COLOR_DESCRIPTION} to convey {EMOTIONAL_TONE}. Use {TEXTURE_STYLE} textures and {LINE_QUALITY} linework. The level of detail should be {DETAIL_SPECIFICATION} with {SIMPLIFICATION_APPROACH} for secondary elements. Include {DECORATIVE_ELEMENTS} as embellishments. The overall aesthetic should blend {STYLE_INFLUENCES} while maintaining {UNIQUE_CHARACTERISTIC}."


A practical example: "Design a minimalist vector illustration depicting the concept of digital transformation in business. The composition should use the rule of thirds to draw focus to a central figure transitioning from analog to digital form. Incorporate symbolic elements including circuit board patterns, flowing data streams, traditional office tools morphing into digital interfaces, and interconnected nodes representing collaboration and connectivity. The color scheme should be a gradient from warm analog colors like sepia and burnt orange transitioning to cool digital blues and electric purples to convey innovation and progress. Use smooth gradient textures and clean geometric linework. The level of detail should be selectively detailed with high precision on the central transformation moment and simplified, iconic representations for secondary elements. Include subtle geometric patterns and light rays as embellishments. The overall aesthetic should blend mid-century modern design sensibilities with contemporary tech aesthetics while maintaining a sense of human-centered optimism."


The Character Design Specialist


Character creation requires attention to personality and consistency: "Create a character design for {CHARACTER_TYPE} named {CHARACTER_NAME} who is {CHARACTER_DESCRIPTION}. Physical appearance: {AGE_RANGE}, {BODY_TYPE}, {HEIGHT_DESCRIPTOR}, with {DISTINCTIVE_FEATURES}. Clothing style is {FASHION_DESCRIPTION} appropriate for {SETTING_CONTEXT}. Color palette for this character: {CHARACTER_COLORS}. Personality traits visible in design: {PERSONALITY_INDICATORS}. Include {NUMBER} views: {VIEW_ANGLES}. Background should be {BACKGROUND_TYPE}. Art style: {STYLE_SPECIFICATION} with {RENDERING_TECHNIQUE}. Ensure design is suitable for {INTENDED_USE}."


The Product Visualization Expert


For commercial and product imagery: "Generate a professional product photograph of {PRODUCT_NAME} which is {PRODUCT_DESCRIPTION}. The product should be displayed {PRESENTATION_STYLE} on {SURFACE_TYPE}. Use {LIGHTING_SETUP} to highlight {PRODUCT_FEATURES}. Background should be {BACKGROUND_DESCRIPTION} to ensure {VISUAL_GOAL}. Include {PROP_ELEMENTS} to provide scale and context. Camera angle: {ANGLE_SPECIFICATION} to showcase {SHOWCASE_PRIORITY}. Depth of field: {DOF_SPECIFICATION}. Color grading: {GRADING_STYLE} to appeal to {TARGET_AUDIENCE}. Ensure {BRAND_GUIDELINES} are followed. Final image should be suitable for {USAGE_CONTEXT}."


COMPELLING VIDEO GENERATION PROMPTS


The Narrative Scene Builder


Video generation requires temporal thinking and motion description: "Generate a {DURATION} video showing {SCENE_DESCRIPTION}. Opening shot: {OPENING_DESCRIPTION} lasting {OPENING_DURATION}. Camera movement: {CAMERA_MOTION} transitioning to {SECONDARY_ANGLE}. Main action: {ACTION_SEQUENCE} occurring {ACTION_TIMING}. The subject {SUBJECT_DESCRIPTION} performs {SUBJECT_ACTIONS}. Environment: {ENVIRONMENT_DETAILS} with {ATMOSPHERIC_ELEMENTS}. Lighting transitions from {INITIAL_LIGHTING} to {FINAL_LIGHTING} creating {LIGHTING_EFFECT}. Color grading: {COLOR_TREATMENT}. Pacing: {PACING_DESCRIPTION}. Mood: {EMOTIONAL_TONE}. Style reference: {STYLE_INSPIRATION}. End with {CLOSING_SHOT}."


For example: "Generate a ten-second video showing a time-lapse of a flower blooming in a misty forest clearing. Opening shot: close-up of a closed bud covered in morning dew lasting two seconds. Camera movement: slow push-in combined with slight upward tilt transitioning to a macro perspective of the petals. Main action: petals gradually unfurling and opening to reveal the flower's center occurring over six seconds with accelerated time-lapse effect. The subject, a vibrant purple iris, performs a natural blooming motion with petals curling outward gracefully. Environment: soft-focused forest background with dappled sunlight filtering through leaves and gentle mist rolling across the ground. Lighting transitions from cool blue morning light to warm golden sunlight creating a magical, ethereal effect. Color grading: enhanced saturation with emphasis on purples and greens with a slight film grain overlay. Pacing: smooth and meditative with gradual acceleration during peak bloom. Mood: peaceful, hopeful, and rejuvenating. Style reference: BBC Earth nature documentaries with cinematic quality. End with a slow pull-back revealing the flower in its full glory with a subtle lens flare."


The Motion Graphics Animator


For animated explainer content: "Create a {DURATION} motion graphics video explaining {TOPIC}. Visual style: {ANIMATION_STYLE} using {DESIGN_APPROACH}. Color scheme: {COLORS} on {BACKGROUND}. The video should progress through {NUMBER} key sections: {SECTION_LIST}. For each section, use {TRANSITION_TYPE} transitions. Animate {ELEMENT_TYPES} with {MOTION_CHARACTERISTICS}. Text should appear using {TEXT_ANIMATION} in {FONT_STYLE}. Include {VISUAL_METAPHORS} to illustrate {CONCEPTS}. Pacing: {PACING_STYLE} with {TIMING_EMPHASIS}. Add {DECORATIVE_ELEMENTS} for visual interest. Ensure {ACCESSIBILITY_REQUIREMENTS}. Overall tone: {TONE_DESCRIPTION}."


The Cinematic Sequence Designer


For dramatic, story-driven content: "Produce a {DURATION} cinematic sequence depicting {NARRATIVE_MOMENT}. Genre: {GENRE} with {SUBGENRE} elements. Setting: {LOCATION} during {TIME_PERIOD}. Protagonist: {CHARACTER_DESCRIPTION} experiencing {EMOTIONAL_STATE}. Shot sequence: Begin with {ESTABLISHING_SHOT}, cut to {SECOND_SHOT}, then {THIRD_SHOT}. Camera techniques: use {CAMERA_TECHNIQUES} to convey {NARRATIVE_PURPOSE}. Lighting: {LIGHTING_MOOD} with {LIGHTING_SOURCES}. Color palette: {COLOR_PSYCHOLOGY} emphasizing {THEMATIC_ELEMENTS}. Include {ENVIRONMENTAL_STORYTELLING} details. Sound design considerations: {AUDIO_DESCRIPTION}. Editing rhythm: {EDITING_PACE} with {CUT_STYLE}. Visual effects: {VFX_ELEMENTS}. Inspired by {DIRECTOR_REFERENCE} cinematography."


The Product Demo Showcase


For commercial video content: "Create a {DURATION} product demonstration video for {PRODUCT_NAME}. Opening: {HOOK_DESCRIPTION} to capture attention. Show the product {PRESENTATION_SEQUENCE} highlighting {KEY_FEATURES}. Demonstrate {USE_CASES} in {CONTEXT_SETTINGS}. Use {CAMERA_MOVEMENTS} to showcase {PRODUCT_ANGLES}. Include {GRAPHIC_OVERLAYS} to emphasize {SPECIFICATIONS}. Lighting: {LIGHTING_STYLE} to make product appear {DESIRED_APPEARANCE}. Background: {BACKGROUND_CHOICE} appropriate for {TARGET_MARKET}. Pacing: {PACING_STRATEGY} to maintain {VIEWER_ENGAGEMENT}. Brand integration: {BRAND_ELEMENTS} positioned {PLACEMENT_STRATEGY}. Call-to-action: {CTA_DESCRIPTION}. Overall style: {PRODUCTION_STYLE} matching {BRAND_PERSONALITY}."


CROSS-MODEL OPTIMIZATION STRATEGIES


Different AI models have varying strengths and respond differently to prompt structures. When working with GPT-based models, they tend to excel with conversational, context-rich prompts that provide reasoning steps. Claude models often perform better with structured, well-organized prompts that clearly delineate different aspects of the request. Stable Diffusion and similar image models benefit from comma-separated descriptive tags combined with quality modifiers and style references. Video generation models typically require clear temporal sequencing and explicit motion descriptions.


To make your prompts truly model-agnostic, focus on these universal principles: Start with a clear statement of the desired output type and primary purpose. Provide comprehensive context about the use case and constraints. Break complex requests into logical components or steps. Use specific, concrete language rather than vague descriptors. Include quality indicators and style references appropriate to the domain. Specify technical requirements like format, resolution, or performance characteristics. When possible, provide examples or references that illustrate your vision.


ADVANCED PARAMETERIZATION TECHNIQUES


The true power of these prompt templates lies in their parameterization. By identifying the variable components and creating a systematic approach to filling them, you can build a personal library of reusable prompts that dramatically accelerate your workflow. Consider creating a simple template system where you maintain a document with your favorite prompt structures and a separate list of common values for each parameter category.


For code generation, maintain lists of common languages, frameworks, architectural patterns, and coding standards you frequently use. For image generation, keep collections of art styles, color schemes, lighting setups, and composition techniques that align with your aesthetic preferences or brand guidelines. For video generation, document your preferred pacing styles, transition types, and cinematic references that match your content goals.


ITERATIVE REFINEMENT AND PROMPT CHAINING


Even the best initial prompt rarely produces perfect results on the first attempt. Developing skill in iterative refinement is crucial. When the output is close but not quite right, analyze what specific aspects need adjustment and create focused follow-up prompts that build on the initial result. For example, if an image has the right composition but wrong lighting, your follow-up might be: "Maintain the exact composition and subject positioning, but change the lighting to {NEW_LIGHTING_DESCRIPTION}."


Prompt chaining is another powerful technique where you break complex tasks into sequential steps, using the output of one prompt as input or context for the next. For code generation, you might first prompt for the overall architecture, then generate individual components, then create tests, and finally generate documentation. For image generation, you might first generate a concept sketch, then refine it into a detailed illustration, then create variations, and finally upscale the best version.


CONCLUSION: THE FUTURE OF PROMPT ENGINEERING


As AI models continue to evolve and improve, the art of prompt engineering will remain essential. The templates and techniques outlined in this article provide a solid foundation, but the field is constantly advancing. The most successful practitioners treat prompt engineering as an ongoing learning process, continuously experimenting with new approaches, documenting what works, and building increasingly sophisticated prompt libraries.


Remember that the best prompt is not necessarily the longest or most complex, but rather the one that most efficiently communicates your intent while providing the model with the context and constraints needed to generate excellent results. By mastering these parameterizable templates and understanding the principles behind effective prompting, you position yourself to leverage AI tools at their full potential, whether you are generating code, creating images, or producing video content. The future belongs to those who can effectively collaborate with AI systems, and that collaboration begins with excellent prompts.