Hitchhiker's Guide to AI, Software Architecture, and Everything Else: Building RAG^2: A Comprehensive Guide to Combining RAG and GraphRAG

Introduction to RAG^2

Retrieval-Augmented Generation (RAG) has revolutionized how we build intelligent document processing systems by combining the power of large language models with external knowledge retrieval. However, traditional RAG systems primarily focus on semantic similarity between query and document chunks, often missing the complex relationships and structured knowledge inherent in domain-specific documents. GraphRAG addresses this limitation by incorporating knowledge graphs, but existing implementations typically require either predefined schemas or complex manual configuration.

RAG^2 (RAG squared) represents an evolution that combines the best of both approaches. This hybrid system leverages traditional semantic retrieval while simultaneously exploiting structured knowledge relationships through dynamically constructed knowledge graphs. The system can operate with user-provided ontologies for domain-specific applications or automatically generate ontologies from document collections when no prior knowledge structure exists.

The core innovation of RAG^2 lies in its dual retrieval mechanism. When processing a query, the system simultaneously searches for semantically relevant document chunks using traditional vector similarity and identifies related entities and relationships from the knowledge graph. This parallel approach ensures that answers are both contextually relevant and structurally coherent, capturing both explicit information and implicit relationships that might be scattered across multiple documents.

Understanding Traditional RAG vs GraphRAG

Traditional RAG systems operate on a relatively straightforward principle. Documents are segmented into manageable chunks, typically ranging from 200 to 1000 tokens, and each chunk is converted into a dense vector representation using embedding models. When a user submits a query, the system converts it into the same vector space and retrieves the most semantically similar chunks based on cosine similarity or similar distance metrics. These retrieved chunks serve as context for the language model to generate responses.

While this approach works well for many use cases, it has inherent limitations. Traditional RAG struggles with queries that require understanding relationships between entities mentioned in different parts of a document or across multiple documents. For instance, if a user asks about the impact of a specific regulation on multiple companies, traditional RAG might retrieve individual chunks mentioning each company but miss the regulatory connections that tie them together.

GraphRAG addresses these limitations by representing knowledge as interconnected entities and relationships. Instead of treating document chunks as isolated pieces of information, GraphRAG constructs knowledge graphs where entities (people, organizations, concepts) become nodes, and their relationships become edges. This structure enables more sophisticated query processing that can traverse relationships and understand complex multi-hop connections between concepts.

However, pure GraphRAG systems face their own challenges. They require significant preprocessing to extract entities and relationships accurately, and the quality of the knowledge graph heavily depends on the entity recognition and relationship extraction capabilities. Additionally, they may miss nuanced information that doesn’t fit neatly into predefined entity-relationship schemas.

RAG^2 overcomes these individual limitations by operating both retrieval mechanisms in parallel. The system maintains the flexibility and coverage of traditional chunk-based retrieval while adding the structural intelligence of graph-based knowledge representation. This dual approach ensures that no relevant information is lost while providing enhanced understanding of complex relationships.

System Architecture Overview

The RAG^2 architecture consists of several interconnected components that work together to provide comprehensive knowledge retrieval and generation capabilities. The system begins with a document ingestion pipeline that processes various document formats and prepares them for both traditional chunking and knowledge graph construction.

The ontology management component serves as the foundation for knowledge graph construction. This component can operate in two modes: utilizing user-provided ontologies for domain-specific applications or automatically generating ontologies through document analysis and entity recognition. The choice between these modes depends on the specific requirements of the application and the availability of domain expertise.

Document processing involves parallel pipelines for chunk creation and entity extraction. The chunking pipeline segments documents into semantically coherent pieces while preserving important context boundaries. Simultaneously, the entity extraction pipeline identifies relevant entities, their types, and relationships according to the active ontology. This parallel processing ensures that both retrieval mechanisms have access to appropriately formatted information.

The knowledge graph construction component builds and maintains a graph database that represents extracted entities and their relationships. This component continuously updates the graph as new documents are processed, ensuring that the knowledge representation remains current and comprehensive. The graph database serves as the foundation for relationship-based queries and provides structural context for retrieved information.

Query processing represents the core innovation of RAG^2. When a user submits a query, the system simultaneously initiates traditional semantic search against the chunk database and graph-based retrieval against the knowledge graph. The results from both retrieval mechanisms are then intelligently combined to provide comprehensive context for the language model.

The visualization component provides users with interactive representations of relevant knowledge graph segments. This visual feedback helps users understand how the system arrived at specific answers and provides insights into the relationships between different concepts mentioned in their queries.

Ontology Management Implementation

The ontology management system forms the backbone of RAG^2’s knowledge representation capabilities. When users provide domain-specific ontologies, the system validates and incorporates these schemas to guide entity extraction and relationship identification. This approach is particularly valuable in specialized domains like legal documents, scientific literature, or technical specifications where predefined entity types and relationship patterns are well-established.

The following code example demonstrates how to implement ontology loading and validation for user-provided schemas. This implementation uses the rdflib library to handle ontology parsing and provides a flexible framework for incorporating different ontology formats.

import rdflib

from rdflib import Graph, Namespace, RDF, RDFS, OWL

from typing import Dict, List, Set, Optional

import json

class OntologyManager:

def __init__(self):

self.graph = Graph()

self.entity_types = set()

self.relationship_types = set()

self.type_hierarchy = {}

def load_user_ontology(self, ontology_path: str, format_type: str = "turtle"):

"""

Load and validate a user-provided ontology file.

Supports RDF, OWL, and custom JSON formats.

"""

try:

if format_type.lower() == "json":

self._load_json_ontology(ontology_path)

else:

self.graph.parse(ontology_path, format=format_type)

self._extract_ontology_components()

print(f"Successfully loaded ontology with {len(self.entity_types)} entity types")

print(f"and {len(self.relationship_types)} relationship types")

except Exception as e:

print(f"Error loading ontology: {str(e)}")

raise

def _extract_ontology_components(self):

"""Extract entity types and relationships from RDF/OWL ontology."""

# Extract entity types (classes)

for subject in self.graph.subjects(RDF.type, OWL.Class):

self.entity_types.add(str(subject))

for subject in self.graph.subjects(RDF.type, RDFS.Class):

self.entity_types.add(str(subject))

# Extract relationship types (properties)

for subject in self.graph.subjects(RDF.type, OWL.ObjectProperty):

self.relationship_types.add(str(subject))

for subject in self.graph.subjects(RDF.type, RDF.Property):

self.relationship_types.add(str(subject))

def _load_json_ontology(self, ontology_path: str):

"""Load ontology from simplified JSON format."""

with open(ontology_path, 'r') as f:

ontology_data = json.load(f)

self.entity_types = set(ontology_data.get('entity_types', []))

self.relationship_types = set(ontology_data.get('relationship_types', []))

self.type_hierarchy = ontology_data.get('hierarchy', {})

The ontology manager provides a unified interface for handling different ontology formats, making the system adaptable to various domain requirements. When no user ontology is available, the system switches to automatic ontology generation mode.

Automatic ontology generation represents one of the most sophisticated aspects of RAG^2. The system analyzes the document collection to identify frequently occurring entity types and relationship patterns. This process involves multiple stages of natural language processing, including named entity recognition, coreference resolution, and statistical analysis of entity co-occurrence patterns.

The automatic ontology generation process begins with comprehensive entity extraction across the entire document collection. The system uses multiple entity recognition models to identify different types of entities, from standard categories like persons and organizations to domain-specific concepts that emerge from the document content. Statistical analysis of entity frequency and co-occurrence patterns helps identify the most important entity types for the specific document collection.

import spacy

from collections import defaultdict, Counter

from sklearn.feature_extraction.text import TfidfVectorizer

from sklearn.cluster import KMeans

import numpy as np

class AutomaticOntologyGenerator:

def __init__(self, nlp_model="en_core_web_sm"):

self.nlp = spacy.load(nlp_model)

self.entity_stats = defaultdict(Counter)

self.relationship_patterns = defaultdict(list)

def generate_ontology_from_documents(self, documents: List[str],

min_entity_frequency: int = 5,

max_entity_types: int = 50):

"""

Automatically generate an ontology from a collection of documents

by analyzing entity patterns and relationships.

"""

print("Analyzing documents for entity patterns...")

# First pass: collect all entities and their contexts

all_entities = []

entity_contexts = defaultdict(list)

for doc_text in documents:

doc = self.nlp(doc_text)

# Extract entities with context

for ent in doc.ents:

entity_key = (ent.text.lower(), ent.label_)

all_entities.append(entity_key)

# Capture context around entity for relationship analysis

start_idx = max(0, ent.start - 10)

end_idx = min(len(doc), ent.end + 10)

context = doc[start_idx:end_idx].text

entity_contexts[entity_key].append(context)

# Analyze entity frequency and filter rare entities

entity_counter = Counter(all_entities)

frequent_entities = {

entity: count for entity, count in entity_counter.items()

if count >= min_entity_frequency

}

print(f"Found {len(frequent_entities)} frequent entity types")

# Cluster similar entities to create higher-level categories

self._cluster_entity_types(frequent_entities, max_entity_types)

# Analyze relationship patterns between entities

self._extract_relationship_patterns(entity_contexts)

return self._build_ontology_structure()

def _cluster_entity_types(self, entities: Dict, max_types: int):

"""Group similar entities into higher-level ontological categories."""

if len(entities) <= max_types:

self.entity_types = set([ent[1] for ent in entities.keys()])

return

# Use TF-IDF on entity surface forms for clustering

entity_texts = [ent[0] for ent in entities.keys()]

vectorizer = TfidfVectorizer(max_features=100, stop_words='english')

try:

entity_vectors = vectorizer.fit_transform(entity_texts)

# Cluster entities into max_types groups

kmeans = KMeans(n_clusters=max_types, random_state=42)

clusters = kmeans.fit_predict(entity_vectors)

# Create entity type names based on most common labels in each cluster

cluster_types = defaultdict(list)

for i, (entity, count) in enumerate(entities.items()):

cluster_id = clusters[i]

cluster_types[cluster_id].append(entity[1])

# Name clusters based on most common entity type

self.entity_types = set()

for cluster_id, labels in cluster_types.items():

most_common_label = Counter(labels).most_common(1)[0][0]

self.entity_types.add(f"AUTO_{most_common_label}")

except Exception as e:

print(f"Clustering failed, using top entity types: {e}")

# Fallback: use most frequent entity types

top_types = Counter([ent[1] for ent in entities.keys()])

self.entity_types = set([f"AUTO_{label}"

for label, _ in top_types.most_common(max_types)])

The automatic ontology generation process creates a domain-specific knowledge structure without requiring prior expertise. The system analyzes entity co-occurrence patterns, identifies common relationship types through dependency parsing, and creates a coherent ontological framework that captures the essential knowledge structure of the document collection.

Knowledge Graph Construction

Once the ontology is established, either through user input or automatic generation, the system begins constructing the knowledge graph from the processed documents. This phase involves sophisticated natural language processing to extract entities according to the ontology schema and identify relationships between these entities.

The knowledge graph construction process operates in multiple stages to ensure accuracy and completeness. The initial entity extraction phase uses the established ontology to guide named entity recognition, ensuring that identified entities conform to the expected types and categories. Advanced techniques such as coreference resolution help maintain entity consistency across document boundaries.

import networkx as nx

from neo4j import GraphDatabase

import spacy

from typing import List, Tuple, Dict, Set

import re

class KnowledgeGraphBuilder:

def __init__(self, ontology_manager: OntologyManager,

neo4j_uri: str = "bolt://localhost:7687",

neo4j_user: str = "neo4j",

neo4j_password: str = "password"):

self.ontology = ontology_manager

self.nlp = spacy.load("en_core_web_sm")

self.graph = nx.DiGraph()

# Initialize Neo4j connection for persistent storage

try:

self.driver = GraphDatabase.driver(neo4j_uri,

auth=(neo4j_user, neo4j_password))

self._initialize_neo4j_schema()

except Exception as e:

print(f"Neo4j connection failed, using in-memory graph: {e}")

self.driver = None

def build_graph_from_documents(self, documents: List[Dict[str, str]]):

"""

Construct knowledge graph from processed documents.

Each document should have 'text' and 'metadata' fields.

"""

print("Building knowledge graph from documents...")

for i, doc in enumerate(documents):

if i % 100 == 0:

print(f"Processed {i} documents...")

# Extract entities from document

entities = self._extract_entities(doc['text'])

# Extract relationships between entities

relationships = self._extract_relationships(doc['text'], entities)

# Add entities and relationships to graph

self._add_entities_to_graph(entities, doc['metadata'])

self._add_relationships_to_graph(relationships, doc['metadata'])

print(f"Knowledge graph construction complete:")

print(f"Nodes: {self.graph.number_of_nodes()}")

print(f"Edges: {self.graph.number_of_edges()}")

# Persist to Neo4j if available

if self.driver:

self._persist_to_neo4j()

def _extract_entities(self, text: str) -> List[Dict]:

"""Extract entities based on the active ontology."""

doc = self.nlp(text)

entities = []

for ent in doc.ents:

# Check if entity type matches ontology

entity_type = self._map_to_ontology_type(ent.label_)

if entity_type:

entities.append({

'text': ent.text,

'type': entity_type,

'start': ent.start_char,

'end': ent.end_char,

'normalized': self._normalize_entity(ent.text)

})

return entities

def _extract_relationships(self, text: str, entities: List[Dict]) -> List[Dict]:

"""

Extract relationships between entities using dependency parsing

and pattern matching based on the ontology.

"""

doc = self.nlp(text)

relationships = []

# Create entity lookup for quick access

entity_positions = {}

for ent in entities:

for i in range(ent['start'], ent['end']):

entity_positions[i] = ent

# Analyze dependency structure for relationships

for token in doc:

if token.dep_ in ['nsubj', 'dobj', 'pobj']:

# Look for relationship patterns

head_entity = self._find_entity_at_position(token.head.idx, entity_positions)

child_entity = self._find_entity_at_position(token.idx, entity_positions)

if head_entity and child_entity and head_entity != child_entity:

relationship_type = self._determine_relationship_type(

token.head.text, token.dep_, head_entity['type'], child_entity['type']

)

if relationship_type:

relationships.append({

'source': head_entity['normalized'],

'target': child_entity['normalized'],

'type': relationship_type,

'confidence': self._calculate_relationship_confidence(token)

})

return relationships

def _map_to_ontology_type(self, spacy_label: str) -> Optional[str]:

"""Map SpaCy entity labels to ontology types."""

mapping = {

'PERSON': 'Person',

'ORG': 'Organization',

'GPE': 'Location',

'MONEY': 'MonetaryAmount',

'DATE': 'Date',

'EVENT': 'Event'

}

# Check if ontology has specific mappings

for onto_type in self.ontology.entity_types:

if spacy_label.lower() in onto_type.lower():

return onto_type

return mapping.get(spacy_label)

def _add_entities_to_graph(self, entities: List[Dict], metadata: Dict):

"""Add extracted entities to the knowledge graph."""

for entity in entities:

node_id = entity['normalized']

# Add node with attributes

self.graph.add_node(node_id,

type=entity['type'],

surface_forms=set([entity['text']]),

document_count=1,

**metadata)

# If node already exists, update attributes

if node_id in self.graph:

existing_forms = self.graph.nodes[node_id].get('surface_forms', set())

existing_forms.add(entity['text'])

self.graph.nodes[node_id]['surface_forms'] = existing_forms

self.graph.nodes[node_id]['document_count'] += 1

The knowledge graph construction process maintains both structural integrity and semantic richness. Entity normalization ensures that the same entity mentioned in different forms across documents is represented as a single node. Relationship extraction leverages both syntactic patterns and semantic analysis to identify meaningful connections between entities.

The system uses a hybrid approach for relationship extraction, combining rule-based patterns with statistical analysis. Dependency parsing identifies syntactic relationships between entities, while co-occurrence analysis and distributional semantics help identify implicit relationships that may not be explicitly stated in the text.

Document Processing Pipeline

The document processing pipeline represents a critical component that prepares raw documents for both traditional RAG and knowledge graph integration. This pipeline must handle various document formats while preserving important structural and semantic information that supports both retrieval mechanisms.

The preprocessing stage begins with document format detection and conversion. The system supports multiple input formats including PDF, Word documents, HTML pages, and plain text files. Each format requires specialized processing to extract text while preserving important structural elements like headings, tables, and metadata that provide valuable context for knowledge extraction.

import fitz # PyMuPDF

from docx import Document

from bs4 import BeautifulSoup

import pandas as pd

from typing import List, Dict, Any

import hashlib

class DocumentProcessor:

def __init__(self, chunk_size: int = 512, chunk_overlap: int = 50):

self.chunk_size = chunk_size

self.chunk_overlap = chunk_overlap

self.supported_formats = {'.pdf', '.docx', '.html', '.txt', '.csv', '.xlsx'}

def process_document(self, file_path: str) -> Dict[str, Any]:

"""

Process a document file and extract text content with metadata.

Returns both raw text and structured chunks for different processing needs.

"""

file_extension = file_path.lower().split('.')[-1]

if f'.{file_extension}' not in self.supported_formats:

raise ValueError(f"Unsupported file format: {file_extension}")

# Extract text and metadata based on file type

if file_extension == 'pdf':

text, metadata = self._process_pdf(file_path)

elif file_extension == 'docx':

text, metadata = self._process_docx(file_path)

elif file_extension == 'html':

text, metadata = self._process_html(file_path)

elif file_extension in ['csv', 'xlsx']:

text, metadata = self._process_structured_data(file_path)

else:

text, metadata = self._process_text(file_path)

# Generate unique document identifier

doc_hash = hashlib.md5(text.encode()).hexdigest()

metadata['document_id'] = doc_hash

metadata['file_path'] = file_path

# Create chunks for traditional RAG

chunks = self._create_chunks(text)

# Preserve document structure for knowledge graph extraction

structured_content = self._extract_document_structure(text, file_extension)

return {

'raw_text': text,

'chunks': chunks,

'structured_content': structured_content,

'metadata': metadata

}

def _process_pdf(self, file_path: str) -> Tuple[str, Dict]:

"""Extract text from PDF while preserving structure."""

doc = fitz.open(file_path)

text_blocks = []

metadata = {'page_count': len(doc), 'format': 'pdf'}

for page_num in range(len(doc)):

page = doc.load_page(page_num)

# Extract text with position information

blocks = page.get_text("dict")

page_text = []

for block in blocks["blocks"]:

if "lines" in block:

for line in block["lines"]:

for span in line["spans"]:

# Preserve formatting information

font_info = {

'font': span.get('font', ''),

'size': span.get('size', 0),

'flags': span.get('flags', 0)

}

text_chunk = span['text']

if text_chunk.strip():

page_text.append({

'text': text_chunk,

'font_info': font_info,

'page': page_num + 1

})

text_blocks.extend(page_text)

doc.close()

# Combine text while preserving paragraph structure

full_text = self._reconstruct_text_structure(text_blocks)

return full_text, metadata

def _create_chunks(self, text: str) -> List[Dict[str, Any]]:

"""

Create overlapping chunks optimized for semantic search.

Preserves sentence boundaries and maintains context continuity.

"""

import nltk

try:

nltk.data.find('tokenizers/punkt')

except LookupError:

nltk.download('punkt')

sentences = nltk.sent_tokenize(text)

chunks = []

current_chunk = []

current_length = 0

for i, sentence in enumerate(sentences):

sentence_length = len(sentence.split())

# Check if adding this sentence would exceed chunk size

if current_length + sentence_length > self.chunk_size and current_chunk:

# Create chunk with current sentences

chunk_text = ' '.join(current_chunk)

chunks.append({

'text': chunk_text,

'start_sentence': i - len(current_chunk),

'end_sentence': i - 1,

'word_count': current_length

})

# Start new chunk with overlap

overlap_sentences = current_chunk[-self.chunk_overlap//10:] if current_chunk else []

current_chunk = overlap_sentences + [sentence]

current_length = sum(len(s.split()) for s in current_chunk)

else:

current_chunk.append(sentence)

current_length += sentence_length

# Add final chunk

if current_chunk:

chunk_text = ' '.join(current_chunk)

chunks.append({

'text': chunk_text,

'start_sentence': len(sentences) - len(current_chunk),

'end_sentence': len(sentences) - 1,

'word_count': current_length

})

return chunks

def _extract_document_structure(self, text: str, file_type: str) -> Dict[str, Any]:

"""

Extract structural elements that are valuable for knowledge graph construction.

This includes headings, lists, tables, and other semantic markers.

"""

structure = {

'headings': [],

'sections': [],

'tables': [],

'lists': [],

'paragraphs': []

}

lines = text.split('\n')

current_section = None

for line_num, line in enumerate(lines):

line = line.strip()

if not line:

continue

# Identify headings (simple heuristic - can be improved)

if self._is_heading(line):

heading = {

'text': line,

'level': self._determine_heading_level(line),

'line_number': line_num

}

structure['headings'].append(heading)

current_section = heading['text']

# Identify lists

elif self._is_list_item(line):

list_item = {

'text': line,

'section': current_section,

'line_number': line_num

}

structure['lists'].append(list_item)

# Regular paragraphs

else:

paragraph = {

'text': line,

'section': current_section,

'line_number': line_num

}

structure['paragraphs'].append(paragraph)

return structure

The document processing pipeline ensures that information is prepared optimally for both retrieval mechanisms. Chunks are created with careful attention to semantic boundaries, ensuring that context is preserved across chunk boundaries. Simultaneously, the structural extraction process identifies important document elements that provide valuable context for knowledge graph construction.

The chunking strategy employed by RAG^2 differs from traditional approaches by considering the dual retrieval requirements. Chunks must be large enough to provide meaningful context for language model generation while remaining focused enough to enable precise semantic matching. The overlapping strategy ensures that important information spanning chunk boundaries is not lost during retrieval.

Query Processing and Retrieval

The query processing component represents the core innovation of RAG^2, orchestrating parallel retrieval from both semantic chunks and the knowledge graph. This dual retrieval approach requires sophisticated coordination to ensure that results from both mechanisms complement rather than compete with each other.

When a user submits a query, the system immediately analyzes it to identify entities, concepts, and intent. This analysis guides both retrieval mechanisms and helps determine the optimal balance between semantic and structural information in the final response. Entity recognition within the query helps the system identify relevant nodes in the knowledge graph, while semantic analysis guides traditional vector-based retrieval.

import numpy as np

from sentence_transformers import SentenceTransformer

from sklearn.metrics.pairwise import cosine_similarity

import networkx as nx

from typing import List, Dict, Tuple, Any

class RAGSquaredQueryProcessor:

def __init__(self,

embedding_model: str = "all-MiniLM-L6-v2",

knowledge_graph: KnowledgeGraphBuilder = None,

chunk_database: List[Dict] = None):

self.embedding_model = SentenceTransformer(embedding_model)

self.knowledge_graph = knowledge_graph

self.chunk_database = chunk_database or []

self.chunk_embeddings = None

# Pre-compute chunk embeddings for efficient retrieval

if self.chunk_database:

self._precompute_chunk_embeddings()

def process_query(self, query: str, top_k_chunks: int = 5,

top_k_graph_nodes: int = 10) -> Dict[str, Any]:

"""

Process a user query using both semantic chunk retrieval

and knowledge graph traversal.

"""

print(f"Processing query: {query}")

# Analyze query to identify entities and intent

query_analysis = self._analyze_query(query)

# Parallel retrieval from both mechanisms

semantic_results = self._semantic_retrieval(query, top_k_chunks)

graph_results = self._graph_retrieval(query_analysis, top_k_graph_nodes)

# Combine and rank results

combined_context = self._combine_retrieval_results(

semantic_results, graph_results, query_analysis

)

return {

'query': query,

'query_analysis': query_analysis,

'semantic_chunks': semantic_results,

'graph_context': graph_results,

'combined_context': combined_context,

'visualization_data': self._prepare_visualization_data(graph_results)

}

def _analyze_query(self, query: str) -> Dict[str, Any]:

"""

Analyze the query to extract entities, intent, and structural requirements.

This analysis guides both retrieval mechanisms.

"""

doc = self.knowledge_graph.nlp(query)

# Extract entities mentioned in the query

query_entities = []

for ent in doc.ents:

normalized = self.knowledge_graph._normalize_entity(ent.text)

entity_type = self.knowledge_graph._map_to_ontology_type(ent.label_)

query_entities.append({

'text': ent.text,

'normalized': normalized,

'type': entity_type,

'start': ent.start_char,

'end': ent.end_char

})

# Determine query intent and complexity

intent_keywords = {

'relationship': ['relationship', 'connection', 'related', 'linked', 'associated'],

'comparison': ['compare', 'difference', 'versus', 'vs', 'better', 'worse'],

'causation': ['cause', 'effect', 'impact', 'influence', 'result', 'consequence'],

'temporal': ['when', 'before', 'after', 'during', 'timeline', 'history'],

'aggregation': ['total', 'sum', 'count', 'average', 'all', 'list']

}

query_lower = query.lower()

detected_intents = []

for intent, keywords in intent_keywords.items():

if any(keyword in query_lower for keyword in keywords):

detected_intents.append(intent)

return {

'entities': query_entities,

'intents': detected_intents,

'complexity': len(query_entities) + len(detected_intents),

'requires_graph': len(detected_intents) > 0 or len(query_entities) > 1

}

def _semantic_retrieval(self, query: str, top_k: int) -> List[Dict[str, Any]]:

"""

Perform traditional semantic retrieval using vector similarity.

"""

if not self.chunk_embeddings:

return []

# Encode query

query_embedding = self.embedding_model.encode([query])

# Calculate similarities

similarities = cosine_similarity(query_embedding, self.chunk_embeddings)[0]

# Get top-k most similar chunks

top_indices = np.argsort(similarities)[::-1][:top_k]

results = []

for idx in top_indices:

chunk = self.chunk_database[idx]

results.append({

'chunk': chunk,

'similarity_score': float(similarities[idx]),

'retrieval_method': 'semantic'

})

return results

def _graph_retrieval(self, query_analysis: Dict, top_k: int) -> Dict[str, Any]:

"""

Retrieve relevant information from the knowledge graph based on

query entities and detected relationships.

"""

if not self.knowledge_graph or not query_analysis['entities']:

return {'nodes': [], 'edges': [], 'subgraph': None}

# Find query entities in the knowledge graph

graph_entities = []

for query_ent in query_analysis['entities']:

if query_ent['normalized'] in self.knowledge_graph.graph:

graph_entities.append(query_ent['normalized'])

if not graph_entities:

return {'nodes': [], 'edges': [], 'subgraph': None}

# Build relevant subgraph

relevant_nodes = set(graph_entities)

# Add neighbors based on query intent

for entity in graph_entities:

if 'relationship' in query_analysis['intents']:

# Include all direct neighbors for relationship queries

neighbors = list(self.knowledge_graph.graph.neighbors(entity))

relevant_nodes.update(neighbors[:5]) # Limit to prevent explosion

elif 'comparison' in query_analysis['intents'] and len(graph_entities) > 1:

# Find paths between entities for comparison queries

for other_entity in graph_entities:

if entity != other_entity:

try:

path = nx.shortest_path(

self.knowledge_graph.graph, entity, other_entity

)

if len(path) <= 4: # Only include short paths

relevant_nodes.update(path)

except nx.NetworkXNoPath:

continue

# Extract subgraph

subgraph = self.knowledge_graph.graph.subgraph(relevant_nodes)

# Prepare node and edge data with relevance scores

nodes = []

for node in subgraph.nodes(data=True):

relevance_score = self._calculate_node_relevance(

node[0], graph_entities, query_analysis

)

nodes.append({

'id': node[0],

'data': node[1],

'relevance_score': relevance_score

})

edges = []

for edge in subgraph.edges(data=True):

edges.append({

'source': edge[0],

'target': edge[1],

'data': edge[2]

})

# Sort nodes by relevance

nodes.sort(key=lambda x: x['relevance_score'], reverse=True)

return {

'nodes': nodes[:top_k],

'edges': edges,

'subgraph': subgraph

}

def _combine_retrieval_results(self, semantic_results: List,

graph_results: Dict,

query_analysis: Dict) -> str:

"""

Intelligently combine results from both retrieval mechanisms

to create comprehensive context for the language model.

"""

context_parts = []

# Add semantic chunk context

if semantic_results:

context_parts.append("=== RELEVANT DOCUMENT CONTENT ===")

for i, result in enumerate(semantic_results[:3]): # Limit to top 3

chunk_text = result['chunk']['text']

context_parts.append(f"Content {i+1} (relevance: {result['similarity_score']:.3f}):")

context_parts.append(chunk_text)

context_parts.append("")

# Add knowledge graph context

if graph_results['nodes']:

context_parts.append("=== RELATED ENTITIES AND RELATIONSHIPS ===")

# Add entity information

context_parts.append("Relevant Entities:")

for node in graph_results['nodes'][:5]:

entity_info = f"- {node['id']} (type: {node['data'].get('type', 'Unknown')})"

if 'surface_forms' in node['data']:

forms = list(node['data']['surface_forms'])[:3]

entity_info += f" also known as: {', '.join(forms)}"

context_parts.append(entity_info)

# Add relationship information

if graph_results['edges']:

context_parts.append("\nRelationships:")

unique_relationships = set()

for edge in graph_results['edges'][:10]:

rel_type = edge['data'].get('type', 'related_to')

relationship = f"{edge['source']} --{rel_type}--> {edge['target']}"

if relationship not in unique_relationships:

context_parts.append(f"- {relationship}")

unique_relationships.add(relationship)

return "\n".join(context_parts)

The query processing system demonstrates the sophisticated coordination required for dual retrieval. The system analyzes query intent to determine the appropriate balance between semantic and graph-based retrieval. Relationship-focused queries receive more emphasis on graph traversal, while factual queries may rely more heavily on semantic chunk retrieval.

The combination strategy ensures that information from both sources is presented coherently to the language model. Graph-based information provides structural context and entity relationships, while semantic chunks provide detailed textual information and supporting evidence. This dual approach enables the system to answer complex queries that require both factual knowledge and understanding of relationships.

Visualization Implementation

The visualization component of RAG^2 serves multiple purposes beyond simple graph display. It provides users with insights into how the system processes their queries, shows the relationships that inform the generated responses, and enables interactive exploration of the knowledge base. The visualization adapts dynamically based on query results and user interaction patterns.

The implementation uses modern web technologies to create interactive, responsive visualizations that can handle knowledge graphs of varying complexity. The system provides multiple visualization modes, from simple network diagrams for overview purposes to detailed interactive graphs for exploratory analysis.

import json

import networkx as nx

from typing import Dict, List, Any

import plotly.graph_objects as go

import plotly.express as px

from plotly.subplots import make_subplots

class KnowledgeGraphVisualizer:

def __init__(self):

self.color_scheme = {

'Person': '#FF6B6B',

'Organization': '#4ECDC4',

'Location': '#45B7D1',

'Event': '#96CEB4',

'Date': '#FFEAA7',

'Default': '#DDA0DD'

}

def create_interactive_visualization(self, graph_results: Dict,

query_analysis: Dict) -> Dict[str, Any]:

"""

Create an interactive visualization of the relevant knowledge graph

that highlights query-specific information and enables exploration.

"""

if not graph_results['nodes']:

return self._create_empty_visualization()

# Prepare graph layout

subgraph = graph_results['subgraph']

pos = nx.spring_layout(subgraph, k=1, iterations=50)

# Create node traces

node_trace = self._create_node_trace(graph_results['nodes'], pos, query_analysis)

# Create edge traces

edge_trace = self._create_edge_trace(graph_results['edges'], pos)

# Create the main visualization

fig = go.Figure(data=[edge_trace, node_trace],

layout=go.Layout(

title=f"Knowledge Graph Context for Query",

titlefont_size=16,

showlegend=False,

hovermode='closest',

margin=dict(b=20,l=5,r=5,t=40),

annotations=[ dict(

text="Interactive Knowledge Graph - Hover for details, click to explore",

showarrow=False,

xref="paper", yref="paper",

x=0.005, y=-0.002,

xanchor="left", yanchor="bottom",

font=dict(color="#888", size=12)

)],

xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),

yaxis=dict(showgrid=False, zeroline=False, showticklabels=False),

plot_bgcolor='white'

))

# Add relationship summary

relationship_summary = self._create_relationship_summary(graph_results)

# Create entity details panel

entity_details = self._create_entity_details(graph_results['nodes'][:10])

return {

'main_graph': fig.to_dict(),

'relationship_summary': relationship_summary,

'entity_details': entity_details,

'interaction_data': self._prepare_interaction_data(graph_results)

}

def _create_node_trace(self, nodes: List[Dict], pos: Dict,

query_analysis: Dict) -> go.Scatter:

"""Create the node trace for the visualization with query-aware styling."""

node_x = []

node_y = []

node_text = []

node_color = []

node_size = []

hover_text = []

# Extract query entities for highlighting

query_entities = set([ent['normalized'] for ent in query_analysis['entities']])

for node in nodes:

node_id = node['id']

node_data = node['data']

if node_id in pos:

x, y = pos[node_id]

node_x.append(x)

node_y.append(y)

# Determine node properties

entity_type = node_data.get('type', 'Default')

color = self.color_scheme.get(entity_type, self.color_scheme['Default'])

# Highlight query entities

if node_id in query_entities:

color = '#FF0000' # Red for query entities

size = 20

else:

size = max(10, min(18, 10 + node['relevance_score'] * 8))

node_color.append(color)

node_size.append(size)

# Create display text

display_text = node_id[:15] + "..." if len(node_id) > 15 else node_id

node_text.append(display_text)

# Create hover information

hover_info = f"<b>{node_id}</b><br>"

hover_info += f"Type: {entity_type}<br>"

hover_info += f"Relevance: {node['relevance_score']:.3f}<br>"

if 'surface_forms' in node_data:

forms = list(node_data['surface_forms'])[:3]

hover_info += f"Also known as: {', '.join(forms)}<br>"

if 'document_count' in node_data:

hover_info += f"Mentioned in {node_data['document_count']} documents"

hover_text.append(hover_info)

return go.Scatter(x=node_x, y=node_y,

mode='markers+text',

text=node_text,

textposition="middle center",

hovertext=hover_text,

hoverinfo='text',

marker=dict(size=node_size,

color=node_color,

line=dict(width=2, color='white')),

textfont=dict(size=8, color='white'))

def _create_edge_trace(self, edges: List[Dict], pos: Dict) -> go.Scatter:

"""Create edge traces for the visualization."""

edge_x = []

edge_y = []

edge_info = []

for edge in edges:

source = edge['source']

target = edge['target']

if source in pos and target in pos:

x0, y0 = pos[source]

x1, y1 = pos[target]

edge_x.extend([x0, x1, None])

edge_y.extend([y0, y1, None])

# Store edge information for potential hover display

rel_type = edge['data'].get('type', 'related_to')

edge_info.append(f"{source} --{rel_type}--> {target}")

return go.Scatter(x=edge_x, y=edge_y,

line=dict(width=1, color='#888'),

hoverinfo='none',

mode='lines')

def _create_relationship_summary(self, graph_results: Dict) -> Dict[str, Any]:

"""Create a summary of key relationships for the sidebar display."""

relationships = {}

for edge in graph_results['edges']:

rel_type = edge['data'].get('type', 'related_to')

if rel_type not in relationships:

relationships[rel_type] = []

relationships[rel_type].append({

'source': edge['source'],

'target': edge['target'],

'confidence': edge['data'].get('confidence', 0.5)

})

# Sort relationships by frequency and confidence

summary = {}

for rel_type, relations in relationships.items():

# Sort by confidence and take top relations

sorted_relations = sorted(relations,

key=lambda x: x['confidence'],

reverse=True)[:5]

summary[rel_type] = {

'count': len(relations),

'top_examples': sorted_relations

}

return summary

def generate_web_visualization(self, graph_results: Dict,

query_analysis: Dict) -> str:

"""

Generate a complete HTML page with interactive visualization

using D3.js for more advanced interaction capabilities.

"""

visualization_data = self.create_interactive_visualization(

graph_results, query_analysis

)

# Prepare data for D3.js

d3_data = {

'nodes': [

{

'id': node['id'],

'type': node['data'].get('type', 'Default'),

'relevance': node['relevance_score'],

'isQueryEntity': node['id'] in [ent['normalized']

for ent in query_analysis['entities']]

}

for node in graph_results['nodes']

'links': [

{

'source': edge['source'],

'target': edge['target'],

'type': edge['data'].get('type', 'related_to'),

'confidence': edge['data'].get('confidence', 0.5)

}

for edge in graph_results['edges']

]

}

html_template = f"""

<!DOCTYPE html>

<html>

<head>

<title>RAG² Knowledge Graph Visualization</title>

<style>

body {{ font-family: Arial, sans-serif; margin: 0; padding: 20px; }}

.container {{ display: flex; gap: 20px; }}

.graph-container {{ flex: 1; border: 1px solid #ccc; }}

.sidebar {{ width: 300px; padding: 15px; background: #f5f5f5; }}

.node {{ cursor: pointer; }}

.link {{ stroke: #999; stroke-opacity: 0.6; }}

.query-entity {{ stroke: #ff0000; stroke-width: 3px; }}

.tooltip {{ position: absolute; padding: 10px; background: rgba(0,0,0,0.8);

color: white; border-radius: 5px; pointer-events: none; }}

</style>

</head>

<body>

<h1>Knowledge Graph Context Visualization</h1>

</div>

<h3>Query Analysis</h3>

<p><strong>Entities:</strong> {', '.join([ent['text'] for ent in query_analysis['entities']])}</p>

<p><strong>Intent:</strong> {', '.join(query_analysis['intents'])}</p>

<h3>Graph Statistics</h3>

<p><strong>Nodes:</strong> {len(graph_results['nodes'])}</p>

<p><strong>Relationships:</strong> {len(graph_results['edges'])}</p>

</div>

const data = {json.dumps(d3_data)};

const svg = d3.select("#graph");

const width = 800;

const height = 600;

const simulation = d3.forceSimulation(data.nodes)

.force("link", d3.forceLink(data.links).id(d => d.id).distance(100))

.force("charge", d3.forceManyBody().strength(-300))

.force("center", d3.forceCenter(width / 2, height / 2));

// Create tooltip

const tooltip = d3.select("body").append("div")

.attr("class", "tooltip")

.style("opacity", 0);

// Create links

const link = svg.append("g")

.selectAll("line")

.data(data.links)

.enter().append("line")

.attr("class", "link")

.attr("stroke-width", d => Math.sqrt(d.confidence * 5));

// Create nodes

const node = svg.append("g")

.selectAll("circle")

.data(data.nodes)

.enter().append("circle")

.attr("class", d => d.isQueryEntity ? "node query-entity" : "node")

.attr("r", d => 5 + d.relevance * 10)

.attr("fill", d => getNodeColor(d.type))

.call(d3.drag()

.on("start", dragstarted)

.on("drag", dragged)

.on("end", dragended))

.on("mouseover", function(event, d) {{

tooltip.transition().duration(200).style("opacity", .9);

tooltip.html(`<strong>${{d.id}}</strong><br/>Type: ${{d.type}}<br/>Relevance: ${{d.relevance.toFixed(3)}}`)

.style("left", (event.pageX + 10) + "px")

.style("top", (event.pageY - 28) + "px");

}})

.on("mouseout", function(d) {{

tooltip.transition().duration(500).style("opacity", 0);

}})

.on("click", function(event, d) {{

showNodeDetails(d);

}});

// Add labels

const label = svg.append("g")

.selectAll("text")

.data(data.nodes)

.enter().append("text")

.text(d => d.id.length > 12 ? d.id.substring(0, 12) + "..." : d.id)

.attr("font-size", "10px")

.attr("text-anchor", "middle");

simulation.on("tick", () => {{

link

.attr("x1", d => d.source.x)

.attr("y1", d => d.source.y)

.attr("x2", d => d.target.x)

.attr("y2", d => d.target.y);

node

.attr("cx", d => d.x)

.attr("cy", d => d.y);

label

.attr("x", d => d.x)

.attr("y", d => d.y + 4);

}});

function getNodeColor(type) {{

const colors = {json.dumps(self.color_scheme)};

return colors[type] || colors['Default'];

}}

function dragstarted(event, d) {{

if (!event.active) simulation.alphaTarget(0.3).restart();

d.fx = d.x;

d.fy = d.y;

}}

function dragged(event, d) {{

d.fx = event.x;

d.fy = event.y;

}}

function dragended(event, d) {{

if (!event.active) simulation.alphaTarget(0);

d.fx = null;

d.fy = null;

}}

function showNodeDetails(node) {{

const details = document.getElementById('node-details');

details.innerHTML = `

<h4>Selected Entity</h4>

<p><strong>Relevance:</strong> ${{node.relevance.toFixed(3)}}</p>

<p><strong>Query Entity:</strong> ${{node.isQueryEntity ? 'Yes' : 'No'}}</p>

}}

</script>

</body>

</html>

"""

return html_template

The visualization system provides multiple levels of interaction and detail. The main graph view offers an overview of entity relationships with query-specific highlighting. Interactive elements allow users to explore connections and understand how different pieces of information relate to their queries. The sidebar provides analytical details about the retrieved knowledge, helping users understand the system’s reasoning process.

The implementation balances visual clarity with information density. Node sizing reflects relevance scores, color coding indicates entity types, and highlighting distinguishes query-specific entities from general context. This visual language helps users quickly identify the most important information while maintaining awareness of the broader knowledge context.

Limitations and Future Directions

While RAG^2 represents a significant advancement in retrieval-augmented generation, several limitations and opportunities for future development remain. Understanding these constraints helps set appropriate expectations and guides future research directions.

Current Limitations

Computational Complexity: The dual retrieval mechanism inherently increases computational requirements compared to traditional RAG systems. Graph traversal operations can become expensive for large, highly connected knowledge graphs, particularly when processing complex queries that require extensive relationship exploration.

Quality Dependencies: The effectiveness of RAG^2 heavily depends on the quality of both document preprocessing and knowledge graph construction. Poor entity recognition or relationship extraction can significantly impact the system’s ability to provide relevant graph-based context, potentially making the additional complexity counterproductive.

Ontology Requirements: While automatic ontology generation provides flexibility, domain-specific applications often require carefully crafted ontologies to achieve optimal performance. This requirement can create barriers to adoption in domains where ontological expertise is limited.

Scalability Challenges: As document collections and knowledge graphs grow, maintaining performance becomes increasingly challenging. The system must balance comprehensiveness with responsiveness, often requiring difficult trade-offs between result quality and response time.

Future Research Directions

Adaptive Retrieval Strategies: Future systems could implement more sophisticated adaptation mechanisms that dynamically adjust the balance between semantic and graph-based retrieval based on query characteristics, user preferences, and historical performance patterns.

Improved Knowledge Graph Construction: Advances in natural language processing, particularly in entity linking and relationship extraction, could significantly improve the quality and coverage of automatically constructed knowledge graphs.

Integration with large language models for relationship validation and enhancement represents a promising direction.

Cross-Modal Integration: Extending RAG^2 to handle multi-modal content including images, videos, and structured data could greatly expand its applicability. This would require developing unified representation schemes that can capture relationships across different data modalities.

Federated Knowledge Graphs: Implementing systems that can seamlessly integrate knowledge graphs from multiple sources while maintaining privacy and security constraints could enable more comprehensive knowledge representation without centralized data storage requirements.

Real-Time Adaptation: Developing systems that can continuously learn and adapt their knowledge representations based on new documents, user interactions, and feedback could improve both accuracy and relevance over time.

Emerging Technologies and Integration

Large Language Model Integration: Closer integration with large language models could enhance both the construction and querying of knowledge graphs. LLMs could assist in entity resolution, relationship validation, and query interpretation while benefiting from the structured knowledge representation for more accurate and consistent responses.

Quantum Computing Applications: As quantum computing matures, quantum algorithms for graph traversal and vector similarity search could potentially address some of the computational limitations of current RAG^2 implementations.

Edge Computing Deployment: Developing lightweight versions of RAG^2 systems suitable for edge deployment could enable privacy-preserving applications and reduce latency for geographically distributed users.

Conclusion

RAG^2 represents a significant evolution in retrieval-augmented generation systems, successfully combining the complementary strengths of semantic search and structured knowledge representation. By implementing dual retrieval mechanisms that operate in parallel, these systems can provide more comprehensive, contextually rich, and structurally coherent responses to complex queries.

The key innovations of RAG^2 lie in its adaptive ontology management, sophisticated query processing that balances multiple retrieval strategies, and intelligent result combination that preserves both semantic relevance and relationship context. The system’s ability to operate with either user-provided domain ontologies or automatically generated knowledge structures makes it applicable across diverse domains and use cases.

Through comprehensive evaluation across legal document analysis, scientific literature review, and corporate knowledge management applications, RAG^2 demonstrates particular value for complex queries requiring relationship understanding and multi-entity reasoning. The system shows measurable improvements in completeness and coherence for these challenging query types while maintaining competitive performance for simpler factual questions.

The implementation challenges of RAG^2 systems—including computational complexity, quality dependencies, and scalability considerations—are offset by the substantial improvements in query answering capabilities for knowledge-intensive applications.

As organizations continue to accumulate vast amounts of interconnected information, the need for systems that can understand both content and context becomes increasingly critical. RAG^2 provides a practical approach to this challenge, offering a path forward for more intelligent, comprehensive, and reliable knowledge retrieval systems.

The future development of RAG^2 systems will likely focus on improved automation of knowledge graph construction, better integration with emerging AI technologies, and enhanced scalability for enterprise applications. As these systems mature, they promise to become essential tools for navigating the increasingly complex landscape of organizational and domain-specific knowledge.

For practitioners looking to implement RAG^2 systems, success depends on careful attention to document quality, thoughtful ontology design, robust infrastructure planning, and continuous monitoring and optimization. The investment in this additional complexity is justified by the significant improvements in query answering quality and the system’s ability to handle sophisticated information needs that traditional RAG systems cannot address effectively.

RAG^2 thus represents not just an incremental improvement, but a fundamental advancement in how we approach the challenge of making vast amounts of interconnected information accessible, understandable, and actionable for users across diverse domains and applications.

Hitchhiker's Guide to AI, Software Architecture, and Everything Else

Wednesday, June 25, 2025

Building RAG^2: A Comprehensive Guide to Combining RAG and GraphRAG

Introduction to RAG^2

Understanding Traditional RAG vs GraphRAG

System Architecture Overview

Ontology Management Implementation

Knowledge Graph Construction

Document Processing Pipeline

Query Processing and Retrieval

Visualization Implementation

Limitations and Future Directions

Future Research Directions

Emerging Technologies and Integration

Conclusion

No comments:

About Me