Friday, May 16, 2025

The Emergence of AI-Powered Coding Assistants: A Deep Dive into CodeAgent

Introduction 

In the rapidly evolving landscape of software development, artificial intelligence has begun to transform how programmers approach their craft. Among the most promising developments is the rise of AI-powered coding assistants - intelligent systems designed to augment human developers by providing real-time analysis, generating code, and serving as knowledgeable programming partners. Today, we explore one such implementation: CodeAgent, a specialized AI assistant built on the Mistral language model through Ollama that focuses exclusively on helping programmers write, analyze, and improve their code.


Understanding AI Agents


Before delving into the specifics of CodeAgent, it's important to understand what an AI agent actually is. An AI agent represents a significant evolution beyond the traditional chatbot or simple query-response system. Where earlier systems were designed primarily to answer questions in isolation, an AI agent possesses a degree of autonomy and persistence, capable of taking actions on behalf of users and maintaining context through extended interactions.


An AI agent combines several key capabilities that distinguish it from simpler AI systems. It maintains awareness of its environment, processes user inputs through sophisticated language understanding, makes decisions about appropriate actions, executes those actions independently, and learns from interactions to improve future performance. In the context of programming, this translates to a system that can understand code, identify issues, suggest improvements, and even generate new implementations based on natural language descriptions.


The most advanced AI agents implement a form of reasoning that allows them to break complex problems down into manageable steps, evaluate alternative approaches, and select optimal solutions. They also maintain a conversational memory, allowing them to reference previous exchanges and build upon established context rather than treating each interaction as isolated.


Introducing CodeAgent


CodeAgent represents an implementation of these agent principles specifically oriented toward programming tasks. Built using Python and leveraging the powerful Mistral language model through the Ollama framework, CodeAgent offers programmers a specialized assistant that focuses exclusively on code-related tasks. Unlike general-purpose AI assistants, CodeAgent's training and system prompts are optimized for understanding programming concepts, recognizing code patterns, and generating high-quality, well-documented source code across numerous programming languages.


The application provides several core functionalities that make it valuable to developers of varying experience levels. It can thoroughly analyze existing code to identify strengths, weaknesses, potential bugs, and optimization opportunities. It generates fresh code implementations based on natural language descriptions of desired functionality. It searches the internet for programming references, ensuring its recommendations reflect current best practices. Perhaps most importantly, it engages in ongoing, interactive dialogue with programmers about their code and development challenges.


Prerequisites and System Requirements


Before using CodeAgent, several prerequisites must be satisfied. First and foremost, the system requires Python 3.8 or higher installed on your computer. The application has been designed to work across multiple operating systems including Linux, macOS, and Windows, though the user experience may vary slightly between platforms.


Ollama serves as the foundational infrastructure for running the Mistral language model locally. Ollama must be installed and properly configured on your system before CodeAgent can function. Ollama provides a containerized environment for running language models locally, eliminating the need for cloud-based API calls and ensuring all processing happens on your own hardware. The official Ollama installation guide at https://github.com/ollama/ollama provides detailed instructions for each operating system.


Several Python packages are required dependencies for CodeAgent to function properly. These include the ollama package for interacting with the Ollama service, requests for handling HTTP communications during web searches, beautifulsoup4 for parsing web content, and argparse for handling command-line arguments. These dependencies can be installed via pip, Python's package manager.


Hardware requirements vary depending on the chosen configuration. At minimum, running the Mistral model through Ollama requires approximately 8GB of RAM, though 16GB or more is recommended for optimal performance. Storage requirements include approximately 4-8GB for the Mistral model itself, which will be downloaded the first time CodeAgent runs. For those wishing to leverage GPU acceleration, an NVIDIA GPU with CUDA support or an Apple Silicon processor with Metal support is required, though the application will fall back to CPU processing if no compatible GPU is detected.


Internet connectivity is required for the web search functionality and for the initial model download, though once the model is downloaded, the core code analysis and generation features can function offline.


Installation and Setup


Setting up CodeAgent involves several straightforward steps. First, ensure that Python 3.8+ is installed on your system by running `python --version` in your terminal or command prompt. If Python is not installed or is an older version, download and install the latest version from the official Python website.


Next, install Ollama by following the instructions for your operating system at the Ollama GitHub repository. For Linux systems, this typically involves running a shell script. For macOS, Ollama can be installed via Homebrew or by downloading the application directly. For Windows, an installer is provided.


Once Ollama is installed, save the CodeAgent.py file to your preferred location. This file contains the entire implementation and can be placed anywhere on your filesystem that you have write access to. No additional configuration files are required.


Install the required Python dependencies by running the following command in your terminal:


pip install ollama requests beautifulsoup4 argparse


At this point, the basic setup is complete and CodeAgent is ready to use. The first time you run the application, it will automatically download the Mistral model through Ollama, which may take several minutes depending on your internet connection speed. This download only happens once; subsequent runs will use the locally stored model.


Running CodeAgent


To start CodeAgent with default settings, simply navigate to the directory containing codeagent.py in your terminal and run:


python codeagent.py



This launches an interactive session where you can enter queries and receive responses. The application displays a welcome message indicating whether GPU acceleration is active, and provides brief instructions on how to structure your requests.


For users who prefer more control over the configuration, several command-line arguments are available:


python codeagent.py --model mistral --temperature 0.7 --verbose --search-limit 3 --context-window 5 --no-gpu


The `--model` argument allows you to specify which Ollama model to use, with "mistral" as the default. Advanced users might experiment with other models like "codellama" if they have them installed in Ollama.


The `--temperature` argument controls the randomness in the model's responses, with higher values (closer to 1.0) producing more creative but potentially less focused answers, and lower values (closer to 0.0) producing more deterministic and conservative responses.


The `--verbose` flag enables detailed logging of operations, which can be helpful for troubleshooting or understanding the application's decision-making process.


The `--search-limit` argument determines how many web search results will be retrieved when looking for programming references, with the default set to 3.


The `--context-window` argument controls how many previous exchanges are included in the conversation context, affecting the model's ability to reference earlier discussions.


The `--no-gpu` flag disables GPU acceleration even if compatible hardware is detected, which can be useful for testing or if you're experiencing GPU-related issues.


For users who want to process a single query without entering an interactive session, the `--query` argument allows you to pass your question directly:


python codeagent.py --query "Create a quicksort algorithm in Python"



This processes the query and outputs the response before exiting, which is particularly useful for integration with scripts or other automation.


Technical Architecture


The implementation of CodeAgent demonstrates thoughtful engineering practices throughout its architecture. At its core sits the Mistral language model, accessed through Ollama, which provides the foundation for understanding both natural language and programming syntax. The application wraps this capability in a purposeful interface designed specifically for code-related tasks.


CodeAgent implements a class-based architecture with methods specialized for different aspects of its functionality. The main CodeAgent class handles initialization, model loading, and conversation management. Specialized methods within this class manage specific tasks like code analysis, code generation, web searching, and webpage content extraction.


Performance optimization receives particular attention in the implementation. The application includes the ability to leverage GPU acceleration through either CUDA (for NVIDIA graphics cards) or MPS (for Apple Silicon), automatically detecting available hardware capabilities and configuring itself accordingly. For systems without specialized hardware, it gracefully falls back to CPU processing.


The conversation management system maintains context through previous exchanges, allowing for coherent, ongoing interactions rather than treating each query in isolation. This enables the agent to build upon previous discussions, reference earlier code snippets, and maintain awareness of the programmer's overall goals throughout a session.


Interacting with CodeAgent


Users interact with CodeAgent through a straightforward command-line interface that accepts natural language queries about programming topics. The system uses pattern recognition to determine whether the user is requesting code analysis, code creation, or information retrieval, then processes the request accordingly.


For code analysis, users can paste code snippets directly into their query, preferably using triple backtick formatting for clarity. CodeAgent then examines the code, evaluating its structure, identifying potential issues, suggesting optimizations, and explaining complex sections. The analysis considers not just syntactic correctness but also adherence to best practices, security implications, and performance characteristics.


When users request code creation, they describe their requirements in natural language, optionally specifying the target programming language, desired functionality, and any constraints. CodeAgent responds by first explaining its approach to the problem, then providing well-commented, efficient code that implements the requested functionality. It frequently includes usage examples to demonstrate how the code should be invoked.


For information retrieval, CodeAgent searches the web for programming references relevant to the user's query. It extracts content from webpages with particular emphasis on code blocks and examples, synthesizes this information into a coherent response, and provides source attributions so users can explore further if desired.


Throughout all these interactions, CodeAgent maintains a conversational memory that allows it to reference previous exchanges, building a more coherent and contextually aware experience than would be possible with isolated queries.


Advanced Features


Several sophisticated features distinguish CodeAgent from simpler implementations. The system attempts to automatically start the Ollama service if it's not already running, simplifying setup for users. It implements robust error handling throughout its codebase, degrading gracefully when faced with unexpected situations rather than failing completely.


The web search functionality implements specialized parsing for programming-related content, prioritizing code snippets and examples in its extraction process. This ensures that when CodeAgent references external sources, it focuses on the most immediately applicable information rather than general descriptions.


Command-line arguments provide extensive customization options, allowing users to specify their preferred model, adjust temperature settings for response generation, enable verbose logging, limit search results, configure context window size, and toggle GPU acceleration.


Performance monitoring is integrated throughout the application, with response times calculated and displayed for each interaction. This provides users with transparency about processing requirements and helps identify potential bottlenecks.


Educational and Productivity Applications


CodeAgent offers significant value across the spectrum of programming experience levels. For beginners, it serves as an interactive programming tutor, explaining concepts, generating examples, analyzing practice code, and providing gentle correction when mistakes are made. The ability to ask follow-up questions makes it particularly valuable for clarifying confusing topics.


Intermediate programmers benefit from CodeAgent's ability to suggest optimizations and best practices that might otherwise be overlooked. By having code regularly analyzed for improvement opportunities, these developers can accelerate their progression toward advanced techniques and patterns.


For experienced developers, CodeAgent functions more as a productivity accelerator, quickly generating boilerplate code, researching API details, proposing implementations for described functionality, and serving as a sounding board for architectural decisions. These programmers often use the agent to explore implementation alternatives more rapidly than they could through manual research.


Across all experience levels, CodeAgent excels at explaining unfamiliar code. When encountering a codebase for the first time, developers can ask the agent to analyze specific functions or classes, receiving plain-language explanations that highlight the purpose and approach of the code rather than just reciting the implementation details.


Limitations and Ethical Considerations


Despite its capabilities, CodeAgent inherits certain limitations from its underlying language model. Like all current AI systems, it lacks true understanding of program semantics in the way human programmers develop through experience. Its suggestions, while often valuable, should be critically evaluated rather than blindly implemented, particularly for security-sensitive applications.


The web search functionality, while useful for finding references and examples, cannot guarantee the quality or correctness of sources it discovers. Information retrieved from the internet should be verified, especially when implementing critical systems or security features.


Ethical considerations around AI coding assistants generally apply to CodeAgent as well. Questions about appropriate attribution when implementing AI-suggested code, the potential impact on programming education, and the long-term effects on the programming profession deserve thoughtful consideration. CodeAgent represents a tool to augment human programmers rather than replace them, but its implications for the industry continue to evolve alongside the technology itself.


Future Directions


Several promising directions exist for further development of CodeAgent and similar systems. Integration with version control systems would allow the agent to understand project context more thoroughly, considering not just isolated code snippets but their relationship to the broader codebase. Connection with IDE extensions could provide a more seamless experience than the current command-line interface.


More sophisticated static analysis capabilities could enhance the depth of code reviews, identifying subtle bugs and security vulnerabilities that might elude current detection methods. Support for more specialized programming domains like embedded systems, machine learning frameworks, or game development could provide targeted assistance for these areas.


The implementation of more advanced reasoning capabilities represents perhaps the most promising frontier. Future versions might incorporate structured approaches to breaking down complex programming challenges, evaluating multiple potential solutions, or even generating comprehensive test suites to verify implemented functionality.


Conclusion


CodeAgent exemplifies the potential of specialized AI agents to transform specific domains through focused application of large language models. By tailoring its capabilities specifically to programming tasks, it provides more valuable assistance than general-purpose AI systems while maintaining the conversational fluidity that makes these tools accessible.


For programmers seeking to enhance their productivity, learn new concepts, or improve their code quality, tools like CodeAgent represent valuable additions to their development environment. As these technologies continue to evolve, the partnership between human programmers and AI assistants promises to unlock new levels of creativity and efficiency in software development.


The future of programming likely involves this kind of human-AI collaboration, with each party contributing their unique strengths: humans providing creativity, intent, and ethical judgment; AI assistants offering knowledge breadth, pattern recognition, and implementation assistance. CodeAgent offers an early glimpse of this collaborative future, demonstrating both the current capabilities and the future potential of AI-powered programming assistants.​​​​​​​​​​​​​​​​


Full Source Code (copy to codeagent.py)


#!/usr/bin/env python3

"""

Code Analysis and Creation Agent using Mistral on Ollama


This script creates an LLM agent specialized in code analysis and creation

using the Mistral model on Ollama. It supports:

- Analyzing existing code

- Creating new code based on requirements

- Searching the internet for programming references

- Interactive dialogue with users about code

- GPU acceleration via CUDA or Apple MPS when available


Requirements:

- Python 3.8+

- ollama package

- requests package

- beautifulsoup4 package

- argparse package


Install dependencies:

pip install ollama requests beautifulsoup4 argparse

"""


import ollama

import requests

import argparse

import json

import re

import sys

import os

import platform

import subprocess

from typing import Dict, List, Any, Optional, Union

from bs4 import BeautifulSoup

from datetime import datetime

import time


class CodeAgent:

    """An LLM agent specialized in code analysis and creation using Mistral on Ollama."""

    

    def __init__(self, model: str = "mistral", temperature: float = 0.7, 

                 verbose: bool = False, search_limit: int = 3, 

                 context_window: int = 5, use_gpu: bool = True):

        """

        Initialize the code agent.

        

        Args:

            model: Name of the Ollama model to use (default: "mistral")

            temperature: Sampling temperature (default: 0.7)

            verbose: Whether to print verbose output (default: False)

            search_limit: Maximum number of web search results (default: 3)

            context_window: Number of previous exchanges to include in context (default: 5)

            use_gpu: Whether to use GPU acceleration if available (default: True)

        """

        self.model = model

        self.temperature = temperature

        self.verbose = verbose

        self.search_limit = search_limit

        self.context_window = context_window

        self.conversation_history = []

        self.use_gpu = use_gpu

        

        # Check for GPU availability

        self.gpu_info = self._check_gpu()

        

        # System prompt that specializes the agent for code tasks

        self.system_prompt = """You are a specialized programming assistant focused on analyzing and creating code.

Your primary capabilities include:

1. Analyzing code for readability, efficiency, security issues, and best practices

2. Creating code based on user requirements

3. Searching for programming references when needed

4. Explaining code concepts clearly and accurately


When analyzing code:

- Identify patterns and anti-patterns

- Suggest optimizations where appropriate

- Comment on code structure and organization

- Highlight potential security issues

- Explain complex parts in simple terms


When creating code:

- Start with a brief explanation of your approach

- Write clean, well-documented code

- Follow language-specific best practices

- Provide usage examples when appropriate


Always respond in a structured way:

1. First, understand the request and clarify what is needed

2. If analyzing code, provide a detailed assessment

3. If creating code, explain your approach before presenting the solution

4. Include relevant references or sources when appropriate


Return code in properly formatted code blocks with appropriate language syntax highlighting.

"""

        

        # Check if Ollama is installed and the model is available

        self._check_ollama_model()

        

        if verbose:

            print(f"CodeAgent initialized with model: {self.model}, temperature: {self.temperature}")

    

    def _check_gpu(self) -> Dict[str, Any]:

        """

        Check for GPU availability (CUDA or Apple MPS).

        

        Returns:

            Dictionary with GPU information

        """

        gpu_info = {

            "available": False,

            "type": None,

            "device": None,

            "name": None

        }

        

        system = platform.system()

        

        # Check for CUDA

        if system in ["Linux", "Windows"]:

            try:

                # Try to detect NVIDIA GPU using nvidia-smi

                result = subprocess.run(

                    ["nvidia-smi", "--query-gpu=name,memory.total", "--format=csv,noheader"],

                    capture_output=True,

                    text=True,

                    check=False

                )

                

                if result.returncode == 0 and result.stdout.strip():

                    gpu_info["available"] = True

                    gpu_info["type"] = "CUDA"

                    gpu_info["name"] = result.stdout.strip()

                    

                    # Configure Ollama to use CUDA

                    os.environ["OLLAMA_USE_CUDA"] = "1"

                    

                    if self.verbose:

                        print(f"CUDA GPU detected: {gpu_info['name']}")

                

            except Exception as e:

                if self.verbose:

                    print(f"Error checking CUDA GPU: {e}")

        

        # Check for Apple MPS (Metal Performance Shaders)

        elif system == "Darwin":  # macOS

            try:

                # Check if running on Apple Silicon

                result = subprocess.run(

                    ["sysctl", "-n", "machdep.cpu.brand_string"],

                    capture_output=True,

                    text=True,

                    check=False

                )

                

                if "Apple" in result.stdout:

                    gpu_info["available"] = True

                    gpu_info["type"] = "MPS"

                    gpu_info["name"] = "Apple Silicon GPU"

                    

                    # Configure Ollama to use Metal

                    os.environ["OLLAMA_USE_METAL"] = "1"

                    

                    if self.verbose:

                        print("Apple Silicon GPU (MPS) detected")

            

            except Exception as e:

                if self.verbose:

                    print(f"Error checking Apple MPS: {e}")

        

        # If GPU is available and the user doesn't want to use it, warn them

        if gpu_info["available"] and not self.use_gpu:

            if self.verbose:

                print(f"Warning: {gpu_info['type']} GPU available but not being used due to use_gpu=False")

            

            # Unset environment variables if GPU is disabled

            if gpu_info["type"] == "CUDA":

                os.environ.pop("OLLAMA_USE_CUDA", None)

            elif gpu_info["type"] == "MPS":

                os.environ.pop("OLLAMA_USE_METAL", None)

            

            gpu_info["available"] = False

        

        # If GPU is required but not available, warn the user

        if self.use_gpu and not gpu_info["available"]:

            print("Warning: GPU acceleration requested but no compatible GPU found. Using CPU instead.")

        

        return gpu_info


    def _check_ollama_model(self):

        """Check if Ollama is installed and the specified model is available."""

        try:

            # Check if Ollama is running

            try:

                models = ollama.list()

            except Exception:

                print("Ollama service not detected. Attempting to start Ollama...")

                self._start_ollama_service()

                

                # Try again after starting

                models = ollama.list()

            

            # List available models

            model_names = [model.get('name', '') for model in models.get('models', [])]

            

            if self.model not in model_names:

                print(f"Model '{self.model}' not found. Attempting to pull it...")

                

                # Set up pull options based on GPU availability

                pull_options = {}

                if self.gpu_info["available"]:

                    if self.gpu_info["type"] == "CUDA":

                        pull_options["cuda"] = True

                    elif self.gpu_info["type"] == "MPS":

                        pull_options["metal"] = True

                

                # Pull the model with appropriate options

                ollama.pull(self.model, **pull_options)

                print(f"Successfully pulled model '{self.model}'")

                

                if self.gpu_info["available"]:

                    print(f"Model configured to use {self.gpu_info['type']} acceleration")

                

        except Exception as e:

            print(f"Error checking Ollama model: {e}")

            print("Please ensure Ollama is installed and running.")

            print("Installation guide: https://github.com/ollama/ollama")

            sys.exit(1)

            

    def _start_ollama_service(self):

        """Attempt to start the Ollama service if not running."""

        system = platform.system()

        

        try:

            if system == "Linux":

                # Try systemd service first

                subprocess.run(["systemctl", "start", "ollama"], check=False)

            elif system == "Darwin":  # macOS

                subprocess.run(["open", "-a", "Ollama"], check=False)

            elif system == "Windows":

                # On Windows, try to start via the Start Menu

                subprocess.Popen(

                    ["powershell", "-Command", "Start-Process 'Ollama'"],

                    stdout=subprocess.DEVNULL,

                    stderr=subprocess.DEVNULL

                )

            

            # Wait a moment for the service to start

            print("Waiting for Ollama service to start...")

            time.sleep(5)

            

        except Exception as e:

            print(f"Error starting Ollama service: {e}")

            print("You may need to start Ollama manually before running this script.")


    def search_web(self, query: str) -> List[Dict[str, str]]:

        """

        Search the web for programming references.

        

        Args:

            query: Search query string

            

        Returns:

            List of search results with titles and snippets

        """

        # Add programming-specific terms to the query for better results

        programming_query = f"{query} programming code example"

        

        try:

            # Use DuckDuckGo search API (lite version)

            headers = {

                'User-Agent': 'CodeAgent/1.0 (Educational Programming Assistant)'

            }

            search_url = f"https://lite.duckduckgo.com/lite/?q={programming_query.replace(' ', '+')}"

            response = requests.get(search_url, headers=headers)

            

            if response.status_code != 200:

                print(f"Search error: {response.status_code}")

                return []

            

            # Parse results

            soup = BeautifulSoup(response.text, 'html.parser')

            results = []

            

            # DuckDuckGo lite results are in a simple HTML structure

            for i, result in enumerate(soup.select('a[href^="https"]')):

                if i >= self.search_limit:

                    break

                    

                title = result.text.strip()

                url = result.get('href')

                

                # Get the snippet (text following the link)

                snippet_elem = result.find_next('td')

                snippet = snippet_elem.text.strip() if snippet_elem else ""

                

                if title and url and not url.startswith("https://duckduckgo.com"):

                    results.append({

                        'title': title,

                        'url': url,

                        'snippet': snippet

                    })

            

            if self.verbose:

                print(f"Found {len(results)} search results for query: {query}")

                

            return results

            

        except Exception as e:

            print(f"Error during web search: {e}")

            return []


    def fetch_webpage_content(self, url: str) -> str:

        """

        Fetch content from a webpage, focusing on code sections.

        

        Args:

            url: URL to fetch

            

        Returns:

            Extracted content with emphasis on code blocks

        """

        try:

            headers = {

                'User-Agent': 'CodeAgent/1.0 (Educational Programming Assistant)'

            }

            response = requests.get(url, headers=headers, timeout=10)

            

            if response.status_code != 200:

                return f"Error fetching page: {response.status_code}"

                

            soup = BeautifulSoup(response.text, 'html.parser')

            

            # Remove script and style elements

            for script in soup(["script", "style"]):

                script.extract()

                

            # Extract code blocks with priority

            code_blocks = []

            

            # Common code block elements

            for code_tag in soup.select('pre, code, .highlight, .code'):

                code_blocks.append(f"```\n{code_tag.get_text().strip()}\n```")

                

            # Extract main content

            text = soup.get_text()

            

            # Clean up whitespace

            lines = (line.strip() for line in text.splitlines())

            chunks = (phrase.strip() for line in lines for phrase in line.split("  "))

            text = '\n'.join(chunk for chunk in chunks if chunk)

            

            # Limit content length

            max_length = 5000

            if len(text) > max_length:

                text = text[:max_length] + "...(content truncated)"

                

            # Combine code blocks and text

            full_content = "\n\n".join(code_blocks) + "\n\n" + text

            

            if self.verbose:

                print(f"Fetched {len(full_content)} characters from {url}")

                

            return full_content

            

        except Exception as e:

            return f"Error fetching page: {e}"


    def analyze_code(self, code: str) -> str:

        """

        Analyze the provided code snippet.

        

        Args:

            code: Code snippet to analyze

            

        Returns:

            Analysis report

        """

        prompt = f"""Analyze the following code thoroughly:


```

{code}

```


Please provide a detailed analysis covering:

1. Overview of what the code does

2. Code structure and organization

3. Potential bugs or issues

4. Performance considerations

5. Security concerns (if applicable)

6. Adherence to best practices

7. Suggested improvements

"""

        

        return self.generate_response(prompt)


    def create_code(self, requirements: str) -> str:

        """

        Create code based on the provided requirements.

        

        Args:

            requirements: Code requirements

            

        Returns:

            Generated code with explanation

        """

        prompt = f"""Create code based on these requirements:


{requirements}


First explain your approach, then provide well-commented, efficient, and clean code that meets these requirements.

Include usage examples when appropriate.

"""

        

        return self.generate_response(prompt)


    def generate_response(self, prompt: str) -> str:

        """

        Generate a response using the Ollama Mistral model.

        

        Args:

            prompt: User prompt

            

        Returns:

            Generated response

        """

        # Build context from conversation history

        context = []

        for i in range(max(0, len(self.conversation_history) - self.context_window), 

                       len(self.conversation_history)):

            context.append(self.conversation_history[i])

        

        try:

            # Prepare the full prompt with system message and context

            messages = [

                {

                    "role": "system",

                    "content": self.system_prompt

                }

            ]

            

            # Add conversation context

            for exchange in context:

                messages.append({

                    "role": "user",

                    "content": exchange["user"]

                })

                messages.append({

                    "role": "assistant",

                    "content": exchange["assistant"]

                })

            

            # Add the current prompt

            messages.append({

                "role": "user",

                "content": prompt

            })

            

            # Generate response

            # Note: Ollama Python client may have different parameter support in different versions

            try:

                # Try with temperature parameter

                response = ollama.chat(

                    model=self.model,

                    messages=messages,

                    temperature=self.temperature,

                    stream=False

                )

            except TypeError:

                # Fall back if temperature isn't supported

                response = ollama.chat(

                    model=self.model,

                    messages=messages,

                    stream=False

                )

                if self.verbose:

                    print("Note: Temperature parameter not supported in this Ollama version")

            

            # Extract the response content

            response_content = response['message']['content']

            

            # Update conversation history

            self.conversation_history.append({

                "user": prompt,

                "assistant": response_content

            })

            

            return response_content

            

        except Exception as e:

            error_msg = f"Error generating response: {e}"

            print(error_msg)

            return error_msg


    def process_query(self, query: str) -> str:

        """

        Process a user query to determine the appropriate action.

        

        Args:

            query: User query

            

        Returns:

            Response to the query

        """

        query_lower = query.lower()

        

        # Check if this is a code analysis request

        if "analyze" in query_lower and ("code" in query_lower or "function" in query_lower):

            # Extract code from the query (assuming code is the largest code-formatted block)

            code_pattern = r"```[\w]*\n([\s\S]*?)\n```"

            code_matches = re.findall(code_pattern, query)

            

            if code_matches:

                # Use the largest code block

                code = max(code_matches, key=len)

                return self.analyze_code(code)

            else:

                # Try to find unformatted code

                code_pattern = r"(?:function|def|class|import|from|public|private)[\s\S]*?(?=\n\n|\Z)"

                code_matches = re.findall(code_pattern, query)

                

                if code_matches:

                    code = max(code_matches, key=len)

                    return self.analyze_code(code)

                else:

                    return "I didn't find any code to analyze in your message. Please provide code within triple backticks (```code```) or make sure your code is clearly identifiable."

        

        # Check if this is a code creation request

        elif any(keyword in query_lower for keyword in ["create", "write", "implement", "develop", "generate"]) and \

             any(keyword in query_lower for keyword in ["code", "function", "program", "script", "class", "algorithm"]):

            return self.create_code(query)

        

        # Check if this is a web search request

        elif "search" in query_lower or "look up" in query_lower or "find information" in query_lower:

            search_results = self.search_web(query)

            

            if not search_results:

                return "I couldn't find any relevant information. Could you please rephrase your query?"

            

            # Format search results

            result_text = "Here's what I found:\n\n"

            for i, result in enumerate(search_results):

                result_text += f"{i+1}. **{result['title']}**\n"

                result_text += f"   {result['snippet']}\n"

                result_text += f"   URL: {result['url']}\n\n"

            

            # Fetch content from the most relevant result

            best_result = search_results[0]

            content = self.fetch_webpage_content(best_result['url'])

            

            # Generate a response based on the search results and fetched content

            synthesis_prompt = f"""Based on the search results for "{query}", 

            particularly the content from {best_result['title']} ({best_result['url']}), 

            provide a helpful response. 

            

            Here's the content from the top result:

            

            {content[:3000]}

            

            Create a comprehensive response that addresses the original query: "{query}"

            """

            

            synthesis = self.generate_response(synthesis_prompt)

            

            # Combine search results with synthesized answer

            return f"{synthesis}\n\n---\n\n{result_text}"

        

        # Default: general code-related query

        else:

            return self.generate_response(query)


    def interactive_session(self):

        """Start an interactive session with the agent."""

        gpu_status = ""

        if self.gpu_info["available"]:

            gpu_status = f" with {self.gpu_info['type']} acceleration"

            

        print(f"🤖 Code Agent ({self.model}{gpu_status}) initialized. Enter 'exit' or 'quit' to end the session.")

        print("You can ask me to analyze code, create code, or search for programming information.")

        print("For code analysis, paste your code between triple backticks (```code```).")

        

        while True:

            try:

                # Get user input

                user_input = input("\n👤 You: ")

                

                if user_input.lower() in ["exit", "quit", "bye"]:

                    print("🤖 Code Agent: Goodbye! Happy coding!")

                    break

                

                print("\n🤔 Thinking...")

                start_time = time.time()

                

                # Process the query

                response = self.process_query(user_input)

                

                # Calculate response time

                end_time = time.time()

                duration = end_time - start_time

                

                # Print the response

                print(f"\n🤖 Code Agent: (response time: {duration:.2f}s)")

                print(response)

                

            except KeyboardInterrupt:

                print("\n\n🤖 Code Agent: Session interrupted. Goodbye!")

                break

            except Exception as e:

                print(f"\n🤖 Code Agent: I encountered an error: {e}")



def main():

    """Main function to run the code agent."""

    parser = argparse.ArgumentParser(description="Code Analysis and Creation Agent using Mistral on Ollama")

    parser.add_argument("--model", type=str, default="mistral", 

                      help="Name of the Ollama model to use (default: mistral)")

    parser.add_argument("--temperature", type=float, default=0.7,

                      help="Sampling temperature (default: 0.7)")

    parser.add_argument("--verbose", action="store_true", 

                      help="Print verbose output")

    parser.add_argument("--search-limit", type=int, default=3,

                      help="Maximum number of web search results (default: 3)")

    parser.add_argument("--context-window", type=int, default=5,

                      help="Number of previous exchanges to include in context (default: 5)")

    parser.add_argument("--query", type=str, 

                      help="Single query to process (if not provided, interactive mode is started)")

    parser.add_argument("--no-gpu", action="store_true",

                      help="Disable GPU acceleration even if available")

    

    args = parser.parse_args()

    

    # Initialize the agent

    agent = CodeAgent(

        model=args.model,

        temperature=args.temperature,

        verbose=args.verbose,

        search_limit=args.search_limit,

        context_window=args.context_window,

        use_gpu=not args.no_gpu

    )

    

    # Either process a single query or start an interactive session

    if args.query:

        response = agent.process_query(args.query)

        print(response)

    else:

        agent.interactive_session()



if __name__ == "__main__":

    main()

No comments: