BUILDING AGENTIC AI SYSTEMS IN GO A Deep Technical Tutorial Based on the GoAgent Framework
By a practitioner, for practitioners. This tutorial walks you through the complete design and implementation of a production-grade agentic AI system written in Go. We use the GoAgent codebase as our running example throughout, examining every layer of the architecture from the LLM provider abstraction down to the messaging adapter plug-in system. Agents like Hermes (a general-purpose self-improving assistant) and OpenClaw (a meticulous research agent) are built and explained from first principles.
Readers are expected to be comfortable with Go and to have a working understanding of how large language models (LLMs) work at the API level. Knowledge of OpenAI's chat-completions API format is helpful but not required.
PART 1: WHAT IS AN AGENTIC AI SYSTEM, AND WHY BUILD ONE IN GO?
The term "agent" in AI has been overloaded to the point of meaninglessness in marketing material, so let us be precise. An agentic AI system is a program that uses a large language model not merely to answer a single question, but to reason about a goal, decide which actions to take in order to achieve it, take those actions by calling external tools or APIs, observe the results, and continue reasoning until the goal is reached or a resource limit is hit. The key insight is that the LLM is used as a reasoning engine inside a loop, not as a one-shot question-answering oracle.
This is fundamentally different from a chatbot. A chatbot takes a message, sends it to an LLM, and returns the reply. An agent takes a goal, asks the LLM what to do next, does that thing, feeds the result back to the LLM, asks what to do next again, and repeats until the job is done. The agent can search the web, read files, run calculations, write code, send messages, and call any API you give it access to. The LLM provides the intelligence; the framework provides the scaffolding.
Why Go? Several reasons make Go an excellent choice for this kind of system. Go's goroutine model makes it trivially easy to run multiple agents in parallel, to execute several tool calls concurrently within a single agent iteration, and to handle long-polling adapters alongside a CLI without blocking. Go's interfaces make it straightforward to swap LLM providers, tools, and messaging adapters without touching the core agent logic. Go compiles to a single static binary, which makes deployment simple. Go's standard library is rich enough that you need very few external dependencies. And Go's explicit error handling forces you to think carefully about failure modes, which matters a great deal in a system that is making HTTP calls to external APIs in a tight loop.
GoAgent is a production-ready framework that demonstrates all of these properties. Let us look at its architecture before diving into the code.
PART 2: THE ARCHITECTURE OF GOAGENT
The system is organized into eight major subsystems, each living in its own Go package. Understanding how they relate to each other is the key to understanding the whole system.
The LLM package defines a single Provider interface and ships with one implementation: an OpenAI-compatible HTTP client that works with OpenAI, Ollama, Groq, Mistral, and any other service that speaks the OpenAI chat-completions wire format. This is the only place in the entire codebase that knows about HTTP and JSON serialization of LLM requests.
The tools package defines the Tool interface and ships with five built-in implementations: web search (DuckDuckGo or SerpAPI), a mathematical expression evaluator, a sandboxed file reader, a sandboxed file writer, and a whitelisted shell executor. The Registry type holds a collection of tools and can serialize them into the OpenAI function-calling format so the LLM knows what it can call.
The agent package is the heart of the system. It contains the Config struct that describes an agent's personality and resource limits, the Memory struct that manages the conversation history, and the Agent struct itself, which implements the ReACT loop. The ReACT loop is the algorithmic core: Reason, Act, Observe, repeat.
The learning package implements four distinct learning mechanisms: an episodic memory store that records past task outcomes and extracts reusable lessons, a user profile that accumulates facts and preferences about the user, a retrieval-augmented generation (RAG) store that uses TF-IDF scoring to find relevant document chunks, and a reflector that calls the LLM after each task to extract lessons and update all three stores.
The skills package manages Markdown files that describe reusable behavioral patterns. A skill is injected into the system prompt when the agent is initialized, giving it domain-specific knowledge without requiring fine-tuning.
The scheduler package manages the lifecycle of agent jobs. A job is an agent paired with a task and a schedule. Schedules can be "run once immediately", "run once at a specific time", "run every day at a specific time", or "run continuously in a background loop". Each job runs in its own goroutine and reports its status and results through a thread-safe data structure.
The adapters package defines the Adapter interface for external messaging channels and ships with two implementations: a Telegram bot adapter that uses long-polling, and an iMessage adapter that polls the macOS Messages SQLite database. A central message Bus routes incoming messages from all adapters through a single handler function, which calls the gateway agent and routes the reply back to the originating adapter.
The agentconfig package provides YAML-based agent definitions. An operator can define a complete agent, including its personality, tools, schedule, and task, in a YAML file without writing a single line of Go code.
The main.go file ties everything together: it reads environment variables, creates the LLM provider, starts the scheduler, loads YAML agent definitions, starts adapters, and runs a command-line interface that lets the operator interact with the system in real time.
Here is a high-level picture of how these subsystems relate to each other:
+------------------+ +------------------+ +------------------+
| agentconfig | | scheduler | | adapters |
| (YAML loader) |---->| (job lifecycle) | | (Telegram, iMsg)|
+------------------+ +--------+---------+ +--------+---------+
| |
v v
+------------------+ +--------+---------+ +--------+---------+
| learning | | agent |<----| message bus |
| (episodic, RAG, |<--->| (ReACT loop, | | (dispatch loop) |
| profile, refl.) | | memory, steps) | +------------------+
+------------------+ +--------+---------+
|
+--------+---------+
| tools |
| (web, calc, file,|
| shell, custom) |
+--------+---------+
|
+--------+---------+
| llm |
| (Provider iface, |
| OpenAI compat.) |
+------------------+
The dependency graph flows downward. The agent depends on tools and llm. The scheduler depends on agent. The adapters communicate with the agent through the message bus. The learning package is used by the agent but does not depend on it. The agentconfig package is used by main.go to configure agents and the scheduler. This clean layering is what makes the system easy to extend.
PART 3: THE PROJECT STRUCTURE
Before writing any code, it is worth understanding the file layout. GoAgent follows the standard Go project layout with one package per directory.
goagent/
|-- main.go Entry point and CLI
|-- go.mod Module definition and dependencies
|-- .env.example Environment variable documentation
|-- Makefile Common build and run targets
|-- Dockerfile Container image definition
|-- docker-compose.yml Multi-container orchestration
|-- agents/ YAML agent definitions (no code needed)
| |-- hermes.yaml
| `-- openclaw.yaml
|-- agentconfig/ YAML config loader package
| `-- agentconfig.go
|-- agent/ Core agent logic
| |-- agent.go ReACT loop and orchestration
| |-- config.go Config struct and defaults
| `-- memory.go Conversation history management
|-- llm/ LLM provider abstraction
| |-- provider.go Interface and message types
| `-- openai.go OpenAI-compatible HTTP client
|-- tools/ Tool implementations
| |-- tool.go Tool interface and Registry
| |-- web_search.go DuckDuckGo / SerpAPI search
| |-- calculator.go Mathematical expression evaluator
| |-- file_read.go Sandboxed file reader
| |-- file_write.go Sandboxed file writer
| `-- shell.go Whitelisted shell executor
|-- skills/ Skill management
| |-- skill.go Skill loader and manager
| `-- skills/ Skill definition files
| |-- summarizer.md
| `-- researcher.md
|-- learning/ Learning and memory subsystems
| |-- utils.go Shared helpers (atomic write, tokenize)
| |-- episodic.go Episode recording and lesson retrieval
| |-- userprofile.go Persistent user facts and preferences
| |-- rag.go TF-IDF retrieval-augmented generation
| `-- reflection.go Post-task LLM-driven reflection
|-- scheduler/ Agent job lifecycle management
| |-- scheduler.go Manager, AgentJob, execution modes
| `-- cron.go Schedule parsing
|-- adapters/ External messaging channel adapters
| |-- adapter.go Interface, Registry, and message Bus
| |-- telegram/telegram.go Telegram Bot API (long-polling)
| `-- imessage/imessage.go Apple iMessage (SQLite polling)
`-- examples/ Standalone usage examples
|-- basic_agent/main.go
`-- multi_tool_agent/main.go
The go.mod file declares the module path and its dependencies. The dependency list is intentionally minimal, which is a deliberate design choice. Fewer dependencies mean fewer supply-chain risks, faster builds, and easier auditing.
module github.com/ms1963/goagent
go 1.22
require (
github.com/PaesslerAG/gval v1.2.2 // Mathematical expression eval
gopkg.in/yaml.v3 v3.0.1 // YAML parsing for agent configs
modernc.org/sqlite v1.29.9 // Pure-Go SQLite for iMessage
)
The gval library evaluates mathematical expressions like "1337 * 42" or "sqrt(144)" without requiring a full scripting language runtime. The yaml.v3 library handles YAML parsing for agent configuration files. The modernc.org/ sqlite library is a pure-Go SQLite driver with no CGO dependency, which is important for the iMessage adapter that needs to read the macOS Messages database.
PART 4: THE LLM PROVIDER ABSTRACTION
The most important design decision in any agentic AI framework is how to abstract the LLM. If you hard-code calls to the OpenAI API, you lock yourself into one vendor and one pricing model. If you define a clean interface, you can swap providers without touching the agent logic. GoAgent defines this interface in llm/provider.go.
The interface is deliberately minimal. It has exactly two methods: Complete, which sends a conversation to the LLM and returns a response, and ModelName, which returns the identifier of the model being used. Everything else, including authentication, retry logic, and response parsing, is the responsibility of the concrete implementation.
// Package llm defines the LLM provider interface and message types.
// All agent logic depends on this interface, never on a concrete provider.
package llm
import "context"
// Role constants define the four message roles in a conversation.
// These map directly to the OpenAI chat-completions message roles.
const (
RoleSystem = "system" // Instructions and context for the LLM
RoleUser = "user" // Input from the human or the agent framework
RoleAssistant = "assistant" // The LLM's response
RoleTool = "tool" // The result of a tool call
)
// Message represents a single turn in a conversation.
// ToolCallID is set when Role is "tool" to link the result to a specific call.
// ToolCalls is set when Role is "assistant" and the LLM wants to call tools.
type Message struct {
Role string
Content string
ToolCallID string
ToolCalls []ToolCall
}
// ToolCall represents a single tool invocation requested by the LLM.
// The Arguments field contains a JSON-encoded string of the tool's parameters.
type ToolCall struct {
ID string
Name string
Arguments string // JSON-encoded, e.g. {"query": "Go 1.22 release notes"}
}
// CompletionRequest is the complete input to the LLM.
// Tools is a slice of OpenAI-format function definitions that tell the LLM
// what tools are available and what arguments each one expects.
type CompletionRequest struct {
Messages []Message
Tools []map[string]any // OpenAI function-calling format
Model string
}
// CompletionResponse is the output from the LLM.
// StopReason tells us why the LLM stopped: "stop" means it gave a final
// answer, "tool_calls" means it wants to call one or more tools.
type CompletionResponse struct {
Content string
ToolCalls []ToolCall
StopReason string // "stop", "tool_calls", "length", etc.
}
// Provider is the interface every LLM backend must implement.
// The agent package depends only on this interface, never on any concrete type.
type Provider interface {
Complete(ctx context.Context, req CompletionRequest) (*CompletionResponse, error)
ModelName() string
}
This interface is the seam between the agent logic and the outside world. The agent package imports only the llm package, never the openai package. This means you can write a mock provider for testing, a local Ollama provider, a Groq provider, or an Anthropic provider, and the agent will work with any of them without modification.
The concrete implementation in llm/openai.go speaks the OpenAI chat-completions wire format. Because Ollama, Groq, Mistral, and many other providers implement this same format, a single implementation covers a wide range of backends.
// OpenAIProvider implements Provider for any OpenAI-compatible API.
// This includes OpenAI itself, Ollama (local), Groq, Mistral, and others.
type OpenAIProvider struct {
baseURL string // e.g. "https://api.openai.com/v1" or "http://localhost:11434/v1"
apiKey string // Bearer token; empty string is fine for local Ollama
model string // e.g. "gpt-4o", "llama3.2", "llama-3.3-70b-versatile"
client *http.Client // Reused across requests for connection pooling
}
// NewOpenAIProvider creates a provider. The 120-second timeout accommodates
// slow local models running on CPU or a single GPU.
func NewOpenAIProvider(baseURL, apiKey, model string) *OpenAIProvider {
return &OpenAIProvider{
baseURL: baseURL,
apiKey: apiKey,
model: model,
client: &http.Client{Timeout: 120 * time.Second},
}
}
// Complete sends a chat completion request and returns the response.
// It serializes the request to JSON, sends it via HTTP POST, deserializes
// the response, and maps it back to the internal CompletionResponse type.
func (p *OpenAIProvider) Complete(
ctx context.Context,
req CompletionRequest,
) (*CompletionResponse, error) {
model := req.Model
if model == "" {
model = p.model
}
// Build the request body in the OpenAI wire format.
body := map[string]any{
"model": model,
"messages": toAPIMessages(req.Messages),
}
// Only include the tools field if there are tools to offer.
// Some models behave poorly when given an empty tools array.
if len(req.Tools) > 0 {
body["tools"] = req.Tools
body["tool_choice"] = "auto"
}
payload, err := json.Marshal(body)
if err != nil {
return nil, fmt.Errorf("marshal request: %w", err)
}
httpReq, err := http.NewRequestWithContext(
ctx,
http.MethodPost,
p.baseURL+"/chat/completions",
bytes.NewReader(payload),
)
if err != nil {
return nil, fmt.Errorf("build request: %w", err)
}
httpReq.Header.Set("Content-Type", "application/json")
if p.apiKey != "" {
httpReq.Header.Set("Authorization", "Bearer "+p.apiKey)
}
resp, err := p.client.Do(httpReq)
if err != nil {
return nil, fmt.Errorf("HTTP request: %w", err)
}
defer resp.Body.Close()
rawBody, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("read response: %w", err)
}
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("API error %d: %s", resp.StatusCode, string(rawBody))
}
// Parse the OpenAI response structure.
var apiResp struct {
Choices []struct {
Message struct {
Content string `json:"content"`
ToolCalls []struct {
ID string `json:"id"`
Function struct {
Name string `json:"name"`
Arguments string `json:"arguments"`
} `json:"function"`
} `json:"tool_calls"`
} `json:"message"`
FinishReason string `json:"finish_reason"`
} `json:"choices"`
}
if err := json.Unmarshal(rawBody, &apiResp); err != nil {
return nil, fmt.Errorf("parse response: %w", err)
}
if len(apiResp.Choices) == 0 {
return nil, fmt.Errorf("no choices in response")
}
choice := apiResp.Choices[0]
out := &CompletionResponse{
Content: choice.Message.Content,
StopReason: choice.FinishReason,
}
for _, tc := range choice.Message.ToolCalls {
out.ToolCalls = append(out.ToolCalls, ToolCall{
ID: tc.ID,
Name: tc.Function.Name,
Arguments: tc.Function.Arguments,
})
}
return out, nil
}
The toAPIMessages helper function is worth examining because it handles the subtleties of the OpenAI message format. Tool result messages need a tool_call_id field. Assistant messages that contain tool calls need to embed the tool_calls array. Getting this right is essential for the function-calling protocol to work correctly.
// toAPIMessages converts internal Message types to the OpenAI wire format.
// This is the only place in the codebase that knows about the JSON structure
// of the OpenAI messages array.
func toAPIMessages(msgs []Message) []map[string]any {
result := make([]map[string]any, 0, len(msgs))
for _, m := range msgs {
msg := map[string]any{
"role": m.Role,
"content": m.Content,
}
// Tool result messages must include the ID that links them to the call.
if m.ToolCallID != "" {
msg["tool_call_id"] = m.ToolCallID
}
// Assistant messages that invoked tools must embed the tool_calls array.
if len(m.ToolCalls) > 0 {
tcs := make([]map[string]any, 0, len(m.ToolCalls))
for _, tc := range m.ToolCalls {
tcs = append(tcs, map[string]any{
"id": tc.ID,
"type": "function",
"function": map[string]any{
"name": tc.Name,
"arguments": tc.Arguments,
},
})
}
msg["tool_calls"] = tcs
}
result = append(result, msg)
}
return result
}
Supporting local and remote LLMs with different GPU backends is simply a matter of pointing the provider at the right base URL. Ollama, which supports CUDA, Apple MLX (Metal Performance Shaders), Vulkan, and CPU backends depending on the platform and the installed drivers, exposes an OpenAI-compatible API on localhost port 11434. The GoAgent provider does not need to know anything about the underlying GPU; that is Ollama's job.
// To use a local Ollama instance running on Apple Silicon with MLX:
provider := llm.NewOpenAIProvider(
"http://localhost:11434/v1",
"", // No API key needed for local Ollama
"llama3.2", // Any model pulled with "ollama pull llama3.2"
)
// To use a local Ollama instance with CUDA on Linux:
provider := llm.NewOpenAIProvider(
"http://localhost:11434/v1",
"",
"mistral", // Ollama handles CUDA automatically if nvidia drivers are present
)
// To use Groq's cloud inference (extremely fast, uses custom hardware):
provider := llm.NewOpenAIProvider(
"https://api.groq.com/openai/v1",
os.Getenv("GROQ_API_KEY"),
"llama-3.3-70b-versatile",
)
// To use OpenAI's API directly:
provider := llm.NewOpenAIProvider(
"https://api.openai.com/v1",
os.Getenv("OPENAI_API_KEY"),
"gpt-4o",
)
If you need to support a provider that does not speak the OpenAI format, such as Anthropic's Claude API or Google's Gemini API, you implement the Provider interface with a new struct. The agent code does not change at all.
// AnthropicProvider would implement the Provider interface for Claude.
// The agent package never needs to know this exists.
type AnthropicProvider struct {
apiKey string
model string
client *http.Client
}
func (p *AnthropicProvider) ModelName() string { return p.model }
func (p *AnthropicProvider) Complete(
ctx context.Context,
req llm.CompletionRequest,
) (*llm.CompletionResponse, error) {
// Translate from OpenAI format to Anthropic format here.
// The agent package never sees this translation.
// ... implementation omitted for brevity ...
return nil, nil
}
This is the power of interface-based design. The seam is clean, the dependencies flow in one direction, and adding a new backend requires touching exactly one file.
PART 5: THE TOOL SYSTEM
Tools are what make an agent useful. Without tools, an agent is just a chatbot with extra steps. With tools, it can search the web, read and write files, execute code, call APIs, query databases, and do anything else you can express as a Go function.
GoAgent defines the Tool interface in tools/tool.go. The interface has four methods. Name returns a unique identifier that the LLM uses to refer to the tool. Description returns a natural-language explanation of what the tool does and when to use it; this is what the LLM reads when deciding which tool to call. Schema returns a JSON Schema object that describes the tool's parameters; this is what the LLM uses to construct valid arguments. Execute takes a JSON-encoded string of arguments and returns a string result or an error.
// Tool is the core interface every agent capability must implement.
// Implementing this interface is all that is required to add a new tool.
type Tool interface {
// Name returns the unique identifier used by the LLM to call this tool.
Name() string
// Description explains what the tool does and when to use it.
// Write this as if you were explaining it to a smart but literal-minded
// colleague who will read it before deciding whether to use the tool.
Description() string
// Schema returns a JSON Schema object describing the tool's parameters.
// This is sent to the LLM as part of the function-calling specification.
Schema() map[string]any
// Execute runs the tool with the given JSON-encoded arguments.
// It returns the tool's output as a plain string, which the LLM will read.
Execute(ctx context.Context, args string) (string, error)
}
The Registry type is a thread-safe map from tool name to Tool. It provides methods to register tools, retrieve them by name, list all registered names, and serialize the entire registry into the OpenAI function-calling format.
// Registry holds all tools available to an agent.
// It is safe for concurrent use by multiple goroutines.
type Registry struct {
mu sync.RWMutex
tools map[string]Tool
}
// NewRegistry creates an empty tool registry.
func NewRegistry() *Registry {
return &Registry{tools: make(map[string]Tool)}
}
// Register adds a tool to the registry. Calling Register with a tool whose
// name is already registered silently overwrites the previous tool.
func (r *Registry) Register(t Tool) {
r.mu.Lock()
defer r.mu.Unlock()
r.tools[t.Name()] = t
}
// ToOpenAITools converts the registry to the OpenAI tools wire format.
// This is called once per agent iteration to tell the LLM what it can do.
func (r *Registry) ToOpenAITools() []map[string]any {
r.mu.RLock()
defer r.mu.RUnlock()
result := make([]map[string]any, 0, len(r.tools))
for _, t := range r.tools {
result = append(result, map[string]any{
"type": "function",
"function": map[string]any{
"name": t.Name(),
"description": t.Description(),
"parameters": t.Schema(),
},
})
}
return result
}
The MustParseArgs helper deserializes the JSON argument string that the LLM produces into a typed Go struct. This is used by every tool implementation to avoid boilerplate JSON parsing.
// MustParseArgs deserializes JSON args into a target struct.
// Returns an error if args is empty or cannot be parsed.
func MustParseArgs(args string, target any) error {
if args == "" {
return fmt.Errorf("tool arguments cannot be empty")
}
if err := json.Unmarshal([]byte(args), target); err != nil {
return fmt.Errorf("invalid tool arguments %q: %w", args, err)
}
return nil
}
Let us look at the built-in tools in detail, because they illustrate important design patterns that you will use when building your own tools.
The CalculatorTool is the simplest tool in the system. It takes a mathematical expression as a string and evaluates it using the gval library. The gval library supports the four arithmetic operators, exponentiation, and a range of mathematical functions including sqrt, abs, floor, ceil, and others. The tool is stateless, has no side effects, and cannot fail in any interesting way.
// CalculatorTool evaluates mathematical expressions.
// It is stateless and safe for concurrent use.
type CalculatorTool struct{}
func (c *CalculatorTool) Name() string { return "calculator" }
func (c *CalculatorTool) Description() string {
return "Evaluate a mathematical expression. " +
"Supports +, -, *, /, ^, sqrt(), abs(), and more."
}
func (c *CalculatorTool) Schema() map[string]any {
return map[string]any{
"type": "object",
"properties": map[string]any{
"expression": map[string]any{
"type": "string",
"description": "The mathematical expression to evaluate, " +
"e.g. '1337 * 42' or 'sqrt(144)'",
},
},
"required": []string{"expression"},
}
}
func (c *CalculatorTool) Execute(_ context.Context, args string) (string, error) {
// Parse the JSON arguments into a typed struct.
var params struct {
Expression string `json:"expression"`
}
if err := MustParseArgs(args, ¶ms); err != nil {
return "", err
}
if params.Expression == "" {
return "", fmt.Errorf("expression cannot be empty")
}
// Evaluate the expression. gval returns an interface{} which we format
// as a string for the LLM to read.
result, err := gval.Evaluate(params.Expression, nil)
if err != nil {
return "", fmt.Errorf("evaluation error: %w", err)
}
return fmt.Sprintf("%v", result), nil
}
The FileReadTool and FileWriteTool demonstrate an important security pattern: sandboxing. Both tools accept a relative path and resolve it against a configured allowed directory. Before reading or writing, they check that the resolved absolute path is still inside the allowed directory. This prevents path traversal attacks where a malicious or confused LLM might try to read /etc/passwd or write to /usr/bin/something.
// FileReadTool reads files from a sandboxed directory.
// AllowedDir is the root directory outside of which no reads are permitted.
type FileReadTool struct {
AllowedDir string
}
func (f *FileReadTool) Name() string { return "file_read" }
func (f *FileReadTool) Description() string {
return "Read the contents of a file from the workspace directory."
}
func (f *FileReadTool) Schema() map[string]any {
return map[string]any{
"type": "object",
"properties": map[string]any{
"path": map[string]any{
"type": "string",
"description": "Relative path to the file within the workspace",
},
},
"required": []string{"path"},
}
}
func (f *FileReadTool) Execute(_ context.Context, args string) (string, error) {
var params struct {
Path string `json:"path"`
}
if err := MustParseArgs(args, ¶ms); err != nil {
return "", err
}
// Resolve and validate the path before reading.
safePath, err := f.safePath(params.Path)
if err != nil {
return "", err
}
data, err := os.ReadFile(safePath)
if err != nil {
return "", fmt.Errorf("read file: %w", err)
}
return string(data), nil
}
// safePath resolves a relative path against AllowedDir and verifies that
// the result is still inside AllowedDir. This prevents path traversal.
func (f *FileReadTool) safePath(rel string) (string, error) {
base, err := filepath.Abs(f.AllowedDir)
if err != nil {
return "", fmt.Errorf("resolve allowed dir: %w", err)
}
// filepath.Clean("/"+rel) normalizes ".." components before joining.
target := filepath.Join(base, filepath.Clean("/"+rel))
if !strings.HasPrefix(target, base+string(filepath.Separator)) &&
target != base {
return "", fmt.Errorf("path %q is outside the allowed directory", rel)
}
return target, nil
}
The ShellTool is the most powerful and therefore the most dangerous tool in the system. It executes shell commands, but only from a whitelist of allowed commands. The whitelist is configured per-agent, so a research agent might be allowed to run ls, cat, and wc, while a code execution agent might be allowed to run go run and python3. The tool also enforces a timeout to prevent runaway processes.
// ShellTool executes whitelisted shell commands in a sandboxed manner.
// TimeoutSeconds limits how long a command can run.
// AllowedCommands is the whitelist of permitted command names.
type ShellTool struct {
TimeoutSeconds int
AllowedCommands []string
}
func (s *ShellTool) Execute(ctx context.Context, args string) (string, error) {
var params struct {
Command string `json:"command"`
}
if err := MustParseArgs(args, ¶ms); err != nil {
return "", err
}
// Split the command into its name and arguments.
parts := strings.Fields(params.Command)
if len(parts) == 0 {
return "", fmt.Errorf("command cannot be empty")
}
cmdName := parts[0]
// Check the command against the whitelist.
allowed := false
for _, a := range s.AllowedCommands {
if a == cmdName {
allowed = true
break
}
}
if !allowed {
return "", fmt.Errorf(
"command %q is not in the allowed list: %s",
cmdName,
strings.Join(s.AllowedCommands, ", "),
)
}
// Create a context with a timeout and run the command.
timeout := time.Duration(s.TimeoutSeconds) * time.Second
cmdCtx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
var stdout, stderr bytes.Buffer
cmd := exec.CommandContext(cmdCtx, parts[0], parts[1:]...)
cmd.Stdout = &stdout
cmd.Stderr = &stderr
if err := cmd.Run(); err != nil {
if stderr.Len() > 0 {
return "", fmt.Errorf("command error: %w\nstderr: %s", err, stderr.String())
}
return "", fmt.Errorf("command error: %w", err)
}
result := stdout.String()
if stderr.Len() > 0 {
result += "\nstderr: " + stderr.String()
}
return result, nil
}
Adding a custom tool to GoAgent requires implementing the Tool interface, which typically takes about fifteen lines of code. The multi_tool_agent example demonstrates this with a JokeTool:
// JokeTool demonstrates how to add a custom tool in about fifteen lines.
// It takes no arguments and always returns the same joke.
type JokeTool struct{}
func (j *JokeTool) Name() string { return "tell_joke" }
func (j *JokeTool) Description() string {
return "Tells a programming joke. No arguments needed."
}
func (j *JokeTool) Schema() map[string]any {
// An empty properties object signals that this tool takes no arguments.
return map[string]any{
"type": "object",
"properties": map[string]any{},
}
}
func (j *JokeTool) Execute(_ context.Context, _ string) (string, error) {
return "Why do Go developers wear glasses? Because they can't C#!", nil
}
Once you have implemented the Tool interface, registering the tool is a single line of code. The agent will automatically include it in the function-calling specification sent to the LLM, and the LLM will be able to call it by name.
registry := tools.NewRegistry()
registry.Register(&JokeTool{})
PART 6: THE REACT LOOP -- THE HEART OF THE AGENT
The ReACT loop is the algorithmic core of every agentic AI system. The name stands for Reason, Act, Observe. In each iteration of the loop, the agent asks the LLM to reason about the current state of the conversation and decide what to do next. If the LLM decides to call a tool, the agent executes the tool, records the result as an observation, and feeds it back to the LLM in the next iteration. If the LLM decides it has enough information to give a final answer, the loop ends and the answer is returned.
The ReACT pattern was introduced in a 2022 paper by Yao et al. and has since become the dominant paradigm for tool-using language model agents. The key insight is that by interleaving reasoning steps with action steps in the conversation history, you give the LLM a scratchpad where it can think through multi-step problems, observe the results of its actions, and revise its plan.
In GoAgent, the ReACT loop is implemented in the Run method of the Agent struct. Let us walk through it step by step.
First, the agent struct itself. It holds all the components needed for the loop: the configuration, the LLM provider, the tool registry, the skills manager, the conversation memory, and the four learning components.
// Agent is the central orchestrator. It holds all components needed
// to execute the ReACT loop and manage learning across sessions.
type Agent struct {
config Config // Name, system prompt, limits, flags
provider llm.Provider // The LLM backend (OpenAI, Ollama, etc.)
registry *tools.Registry // Available tools
skills *skills.Manager // Behavioral skill definitions
memory *Memory // Conversation history (sliding window)
episodes *learning.EpisodicStore // Past task outcomes and lessons
profile *learning.UserProfile // Persistent user facts and prefs
ragStore *learning.RAGStore // TF-IDF document retrieval
reflector *learning.Reflector // Post-task reflection engine
}
The Step type records what happened in each iteration of the loop. This trace is invaluable for debugging and for the reflection system, which analyzes it after the task completes to extract lessons.
// StepType describes what happened in a single ReACT step.
type StepType string
const (
StepToolCall StepType = "tool_call" // LLM requested a tool
StepObservation StepType = "observation" // Tool returned a result
StepFinalAnswer StepType = "final_answer" // LLM gave a final answer
)
// Step is a single entry in the agent's reasoning trace.
// The trace is used for debugging and for post-task reflection.
type Step struct {
Type StepType
ToolName string // Set when Type is StepToolCall or StepObservation
ToolArgs string // JSON-encoded arguments; set when Type is StepToolCall
Content string // The LLM's text content; set when Type is StepFinalAnswer
Observation string // The tool's output; set when Type is StepObservation
}
Now the Run method itself. This is the most important function in the entire codebase. It is worth reading carefully.
// Run executes the full ReACT loop for a given user input.
// It returns the final answer, the full reasoning trace, and any error.
func (a *Agent) Run(
ctx context.Context,
userInput string,
) (string, []Step, error) {
// Step 1: Update the system prompt with fresh context from the RAG store.
// The RAG store finds the most relevant document chunks for this query
// and injects them into the system prompt so the LLM has access to them.
ragContext := ""
if a.ragStore != nil && a.config.EnableRAG {
ragContext = a.ragStore.ToContextString(userInput, 3)
}
a.memory.UpdateSystemPrompt(a.buildSystemPrompt(ragContext))
// Step 2: Inject relevant lessons from past episodes.
// The episodic store finds lessons from similar past tasks and injects
// them as a system message so the LLM can learn from experience.
if a.episodes != nil {
lessons := a.episodes.RelevantLessons(userInput, 3)
if len(lessons) > 0 {
var sb strings.Builder
sb.WriteString("Relevant lessons from past experience:\n")
for _, l := range lessons {
sb.WriteString(fmt.Sprintf("- %s\n", l))
}
a.memory.Add(llm.Message{
Role: llm.RoleSystem,
Content: sb.String(),
})
}
}
// Step 3: Add the user's input to the conversation history.
a.memory.Add(llm.Message{Role: llm.RoleUser, Content: userInput})
var trace []Step
toolDefs := a.registry.ToOpenAITools()
// Step 4: The main ReACT loop. Each iteration is one round of
// Reason -> Act -> Observe.
for iteration := 0; iteration < a.config.MaxIterations; iteration++ {
// Respect context cancellation at the start of each iteration.
select {
case <-ctx.Done():
return "", trace, ctx.Err()
default:
}
if a.config.Verbose {
fmt.Printf("[%s] Iteration %d/%d\n",
a.config.Name, iteration+1, a.config.MaxIterations)
}
// REASON: Ask the LLM what to do next, given the full conversation
// history and the list of available tools.
resp, err := a.provider.Complete(ctx, llm.CompletionRequest{
Messages: a.memory.All(),
Tools: toolDefs,
Model: a.provider.ModelName(),
})
if err != nil {
return "", trace, fmt.Errorf(
"LLM error on iteration %d: %w", iteration+1, err)
}
// FINAL ANSWER: If the LLM stopped without requesting tools,
// it has given us its final answer. Record it and return.
if resp.StopReason == "stop" || len(resp.ToolCalls) == 0 {
step := Step{Type: StepFinalAnswer, Content: resp.Content}
trace = append(trace, step)
a.memory.Add(llm.Message{
Role: llm.RoleAssistant,
Content: resp.Content,
})
// Persist the conversation to disk if a memory file is configured.
if a.config.MemoryFile != "" {
_ = a.memory.SaveToFile(a.config.MemoryFile)
}
// Trigger post-task reflection asynchronously so it does not
// block the response to the user.
if a.config.EnableLearning && a.reflector != nil {
go a.runReflection(
context.Background(), userInput, trace, resp.Content, true)
}
return resp.Content, trace, nil
}
// ACT: The LLM wants to call one or more tools. Record the tool
// calls in the trace and in the conversation history.
a.memory.Add(llm.Message{
Role: llm.RoleAssistant,
Content: resp.Content,
ToolCalls: resp.ToolCalls,
})
for _, tc := range resp.ToolCalls {
step := Step{
Type: StepToolCall,
ToolName: tc.Name,
ToolArgs: tc.Arguments,
}
trace = append(trace, step)
}
// Execute all tool calls in parallel using goroutines.
// This is a significant performance win when the LLM requests
// multiple tools in a single iteration.
results := a.executeToolsParallel(ctx, resp.ToolCalls)
// OBSERVE: Add each tool result to the conversation history
// and to the trace. The LLM will read these in the next iteration.
for _, result := range results {
obsStep := Step{
Type: StepObservation,
ToolName: result.toolName,
Observation: result.observation,
}
trace = append(trace, obsStep)
a.memory.Add(llm.Message{
Role: llm.RoleTool,
Content: result.observation,
ToolCallID: result.callID,
})
}
}
// If we reach here, the agent exceeded its iteration limit without
// producing a final answer. Trigger a failure reflection and return an error.
if a.config.EnableLearning && a.reflector != nil {
go a.runReflection(
context.Background(), userInput, trace,
"max iterations exceeded", false)
}
return "", trace, fmt.Errorf(
"agent '%s' exceeded max iterations (%d) without a final answer",
a.config.Name, a.config.MaxIterations,
)
}
The parallel tool execution is implemented using a goroutine per tool call and a WaitGroup to synchronize the results. The results slice is pre-allocated with the correct length so that each goroutine can write to its own index without any locking.
// toolResult holds the result of a single parallel tool execution.
type toolResult struct {
callID string
toolName string
observation string
}
// executeToolsParallel runs all tool calls concurrently and returns
// ordered results. The order matches the order of the input calls slice,
// which is important for correct message ordering in the conversation.
func (a *Agent) executeToolsParallel(
ctx context.Context,
calls []llm.ToolCall,
) []toolResult {
results := make([]toolResult, len(calls))
var wg sync.WaitGroup
for i, tc := range calls {
wg.Add(1)
// Capture loop variables by value to avoid the classic Go closure bug.
go func(idx int, call llm.ToolCall) {
defer wg.Done()
// Check for cancellation before starting the tool.
select {
case <-ctx.Done():
results[idx] = toolResult{
callID: call.ID,
toolName: call.Name,
observation: fmt.Sprintf("ERROR: context cancelled: %v", ctx.Err()),
}
return
default:
}
obs := a.executeTool(ctx, call)
results[idx] = toolResult{
callID: call.ID,
toolName: call.Name,
observation: obs,
}
}(i, tc)
}
wg.Wait()
return results
}
// executeTool runs a single tool and returns its output as a string.
// Errors are converted to error strings so the LLM can read them and
// decide how to proceed.
func (a *Agent) executeTool(ctx context.Context, tc llm.ToolCall) string {
tool, ok := a.registry.Get(tc.Name)
if !ok {
return fmt.Sprintf(
"ERROR: Unknown tool '%s'. Available: %s",
tc.Name,
strings.Join(a.registry.Names(), ", "),
)
}
result, err := tool.Execute(ctx, tc.Arguments)
if err != nil {
return fmt.Sprintf("ERROR executing '%s': %v", tc.Name, err)
}
return result
}
Notice that tool errors are returned as strings rather than Go errors. This is intentional. If a tool fails, we want the LLM to know about it so it can try a different approach, retry with different arguments, or tell the user that the operation failed. Returning a Go error would terminate the loop; returning an error string lets the LLM handle the failure gracefully.
The flow of a typical multi-step agent interaction looks like this:
User: "What is the square root of the number of results for 'Go programming'?"
Iteration 1:
REASON: LLM sees the question and decides to search the web first.
ACT: LLM calls web_search({"query": "Go programming", "num_results": 5})
OBSERVE: "1. The Go Programming Language... 2. Go by Example... [5 results]"
Iteration 2:
REASON: LLM sees the search results and decides to count them and calculate.
ACT: LLM calls calculator({"expression": "sqrt(5)"})
OBSERVE: "2.23606797749979"
Iteration 3:
REASON: LLM has all the information it needs.
FINAL ANSWER: "The square root of 5 (the number of search results for
'Go programming') is approximately 2.24."
Each iteration adds messages to the conversation history, building up a rich context that the LLM can reason about. The conversation history is managed by the Memory struct.
PART 7: CONVERSATION MEMORY
The Memory struct manages the sliding window of conversation history that is sent to the LLM on each iteration. It is thread-safe, supports a configurable maximum number of turns, and can be persisted to and loaded from a JSON file.
The sliding window is important because LLMs have a context window limit. If you accumulate messages indefinitely, you will eventually exceed the model's context length and get an error. By keeping only the most recent N turns, you ensure that the conversation history always fits within the context window. The system prompt is always kept, regardless of the window size, because it contains the agent's identity and instructions.
// Memory stores the conversation history for an agent session.
// It is safe for concurrent use by multiple goroutines.
type Memory struct {
mu sync.RWMutex
messages []llm.Message
maxTurns int // 0 means unlimited; positive values enforce a sliding window
}
// NewMemory creates a new memory store with the given maximum turn count.
func NewMemory(maxTurns int) *Memory {
return &Memory{maxTurns: maxTurns}
}
// Add appends a message to the conversation history.
// If maxTurns is positive and the history exceeds the limit, the oldest
// non-system messages are dropped to make room.
func (m *Memory) Add(msg llm.Message) {
m.mu.Lock()
defer m.mu.Unlock()
m.messages = append(m.messages, msg)
m.trim()
}
// trim enforces the sliding window. It always keeps the system prompt
// (the first message) and drops the oldest non-system messages when the
// window is exceeded.
func (m *Memory) trim() {
if m.maxTurns <= 0 || len(m.messages) <= m.maxTurns {
return
}
// Find the system messages (always at the beginning).
systemEnd := 0
for systemEnd < len(m.messages) &&
m.messages[systemEnd].Role == llm.RoleSystem {
systemEnd++
}
// Calculate how many non-system messages to keep.
nonSystem := m.messages[systemEnd:]
keep := m.maxTurns - systemEnd
if keep < 0 {
keep = 0
}
if len(nonSystem) > keep {
nonSystem = nonSystem[len(nonSystem)-keep:]
}
// Rebuild the messages slice.
m.messages = append(m.messages[:systemEnd], nonSystem...)
}
// All returns a copy of all messages in the conversation history.
// The copy prevents the caller from modifying the internal slice.
func (m *Memory) All() []llm.Message {
m.mu.RLock()
defer m.mu.RUnlock()
out := make([]llm.Message, len(m.messages))
copy(out, m.messages)
return out
}
// UpdateSystemPrompt replaces the system message at the beginning of the
// history. This is called at the start of each Run to inject fresh RAG
// context and user profile information.
func (m *Memory) UpdateSystemPrompt(content string) {
m.mu.Lock()
defer m.mu.Unlock()
if len(m.messages) > 0 && m.messages[0].Role == llm.RoleSystem {
m.messages[0].Content = content
} else {
m.messages = append(
[]llm.Message{{Role: llm.RoleSystem, Content: content}},
m.messages...,
)
}
}
The memory can be persisted to a JSON file so that conversations survive process restarts. The atomicWriteFile helper writes to a temporary file and then renames it, which ensures that the file is never in a partially-written state even if the process is killed mid-write.
// SaveToFile persists the conversation history to a JSON file.
// It uses an atomic write (temp file + rename) to prevent corruption.
func (m *Memory) SaveToFile(path string) error {
m.mu.RLock()
data, err := json.MarshalIndent(m.messages, "", " ")
m.mu.RUnlock()
if err != nil {
return fmt.Errorf("marshal memory: %w", err)
}
return atomicWriteFile(path, data)
}
// LoadFromFile restores the conversation history from a JSON file.
// If the file does not exist, it returns nil (fresh start).
func (m *Memory) LoadFromFile(path string) error {
data, err := os.ReadFile(path)
if err != nil {
if os.IsNotExist(err) {
return nil
}
return fmt.Errorf("read memory file: %w", err)
}
m.mu.Lock()
defer m.mu.Unlock()
return json.Unmarshal(data, &m.messages)
}
// atomicWriteFile writes data to path atomically via temp-file + rename.
// This is safe against partial writes and process crashes.
func atomicWriteFile(path string, data []byte) error {
dir := filepath.Dir(path)
if err := os.MkdirAll(dir, 0755); err != nil {
return fmt.Errorf("mkdir: %w", err)
}
// Create a temporary file in the same directory as the target.
// Using the same directory ensures the rename is atomic (same filesystem).
tmp, err := os.CreateTemp(dir, ".mem-write-*")
if err != nil {
return fmt.Errorf("create temp: %w", err)
}
tmpName := tmp.Name()
if _, err := tmp.Write(data); err != nil {
tmp.Close()
os.Remove(tmpName)
return err
}
if err := tmp.Close(); err != nil {
os.Remove(tmpName)
return err
}
// Rename is atomic on POSIX systems when source and destination are
// on the same filesystem.
return os.Rename(tmpName, path)
}
PART 8: THE LEARNING SYSTEM
One of the most sophisticated aspects of GoAgent is its learning system. Most agent frameworks treat each conversation as completely independent. GoAgent accumulates knowledge across sessions through four distinct mechanisms that work together to make the agent progressively more effective over time.
The first mechanism is the episodic memory store. After each task, the agent records what happened: the task description, the outcome (success or failure), the lessons learned, and a timestamp. On subsequent tasks, the agent retrieves the most relevant past episodes and injects their lessons into the conversation as system messages. This is analogous to how a human expert draws on past experience when faced with a familiar type of problem.
// Episode records the outcome of a single task execution.
// It is stored persistently and used to inform future task executions.
type Episode struct {
Timestamp time.Time `json:"timestamp"`
Task string `json:"task"`
Outcome string `json:"outcome"` // "success" or "failure"
Lessons []string `json:"lessons"` // Reusable insights extracted by reflection
ToolsUsed []string `json:"tools_used"`
}
// EpisodicStore persists episodes and retrieves relevant ones for new tasks.
type EpisodicStore struct {
mu sync.RWMutex
episodes []*Episode
path string
}
// RelevantLessons returns the lessons from the n most relevant past episodes.
// Relevance is determined by simple keyword overlap between the query and
// the task descriptions of past episodes.
func (s *EpisodicStore) RelevantLessons(query string, n int) []string {
s.mu.RLock()
defer s.mu.RUnlock()
queryWords := strings.Fields(strings.ToLower(query))
type scored struct {
episode *Episode
score int
}
var candidates []scored
for _, ep := range s.episodes {
taskWords := strings.Fields(strings.ToLower(ep.Task))
score := 0
for _, qw := range queryWords {
for _, tw := range taskWords {
if qw == tw {
score++
}
}
}
if score > 0 {
candidates = append(candidates, scored{ep, score})
}
}
// Sort by relevance score, highest first.
sort.Slice(candidates, func(i, j int) bool {
return candidates[i].score > candidates[j].score
})
var lessons []string
seen := make(map[string]bool)
for _, c := range candidates {
for _, l := range c.episode.Lessons {
if !seen[l] && len(lessons) < n {
lessons = append(lessons, l)
seen[l] = true
}
}
if len(lessons) >= n {
break
}
}
return lessons
}
The second mechanism is the user profile. The user profile accumulates facts and preferences about the user across sessions. Facts are things like "User works with Python code" or "User prefers metric units". Preferences are key-value pairs like "response_style=concise" or "language=German". The profile is injected into the system prompt at the start of each conversation, allowing the agent to personalize its responses without the user having to re-state their preferences every time.
// UserProfile stores persistent facts and preferences about the user.
// It is updated by the reflection system after each task.
type UserProfile struct {
mu sync.RWMutex
Facts []string `json:"facts"`
Preferences map[string]string `json:"preferences"`
UpdatedAt time.Time `json:"updated_at"`
path string
}
// AddFact adds a new fact to the profile if it is not already present.
func (p *UserProfile) AddFact(fact string) error {
p.mu.Lock()
defer p.mu.Unlock()
for _, f := range p.Facts {
if f == fact {
return nil // Already known
}
}
p.Facts = append(p.Facts, fact)
p.UpdatedAt = time.Now()
return p.save()
}
// Summary returns a human-readable summary of the profile for injection
// into the system prompt.
func (p *UserProfile) Summary() string {
p.mu.RLock()
defer p.mu.RUnlock()
if len(p.Facts) == 0 && len(p.Preferences) == 0 {
return ""
}
var sb strings.Builder
if len(p.Facts) > 0 {
sb.WriteString("Known facts: " + strings.Join(p.Facts, "; ") + "\n")
}
if len(p.Preferences) > 0 {
prefs := make([]string, 0, len(p.Preferences))
for k, v := range p.Preferences {
prefs = append(prefs, k+"="+v)
}
sb.WriteString("Preferences: " + strings.Join(prefs, ", ") + "\n")
}
return sb.String()
}
The third mechanism is the RAG store. RAG stands for Retrieval-Augmented Generation. The idea is to give the agent access to a large body of documents without putting all of them in the context window at once. Instead, you index the documents and retrieve only the most relevant chunks for each query. GoAgent implements this using TF-IDF (Term Frequency - Inverse Document Frequency), which is a classical information retrieval technique that scores documents based on how frequently query terms appear in them relative to how frequently those terms appear across all documents.
TF-IDF is not as powerful as embedding-based retrieval (which uses vector similarity), but it has significant advantages: it requires no external service, no GPU, no embedding model, and no additional dependencies. For many practical use cases, keyword-based retrieval is good enough.
// Chunk is a piece of text with its TF scores pre-computed.
// Pre-computing TF scores at ingestion time makes retrieval fast.
type Chunk struct {
Source string `json:"source"` // Filename or URL
Text string `json:"text"` // The chunk content
TFScores map[string]float64 `json:"tf_scores"` // Term frequency scores
}
// IngestText splits text into chunks and adds them to the store.
// The TF scores are computed at ingestion time for fast retrieval later.
func (s *RAGStore) IngestText(source, text string) (int, error) {
rawChunks := splitIntoChunks(text, 150) // 150 words per chunk
s.mu.Lock()
defer s.mu.Unlock()
for _, raw := range rawChunks {
tokens := tokenize(raw)
tf := computeTF(tokens)
s.chunks = append(s.chunks, &Chunk{
Source: source,
Text: raw,
TFScores: tf,
})
}
return len(rawChunks), s.save()
}
// ToContextString retrieves the top-n most relevant chunks for a query
// and formats them as a context string for injection into the system prompt.
func (s *RAGStore) ToContextString(query string, n int) string {
s.mu.RLock()
defer s.mu.RUnlock()
if len(s.chunks) == 0 {
return ""
}
queryTokens := tokenize(query)
idf := computeIDF(s.chunks)
type scored struct {
chunk *Chunk
score float64
}
candidates := make([]scored, 0, len(s.chunks))
for _, c := range s.chunks {
score := tfidfScore(queryTokens, c.TFScores, idf)
if score > 0 {
candidates = append(candidates, scored{chunk: c, score: score})
}
}
sort.Slice(candidates, func(i, j int) bool {
return candidates[i].score > candidates[j].score
})
var sb strings.Builder
for i, c := range candidates {
if i >= n {
break
}
sb.WriteString(fmt.Sprintf("[%s]: %s\n\n", c.chunk.Source, c.chunk.Text))
}
return sb.String()
}
// computeTF calculates term frequency scores for a list of tokens.
// TF(t) = count(t) / total_tokens
func computeTF(tokens []string) map[string]float64 {
counts := make(map[string]int, len(tokens))
for _, t := range tokens {
counts[t]++
}
tf := make(map[string]float64, len(counts))
total := float64(len(tokens))
if total == 0 {
return tf
}
for term, count := range counts {
tf[term] = float64(count) / total
}
return tf
}
// computeIDF calculates inverse document frequency scores across all chunks.
// IDF(t) = log(total_docs / docs_containing_t) + 1
func computeIDF(chunks []*Chunk) map[string]float64 {
docCount := float64(len(chunks))
termDocs := make(map[string]int)
for _, c := range chunks {
for term := range c.TFScores {
termDocs[term]++
}
}
idf := make(map[string]float64, len(termDocs))
for term, count := range termDocs {
idf[term] = math.Log(docCount/float64(count)) + 1
}
return idf
}
// tfidfScore computes the TF-IDF score for a query against a single chunk.
func tfidfScore(
queryTokens []string,
tf map[string]float64,
idf map[string]float64,
) float64 {
var score float64
for _, token := range queryTokens {
if tfVal, ok := tf[token]; ok {
score += tfVal * idf[token]
}
}
return score
}
The fourth mechanism is the reflection system. After each task completes, the reflector calls the LLM with a structured prompt that asks it to analyze what happened and extract lessons, user facts, user preferences, and potentially a new reusable skill. This is done asynchronously in a goroutine so it does not delay the response to the user.
The reflection system is a beautiful example of using the LLM's own reasoning capabilities to improve the agent's future performance. The LLM that just completed a task is in the best position to know what went well, what went poorly, and what a future agent should do differently.
// ReflectionInput is the data passed to the reflector after a task completes.
type ReflectionInput struct {
Task string // The original user request
Steps int // Number of ReACT iterations taken
ToolsUsed []string // Names of tools that were called
Trace string // Human-readable trace of the reasoning steps
Answer string // The final answer given to the user
Success bool // Whether the task completed successfully
}
// ReflectionOutput is the structured output from the reflection LLM call.
type ReflectionOutput struct {
Lessons []string `json:"lessons"` // Reusable insights
UserFacts []string `json:"user_facts"` // Facts about the user
UserPrefs map[string]string `json:"user_preferences"` // User preferences
NewSkillName string `json:"new_skill_name"` // Name of a new skill
NewSkillContent string `json:"new_skill_content"` // Markdown content
}
// Reflect analyzes a completed task and updates the learning stores.
// It calls the LLM with a structured prompt and parses the JSON response.
func (r *Reflector) Reflect(
ctx context.Context,
caller LLMCaller,
input ReflectionInput,
) (*ReflectionOutput, error) {
outcome := "success"
if !input.Success {
outcome = "failure"
}
systemPrompt := `You are a reflection engine for an AI agent.
Analyze the completed task and respond with a JSON object containing:
lessons: 1-3 reusable lessons for future similar tasks
user_facts: only clearly inferable facts (e.g. "User works with Python code")
user_preferences: only clearly inferable preferences (e.g. {"response_style": "concise"})
new_skill_name: only if a highly reusable pattern was found; else ""
new_skill_content: full .md skill file if new_skill_name is non-empty, else ""`
userPrompt := fmt.Sprintf( "Task: %s\nOutcome: %s\nSteps taken: %d\nTools used: %s\n"+ "Final answer: %s\n\nTrace:\n%s", input.Task, outcome, input.Steps, strings.Join(input.ToolsUsed, ", "), input.Answer, input.Trace, ) raw, err := caller.CallLLM(ctx, systemPrompt, userPrompt) if err != nil { return nil, fmt.Errorf("reflection LLM call failed: %w", err) } // Strip markdown code fences if the LLM wrapped the JSON in them. raw = strings.TrimSpace(raw) if strings.HasPrefix(raw, "```") { lines := strings.Split(raw, "\n") if len(lines) > 2 { raw = strings.Join(lines[1:len(lines)-1], "\n") } } var output ReflectionOutput if err := json.Unmarshal([]byte(raw), &output); err != nil { return nil, fmt.Errorf("parse reflection output: %w (raw: %.200s)", err, raw) } // Persist the episode to the episodic store. episode := &Episode{ Timestamp: time.Now(), Task: input.Task, Outcome: outcome, Lessons: output.Lessons, ToolsUsed: input.ToolsUsed, } _ = r.episodes.Add(episode) return &output, nil}
The system prompt is assembled dynamically at the start of each Run call by the buildSystemPrompt method. It combines the base system prompt from the agent configuration with the user profile summary and the RAG context. This is what gives the agent its personality, its knowledge of the user, and its access to relevant documents.
// buildSystemPrompt constructs the full system prompt with all context injected.
// This is called at the start of each Run to ensure fresh context.
func (a *Agent) buildSystemPrompt(ragContext string) string {
var sb strings.Builder
// Start with the base system prompt (the agent's personality).
sb.WriteString(a.config.SystemPrompt)
sb.WriteString("\n\n")
// Inject the user profile if it has any content.
if a.profile != nil {
if summary := a.profile.Summary(); summary != "" {
sb.WriteString("## User Profile\n")
sb.WriteString(summary)
sb.WriteString("\n")
}
}
// Inject the RAG context if it has any content.
if ragContext != "" {
sb.WriteString("## Relevant Context\n")
sb.WriteString(ragContext)
sb.WriteString("\n")
}
return sb.String()
}
PART 9: AGENT CONFIGURATION AND THE YAML DEFINITION SYSTEM
One of GoAgent's most practical features is the ability to define agents in YAML without writing any Go code. This makes it possible for non-programmers to create new agents, and it makes it easy for programmers to experiment with different agent personalities and tool configurations without recompiling.
The Config struct in agent/config.go defines all the runtime parameters for a single agent instance. Every field has a sensible default provided by the DefaultConfig function.
// Config holds all runtime configuration for a single agent instance.
type Config struct {
Name string // Display name, used in log messages and prompts
SystemPrompt string // The agent's personality and instructions
MaxIterations int // Maximum ReACT loop iterations (prevents infinite loops)
MaxMemoryTurns int // Sliding window size for conversation history
Verbose bool // Whether to print iteration progress to stdout
MemoryFile string // Path to persist conversation history (empty = no persistence)
LearningDir string // Directory for episodic, profile, and RAG data
SkillsDir string // Directory containing skill Markdown files
EnableLearning bool // Whether to run post-task reflection
EnableRAG bool // Whether to inject RAG context into the system prompt
}
// DefaultConfig returns sensible production defaults.
// These can be overridden field by field after calling DefaultConfig().
func DefaultConfig() Config {
return Config{
Name: "GoAgent",
SystemPrompt: "You are a helpful, precise AI assistant. Think step by step.",
MaxIterations: 15,
MaxMemoryTurns: 50,
Verbose: true,
LearningDir: "./data/learning",
SkillsDir: "./skills/skills",
EnableLearning: true,
EnableRAG: true,
}
}
The AgentDefinition struct in agentconfig/agentconfig.go mirrors the Config struct but is designed to be loaded from YAML. It adds scheduling and tool selection fields that are not part of the runtime Config.
// AgentDefinition is the schema for a YAML agent configuration file.
// Every field maps 1-to-1 with agent.Config plus scheduling and tool selection.
type AgentDefinition struct {
// Identity
Name string `yaml:"name"`
Description string `yaml:"description"`
// Behaviour
SystemPrompt string `yaml:"system_prompt"`
MaxIterations int `yaml:"max_iterations"`
MaxMemoryTurns int `yaml:"max_memory_turns"`
Verbose bool `yaml:"verbose"`
EnableLearning bool `yaml:"enable_learning"`
EnableRAG bool `yaml:"enable_rag"`
// Tool selection (names from the built-in tool registry)
Tools []string `yaml:"tools"`
ShellAllowedCommands []string `yaml:"shell_allowed_commands"`
// Scheduling (optional)
Schedule string `yaml:"schedule"` // "once" | "background" | "daily at HH:MM"
Task string `yaml:"task"` // The task/prompt to run on a schedule
// Auto-start: if true, GoAgent starts this agent automatically on launch
AutoStart bool `yaml:"auto_start"`
}
The Hermes agent definition demonstrates how to configure a general-purpose assistant with the full tool suite and learning enabled:
# agents/hermes.yaml
# Hermes is a general-purpose self-improving assistant.
# Set auto_start: true to have GoAgent run this agent immediately on launch.
name: Hermes
description: "A highly capable general-purpose AI assistant"
system_prompt: |
You are Hermes, a highly capable, self-improving AI assistant.
You are precise, thorough, and always verify facts using your tools.
You learn from every interaction and adapt to the user's needs and preferences.
When doing math, use the calculator. When you need current information, search the web.
When asked to save or create files, use the file_write tool.
max_iterations: 15
max_memory_turns: 50
verbose: true
enable_learning: true
enable_rag: true
tools:
- web_search
- calculator
- file_read
- file_write
- shell
shell_allowed_commands:
- ls
- cat
- echo
- pwd
- date
- wc
The OpenClaw agent definition shows how to configure a specialized research agent with a different personality, a higher iteration limit, and a more restricted tool set:
# agents/openclaw.yaml
# OpenClaw is a meticulous research agent that writes structured reports.
name: OpenClaw
description: "A research-focused agent that searches the web and writes structured reports"
system_prompt: |
You are OpenClaw, a meticulous research agent.
Your job is to find accurate, up-to-date information using web search,
synthesize it into clear structured reports, and save them to files.
Always cite your sources. Never fabricate information.
Prefer bullet points and headers for readability.
max_iterations: 20
max_memory_turns: 40
verbose: false
enable_learning: true
enable_rag: true
tools:
- web_search
- calculator
- file_read
- file_write
auto_start: false
The key difference between Hermes and OpenClaw is not just their system prompts but their tool sets and iteration limits. OpenClaw does not have shell access because a research agent does not need to execute system commands. It has a higher iteration limit because research tasks often require more steps than general-purpose tasks. Its verbose flag is false because it runs in the background and does not need to print progress to the terminal.
The LoadDir function scans a directory for YAML files and loads each one as an AgentDefinition. It validates each definition and skips any that are invalid, printing a warning to stderr.
// LoadDir reads all *.yaml and *.yml files from dir and returns valid definitions.
// Invalid files are skipped with a warning rather than causing a fatal error.
func LoadDir(dir string) ([]*AgentDefinition, error) {
entries, err := os.ReadDir(dir)
if err != nil {
if os.IsNotExist(err) {
return nil, nil // No agents directory is fine
}
return nil, fmt.Errorf("read agents dir %q: %w", dir, err)
}
var defs []*AgentDefinition
for _, entry := range entries {
if entry.IsDir() {
continue
}
ext := strings.ToLower(filepath.Ext(entry.Name()))
if ext != ".yaml" && ext != ".yml" {
continue
}
path := filepath.Join(dir, entry.Name())
def, err := LoadFile(path)
if err != nil {
fmt.Fprintf(os.Stderr, "Skipping agent config %q: %v\n", path, err)
continue
}
defs = append(defs, def)
}
return defs, nil
}
PART 10: THE SKILLS SYSTEM
Skills are reusable behavioral patterns defined in Markdown files. They are injected into the system prompt when the agent is initialized, giving it domain-specific knowledge and procedural guidance without requiring fine-tuning or prompt engineering in the YAML configuration.
A skill file is a Markdown document that describes how to approach a particular type of task. The summarizer skill, for example, tells the agent how to structure a summary. The researcher skill tells it how to conduct web research systematically.
# skills/skills/summarizer.md
# Summarizer Skill
When asked to summarize content:
1. Identify the main topic and key points.
2. Extract the most important facts and insights.
3. Present the summary in a clear, structured format.
4. Keep the summary concise but comprehensive.
5. If the content is technical, preserve technical accuracy.
Always indicate the source of the content being summarized.
# skills/skills/researcher.md
# Researcher Skill
When asked to research a topic:
1. Search for current information using web_search.
2. Cross-reference multiple sources to verify facts.
3. Identify key facts, statistics, and expert opinions.
4. Present findings in a structured format with sources cited.
Always note the date of information and flag anything that may be outdated.
The reflection system can generate new skill files automatically. When the reflector identifies a highly reusable pattern in a completed task, it asks the LLM to write a new skill file and saves it to the skills directory. On the next run, the agent will load this skill and benefit from the accumulated knowledge. This is a form of automated prompt engineering that improves the agent's performance over time without any human intervention.
PART 11: THE SCHEDULER
The scheduler manages the lifecycle of agent jobs. A job is an agent paired with a task and a schedule. The scheduler supports three run modes: once (run the task immediately or at a specific time), daily (run the task every day at a specific time), and background (run the task in a continuous loop with a brief pause between runs).
The Schedule type is parsed from a human-readable string by the ParseSchedule function. The string format is designed to be easy to type at a CLI prompt.
// RunMode defines how an agent job is scheduled.
type RunMode string
const (
RunBackground RunMode = "background" // Continuous loop with brief pauses
RunOnce RunMode = "once" // Run once, immediately or at a time
RunDaily RunMode = "daily" // Run every day at a specific time
)
// Schedule holds the parsed scheduling information for a job.
type Schedule struct {
Mode RunMode
AtTime time.Time // For RunOnce: the scheduled time (zero = immediate)
Hour int // For RunDaily: the hour (0-23)
Minute int // For RunDaily: the minute (0-59)
}
// ParseSchedule parses a schedule string into a Schedule.
// Accepted formats:
// "once" -- run immediately, once
// "background" -- run continuously
// "daily at HH:MM" -- run every day at the given time
// "once at HH:MM" -- run once at the given time today (or tomorrow if past)
func ParseSchedule(s string) (Schedule, error) {
s = strings.TrimSpace(s)
switch strings.ToLower(s) {
case "once", "":
return Schedule{Mode: RunOnce}, nil
case "background":
return Schedule{Mode: RunBackground}, nil
}
// ... regex matching for "daily at HH:MM" and "once at HH:MM" ...
}
The AgentJob struct holds all the state for a running job: the job configuration, the agent instance, the current status, the history of results, and the goroutine cancellation function.
// AgentJob represents a running or scheduled agent with lifecycle management.
type AgentJob struct {
mu sync.RWMutex
ID string // Unique identifier (sanitized agent name)
Config AgentJobConfig // Everything needed to recreate the agent
Schedule Schedule // When and how often to run
Status Status // Current state: pending, running, done, stopped, error
Results []JobResult // History of all run results
StartedAt time.Time // When the job was created
agent *agent.Agent // The agent instance
cancel context.CancelFunc // Cancels the job's goroutine
done chan struct{} // Closed when the goroutine exits
}
The Manager type holds all jobs in a thread-safe map and provides methods to add, stop, and list jobs. Adding a job creates the agent, starts a goroutine to run it, and returns immediately.
// Manager manages the lifecycle of all agent jobs.
type Manager struct {
mu sync.RWMutex
jobs map[string]*AgentJob
}
// Add creates and starts a new agent job.
// It returns immediately; the job runs in a background goroutine.
func (m *Manager) Add(cfg AgentJobConfig) (*AgentJob, error) {
// Create the agent from the job configuration.
a := agent.New(
cfg.AgentConfig,
cfg.Provider,
cfg.Registry,
cfg.SkillsManager,
cfg.Episodes,
cfg.Profile,
cfg.RAGStore,
)
// Create a cancellable context for the job's goroutine.
ctx, cancel := context.WithCancel(context.Background())
job := &AgentJob{
ID: cfg.ID,
Config: cfg,
Schedule: cfg.Schedule,
Status: StatusPending,
StartedAt: time.Now(),
agent: a,
cancel: cancel,
done: make(chan struct{}),
}
m.mu.Lock()
m.jobs[cfg.ID] = job
m.mu.Unlock()
// Start the job's goroutine.
go job.run(ctx)
return job, nil
}
The run method dispatches to the appropriate execution mode. The daily mode calculates the time until the next scheduled run and sleeps until then. The background mode runs the task repeatedly with a brief pause between runs. The once mode optionally waits until a scheduled time and then runs the task once.
// run is the job's internal goroutine. It dispatches to the appropriate
// execution mode and closes the done channel when it exits.
func (j *AgentJob) run(ctx context.Context) {
defer close(j.done)
switch j.Schedule.Mode {
case RunOnce:
j.runOnce(ctx)
case RunDaily:
j.runDaily(ctx)
case RunBackground:
j.runBackground(ctx)
default:
j.runOnce(ctx)
}
}
// runDaily runs the task every day at the configured time.
// It calculates the next scheduled time on each iteration, which correctly
// handles daylight saving time transitions.
func (j *AgentJob) runDaily(ctx context.Context) {
for {
now := time.Now()
next := time.Date(
now.Year(), now.Month(), now.Day(),
j.Schedule.Hour, j.Schedule.Minute, 0, 0, now.Location(),
)
// If the scheduled time has already passed today, schedule for tomorrow.
if next.Before(now) {
next = next.Add(24 * time.Hour)
}
select {
case <-ctx.Done():
j.setStatus(StatusStopped)
return
case <-time.After(time.Until(next)):
}
j.execute(ctx)
if ctx.Err() != nil {
j.setStatus(StatusStopped)
return
}
}
}
// execute runs the agent's task once and records the result.
func (j *AgentJob) execute(ctx context.Context) {
j.setStatus(StatusRunning)
start := time.Now()
answer, _, err := j.agent.Run(ctx, j.Config.Task)
duration := time.Since(start)
result := JobResult{
RunAt: start,
Answer: answer,
Error: err,
Duration: duration,
}
j.mu.Lock()
j.Results = append(j.Results, result)
if err != nil {
j.Status = StatusError
} else {
j.Status = StatusDone
}
j.mu.Unlock()
}
PART 12: THE ADAPTER SYSTEM
The adapter system allows GoAgent to receive messages from and send messages to external communication channels. The design is a classic plug-in architecture: a central interface, a registry, and a message bus that routes messages between adapters and the agent.
The Adapter interface defines the contract that every communication channel must implement. Start begins receiving messages and sends them to the provided channel. Send delivers a reply to a specific recipient. Stop shuts down the adapter gracefully.
// Adapter is the interface every external communication channel must implement.
// Implementing this interface is all that is required to add a new channel.
type Adapter interface {
// Name returns the unique identifier for this adapter.
Name() string
// Start begins receiving messages and sends them to the incoming channel.
// It should return quickly and do its work in a background goroutine.
Start(ctx context.Context, incoming chan<- IncomingMessage) error
// Send delivers a reply to a specific recipient on this channel.
Send(ctx context.Context, msg OutgoingMessage) error
// Stop shuts down the adapter gracefully.
Stop() error
}
// IncomingMessage represents a message arriving from an external channel.
type IncomingMessage struct {
AdapterName string // Which adapter received this message
SenderID string // Channel-specific sender identifier
Text string // The message text
Raw map[string]any // Channel-specific metadata
}
// OutgoingMessage represents a reply to be sent back through an adapter.
type OutgoingMessage struct {
AdapterName string // Which adapter should send this message
RecipientID string // Channel-specific recipient identifier
Text string // The reply text
}
The Telegram adapter implements the Adapter interface using the Telegram Bot API's long-polling mechanism. Long-polling means the adapter makes an HTTP request to the Telegram servers and waits up to 30 seconds for new messages. When messages arrive, it processes them and immediately makes another request. This approach requires no webhook server and works behind NAT and firewalls.
// TelegramAdapter connects GoAgent to Telegram via the Bot API (long-polling).
type TelegramAdapter struct {
token string
baseURL string
client *http.Client
mu sync.Mutex
lastUpdate int64 // The offset for the next getUpdates call
stopOnce sync.Once
stopCh chan struct{}
}
// Start begins long-polling for Telegram updates.
// It verifies the token by calling getMe and then starts the poll loop.
func (t *TelegramAdapter) Start(
ctx context.Context,
incoming chan<- adapters.IncomingMessage,
) error {
if t.token == "" {
return fmt.Errorf("telegram adapter: GOAGENT_TELEGRAM_TOKEN is not set")
}
me, err := t.getMe()
if err != nil {
return fmt.Errorf("telegram adapter: getMe failed (check token): %w", err)
}
log.Printf("Telegram bot connected: @%s (ID: %d)", me.Username, me.ID)
go t.pollLoop(ctx, incoming)
return nil
}
// pollLoop continuously polls for updates and sends them to the incoming channel.
// It handles errors gracefully by retrying after a delay.
func (t *TelegramAdapter) pollLoop(
ctx context.Context,
incoming chan<- adapters.IncomingMessage,
) {
for {
select {
case <-ctx.Done():
return
case <-t.stopCh:
return
default:
}
updates, err := t.getUpdates(ctx)
if err != nil {
// On error, wait before retrying to avoid hammering the API.
select {
case <-ctx.Done():
return
case <-t.stopCh:
return
case <-time.After(retryDelay):
}
continue
}
for _, upd := range updates {
if upd.Message == nil || upd.Message.Text == "" {
t.advanceOffset(upd.UpdateID)
continue
}
chatID := strconv.FormatInt(upd.Message.Chat.ID, 10)
select {
case incoming <- adapters.IncomingMessage{
AdapterName: adapterName,
SenderID: chatID,
Text: upd.Message.Text,
Raw: map[string]any{
"update_id": upd.UpdateID,
"chat_id": upd.Message.Chat.ID,
},
}:
case <-ctx.Done():
return
case <-t.stopCh:
return
}
t.advanceOffset(upd.UpdateID)
}
}
}
The iMessage adapter is more exotic. It reads from the macOS Messages SQLite database at ~/Library/Messages/chat.db. The database is updated by the Messages app whenever a new message arrives. The adapter polls the database every three seconds and sends new messages to the incoming channel. Replies are sent using AppleScript via the osascript command.
// IMessageAdapter polls ~/Library/Messages/chat.db for new messages.
// It requires macOS and Full Disk Access permission for the terminal.
type IMessageAdapter struct {
handle string // The phone number or email address to monitor
db *sql.DB // Read-only connection to chat.db
mu sync.Mutex
lastROWID int64 // The ROWID of the last processed message
stopOnce sync.Once
stopCh chan struct{}
}
// pollLoop polls the database on a timer and sends new messages to the channel.
func (a *IMessageAdapter) pollLoop(
ctx context.Context,
incoming chan<- adapters.IncomingMessage,
) {
ticker := time.NewTicker(pollInterval) // 3 seconds
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-a.stopCh:
return
case <-ticker.C:
msgs, err := a.fetchNewMessages()
if err != nil {
log.Printf("iMessage poll error: %v", err)
continue
}
for _, m := range msgs {
select {
case incoming <- m:
case <-ctx.Done():
return
case <-a.stopCh:
return
}
}
}
}
}
// Send delivers a reply via osascript (AppleScript).
// The message text and handle are escaped to prevent AppleScript injection.
func (a *IMessageAdapter) Send(_ context.Context, msg adapters.OutgoingMessage) error {
safeText := strings.ReplaceAll(msg.Text, `"`, `\"`)
safeHandle := strings.ReplaceAll(msg.RecipientID, `"`, `\"`)
script := fmt.Sprintf(`tell application "Messages"
set targetService to 1st service whose service type = iMessage
set targetBuddy to buddy "%s" of targetService
send "%s" to targetBuddy
end tell`, safeHandle, safeText)
cmd := exec.Command("osascript", "-e", script)
if out, err := cmd.CombinedOutput(); err != nil {
return fmt.Errorf("osascript send failed: %w\noutput: %s", err, string(out))
}
return nil
}
The message Bus ties the adapter system together. It holds a reference to the adapter registry, a message handler function, and an incoming message channel. When Start is called, it starts all registered adapters and launches a dispatch loop goroutine. The dispatch loop reads messages from the incoming channel and calls the handler in a separate goroutine for each message, so slow agent responses do not block the receipt of new messages.
// Bus is the central message router.
type Bus struct {
registry *Registry
handler MessageHandler
incoming chan IncomingMessage
wg sync.WaitGroup
}
// dispatchLoop reads from the incoming channel and dispatches to the handler.
// Each message is handled in its own goroutine for concurrency.
func (b *Bus) dispatchLoop(ctx context.Context) {
defer b.wg.Done()
for {
select {
case <-ctx.Done():
return
case msg, ok := <-b.incoming:
if !ok {
return
}
// Handle each message concurrently so slow responses do not
// block the receipt of subsequent messages.
go b.handleMessage(ctx, msg)
}
}
}
// handleMessage calls the handler and routes the reply back to the adapter.
func (b *Bus) handleMessage(ctx context.Context, msg IncomingMessage) {
reply, err := b.handler(ctx, msg)
if err != nil {
log.Printf("Handler error for [%s/%s]: %v",
msg.AdapterName, msg.SenderID, err)
reply = fmt.Sprintf("Sorry, I encountered an error: %v", err)
}
if reply == "" {
return
}
adapter, ok := b.registry.Get(msg.AdapterName)
if !ok {
log.Printf("No adapter found for %q to send reply", msg.AdapterName)
return
}
if err := adapter.Send(ctx, OutgoingMessage{
AdapterName: msg.AdapterName,
RecipientID: msg.SenderID,
Text: reply,
}); err != nil {
log.Printf("Send error [%s -> %s]: %v",
msg.AdapterName, msg.SenderID, err)
}
}
The adapter system also supports Go plugin-based adapters. A plugin is a shared library (.so file on Linux, .dylib on macOS) that exports an Adapter symbol. The LoadPlugins function scans a directory for .so files and loads each one using Go's plugin package. This allows third-party adapters to be added without recompiling the main binary.
// To write a custom adapter plugin, implement the Adapter interface and
// export an "Adapter" symbol. The plugin loader will find it automatically.
package myadapter
import "github.com/ms1963/goagent/adapters"
// Adapter is the exported symbol that the plugin loader looks for.
var Adapter adapters.Adapter = &MyAdapter{}
type MyAdapter struct{}
func (m *MyAdapter) Name() string { return "myadapter" }
func (m *MyAdapter) Start(
ctx context.Context,
in chan<- adapters.IncomingMessage,
) error {
// Start receiving messages and sending them to the 'in' channel.
go m.receiveLoop(ctx, in)
return nil
}
func (m *MyAdapter) Send(
ctx context.Context,
msg adapters.OutgoingMessage,
) error {
// Deliver msg.Text to msg.RecipientID on your channel.
return nil
}
func (m *MyAdapter) Stop() error {
// Shut down gracefully.
return nil
}
To build the plugin and install it, you use the Go plugin build mode:
go build -buildmode=plugin -o adapters/myadapter.so ./path/to/myadapter/
PART 13: THE MAIN ENTRY POINT AND THE CLI
The main.go file is the glue that holds everything together. It implements a restart loop, which allows the system to be restarted without killing the process. This is useful for reloading configuration changes without downtime.
// main is the outer restart loop. When run() returns true, the system
// restarts. When it returns false, the process exits.
func main() {
for {
shouldRestart := run()
if !shouldRestart {
break
}
fmt.Println("\nRestarting GoAgent...\n")
time.Sleep(500 * time.Millisecond)
}
}
The run function initializes all subsystems, starts the adapters, and runs the CLI loop. It returns true if a restart was requested and false if the user asked to exit.
// run starts the system and returns true if a restart was requested.
func run() (restart bool) {
// Set up signal handling for graceful shutdown.
ctx, stop := signal.NotifyContext(
context.Background(), os.Interrupt, syscall.SIGTERM)
defer stop()
// 1. Create the LLM provider from environment variables.
baseURL := getEnv("GOAGENT_BASE_URL", "https://api.openai.com/v1")
apiKey := getEnv("GOAGENT_API_KEY", os.Getenv("OPENAI_API_KEY"))
model := getEnv("GOAGENT_MODEL", "gpt-4o")
provider := llm.NewOpenAIProvider(baseURL, apiKey, model)
// 2. Ensure base directories exist.
for _, dir := range []string{
"./workspace", "./data", "./data/learning",
"./skills/skills", "./adapters", "./agents",
} {
if err := os.MkdirAll(dir, 0755); err != nil {
log.Fatalf("Could not create directory %s: %v", dir, err)
}
}
// 3. Create the scheduler manager.
mgr := scheduler.NewManager()
defer mgr.StopAll()
// 4. Load YAML agent definitions.
agentDefs, err := agentconfig.LoadDir("./agents")
if err != nil {
log.Printf("Agent config loading: %v", err)
}
// 5. Auto-start agents that have auto_start: true.
for _, def := range agentDefs {
if def.AutoStart {
if err := startAgentFromDef(ctx, def, mgr, provider); err != nil {
log.Printf("Auto-start failed for %q: %v", def.Name, err)
}
}
}
// 6. Set up the adapter registry and message bus.
adapterRegistry := adapters.NewRegistry()
if tg := telegram.NewFromEnv(); tg != nil {
adapterRegistry.Register(tg)
}
if im := imessage.NewFromEnv(); im != nil {
adapterRegistry.Register(im)
}
// 7. Build the gateway agent that handles adapter messages.
gatewayAgent := buildGatewayAgent(provider)
handler := func(handlerCtx context.Context, msg adapters.IncomingMessage) (string, error) {
return gatewayAgent.Chat(handlerCtx, msg.Text)
}
// 8. Start the message bus.
bus := adapters.NewBus(adapterRegistry, handler)
if err := bus.Start(ctx); err != nil {
log.Printf("Bus start error: %v", err)
}
defer bus.Stop()
// 9. Run the CLI loop.
return runCLI(ctx, mgr, provider, agentDefs)
}
The CLI loop reads commands from stdin and dispatches them to handler functions. The commands are: create (create and schedule an agent), list (list running jobs), stop (stop a job), chat (start an interactive chat session), agents (list YAML-defined agents), adapters (show adapter status), restart (restart the system), help (show help), and exit (shut down).
The chat command is particularly interesting because it creates a full agent with all learning components enabled and runs an interactive conversation loop. The user can type "reset" to clear the conversation history, "profile" to see the accumulated user profile, "episodes" to see past task records, and "learn
PART 14: PUTTING IT ALL TOGETHER -- A COMPLETE EXAMPLE
Let us trace through a complete example to see how all the pieces work together. Suppose the user types "chat Hermes" at the CLI prompt, and then asks: "Research the latest developments in Go generics and save a report to workspace/go-generics.md"
Step 1: The cmdChat function is called with the argument "Hermes". It looks up the Hermes definition in the loaded YAML definitions, finds agents/hermes.yaml, and uses it to configure the agent. It creates the episodic store, user profile, RAG store, skills manager, and tool registry from the YAML definition. It then calls agent.New with all these components.
Step 2: agent.New creates the Memory, initializes the Reflector (because both episodes and profile are non-nil), loads the conversation history from the memory file if it exists, and adds the initial system prompt to memory.
Step 3: The user's input is received by the CLI loop and passed to a.Chat, which calls a.Run.
Step 4: Run updates the system prompt with fresh RAG context (empty on first run), injects any relevant episodic lessons (none on first run), and adds the user's message to memory.
Step 5: The first iteration of the ReACT loop begins. The LLM receives the system prompt, the user's message, and the tool definitions for web_search, calculator, file_read, file_write, and shell. It decides to search the web first and returns a tool call: web_search({"query": "Go generics 2024 latest developments", "num_results": 5}).
Step 6: executeToolsParallel is called with the single tool call. A goroutine is spawned that calls the WebSearchTool's Execute method. The tool makes an HTTP request to DuckDuckGo (or SerpAPI if configured) and returns the top 5 results as a formatted string.
Step 7: The tool result is added to memory as a tool message. The second iteration begins. The LLM reads the search results and decides to search for more specific information: web_search({"query": "Go 1.21 1.22 generics improvements type inference", "num_results": 5}).
Step 8: After several iterations of searching and synthesizing, the LLM has enough information to write the report. It calls file_write({"path": "go-generics.md", "content": "# Go Generics: Latest Developments\n\n..."}).
Step 9: The FileWriteTool validates the path (ensuring it is inside ./workspace), creates any necessary directories, and writes the file atomically. It returns "File written successfully: go-generics.md".
Step 10: The LLM reads the success message and gives a final answer: "I've researched the latest developments in Go generics and saved a comprehensive report to workspace/go-generics.md. The report covers..."
Step 11: Run returns the final answer and the trace. The answer is printed to the terminal.
Step 12: In a background goroutine, runReflection is called. It builds a ReflectionInput from the trace and calls the Reflector. The Reflector asks the LLM to analyze the task and extract lessons. The LLM might return something like: lessons: ["When researching technical topics, start with broad searches and then narrow down to specific version numbers", "file_write requires the path to be relative to the workspace directory"]. These lessons are stored in the episodic store and will be injected into future conversations when the user asks about similar topics.
This complete flow demonstrates how all eight subsystems work together to produce a result that is more than the sum of its parts: the agent not only completes the task but learns from it, improving its performance on future similar tasks.
PART 15: DESIGN DECISIONS AND LESSONS LEARNED
Several design decisions in GoAgent are worth examining explicitly, because they reflect hard-won lessons about building reliable agentic systems.
The decision to use the OpenAI function-calling format rather than a custom tool-calling protocol is the most important design decision in the system. The function-calling format is well-understood by all major LLMs, is supported by Ollama, Groq, Mistral, and other providers, and has a clear specification. By adopting it, GoAgent gets tool-calling support from any compatible LLM without any prompt engineering.
The decision to execute tool calls in parallel is a significant performance optimization. When the LLM requests multiple tools in a single iteration, which happens frequently with capable models like GPT-4o, executing them sequentially would add unnecessary latency. Parallel execution using goroutines is trivial in Go and can reduce iteration time by a factor equal to the number of concurrent tool calls.
The decision to return tool errors as strings rather than Go errors is crucial for robustness. If a tool fails, the LLM needs to know about it so it can adapt. Returning a Go error would terminate the loop and give the user an unhelpful error message. Returning an error string lets the LLM decide how to handle the failure: retry with different arguments, try a different tool, or tell the user that the operation is not possible.
The decision to use TF-IDF for RAG rather than embedding-based retrieval is a pragmatic choice that prioritizes simplicity and zero external dependencies over retrieval quality. For many use cases, keyword-based retrieval is good enough. If you need better retrieval quality, you can replace the RAGStore with an embedding-based implementation without changing any other code, because the interface is the same.
The decision to run reflection asynchronously in a goroutine is important for user experience. Reflection involves an additional LLM call, which can take several seconds. If reflection were synchronous, the user would have to wait for it before seeing the agent's response. By running it in the background, the user gets the response immediately and the reflection happens in parallel.
The decision to use a sliding window for conversation memory rather than summarization is a simplicity tradeoff. Summarization would allow the agent to maintain context over very long conversations, but it requires an additional LLM call and introduces the risk of losing important details in the summary. A sliding window is simpler, predictable, and sufficient for most use cases.
The decision to support YAML-based agent definitions without requiring code changes is a significant usability improvement. It allows non-programmers to create new agents and allows programmers to experiment with different configurations without the compile-edit-run cycle. The YAML format is simple enough to be self-documenting.
PART 16: EXTENDING GOAGENT
GoAgent is designed to be extended at every layer. Here is a summary of the extension points and how to use them.
To add a new LLM provider, implement the llm.Provider interface. The interface has two methods: Complete and ModelName. Your implementation can use any protocol, any authentication scheme, and any response format, as long as it translates to and from the internal llm.Message and llm.CompletionResponse types. For example, to support a hypothetical local model server that uses a different protocol:
// LocalModelProvider implements llm.Provider for a hypothetical local server
// that uses a custom binary protocol over a Unix socket.
type LocalModelProvider struct {
socketPath string
modelName string
}
func (p *LocalModelProvider) ModelName() string { return p.modelName }
func (p *LocalModelProvider) Complete(
ctx context.Context,
req llm.CompletionRequest,
) (*llm.CompletionResponse, error) {
// Connect to the Unix socket, serialize the request in the server's
// protocol, send it, receive the response, and deserialize it.
// The agent code never knows this is happening.
conn, err := net.Dial("unix", p.socketPath)
if err != nil {
return nil, fmt.Errorf("connect to local model: %w", err)
}
defer conn.Close()
// ... protocol-specific serialization and deserialization ...
return &llm.CompletionResponse{
Content: "response from local model",
StopReason: "stop",
}, nil
}
To add a new tool, implement the tools.Tool interface and register it with the registry. The interface has four methods: Name, Description, Schema, and Execute. The Schema method returns a JSON Schema object that the LLM uses to construct valid arguments. The Execute method receives a JSON-encoded argument string and returns a result string.
// DatabaseQueryTool allows the agent to query a PostgreSQL database.
// This demonstrates how to add a stateful tool that holds a connection pool.
type DatabaseQueryTool struct {
db *sql.DB
maxRows int // Maximum rows to return per query
readOnly bool // If true, only SELECT queries are allowed
}
func (d *DatabaseQueryTool) Name() string { return "database_query" }
func (d *DatabaseQueryTool) Description() string {
return "Execute a SQL query against the application database. " +
"Only SELECT queries are permitted."
}
func (d *DatabaseQueryTool) Schema() map[string]any {
return map[string]any{
"type": "object",
"properties": map[string]any{
"query": map[string]any{
"type": "string",
"description": "The SQL SELECT query to execute",
},
},
"required": []string{"query"},
}
}
func (d *DatabaseQueryTool) Execute(
ctx context.Context,
args string,
) (string, error) {
var params struct {
Query string `json:"query"`
}
if err := tools.MustParseArgs(args, ¶ms); err != nil {
return "", err
}
// Enforce read-only access by checking the query prefix.
if d.readOnly {
trimmed := strings.TrimSpace(strings.ToUpper(params.Query))
if !strings.HasPrefix(trimmed, "SELECT") {
return "", fmt.Errorf("only SELECT queries are permitted")
}
}
rows, err := d.db.QueryContext(ctx, params.Query)
if err != nil {
return "", fmt.Errorf("query error: %w", err)
}
defer rows.Close()
// Format the results as a simple text table.
cols, _ := rows.Columns()
var sb strings.Builder
sb.WriteString(strings.Join(cols, " | ") + "\n")
sb.WriteString(strings.Repeat("-", 40) + "\n")
count := 0
vals := make([]any, len(cols))
ptrs := make([]any, len(cols))
for i := range vals {
ptrs[i] = &vals[i]
}
for rows.Next() && count < d.maxRows {
if err := rows.Scan(ptrs...); err != nil {
continue
}
parts := make([]string, len(cols))
for i, v := range vals {
parts[i] = fmt.Sprintf("%v", v)
}
sb.WriteString(strings.Join(parts, " | ") + "\n")
count++
}
return sb.String(), nil
}
To add a new messaging adapter, implement the adapters.Adapter interface and either register it directly or build it as a Go plugin. The interface has four methods: Name, Start, Send, and Stop. Start should return quickly and do its work in a background goroutine. Send should be safe to call from multiple goroutines concurrently.
To add a new skill, create a Markdown file in the skills/skills directory. The file will be loaded automatically on the next startup. The content of the file is injected into the system prompt, so write it as instructions to the agent.
To add a new learning mechanism, you can extend the learning package with a new store type and integrate it into the agent's Run method. The existing stores (episodic, profile, RAG) demonstrate the pattern: a struct with a mutex, a path for persistence, load and save methods using atomic writes, and a method that returns a string for injection into the system prompt.
PART 17: CONFIGURATION AND DEPLOYMENT
GoAgent is configured entirely through environment variables, which makes it easy to deploy in containers and cloud environments. The .env.example file documents all available variables.
# LLM Provider (OpenAI-compatible)
GOAGENT_BASE_URL=https://api.openai.com/v1
GOAGENT_MODEL=gpt-4o
GOAGENT_API_KEY=sk-...
OPENAI_API_KEY=sk-... # Fallback if GOAGENT_API_KEY is not set
# Optional: SerpAPI for higher-quality web search results
SERPAPI_KEY=...
# Optional: Telegram bot adapter
GOAGENT_TELEGRAM_TOKEN=...
# Optional: Apple iMessage adapter (macOS only)
GOAGENT_IMESSAGE_HANDLE=+1234567890
# Optional: Custom agents directory
GOAGENT_AGENTS_DIR=./agents
To use a local Ollama instance with CUDA on Linux, set GOAGENT_BASE_URL to http://localhost:11434/v1, leave GOAGENT_API_KEY empty, and set GOAGENT_MODEL to the name of the model you have pulled with "ollama pull". Ollama automatically uses CUDA if the NVIDIA drivers and CUDA toolkit are installed.
To use a local Ollama instance on Apple Silicon with MLX (Metal Performance Shaders), the setup is identical: set GOAGENT_BASE_URL to http://localhost:11434/v1 and GOAGENT_MODEL to a model that Ollama supports on Apple Silicon. Ollama automatically uses the Metal backend on macOS.
To use Ollama with Vulkan on Linux (for AMD or Intel GPUs), install Ollama with Vulkan support and set the same environment variables. Ollama handles the GPU backend selection transparently.
The Makefile provides common build and run targets:
# Run with default settings (uses OPENAI_API_KEY from environment)
make run
# Run with a local Ollama instance
make run-ollama
# Build a Docker image
make docker-build
# Run in Docker with environment variables from .env
make docker-run
The Dockerfile uses a multi-stage build to produce a minimal production image:
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o goagent .
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /app
COPY --from=builder /app/goagent .
COPY agents/ ./agents/
COPY skills/ ./skills/
CMD ["./goagent"]
The docker-compose.yml file orchestrates the GoAgent container with persistent volumes for the data and workspace directories:
version: "3.9"
services:
goagent:
build: .
environment:
- GOAGENT_BASE_URL=${GOAGENT_BASE_URL:-https://api.openai.com/v1}
- GOAGENT_MODEL=${GOAGENT_MODEL:-gpt-4o}
- GOAGENT_API_KEY=${GOAGENT_API_KEY}
- GOAGENT_TELEGRAM_TOKEN=${GOAGENT_TELEGRAM_TOKEN}
- SERPAPI_KEY=${SERPAPI_KEY}
volumes:
- ./data:/app/data # Persistent learning and memory data
- ./workspace:/app/workspace # Agent output files
stdin_open: true
tty: true
PART 18: TESTING AGENTIC SYSTEMS
Testing agentic systems is harder than testing conventional software because the behavior depends on the LLM, which is non-deterministic. GoAgent's interface-based design makes testing tractable by allowing you to replace the LLM provider with a mock.
A mock provider can return pre-scripted responses for specific inputs, allowing you to test the agent's behavior in a controlled and reproducible way. You can test that the agent correctly handles tool call responses, that it terminates when the LLM returns a final answer, that it respects the maximum iteration limit, and that it correctly persists and loads memory.
// MockProvider is a test double for the LLM provider.
// It returns pre-scripted responses in order.
type MockProvider struct {
responses []*llm.CompletionResponse
index int
mu sync.Mutex
}
func (m *MockProvider) ModelName() string { return "mock-model" }
func (m *MockProvider) Complete(
_ context.Context,
_ llm.CompletionRequest,
) (*llm.CompletionResponse, error) {
m.mu.Lock()
defer m.mu.Unlock()
if m.index >= len(m.responses) {
return &llm.CompletionResponse{
Content: "No more scripted responses",
StopReason: "stop",
}, nil
}
resp := m.responses[m.index]
m.index++
return resp, nil
}
// TestAgentToolCall verifies that the agent correctly executes a tool call
// and feeds the result back to the LLM.
func TestAgentToolCall(t *testing.T) {
// Script the LLM to first request a tool call, then give a final answer.
provider := &MockProvider{
responses: []*llm.CompletionResponse{
{
// First response: request the calculator tool.
StopReason: "tool_calls",
ToolCalls: []llm.ToolCall{
{
ID: "call_001",
Name: "calculator",
Arguments: `{"expression": "6 * 7"}`,
},
},
},
{
// Second response: give the final answer.
Content: "6 times 7 is 42.",
StopReason: "stop",
},
},
}
registry := tools.NewRegistry()
registry.Register(&tools.CalculatorTool{})
cfg := agent.DefaultConfig()
cfg.Verbose = false
cfg.EnableLearning = false
cfg.EnableRAG = false
a := agent.New(cfg, provider, registry, nil, nil, nil, nil)
answer, trace, err := a.Run(context.Background(), "What is 6 times 7?")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if answer != "6 times 7 is 42." {
t.Errorf("unexpected answer: %q", answer)
}
if len(trace) != 3 { // tool_call + observation + final_answer
t.Errorf("unexpected trace length: %d", len(trace))
}
}
You can also test the tools in isolation, since each tool is a pure function that takes a JSON string and returns a string. Testing the calculator tool is straightforward:
func TestCalculatorTool(t *testing.T) {
calc := &tools.CalculatorTool{}
tests := []struct {
args string
expected string
wantErr bool
}{
{`{"expression": "1337 * 42"}`, "56154", false},
{`{"expression": "sqrt(144)"}`, "12", false},
{`{"expression": "2^10"}`, "1024", false},
{`{"expression": ""}`, "", true},
{`{"expression": "invalid"}`, "", true},
}
for _, tt := range tests {
result, err := calc.Execute(context.Background(), tt.args)
if (err != nil) != tt.wantErr {
t.Errorf("args=%q: got err=%v, wantErr=%v", tt.args, err, tt.wantErr)
}
if !tt.wantErr && result != tt.expected {
t.Errorf("args=%q: got %q, want %q", tt.args, result, tt.expected)
}
}
}
Testing the learning system requires verifying that episodes are correctly stored and retrieved, that the TF-IDF scoring returns the most relevant chunks, and that the user profile correctly accumulates facts and preferences. These tests are straightforward because the learning stores are pure data structures with deterministic behavior.
CONCLUSION
GoAgent demonstrates that building a production-grade agentic AI system in Go is not only feasible but natural. Go's concurrency primitives make parallel tool execution trivial. Go's interfaces make the system extensible at every layer. Go's explicit error handling forces careful thinking about failure modes. And Go's minimal dependency philosophy keeps the system auditable and fast to build.
The key insights from this tutorial are worth summarizing. The ReACT loop is the algorithmic core of every agentic system: reason, act, observe, repeat. The LLM provider abstraction is the most important design decision: define a clean interface and never let the agent code depend on a concrete provider. Tools are the agent's hands: define a simple interface, implement a registry, and let the LLM decide which tools to use. Memory management is non-trivial: a sliding window is simple and sufficient for most use cases, but you need to think carefully about what to keep and what to discard. Learning across sessions is what separates a truly useful agent from a sophisticated chatbot: episodic memory, user profiles, RAG, and reflection work together to make the agent progressively more effective. The adapter system is what makes an agent accessible: a clean interface and a message bus allow any communication channel to be added without touching the core logic.
Agents like Hermes and OpenClaw are not magic. They are well-configured instances of a general-purpose framework, given specific personalities, tool sets, and behavioral guidelines through YAML files and Markdown skill definitions. The framework does the heavy lifting; the configuration gives each agent its character.
The most important thing to understand about agentic AI systems is that they are software systems first and AI systems second. The same principles that make good software -- clean interfaces, separation of concerns, explicit error handling, testability, and minimal dependencies -- also make good agentic systems. The LLM is a powerful component, but it is just one component among many. The quality of the system as a whole depends on the quality of the scaffolding around it.
Go is an excellent language for building that scaffolding. We hope this tutorial has given you the knowledge and the confidence to build your own.
No comments:
Post a Comment