8 min read·Updated 2026-04-13

Architecture

Runtime internals: one JVM process, primitives, turn pipeline, tool loop, extension points. Hive is optional.

Audience and mental model

This page is for developers who will read or change the runtime code. It describes how the pieces fit together internally. If you only operate the runtime, the User Guide concept pages are a better starting point.

The mental model: the runtime is a single long-lived JVM process that owns a workspace volume, exposes a dashboard and one or more channels, and runs a tool loop per turn. All extensibility — skills, plugins, MCP servers — plugs into the same loop.

Bot and Hive

Two components, not a cluster

The product is Bot (the runtime) plus an optional Hive (fleet orchestrator). Bot stands alone; Hive is worth adding only when you run more than one Bot and need coordinated approvals or shared inspection.

GolemCore Bot

One JVM process, one dashboard, one workspace volume. Runs agent sessions, tools, skills, memory, traces, channels, and delayed actions.

Standalone

GolemCore Hive

Separate process that tracks many Bots and provides approvals, lifecycle signals, and read-only cross-runtime inspection. Optional.

Optional

Runtime primitives

Everything inside the runtime is one of these primitives. The rest of the code is plumbing between them.

Session

Persistent conversations with their own tool-call history and state. The unit of work the user interacts with.

Conversation

Turn

One request/response cycle in a session: input, tier resolution, tool loop, persistence, response.

Cycle

Skill

Sticky overlay on a session: instructions plus optional MCP server and variables. Loaded from workspace/skills/.

Behavior

Plugin

Capability pack contributed at startup — tools, channels, voice, RAG backends. Loaded from JARs via ServiceLoader into isolated child Spring contexts; exposes SPIs like ToolProvider, RagProvider, SttProvider, TtsProvider, ChannelPort.

Capability

Memory

Four-layer store — Working, Episodic, Semantic, Procedural — with extract/normalize/append/promote pipeline and progressive disclosure at retrieval time.

State

Model router

Tier-to-model mapping stored in preferences/model-router.json. Resolves abstract tiers to concrete provider+model pairs.

Routing

Channel

Input sources — dashboard chat, Telegram, webhooks, Hive commands. Each channel can start a turn and emits a response event stream.

Input

Trace

Replayable execution snapshots captured per turn and embedded in the session payload on disk. Bounded by TracingConfig budgets (sessionTraceBudgetMb, maxSnapshotSizeKb).

Observability

The turn pipeline

A turn follows a strict pipeline. Knowing the order helps when reading the code or debugging unexpected behavior.

Pipeline stages

text

1. Channel     — receives input, starts a session context
2. Tier select — active skill or session default picks a tier
3. Resolve     — model router picks a concrete provider+model
4. Prepare     — memory loaded (headers), skills merged, tools declared
5. Tool loop   — model ↔ runtime until model stops requesting tools
6. Finalize    — response emitted to the channel
7. Persist     — session written (with embedded trace), memory updates flushed

Every stage produces events on the internal event bus. The dashboard and logs are just subscribers — nothing in the pipeline is hidden from instrumentation.

Inside the tool loop

The tool loop is where most of the complexity lives. Each iteration does the same thing:

Tool loop iteration

text

while (model returns tool_use):
    collect requested tool calls
    for each tool call:
        dispatch to tool registry (core, plugin, MCP)
        execute with timeout and sandbox
        capture result or error
    feed all results back to model
emit model's final assistant message

The tool registry is the integration point for plugins and MCP servers: both contribute tools into the same registry, and the dispatcher does not care whether the tool is native Java, a plugin-provided adapter, or a JSON-RPC call to an MCP subprocess.

Extension points

There are three places to add functionality to the runtime, in increasing order of coupling to the core code:

Skills

No code change. Write SKILL.md files on disk. Good for per-deployment behavior tweaks, MCP integrations, and cross-skill composition.

Zero-code

Plugins

Compile a plugin against the plugin SPI, drop the jar in the plugins directory. The PluginManager loads it into an isolated child Spring context at startup and registers its beans into the host.

In-process

Core changes

Modify Bot itself — new primitives, new pipeline stages, new event kinds. Requires a fork or an upstream PR. Reserve for capability gaps that cannot be closed any other way.

Core

Most real work is a skill. Some of it becomes a plugin. Very little of it becomes a core change.

When Hive is worth it

Hive is a separate process that subscribes to lifecycle events from one or more Bots and adds cross-runtime coordination. The Bot does not depend on Hive at runtime; the Hive connection is configured in preferences/hive.json and can be removed without restarting.

One Bot, one operator, one dashboard: do not add Hive.
Multiple Bots that need approval gates or cross-runtime search: Hive earns its keep.
You want Hive's inspection views but do not need approvals: Hive runs in read-only mode.

What to do next

Model Routing

The router is where tier-to-model decisions live. Understand it before touching the resolve stage.

Concept

Glossary

Canonical names for the runtime primitives referenced throughout this page.

Reference

User Guide

Model Routing

How GolemCore Bot picks a model for each turn, the four named tiers plus five custom slots, and how to set them up in Settings.

User Guide

MCP Servers

What an MCP server is, the two ways to declare one, and how the bot starts and stops it for you.

Reference

Glossary

One-line definitions for the terms that appear across the documentation. Use this when a word seems to mean something specific and you want the canonical meaning.