- Durable session graph
- Conversation records, tool events, mode events, and plugin nodes live in one
SessionGraph per session. SessionReadView and ChronologicalProjection read it; nothing else owns persistence.
- Per-turn atomic commit
RuntimeCommit writes graph delta, checkpoint blobs, usage deltas, and head revision in one SQLite transaction with optimistic CAS on expected_head_revision. Partial turn = no commit.
- Typed plugin capabilities
- Tools see a named
ToolContext surface — tasks(), sessions(), direct_completion, tool_catalog, snapshot reads — not a general host handle. There is no ToolContext::host() escape hatch.
- Sandboxed code load, ready
- RLM mode runs model-emitted Lashlang programs in a VM with no filesystem, process, or network surface. Every effect crosses
ToolHost. Use it when the model should compose multiple tool calls per turn instead of one.
- Subagent capability boundaries
Capability::resolve(parent_policy) returns a constrained SessionSpec for the child. Interactive-only tools are stripped from every subagent surface regardless of capability.
- Semantic event stream
- Identity-bearing
TurnActivity items: assistant prose deltas, reasoning deltas, tool start/complete pairs with correlation ids, code-block start/complete, terminal SubmittedValue/ToolValue, per-call and rolling usage. Apps fold this directly into product state.
- Tracing as a first-class sink
- Every provider call across every session emits
TraceRecords through TraceSink implementations. JSONL by default, OTel optional.
- Snapshot and restore seams
- Plugins, tool state, and the Lashlang VM each persist through versioned snapshot writers so a parked session resumes intact across process restarts.
What sits outside this crate, on purpose: tenancy boundaries, retention and lifecycle for long-lived artifacts, discovery of agent-authored procedures, a shared coherent image across many sessions. lash is the kernel and runtime; the platform-shaped pieces belong to whatever embeds it.
lash vs lash-core boundary
use std::sync::Arc;
use lash::{tools::*, LashCore, ModeId, ModePreset, PluginStack, TurnEvent, TurnInput};
let store_factory = Arc::new(lash_sqlite_store::SqliteSessionStoreFactory::new("lash-sessions"));
let core = LashCore::builder()
.install_mode(ModePreset::standard())
.install_mode(ModePreset::rlm())
.default_mode(ModeId::rlm())
.provider(provider)
.model("anthropic/claude-sonnet-4.6", None)
.max_context_tokens(200_000)
.plugins(PluginStack::runtime())
.tools(Arc::new(AppTools) as Arc<dyn ToolProvider>)
.store_factory(store_factory)
.build()?;
let session = core.session("chat-123").rlm().open().await?;
let result = session.turn(TurnInput::text("Use the app tools.")).run().await?;
let assistant_text: String = result.activities.iter().filter_map(|activity| match &activity.event {
TurnEvent::AssistantProseDelta { text } => Some(text.as_str()),
_ => None,
}).collect();
println!("{assistant_text}");
LashCore
- Cloneable shared configuration: provider, model, installed modes, tool providers, plugin factories, store factory, attachment store, and tracing. Runtime internals such as residency and termination policy live behind
.advanced().
ModePreset
- Installs execution modes. Use
ModePreset::standard() for native provider tool calls, ModePreset::rlm() for Lashlang-driven RLM turns, or install both and choose the default with default_mode.
SessionSpec
- Reusable public configuration overlay for provider, model/variant, execution mode, max context tokens, max turns, and prompt layer. Root cores use
SessionSpec::new(); child sessions and subagents usually use SessionSpec::inherit().
PluginStack
- Ordered plugin factory list.
LashCore::standard() and LashCore::rlm() include PluginStack::runtime(); a raw LashCore::builder() stays explicit and needs modes and plugins installed by the host.
LashSession
- One app conversation or task. Sessions wrap a parked/resumable runtime, can use a per-session store, and expose
turn(TurnInput), run(TurnInput), read_view(), and lower-level control groups through control().
TurnBuilder
- Per-turn configuration: cancellation, mode options, typed plugin input, and RLM-projected bindings. Call
.stream(&sink) for live events or .run() for a collected ordered activity log.
Root defaults
use lash::{plugins::PluginFactory, LashCore, PluginStack, SessionSpec};
let root_spec = SessionSpec::new()
.provider(provider)
.model("gpt-5.4", None)
.max_context_tokens(200_000);
let core = LashCore::rlm()
.session_spec(root_spec)
.configure_plugins(|plugins| {
plugins.push(Arc::new(AppPluginFactory) as Arc<dyn PluginFactory>);
})
.build()?;
Explicit stacks
let plugins = PluginStack::runtime().configure(|plugins| {
plugins.replace(Arc::new(CustomBudgetPlugin) as Arc<dyn PluginFactory>);
plugins.push(Arc::new(AppPluginFactory) as Arc<dyn PluginFactory>);
});
let core = LashCore::builder()
.install_mode(ModePreset::rlm())
.default_mode(ModeId::rlm())
.session_spec(root_spec)
.plugins(plugins)
.build()?;
.plugin(...) appends one factory to the current stack, .plugins(...) replaces the full stack, and .configure_plugins(...) mutates the current stack in place. That gives hosts a default runtime set while still allowing precise removal, replacement, or insertion.
Collected result
let collected = session
.turn(TurnInput::text("Summarize this task."))
.run()
.await?;
let visible_answer: String = collected.activities.iter().filter_map(|activity| match &activity.event {
TurnEvent::AssistantProseDelta { text } => Some(text.as_str()),
_ => None,
}).collect();
let parent_usage = collected.result.usage; // parent's own LLM tokens
let children = &collected.result.children_usage; // per-(source, model) child entries
let total = collected.result.total_usage(); // parent + children
let outcome = collected.result.outcome;
Live stream
let ui_sink = Arc::new(AppEvents::new(tx));
let turn = session
.turn(TurnInput::text(user_text))
.stream(ui_sink.as_ref())
.await?;
persist(app_turn_state.assistant_text(), turn.total_usage())?;
Apps own their projection policy. Fold TurnActivity directly when persisting assistant prose, terminal values, tool summaries, or timelines. Treat TurnActivity.correlation_id as the stable identity for multi-phase UI rows: start events insert rows, completion events update those same rows.
TurnActivity sink
use async_trait::async_trait;
use lash::{TurnActivity, TurnActivitySink, TurnEvent};
struct AppEvents {
tx: AppUiTx,
turn_state: std::sync::Mutex<TurnUiState>,
}
#[derive(Default)]
struct TurnUiState {
reasoning: Option<UiRowId>,
tools: std::collections::HashMap<String, UiRowId>,
code: Option<UiRowId>,
}
#[async_trait]
impl TurnActivitySink for AppEvents {
async fn emit(&self, activity: TurnActivity) {
let correlation_id = activity.correlation_id.0.clone();
match activity.event {
TurnEvent::AssistantProseDelta { text } => {
append_live_text(text).await;
}
TurnEvent::ReasoningDelta { text } => {
let row = self.turn_state.lock().unwrap().reasoning.clone();
let row = upsert_reasoning_row(row, text).await;
self.turn_state.lock().unwrap().reasoning = Some(row);
}
TurnEvent::ToolCallStarted { name, args, .. } => {
let row = insert_tool_row(name, args).await;
self.turn_state
.lock()
.unwrap()
.tools
.insert(correlation_id, row);
}
TurnEvent::ToolCallCompleted { name, result, success, .. } => {
let row = self.turn_state.lock().unwrap().tools.remove(&correlation_id);
update_or_insert_tool_row(row, name, result, success).await;
}
TurnEvent::CodeBlockStarted { language, code } => {
let row = insert_code_row(language, code).await;
self.turn_state.lock().unwrap().code = Some(row);
}
TurnEvent::CodeBlockCompleted { language, output, error, success, .. } => {
let row = self.turn_state.lock().unwrap().code.take();
update_or_insert_code_row(row, language, output, error, success).await;
}
TurnEvent::SubmittedValue { value } => {
append_live_text(render_terminal_value(&value)).await;
}
TurnEvent::ToolValue { tool_name, value } => {
append_live_text(render_terminal_value(&value)).await;
record_terminal_tool(tool_name).await;
}
TurnEvent::Usage { usage, cumulative, .. } => {
update_usage(usage, cumulative).await;
}
TurnEvent::ChildUsage { source, usage, cumulative, .. } => {
update_child_usage(source, usage, cumulative).await;
}
TurnEvent::Error { .. }
| _ => {}
}
}
}
fn render_terminal_value(value: &serde_json::Value) -> String {
match value {
serde_json::Value::Null => String::new(),
serde_json::Value::String(text) => text.clone(),
other => serde_json::to_string_pretty(other).unwrap_or_else(|_| other.to_string()),
}
}
Use event identity, not duplicate detection.
ToolCallStarted and ToolCallCompleted describe the same logical row when their correlation_id matches; code-block events work the same way. TurnEvent::SubmittedValue and TurnEvent::ToolValue mean “a new terminal value was authored by a control path.” They are not emitted for a normal assistant prose finish because that prose already streamed as AssistantProseDelta.
TraceSink
- Every provider call across every session in the runtime. Right for billing, audit, off-line analysis. Heavier than necessary for plain totals.
TurnEvent::Usage / TurnEvent::ChildUsage
- Live during a turn, one event per LLM iteration.
Usage is the parent's own model call; ChildUsage carries session_id and source so a UI can group child traffic (subagent, compaction, observer). Right for live counters.
TurnResult.usage / TurnResult.children_usage
- Per-turn snapshot at completion.
usage is parent-only; children_usage is a per-(source, model) breakdown for any child sessions that ran during the turn. TurnResult::total_usage() sums both.
session.usage_report() → SessionUsageReport
- Aggregate across the whole session, broken down by
source × model. Right for dashboards and "session so far."
The full re-export and well-known source label constants live in lash::usage.
Per-turn RLM options
let submitted = session
.turn(TurnInput::text("Move on the board."))
.require_submit()?
.stream(&sink)
.await?;
let prose_or_submit = session
.turn(TurnInput::text("Answer directly if no code is needed."))
.allow_prose_or_submit()?
.run()
.await?;
Outcome shape
match result.outcome {
lash::TurnOutcome::Finished(
lash::TurnFinish::SubmittedValue { value }
) => {
// Same value already arrived as TurnEvent::SubmittedValue.
persist_typed_value(value)?;
}
lash::TurnOutcome::Finished(
lash::TurnFinish::AssistantMessage { text }
) => persist_text(text)?,
other => handle_other_outcome(other)?,
}
Template layout is separate from slot content
use lash::{
PromptBuiltin, PromptContribution, PromptSlot, PromptTemplate,
PromptTemplateEntry, PromptTemplateSection,
};
let template = PromptTemplate::new(vec![
PromptTemplateSection::untitled(vec![
PromptTemplateEntry::builtin(PromptBuiltin::MainAgentIntro),
PromptTemplateEntry::slot(PromptSlot::Intro),
]),
PromptTemplateSection::titled(
"Guidance",
vec![PromptTemplateEntry::slot(PromptSlot::Guidance)],
),
]);
let core = lash::LashCore::standard()
.provider(provider)
.model("gpt-5.4", None)
.max_context_tokens(200_000)
.prompt_template(template)
.prompt_contribution(PromptContribution::guidance(
"App",
"Answer as the host application assistant.",
))
.build()?;
let session = core
.session("customer-42")
.replace_prompt_slot(
PromptSlot::Guidance,
[PromptContribution::guidance(
"Tenant",
"Use the tenant's support policy.",
)],
)
.open()
.await?;
let result = session
.turn(TurnInput::text("Draft the response."))
.prompt_contribution(PromptContribution::guidance(
"Turn",
"Keep this reply under 120 words.",
))
.run()
.await?;
App state
Own chat tables, account ids, frontend board state, request auth, and transport formats. The example app stores chat messages in SQLite and streams newline-delimited JSON to the browser.
Runtime state
Pass an explicit store factory such as lash_sqlite_store::SqliteSessionStoreFactory::new(...) to LashCoreBuilder::store_factory, or pass a concrete store to SessionBuilder::store, when sessions need durable runtime state across process restarts.
use std::sync::Arc;
use lash::{plugins::PluginFactory, SessionSpec};
use lash_subagents::{default_registry, SubagentsPluginFactory};
let registry = Arc::new(default_registry(&tier_models));
let host = Arc::new(AppSubagentHost::new(child_store_factory));
let subagents = SubagentsPluginFactory::new(registry, host)
.with_session_spec(SessionSpec::inherit().max_turns(8));
let core = LashCore::rlm()
.provider(provider)
.model(model, None)
.max_context_tokens(200_000)
.plugin(Arc::new(subagents) as Arc<dyn PluginFactory>)
.build()?;
Capability implementations return SessionSpec overlays. StaticCapability is for exact child authority, while TierCapability implements the built-in explore and peer model/mode selection. Tool authors should not construct SessionPolicy for child configuration; it remains the resolved runtime artifact.
use std::collections::BTreeMap;
use lash_plugin_mcp::{McpPluginFactory, McpServerConfig};
let mut servers = BTreeMap::new();
servers.insert(
"docs".to_string(),
McpServerConfig::stdio("uvx", vec!["mcp-server-docs".into()]),
);
servers.insert(
"web".to_string(),
McpServerConfig::streamable_http("https://mcp.example.com/rpc"),
);
let mcp = McpPluginFactory::new(servers).await?;
let core = LashCore::rlm()
.provider(provider)
.model(model, None)
.max_context_tokens(200_000)
.plugin(std::sync::Arc::new(mcp))
.build()?;
Tools are surfaced under mcp__<server>__<tool> names with their original input and output schemas preserved. The factory's attach_server / detach_server methods let hosts add or remove servers at runtime without rebuilding the core.
let core = LashCore::rlm()
.provider(provider)
.model("anthropic/claude-sonnet-4.6", None)
.max_context_tokens(200_000)
.store_factory(store_factory)
.advanced()
.residency(Residency::ActivePathOnly)
.build()?;
Turn streaming is semantic by default: TurnBuilder::stream emits TurnActivity items and resolves with a rich TurnResult. Raw runtime telemetry belongs in tracing and lower-level runtime debugging, not the normal lash API surface.
OPENROUTER_API_KEY=... cargo run -p agent-service
# then open http://127.0.0.1:3000
Source: examples/agent-service. The dedicated walkthrough is Agent Service.