Every time I tried to build a serious LLM workflow, I ended up juggling scripts, notebooks, glue services, and half a dozen “context” hacks. I wanted a language where async orchestration, validation, RAG, agents, and now structured memory are first-class—not bolted on after the fact. Lexon is my answer: an LLM-first programming language with a deterministic runtime, strong governance, and batteries included. This RC introduces the biggest addition so far, Structured Project Memory, but it sits next to peers like MCP, sessions, merge/fallback/ensemble, arbitrage, multioutput, and advanced RAG. Here’s the tour.
Run one command, watch async orchestration + merge happen, and be done before the kettle boils:
cargo run -q -p lexc-cli -- compile --run samples/01-async-parallel.lxpub fn main() {
set_default_model("simulated");
let [outline, slogans] = ask_parallel([
ask { user: "Outline the release checklist for Lexon RC.1"; temperature: 0.1; },
ask { user: "Give me two motivating one-liners for the launch"; temperature: 0.4; }
]);
let merged = ask_merge(outline, slogans, "Return two concise bullet points for kickoff");
print(merged);
}
No shell scripts, no YAML pipelines: lexc-cli compiles
and runs the IR, and the deterministic runtime simulates the LLMs until
you provide real API keys.
rustup override set 1.82.0 && rustup component add clippy rustfmt.cargo build --workspace --locked.OPENAI_API_KEY,
ANTHROPIC_API_KEY, GOOGLE_API_KEY, or declare
custom [providers] blocks in lexon.toml.--workspace . gates file I/O,
--allow-exec unlocks execute(),
LEXON_ALLOW_HTTP=1 enables the HTTP client.If you’re tired of stitching together Python notebooks, LangChain pipelines, or orchestration DAGs just to run prompts with context and validation, Lexon is the opposite experience: a real language with LLM-first primitives, governance, and structured memory built in. Instead of spinning up extra frameworks and services for every new feature, Lexon gives you:
lexon.toml
once; you don’t have to wire half a dozen services just to get
started.That lexon.toml (shipped in
v1.0.0-rc.1/lexon.toml) is where you declare
[system] default_provider,
[providers.<name>] blocks, web_search
presets, sandbox flags, and structured-memory backends. No extra
bootstrap layer required.
modules/ roots + aliasing),
if/while/for/match, public/private functions,
typed/inferred variables, predictable truthiness, JSON-like structs,
error primitives (Ok/Error/is_ok/unwrap).task.spawn/await,
join_all, join_any, select_any,
channels, rate limiter, retry policies, timeouts, cooperative
cancellation.map,
filter, reduce, range,
zip, chunk, strings.join, inline
expressions for dataset pipelines.execute()
disabled by default, absolute paths gated by --workspace,
budgets per provider, telemetry hooks (Prometheus, OTEL).Hello world stays simple:
pub fn main() {
set_default_model("simulated");
let message = ask("Say hello from Lexon");
print(message);
}
Execution goes through lexc-cli. Offline simulations are
the default; setting OPENAI_API_KEY,
ANTHROPIC_API_KEY, GOOGLE_API_KEY, or custom
provider blocks in lexon.toml seamlessly switches to real
models.
ask*,
task.spawn, before_action).lexon.toml, VS Code extension,
tree-sitter.ask_parallel,
ask_merge.The orchestration surface is intentionally broad because real apps
need more than a lone ask():
ask,
ask_parallel, ask_merge
(summarize/synthesize), ask_with_fallback,
ask_ensemble, model_arbitrage,
model_dialogue.ask_safe, ask_with_validation,
quality.* gates for schema/PII/confidence, configurable
retries/budgets.ask_multioutput emits
primary text + multiple deterministic files (JSON/CSV/Markdown/binary
stubs) with helpers get_multioutput_*,
save_multioutput_file.session_start/ask/history/summarize/compress/extract_key_points,
TTL/GC via sessions.gc_now, context window management.memory_index.ingest_chunks, hybrid search (SQLite +
Qdrant), rerank (LLM + cross-encoder), semantic fusion with citations,
rag.optimize_window.load_csv, save_json, Parquet via
Polars/Arrow), web.search (DuckDuckGo, Brave, SerpAPI,
custom endpoints), HTTP client (opt-in via
LEXON_ALLOW_HTTP), string/regex helpers for parsing.Example pipeline:
let ds = load_csv("samples/triage/tickets.csv");
let urgent = filter(ds, 'priority == "high"');
save_json(urgent, "output/high_tickets.json");
let brief = ask_safe {
user: "Summarize the high priority tickets",
validation: "basic",
max_attempts: 2
};
print(brief);
Dual-role prompt (system + user) with guardrails:
let summary = ask {
system: "You are Lexon's semantic memory layer. Be precise.";
user: "Summarize the runtime guide in two bullet points.";
model: "openai:gpt-4o-mini";
temperature: 0.2;
max_tokens: 256;
};
Hybrid search + fusion + answer:
let hits = memory_index.hybrid_search("before_action hook", 5);
let context = rag.fuse_passages(hits, 3);
let answer = ask {
system: "Use only the provided context.";
user: strings.join([
"Context:\n", context, "\n\nQuestion: How do before_action hooks enrich agents?"
], "")
};
print(answer);
Agents aren’t useful if you can’t govern them, cancel them, or see what they’re doing. MCP support comes built-in—you can launch stdio or WebSocket MCP servers directly from Lexon, register tools with quotas, and stream progress/cancelation signals without extra glue:
rpc.cancel),
heartbeats, streaming progress.agent_create/run,
parallel/chained flows, supervisors, budgets, deadlines, telemetry
spans, on_tool_call/on_tool_error.before_action use_context automatically pulls
structured-memory bundles before any agent step.lexon.toml drives
providers, web search, sandbox toggles (LEXON_ALLOW_HTTP,
LEXON_ALLOW_NET), memory paths.Everything funnels through the same governance rails: structured memory, RAG queries, HTTP calls, MCP tools, and multioutput share telemetry, budgets, and deterministic behavior.
Start servers and hook context:
# stdio server
cargo run -q -p lexc -- --mcp-stdio
# WebSocket server with custom addr
cargo run -q -p lexc -- --mcp-ws --mcp-addr 127.0.0.1:9443before_action use_context project="lexon_demo" topic="runtime";
let supervisor = agent_create("runtime_supervisor", """{"model":"openai:gpt-4o-mini","budget_usd":0.15}""");
let report = agent_run(supervisor, "Produce the deployment checklist for RC.1", """{"deadline_ms": 30000}""");
print(report);
Lexon’s OTEL hooks are baked into the runtime. Flip a single env var
and every scheduler hop, ask call, structured-memory write,
or MCP tool execution emits spans:
LEXON_OTEL=1 OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
cargo run -q -p lexc-cli -- compile --run samples/01-async-parallel.lxThe bundled OTLP smoke collector (cargo make otel-smoke)
surfaces spans such as lexon.scheduler.execute,
lexon.ask.request, and
lexon.memory.remember_raw, each annotated with model,
tokens, duration, and budget metadata—proof that this stack is
observability-ready.
This RC’s headline feature is a native “second brain” for each Lexon project. It doesn’t replace RAG; it sits above it, curating the knowledge humans actually care about before any retrieval call.
MemoryObjects with path,
kind, raw,
summary_micro/short/long, tags,
metadata, relevance, pinned,
timestamps, and policies.basic: heuristic scoring (relevance, pinning,
topic/kind/tag matches).patricia: compressed trie for fast path-prefix
lookups.raptor: RAPTOR-style clustering using tags +
recency.hybrid (GraphRAG/MemTree): entity-token overlap +
clustering.LEXON_MEMORY_BACKEND=basic|patricia|raptor|hybrid.let _ = memory_space.create("lexon_demo", """{"reset": true}""");
let obj = remember_raw(
"lexon_demo",
"decision",
read_file("docs/runtime_decision.md"),
"""{"project": "runtime", "path_hint": "lexon/runtime/decisions"}"""
);
let bundle = recall_context(
"lexon_demo",
"runtime",
"""{"limit": 3, "include_raw": true, "freeze_clock": "2025-01-01T00:00:00Z"}"""
);
print(obj);
print(bundle);
remember_raw talks to the configured provider (OpenAI,
Anthropic, Google, Ollama, HF, custom) to infer structure, tags,
relevance, pinning suggestions, and metadata;
remember_structured ingests pre-built payloads. Both obey
budgets, retries, telemetry, and deterministic testing features such as
freeze_clock.
ask, so it works with every supported backend.recall_context yields
global_summary + sections + raw; chain with
memory_index.hybrid_search for large corpora.before_action use_context
auto-injects bundles so pinned guides/configs always enter the
prompt.remember_structured, giving each
project a durable storyline.ask_multioutput to produce multi-file briefings.remember_* /
recall_* emit spans, respect budgets, and surface in
OTEL/Prometheus dashboards.Structured memory is a peer to RAG, MCP, sessions, merge/fallback/ensemble, arbitrage, multioutput, and other orchestration features—one more first-class capability, not the only story.
If you just need the checklist, here it is:
if/while/for/match, functions, structs, error handling,
string/data utilities.task.spawn/await, join_all,
select_any, channels, rate limiter, retries, timeouts.map,
filter, reduce, zip,
chunk, flatten, unique,
find, count.execute, deterministic
goldens.web.search, custom endpoints via
lexon.toml.ask_safe, ask_with_validation, multioutput,
model arbitrage/dialogue.quality.*, gates
for PII/schema/confidence, sessions.gc_now, budgets per
provider.memory_index.*), structured project memory
(memory_space.*, remember_*,
recall_*, pin/policy APIs),
before_action use_context.rag.optimize_window, per-model tokenization.lexon_py), cargo-make tasks,
fuzz harnesses.GA exit criteria: p95 runtime <1.2× baseline
(samples/apps/research_analyst), <1% token-budget
regression on samples/memory/structured_semantic, and OTEL
spans present for every tool/ask call in CI smoke tests.
These are the programs I run to prove Lexon still “gets it right” end-to-end:
samples/00-hello-lexon.lx: syntax and CLI
workflow.samples/01-async-parallel.lx: scheduler,
task.spawn, select_any.samples/apps/research_analyst/main.lx:
full MCP + web search + RAG + sessions demo.samples/memory/structured_semantic.lx:
deterministic structured-memory smoke test + golden/memory/structured_semantic.txt.Commands:
cargo build --workspace
cargo run -q -p lexc-cli -- compile --run samples/00-hello-lexon.lx
cargo make samples-smoke
cargo make samples-snapshotSwitch structured memory backend:
LEXON_MEMORY_BACKEND=hybrid cargo run --bin lexc -- samples/memory/structured_semantic.lxUse real providers by exporting API keys and editing
lexon.toml (default_provider, per-provider
defaults). Without keys, everything is deterministic and runs
offline.
{"backend": "patricia"}
hints).lexc memory browse) and
better introspection.ROADMAP.md: Qdrant presets,
telemetry dashboards, provider expansion, IR optimizations, richer
stdlib/networking, sockets, CI hardening, and DX tooling.We gate GA on three metrics: median
cargo make samples-smoke, p95 runtime for
samples/apps/research_analyst, and structured-memory recall
accuracy on the golden sample. When those stay inside budget for two
consecutive runs, RC graduates to 1.0.
github.com/lexon-lang/lexon (RC pinned to Rust
1.82).cargo build --workspace.samples/memory/structured_semantic.lx using
different backends (basic, patricia,
raptor, hybrid).README.md, DOCUMENTATION.md,
communication/lexon_memory_features.md for the deep
details.Lexon already drives MCP agents, ETL pipelines, copilots, RAG flows, and now high-signal project memory—all from one language. If you try it, let me know what you build; I’m still the only person maintaining this thing, and your feedback shapes the backlog.
Repository: github.com/lexon-lang/lexon
Docs: README.md, DOCUMENTATION.md,
communication/lexon_memory_features.md
Contact: open an issue/PR or share your demo referencing Lexon
+ Structured Project Memory.
samples/memory/structured_semantic.lx, pin the insights you
care about, then wire before_action use_context into your
go-to agent template so every step sees the curated context.lexc --mcp-ws --mcp-addr 127.0.0.1:9443 and hand that
endpoint to Cursor, Claude Desktop, or your own supervisor—Lexon handles
streaming, quotas, and cancellation.recall_context first, reuse its summaries as a system
prompt, then run memory_index.hybrid_search for long-tail
retrieval. Less hallucination, more grounded answers.cargo make samples-smoke, and watch structured memory,
agents, and MCP emit spans with budgets so you can spot regressions
early.lexon_py,
compile .lx snippets inline, and keep your existing
notebooks while you migrate orchestrations into Lexon.good first issue
queue are the fastest way to contribute feedback.cargo run -q -p lexc-cli -- compile --run samples/apps/research_analyst/main.lx
hits MCP, web search, RAG, sessions, and structured memory
end-to-end—perfect demo fodder.