← blog

What is Moltis? The Rust-Native AI Assistant Server Explained

June 5, 2026 · 26 min read · Nebula Deck

moltisairustself-hostedllmnebula deck

Quick answer: Moltis is an open-source, self-hostable AI assistant server written in Rust. It compiles into a single binary with no runtime dependency - no Node.js, no Python, no external database. It idles at ~256MB RAM. It provides persistent memory (hybrid vector + full-text search), multi-channel messaging (Telegram, WhatsApp, Discord, Slack, Matrix, Teams, Signal, Nostr, phone, and web), tool execution, MCP integration, voice I/O, local LLM support, and browser automation. Nebula Deck offers managed Moltis hosting with on-demand coding agents starting at $7/mo.

What is Moltis?
Moltis vs OpenClaw vs Hermes Agent: Architecture Comparison
Key Features
How Moltis Works: Technical Architecture
Self-Hosting Moltis
Managed Moltis with Nebula Deck
Importing from OpenClaw
FAQ

What is Moltis?

Moltis is a persistent personal agent server. Think of it as an AI assistant that's always running on your infrastructure - you interact with it through a web UI or the messaging platforms you already use, and it remembers everything across sessions.

It's written in Rust, ships as a single compiled binary, and stores all data locally in SQLite. No external database, no Redis, no Kafka, no node_modules. The whole thing starts in milliseconds.

The core idea: instead of an AI assistant that lives inside someone else's SaaS, Moltis gives you one that runs on your own machine (or managed infrastructure like Nebula Deck), stores data in files you control, and talks to LLM providers directly using your own API keys.

Moltis was created by Fabien Penso, a systems engineer with over two decades of experience building high-throughput infrastructure - including the push system behind BBC mobile apps (32M+ devices, 120M+ notifications/day at peak), Facebook OpenGraph integrations used by hundreds of millions, and Constellations, the fastest Rust-based Cosmos blockchain indexer. He built Moltis because he wanted an AI assistant he could trust, run himself, and understand end to end.

"Infrastructure you depend on should be yours. No telemetry, no phone-home, no vendor lock-in. If I disappear tomorrow, your stack still works."

Alpha software disclosure: Moltis is officially alpha software. The project's own documentation advises running it in isolated environments, reviewing enabled tools and providers, keeping secrets scoped and rotated, and avoiding public exposure without strong authentication. This is par for the course with self-hosted AI tooling in 2026.

Key characteristics:

Language: Rust (stable, safe ownership model, no garbage collector)
Memory footprint: ~256MB RAM at idle (including web server, chat engine, SQLite, memory system)
Binary: Single self-contained executable - web UI, LLM providers, tools, and all assets compiled in
Storage: SQLite + vector embeddings + FTS5 full-text search (all local files)
Codebase: ~270K Rust lines across 59 workspace crates, 470+ test files (grown from 150K / 27 crates at launch in Feb 2026)
LLM providers: Bring your own keys - 15+ providers supported, plus local models
Source: Open-source at github.com/moltis-org/moltis (2.7k+ stars)
Channels: Telegram, WhatsApp, Discord, Slack, Matrix, Microsoft Teams, Signal, Nostr, phone calls (Twilio/Telnyx/Plivo), and web

Moltis vs OpenClaw vs Hermes Agent: Architecture Comparison

If you're evaluating Moltis, you've probably looked at OpenClaw and Hermes Agent too. All three are open-source personal agent servers, but they take fundamentally different approaches.

	Moltis	OpenClaw	Hermes Agent
Language	Rust	TypeScript (+ Swift/Kotlin apps)	Python (+ TypeScript TUI/web)
Runtime	Single compiled binary	Node.js 22.19+ (24 recommended) + npm/pnpm/bun	Python 3.11 + uv/pip
Codebase size	~270K Rust LoC	~1M+ TypeScript LoC	~144K Python LoC
Architecture	59 Rust workspace crates	npm packages, extensions, apps	Python packages, plugins, tools
Storage	SQLite (embedded, no external DB)	File-based workspace; Gateway-managed local state	File-based workspace (~/.hermes)
Memory system	SQLite + FTS5 + vector embeddings	Builtin, QMD, and Honcho engines; active memory; dreaming; compaction	Agent-curated memory + FTS5 session search + Honcho user modeling
Sandbox	Docker/Podman + Apple Container + WASM + Vercel + Daytona + remote	Docker (default), SSH, OpenShell backends	Local, Docker, SSH, Daytona, Singularity, Modal
Auth	Password + Passkey + scoped API keys + Vault	Pairing + gateway controls	CLI + messaging gateway
Startup time	Milliseconds	Seconds (Node.js boot)	Seconds (Python boot)

The runtime difference matters in practice. Moltis compiles everything - web UI, provider integrations, tools, assets - into one binary. No Node.js to babysit, no Python virtual environment to manage, no V8 garbage collector introducing latency spikes. Secrets are zeroed on drop rather than waiting for garbage collection.

All three projects support sandboxed tool execution. Moltis offers the widest variety of isolation backends (Docker/Podman, Apple Container, WASM, Vercel, Daytona, remote SSH), while OpenClaw supports Docker, SSH, and OpenShell, and Hermes Agent covers local, Docker, SSH, Singularity, Modal, and Daytona. The breadth of Moltis sandbox options is why Nebula Deck chose it as its engine - the math works for multi-tenant hosting at scale.

Key Features

Persistent Memory

Moltis remembers conversations across sessions using a two-layer memory system:

MEMORY.md - core identity facts, loaded into every conversation (kept short on purpose)
memory/<topic>.md - detailed notes, project context, decisions (loaded on demand via search)

When you tell Moltis to remember something, it writes to the appropriate file. When you ask a question, it searches across all memory files using hybrid search.

Two memory backends are available:

Feature	Built-in (default)	QMD (optional sidecar)
Search	Hybrid (vector + FTS5 keyword)	Hybrid (BM25 + vector + LLM reranking)
External deps	None - pure Rust	Requires QMD binary (Node.js/Bun)
Local embeddings	GGUF models via llama.cpp	GGUF models
Remote embeddings	OpenAI, Ollama, custom endpoints	Built-in
Embedding cache	SQLite with LRU eviction	Built-in
LLM reranking	Optional	Built-in

Both backends support offline operation with local embedding models. The built-in backend is the default and requires zero external dependencies - everything is embedded in the Moltis binary.

Multi-Channel Messaging

Moltis connects to messaging platforms simultaneously. Your assistant is available on all of these at once, backed by the same memory and configuration:

Channel	Connection Mode	Notable Capabilities
Telegram	Polling	Streaming, voice ingest, reactions, OTP, location
Discord	WebSocket Gateway	Streaming, threads, voice ingest, reactions
Matrix	Sync loop	Streaming, encrypted chats, device verification, OTP
WhatsApp	WebSocket Gateway	Streaming, voice ingest, OTP, pairing
Slack	Socket Mode	Streaming, threads, interactive messages
Microsoft Teams	Webhook	Streaming, threads, reactions
Signal	signal-cli SSE	OTP, DMs, groups
Nostr	Relay subscription	OTP, encrypted DMs (NIP-04)
Phone (Twilio/Telnyx/Plivo)	Webhook	Outbound/inbound calls, TTS, speech recognition, DTMF

Most channels don't require a public URL - only Microsoft Teams and telephony need an inbound webhook endpoint. This makes self-hosting on a home network or behind NAT straightforward for messaging.

The web UI also works as an installable PWA with push notifications on mobile, so you get a native-feeling app experience without a separate frontend.

Tools, Skills, and Sandboxed Execution

Moltis can execute tools - shell commands, file operations, web fetching, browser automation, and more. Shell commands run inside isolated containers by default:

Docker/Podman - full container isolation with filesystem boundaries
Apple Container - native macOS sandboxing
WASM - lightweight in-process isolation
Vercel / Daytona - serverless and remote sandbox targets
Restricted host - fallback when no container runtime is available

The skills system provides reusable workflows for common tasks. Skills can be bundled (shipped with Moltis), workspace-specific (your own), or auto-generated through the self-improvement feature. Moltis also supports automatic checkpoints before skill and memory mutations - you can roll back without touching git history.

When the LLM requests multiple tool calls in a single turn, they execute in parallel via futures::join_all rather than sequentially. This matters when an agent chains five tools in a row - total wall time is the longest tool, not the sum of all of them.

Environment variables injected into sandboxes are automatically redacted from output in plain text, base64, and hex forms. Secrets reach the container, but they don't leak back through tool output.

MCP (Model Context Protocol)

Moltis supports the Model Context Protocol - an open standard for connecting AI models to external tools and data sources. Both stdio and HTTP/SSE transports are supported, meaning you can connect any MCP-compatible server: databases, APIs, internal tools, browser automation, whatever you need.

In the Nebula Deck architecture, MCP is also how the Go backend communicates with Moltis containers - spawning coding agents, managing browser sessions, and handling billing-related operations all flow through MCP endpoints.

Local LLM Support

Moltis can run LLM inference entirely on your machine - no API key, no internet connection required. Two backends are supported:

Backend	Format	Platform	GPU Acceleration
GGUF (llama.cpp)	`.gguf` files	macOS, Linux, Windows	Metal (macOS), CUDA (NVIDIA), Vulkan
MLX	MLX model repos	macOS (Apple Silicon only)	Apple Silicon neural engine

Models are organized by memory tier - from 4GB RAM (1B parameter models) to 32GB+ (14B+ parameter models). You can search and download models directly from HuggingFace within the Moltis web UI.

This means you can run a fully offline AI assistant: Moltis on your hardware, the LLM on your hardware, all data local. Zero external API calls.

Voice I/O

Moltis includes built-in speech-to-text and text-to-speech. Multiple TTS providers are supported, and voice messages from Telegram, Discord, Matrix, and WhatsApp can be automatically transcribed.

Proactive Heartbeat

Moltis runs a periodic heartbeat (every 30 minutes by default, configurable) where the LLM checks if anything needs your attention - inbox, calendar, reminders - and only notifies you when something actually does. Instead of you checking five dashboards, the assistant watches them and pings you on your preferred channel when something matters.

Additional Features

Browser automation - headless Chromium control for web scraping and interaction
Cron scheduling - schedule recurring tasks and reminders
Webhooks - receive events from external services (GitHub, Stripe, Sentry, etc.)
Hooks - 15 lifecycle events where you can observe, modify, or block actions
Session branching - fork conversations to explore alternatives
CalDAV - integrate calendar data
GraphQL API - programmatic access to all Moltis functionality
Cross-session recall - search past sessions for relevant context without dumping raw history into every prompt
Encryption at rest - vault-backed secret storage with secrecy::Secret zeroing
SSRF protection - DNS-resolved blocking of loopback, private, link-local, and CGNAT ranges
Tailscale integration - expose the gateway over your tailnet via Tailscale Serve or Funnel, with status monitoring and mode switching from the web UI
Observability - Prometheus metrics, OpenTelemetry tracing with OTLP export, and structured logging for production deployments
PWA - installable web app with push notifications on mobile devices
Onboarding wizard - first-run setup walks you through configuring agent identity (name, emoji, creature, vibe) and your user profile, with TOML config validation and typo detection

How Moltis Works: Technical Architecture

Under the hood, Moltis is structured as a Rust workspace with 59 crates:

 INGRESS: Web UI (PWA)        INGRESS: Channels
 +-------------------+        +--------------------------+
 |  HTTP Gateway     |        | Telegram  Discord  Slack |
 |  (TLS, WebSocket, |        | WhatsApp  Matrix  Teams  |
 |   REST API)       |        | Signal   Nostr    Phone  |
 +--------+----------+        +-------------+------------+
          |                                 |
          +----------------+----------------+
                           |
                           v
          +--------------------------------+
          |        Agent Loop              |
          |  (orchestrates turns,          |
          |   streams responses back       |
          |   to all channels)             |
          +---+-------+-------+----+---+---+
              |       |       |    |   |
              v       v       v    v   v
 +----------+ +-----+ +-----+ +--+--+ +-----+
 | Provider | |Memory| |Tools| |MCP  | |Cron |
 | Registry | |(FTS5 | |&    | |Bridge| |Heart|
 |          | |+vec) | |Skills| |     | |beat |
 | Anthropic| +-----+ +--+--+ +-----+ +-----+
 | OpenAI   |             |
 | Gemini   |             v
 | + 12 more|    +------------------+
 +----------+    |    Sandbox       |
                 | Docker / Podman  |
                 | Apple Container  |
                 | WASM / Remote    |
                 +------------------+
                          |
          +---------------+---------------+
          |       SQLite Database         |
          |  sessions - memory - vectors  |
          |  hooks - cron - embeddings    |
          +-------------------------------+

Provider abstraction: Moltis doesn't lock you into one LLM. It supports 15+ providers through a trait-based architecture - Anthropic, OpenAI, Google Gemini, DeepSeek, Mistral, Groq, xAI, OpenRouter, Cerebras, MiniMax, Moonshot, Venice, Z.AI, plus OAuth providers (OpenAI Codex, GitHub Copilot) and local providers (Ollama, LM Studio, built-in GGUF/MLX). You switch providers by changing a config value. Any OpenAI-compatible endpoint can be added with a custom- prefix. OpenAI Batch API support gives you 50% cost savings on eligible calls.

Streaming-first design: Token streaming works on every provider, including when tools are enabled. Tool call arguments stream as deltas as they arrive - you see output immediately rather than waiting for the full response to complete. This applies across all channels, not just the web UI.

Memory architecture: All persistence lives in SQLite. Vector embeddings are stored alongside the database file. Full-text search uses SQLite's FTS5. No external vector database, no Redis, no separate search service. The agent runner itself is ~7.5K lines of Rust, with provider implementations in ~19K more. The remaining ~243K lines cover the web UI, tools, channels, sandbox backends, and supporting infrastructure.

Security model: Moltis generates a self-signed CA on first run for local TLS. Authentication uses password + WebAuthn passkeys + scoped API keys. Secrets go through secrecy::Secret wrappers that zero memory on drop. The SSRF filter resolves DNS and blocks loopback, private, link-local, and CGNAT ranges. Rate limiting is built in with per-IP throttling and strict login protection. There are 15 lifecycle hook events where actions can be inspected, modified, or blocked before execution. Release artifacts are signed with Sigstore/Cosign (keyless signing), Docker images ship with SBOM and provenance attestations, and workspace lints deny unsafe_code, unwrap_used, and expect_used by default.

Self-Hosting Moltis

Moltis is open-source and designed to be self-hosted. Here's what you need.

Installation

The fastest way to get started on macOS or Linux:

curl -fsSL https://www.moltis.org/install.sh | sh

Or via Homebrew:

brew install moltis-org/tap/moltis

Or via Docker:

docker pull ghcr.io/moltis-org/moltis:latest

Linux packages (.deb, .rpm, .pkg.tar.zst, Snap, AppImage) are also available through the installer script. You can even build from source with Rust 1.91+.

Running with Docker

docker run -d \
  --name moltis \
  -p 13131:13131 \
  -p 13132:13132 \
  -p 1455:1455 \
  -v moltis-config:/home/moltis/.config/moltis \
  -v moltis-data:/home/moltis/.moltis \
  ghcr.io/moltis-org/moltis:latest

Ports:

Port	Purpose
13131	Gateway - web UI, API, WebSocket (HTTPS by default)
13132	CA certificate download for local TLS trust
1455	OAuth callback (required for OpenAI Codex and similar providers)

Volume mounts:

Path	Contents
`/home/moltis/.config/moltis`	Configuration: `moltis.toml`, credentials, MCP server config
`/home/moltis/.moltis`	Runtime data: databases, sessions, memory files, models, logs

Then open https://localhost:13131 in your browser and follow the onboarding: configure your LLM provider, optionally add a passkey, and start chatting.

Note: On localhost, no authentication is required. If you access Moltis from a different machine, a setup code is printed to the container logs.

First Run

After installation, simply run:

moltis

You'll see:

Moltis gateway starting...
Open http://localhost:13131 in your browser

Configure a provider (API key, OAuth, or local LLM), and you're chatting. The fastest path is setting an environment variable like ANTHROPIC_API_KEY or OPENROUTER_API_KEY before starting - models appear automatically in the picker.

What you manage yourself

Self-hosting means you handle:

TLS certificates - for public-facing deployments, use a reverse proxy (Caddy, nginx, Traefik) with Let's Encrypt
Updates - pull new images or re-run the install script
Backups - the SQLite database + memory files in the data directory
Security - network exposure, firewall, container isolation if multi-user

Managed Moltis with Nebula Deck

If self-hosting sounds like work but you still want the benefits of Moltis - your own instance, your own API keys, persistent memory, multi-channel access - Nebula Deck provides managed hosting.

What Nebula Deck handles

Infrastructure - each user gets their own Moltis container with gVisor isolation
TLS - automatic via Caddy + Cloudflare DNS-01
Updates - self-service with rollback (you choose when, not forced reboots)
Security - gVisor-isolated containers, no Docker socket access for tenants, SSRF-filtered networking
Backups - planned for post-launch

What you get beyond vanilla Moltis

Coding agents - on-demand OpenHands instances for autonomous coding tasks (Standard headless: $0.05/hr, Full GUI: $0.15/hr, per-second billing with 60s minimum)
Browser sessions - isolated Chromium for web automation and scraping ($0.05/hr)
Web search - shared SearXNG instance providing unlimited search at zero marginal cost (rolling out at launch)
MCP bridge - agent and browser session orchestration through MCP endpoints
Polar.sh billing - integrated usage-based billing
Zitadel OAuth - platform-level authentication

Pricing

Component	Cost
Deck (always-on Moltis, 256MB, 1GB storage)	$7/mo
Compute tiers (agents + browser, per-second billing, 60s min)
- Developer (3 concurrent, $10 credit)	$15/mo
- Studio (5 concurrent, $35 credit)	$39/mo
- Observatory (10 concurrent, $100 credit)	$99/mo
Standard container rate (headless agent or browser)	$0.05/hr
Full container rate (GUI agent + browser)	$0.15/hr
LLM tokens	BYOK - provider rates, no markup

Importing from OpenClaw

Moltis has a built-in OpenClaw import system - it's a core feature of Moltis itself, not just Nebula Deck. If you have an existing OpenClaw installation, Moltis automatically detects it and can import:

Category	What gets imported
Identity	Agent name, theme, timezone
Providers	API keys (mapped to Moltis equivalents)
Skills	All skill directories with `SKILL.md`
Memory	`MEMORY.md` and all `memory/*.md` files
Channels	Telegram and Discord bot configs
Sessions	Full conversation history (JSONL -> Moltis format)
MCP Servers	Server configurations
Workspace files	`SOUL.md`, `IDENTITY.md`, `USER.md`, `TOOLS.md`, `AGENTS.md`, etc.

The import is strictly read-only - your OpenClaw installation is never modified. You can import via:

Web UI - during onboarding or from Settings > OpenClaw Import
CLI - moltis import detect, moltis import all, or moltis import select -c providers,skills,memory
RPC - programmatic access via openclaw.detect and openclaw.import methods

There's even automatic background syncing: if you continue using OpenClaw after import, Moltis watches for new session files and syncs them incrementally within seconds.

FAQ

Is Moltis free and open-source?

Yes. Moltis is open-source at github.com/moltis-org/moltis with 2.7k+ stars. You can self-host it for free - you only pay for your server and LLM API usage. If you want managed hosting with extra features like coding agents and browser sessions, Nebula Deck provides that starting at $7/mo.

Is Moltis production-ready?

Moltis is officially alpha software. The project's documentation advises treating it accordingly: run in isolated environments, review your tool/provider configuration, keep secrets rotated, and don't expose it publicly without strong authentication. That said, it's actively developed with 3,700+ commits and a growing community.

Can I import my data from OpenClaw?

Yes - and it's built into Moltis itself, not just the managed offering. Moltis automatically detects an existing OpenClaw installation and can import identity, providers, skills, memory, sessions, channels, MCP servers, and workspace files. The import is read-only and idempotent, so you can safely run it alongside OpenClaw and re-import to pick up new data.

Does Moltis work with local LLMs?

Yes. Moltis has two built-in local inference backends: GGUF (powered by llama.cpp, cross-platform with GPU acceleration) and MLX (Apple Silicon native). It also works with Ollama, LM Studio, or any OpenAI-compatible local server. This gives you a fully offline AI assistant - the agent and the LLM both run on your hardware, with zero external API calls.

How is Moltis different from Hermes Agent?

Both are self-hostable AI assistants with multi-channel messaging, persistent memory, MCP support, and self-improving skills. The key difference is focus: Moltis prioritizes a minimal trusted runtime (Rust binary, ~256MB, fast startup) and multi-tenant density. Hermes Agent is Python-first with a research-oriented learning loop - trajectory generation, RL environments, and user modeling. Both have autonomous skill improvement (Moltis enables it by default). The practical difference: Moltis is easier to deploy on small VPS instances with no runtime to install, while Hermes is more interesting if your priority is the research and reinforcement feedback loop.

How is Moltis different from OpenClaw?

OpenClaw is a TypeScript-based ecosystem with companion apps (macOS, iOS, Android), a broad plugin system, and ~1M+ lines of code. Moltis is a Rust single binary with ~270K lines, designed for minimal runtime overhead. Both support channels, tools, skills, and memory. The practical difference: OpenClaw has a larger feature surface and native mobile apps, while Moltis has a smaller trusted computing base, faster startup, and lower resource requirements - making it better suited for dense multi-tenant hosting.

Want to try Moltis without setting up a server? Nebula Deck provides managed Moltis instances with on-demand coding agents, browser sessions, and SearXNG-powered web search - starting at $7/mo.

Table of Contents