What is Moltis? The Rust-Native AI Assistant Server Explained
Quick answer: Moltis is an open-source, self-hostable AI assistant server written in Rust. It compiles into a single binary with no runtime dependency - no Node.js, no Python, no external database. It idles at ~256MB RAM. It provides persistent memory (hybrid vector + full-text search), multi-channel messaging (Telegram, WhatsApp, Discord, Slack, Matrix, Teams, Signal, Nostr, phone, and web), tool execution, MCP integration, voice I/O, local LLM support, and browser automation. Nebula Deck offers managed Moltis hosting with on-demand coding agents starting at $7/mo.
Table of Contents
- What is Moltis?
- Moltis vs OpenClaw vs Hermes Agent: Architecture Comparison
- Key Features
- How Moltis Works: Technical Architecture
- Self-Hosting Moltis
- Managed Moltis with Nebula Deck
- Importing from OpenClaw
- FAQ
What is Moltis?
Moltis is a persistent personal agent server. Think of it as an AI assistant that's always running on your infrastructure - you interact with it through a web UI or the messaging platforms you already use, and it remembers everything across sessions.
It's written in Rust, ships as a single compiled binary, and stores all data locally in SQLite. No external database, no Redis, no Kafka, no node_modules. The whole thing starts in milliseconds.
The core idea: instead of an AI assistant that lives inside someone else's SaaS, Moltis gives you one that runs on your own machine (or managed infrastructure like Nebula Deck), stores data in files you control, and talks to LLM providers directly using your own API keys.
Moltis was created by Fabien Penso, a systems engineer with over two decades of experience building high-throughput infrastructure - including the push system behind BBC mobile apps (32M+ devices, 120M+ notifications/day at peak), Facebook OpenGraph integrations used by hundreds of millions, and Constellations, the fastest Rust-based Cosmos blockchain indexer. He built Moltis because he wanted an AI assistant he could trust, run himself, and understand end to end.
"Infrastructure you depend on should be yours. No telemetry, no phone-home, no vendor lock-in. If I disappear tomorrow, your stack still works."
Alpha software disclosure: Moltis is officially alpha software. The project's own documentation advises running it in isolated environments, reviewing enabled tools and providers, keeping secrets scoped and rotated, and avoiding public exposure without strong authentication. This is par for the course with self-hosted AI tooling in 2026.
Key characteristics:
- Language: Rust (stable, safe ownership model, no garbage collector)
- Memory footprint: ~256MB RAM at idle (including web server, chat engine, SQLite, memory system)
- Binary: Single self-contained executable - web UI, LLM providers, tools, and all assets compiled in
- Storage: SQLite + vector embeddings + FTS5 full-text search (all local files)
- Codebase: ~270K Rust lines across 59 workspace crates, 470+ test files (grown from 150K / 27 crates at launch in Feb 2026)
- LLM providers: Bring your own keys - 15+ providers supported, plus local models
- Source: Open-source at github.com/moltis-org/moltis (2.7k+ stars)
- Channels: Telegram, WhatsApp, Discord, Slack, Matrix, Microsoft Teams, Signal, Nostr, phone calls (Twilio/Telnyx/Plivo), and web
Moltis vs OpenClaw vs Hermes Agent: Architecture Comparison
If you're evaluating Moltis, you've probably looked at OpenClaw and Hermes Agent too. All three are open-source personal agent servers, but they take fundamentally different approaches.
| Moltis | OpenClaw | Hermes Agent | |
|---|---|---|---|
| Language | Rust | TypeScript (+ Swift/Kotlin apps) | Python (+ TypeScript TUI/web) |
| Runtime | Single compiled binary | Node.js 22.19+ (24 recommended) + npm/pnpm/bun | Python 3.11 + uv/pip |
| Codebase size | ~270K Rust LoC | ~1M+ TypeScript LoC | ~144K Python LoC |
| Architecture | 59 Rust workspace crates | npm packages, extensions, apps | Python packages, plugins, tools |
| Storage | SQLite (embedded, no external DB) | File-based workspace; Gateway-managed local state | File-based workspace (~/.hermes) |
| Memory system | SQLite + FTS5 + vector embeddings | Builtin, QMD, and Honcho engines; active memory; dreaming; compaction | Agent-curated memory + FTS5 session search + Honcho user modeling |
| Sandbox | Docker/Podman + Apple Container + WASM + Vercel + Daytona + remote | Docker (default), SSH, OpenShell backends | Local, Docker, SSH, Daytona, Singularity, Modal |
| Auth | Password + Passkey + scoped API keys + Vault | Pairing + gateway controls | CLI + messaging gateway |
| Startup time | Milliseconds | Seconds (Node.js boot) | Seconds (Python boot) |
The runtime difference matters in practice. Moltis compiles everything - web UI, provider integrations, tools, assets - into one binary. No Node.js to babysit, no Python virtual environment to manage, no V8 garbage collector introducing latency spikes. Secrets are zeroed on drop rather than waiting for garbage collection.
All three projects support sandboxed tool execution. Moltis offers the widest variety of isolation backends (Docker/Podman, Apple Container, WASM, Vercel, Daytona, remote SSH), while OpenClaw supports Docker, SSH, and OpenShell, and Hermes Agent covers local, Docker, SSH, Singularity, Modal, and Daytona. The breadth of Moltis sandbox options is why Nebula Deck chose it as its engine - the math works for multi-tenant hosting at scale.
Key Features
Persistent Memory
Moltis remembers conversations across sessions using a two-layer memory system:
- MEMORY.md - core identity facts, loaded into every conversation (kept short on purpose)
- memory/<topic>.md - detailed notes, project context, decisions (loaded on demand via search)
When you tell Moltis to remember something, it writes to the appropriate file. When you ask a question, it searches across all memory files using hybrid search.
Two memory backends are available:
| Feature | Built-in (default) | QMD (optional sidecar) |
|---|---|---|
| Search | Hybrid (vector + FTS5 keyword) | Hybrid (BM25 + vector + LLM reranking) |
| External deps | None - pure Rust | Requires QMD binary (Node.js/Bun) |
| Local embeddings | GGUF models via llama.cpp | GGUF models |
| Remote embeddings | OpenAI, Ollama, custom endpoints | Built-in |
| Embedding cache | SQLite with LRU eviction | Built-in |
| LLM reranking | Optional | Built-in |
Both backends support offline operation with local embedding models. The built-in backend is the default and requires zero external dependencies - everything is embedded in the Moltis binary.
Multi-Channel Messaging
Moltis connects to messaging platforms simultaneously. Your assistant is available on all of these at once, backed by the same memory and configuration:
| Channel | Connection Mode | Notable Capabilities |
|---|---|---|
| Telegram | Polling | Streaming, voice ingest, reactions, OTP, location |
| Discord | WebSocket Gateway | Streaming, threads, voice ingest, reactions |
| Matrix | Sync loop | Streaming, encrypted chats, device verification, OTP |
| WebSocket Gateway | Streaming, voice ingest, OTP, pairing | |
| Slack | Socket Mode | Streaming, threads, interactive messages |
| Microsoft Teams | Webhook | Streaming, threads, reactions |
| Signal | signal-cli SSE | OTP, DMs, groups |
| Nostr | Relay subscription | OTP, encrypted DMs (NIP-04) |
| Phone (Twilio/Telnyx/Plivo) | Webhook | Outbound/inbound calls, TTS, speech recognition, DTMF |
Most channels don't require a public URL - only Microsoft Teams and telephony need an inbound webhook endpoint. This makes self-hosting on a home network or behind NAT straightforward for messaging.
The web UI also works as an installable PWA with push notifications on mobile, so you get a native-feeling app experience without a separate frontend.
Tools, Skills, and Sandboxed Execution
Moltis can execute tools - shell commands, file operations, web fetching, browser automation, and more. Shell commands run inside isolated containers by default:
- Docker/Podman - full container isolation with filesystem boundaries
- Apple Container - native macOS sandboxing
- WASM - lightweight in-process isolation
- Vercel / Daytona - serverless and remote sandbox targets
- Restricted host - fallback when no container runtime is available
The skills system provides reusable workflows for common tasks. Skills can be bundled (shipped with Moltis), workspace-specific (your own), or auto-generated through the self-improvement feature. Moltis also supports automatic checkpoints before skill and memory mutations - you can roll back without touching git history.
When the LLM requests multiple tool calls in a single turn, they execute in parallel via futures::join_all rather than sequentially. This matters when an agent chains five tools in a row - total wall time is the longest tool, not the sum of all of them.
Environment variables injected into sandboxes are automatically redacted from output in plain text, base64, and hex forms. Secrets reach the container, but they don't leak back through tool output.
MCP (Model Context Protocol)
Moltis supports the Model Context Protocol - an open standard for connecting AI models to external tools and data sources. Both stdio and HTTP/SSE transports are supported, meaning you can connect any MCP-compatible server: databases, APIs, internal tools, browser automation, whatever you need.
In the Nebula Deck architecture, MCP is also how the Go backend communicates with Moltis containers - spawning coding agents, managing browser sessions, and handling billing-related operations all flow through MCP endpoints.
Local LLM Support
Moltis can run LLM inference entirely on your machine - no API key, no internet connection required. Two backends are supported:
| Backend | Format | Platform | GPU Acceleration |
|---|---|---|---|
| GGUF (llama.cpp) | .gguf files |
macOS, Linux, Windows | Metal (macOS), CUDA (NVIDIA), Vulkan |
| MLX | MLX model repos | macOS (Apple Silicon only) | Apple Silicon neural engine |
Models are organized by memory tier - from 4GB RAM (1B parameter models) to 32GB+ (14B+ parameter models). You can search and download models directly from HuggingFace within the Moltis web UI.
This means you can run a fully offline AI assistant: Moltis on your hardware, the LLM on your hardware, all data local. Zero external API calls.
Voice I/O
Moltis includes built-in speech-to-text and text-to-speech. Multiple TTS providers are supported, and voice messages from Telegram, Discord, Matrix, and WhatsApp can be automatically transcribed.
Proactive Heartbeat
Moltis runs a periodic heartbeat (every 30 minutes by default, configurable) where the LLM checks if anything needs your attention - inbox, calendar, reminders - and only notifies you when something actually does. Instead of you checking five dashboards, the assistant watches them and pings you on your preferred channel when something matters.
Additional Features
- Browser automation - headless Chromium control for web scraping and interaction
- Cron scheduling - schedule recurring tasks and reminders
- Webhooks - receive events from external services (GitHub, Stripe, Sentry, etc.)
- Hooks - 15 lifecycle events where you can observe, modify, or block actions
- Session branching - fork conversations to explore alternatives
- CalDAV - integrate calendar data
- GraphQL API - programmatic access to all Moltis functionality
- Cross-session recall - search past sessions for relevant context without dumping raw history into every prompt
- Encryption at rest - vault-backed secret storage with
secrecy::Secretzeroing - SSRF protection - DNS-resolved blocking of loopback, private, link-local, and CGNAT ranges
- Tailscale integration - expose the gateway over your tailnet via Tailscale Serve or Funnel, with status monitoring and mode switching from the web UI
- Observability - Prometheus metrics, OpenTelemetry tracing with OTLP export, and structured logging for production deployments
- PWA - installable web app with push notifications on mobile devices
- Onboarding wizard - first-run setup walks you through configuring agent identity (name, emoji, creature, vibe) and your user profile, with TOML config validation and typo detection
How Moltis Works: Technical Architecture
Under the hood, Moltis is structured as a Rust workspace with 59 crates:
INGRESS: Web UI (PWA) INGRESS: Channels
+-------------------+ +--------------------------+
| HTTP Gateway | | Telegram Discord Slack |
| (TLS, WebSocket, | | WhatsApp Matrix Teams |
| REST API) | | Signal Nostr Phone |
+--------+----------+ +-------------+------------+
| |
+----------------+----------------+
|
v
+--------------------------------+
| Agent Loop |
| (orchestrates turns, |
| streams responses back |
| to all channels) |
+---+-------+-------+----+---+---+
| | | | |
v v v v v
+----------+ +-----+ +-----+ +--+--+ +-----+
| Provider | |Memory| |Tools| |MCP | |Cron |
| Registry | |(FTS5 | |& | |Bridge| |Heart|
| | |+vec) | |Skills| | | |beat |
| Anthropic| +-----+ +--+--+ +-----+ +-----+
| OpenAI | |
| Gemini | v
| + 12 more| +------------------+
+----------+ | Sandbox |
| Docker / Podman |
| Apple Container |
| WASM / Remote |
+------------------+
|
+---------------+---------------+
| SQLite Database |
| sessions - memory - vectors |
| hooks - cron - embeddings |
+-------------------------------+
Provider abstraction: Moltis doesn't lock you into one LLM. It supports 15+ providers through a trait-based architecture - Anthropic, OpenAI, Google Gemini, DeepSeek, Mistral, Groq, xAI, OpenRouter, Cerebras, MiniMax, Moonshot, Venice, Z.AI, plus OAuth providers (OpenAI Codex, GitHub Copilot) and local providers (Ollama, LM Studio, built-in GGUF/MLX). You switch providers by changing a config value. Any OpenAI-compatible endpoint can be added with a custom- prefix. OpenAI Batch API support gives you 50% cost savings on eligible calls.
Streaming-first design: Token streaming works on every provider, including when tools are enabled. Tool call arguments stream as deltas as they arrive - you see output immediately rather than waiting for the full response to complete. This applies across all channels, not just the web UI.
Memory architecture: All persistence lives in SQLite. Vector embeddings are stored alongside the database file. Full-text search uses SQLite's FTS5. No external vector database, no Redis, no separate search service. The agent runner itself is ~7.5K lines of Rust, with provider implementations in ~19K more. The remaining ~243K lines cover the web UI, tools, channels, sandbox backends, and supporting infrastructure.
Security model: Moltis generates a self-signed CA on first run for local TLS. Authentication uses password + WebAuthn passkeys + scoped API keys. Secrets go through secrecy::Secret wrappers that zero memory on drop. The SSRF filter resolves DNS and blocks loopback, private, link-local, and CGNAT ranges. Rate limiting is built in with per-IP throttling and strict login protection. There are 15 lifecycle hook events where actions can be inspected, modified, or blocked before execution. Release artifacts are signed with Sigstore/Cosign (keyless signing), Docker images ship with SBOM and provenance attestations, and workspace lints deny unsafe_code, unwrap_used, and expect_used by default.
Self-Hosting Moltis
Moltis is open-source and designed to be self-hosted. Here's what you need.
Installation
The fastest way to get started on macOS or Linux:
curl -fsSL https://www.moltis.org/install.sh | sh
Or via Homebrew:
brew install moltis-org/tap/moltis
Or via Docker:
docker pull ghcr.io/moltis-org/moltis:latest
Linux packages (.deb, .rpm, .pkg.tar.zst, Snap, AppImage) are also available through the installer script. You can even build from source with Rust 1.91+.
Running with Docker
docker run -d \
--name moltis \
-p 13131:13131 \
-p 13132:13132 \
-p 1455:1455 \
-v moltis-config:/home/moltis/.config/moltis \
-v moltis-data:/home/moltis/.moltis \
ghcr.io/moltis-org/moltis:latest
Ports:
| Port | Purpose |
|---|---|
| 13131 | Gateway - web UI, API, WebSocket (HTTPS by default) |
| 13132 | CA certificate download for local TLS trust |
| 1455 | OAuth callback (required for OpenAI Codex and similar providers) |
Volume mounts:
| Path | Contents |
|---|---|
/home/moltis/.config/moltis |
Configuration: moltis.toml, credentials, MCP server config |
/home/moltis/.moltis |
Runtime data: databases, sessions, memory files, models, logs |
Then open https://localhost:13131 in your browser and follow the onboarding: configure your LLM provider, optionally add a passkey, and start chatting.
Note: On localhost, no authentication is required. If you access Moltis from a different machine, a setup code is printed to the container logs.
First Run
After installation, simply run:
moltis
You'll see:
Moltis gateway starting...
Open http://localhost:13131 in your browser
Configure a provider (API key, OAuth, or local LLM), and you're chatting. The fastest path is setting an environment variable like ANTHROPIC_API_KEY or OPENROUTER_API_KEY before starting - models appear automatically in the picker.
What you manage yourself
Self-hosting means you handle:
- TLS certificates - for public-facing deployments, use a reverse proxy (Caddy, nginx, Traefik) with Let's Encrypt
- Updates - pull new images or re-run the install script
- Backups - the SQLite database + memory files in the data directory
- Security - network exposure, firewall, container isolation if multi-user
Managed Moltis with Nebula Deck
If self-hosting sounds like work but you still want the benefits of Moltis - your own instance, your own API keys, persistent memory, multi-channel access - Nebula Deck provides managed hosting.
What Nebula Deck handles
- Infrastructure - each user gets their own Moltis container with gVisor isolation
- TLS - automatic via Caddy + Cloudflare DNS-01
- Updates - self-service with rollback (you choose when, not forced reboots)
- Security - gVisor-isolated containers, no Docker socket access for tenants, SSRF-filtered networking
- Backups - planned for post-launch
What you get beyond vanilla Moltis
- Coding agents - on-demand OpenHands instances for autonomous coding tasks (Standard headless: $0.05/hr, Full GUI: $0.15/hr, per-second billing with 60s minimum)
- Browser sessions - isolated Chromium for web automation and scraping ($0.05/hr)
- Web search - shared SearXNG instance providing unlimited search at zero marginal cost (rolling out at launch)
- MCP bridge - agent and browser session orchestration through MCP endpoints
- Polar.sh billing - integrated usage-based billing
- Zitadel OAuth - platform-level authentication
Pricing
| Component | Cost |
|---|---|
| Deck (always-on Moltis, 256MB, 1GB storage) | $7/mo |
| Compute tiers (agents + browser, per-second billing, 60s min) | |
| - Developer (3 concurrent, $10 credit) | $15/mo |
| - Studio (5 concurrent, $35 credit) | $39/mo |
| - Observatory (10 concurrent, $100 credit) | $99/mo |
| Standard container rate (headless agent or browser) | $0.05/hr |
| Full container rate (GUI agent + browser) | $0.15/hr |
| LLM tokens | BYOK - provider rates, no markup |
Importing from OpenClaw
Moltis has a built-in OpenClaw import system - it's a core feature of Moltis itself, not just Nebula Deck. If you have an existing OpenClaw installation, Moltis automatically detects it and can import:
| Category | What gets imported |
|---|---|
| Identity | Agent name, theme, timezone |
| Providers | API keys (mapped to Moltis equivalents) |
| Skills | All skill directories with SKILL.md |
| Memory | MEMORY.md and all memory/*.md files |
| Channels | Telegram and Discord bot configs |
| Sessions | Full conversation history (JSONL -> Moltis format) |
| MCP Servers | Server configurations |
| Workspace files | SOUL.md, IDENTITY.md, USER.md, TOOLS.md, AGENTS.md, etc. |
The import is strictly read-only - your OpenClaw installation is never modified. You can import via:
- Web UI - during onboarding or from Settings > OpenClaw Import
- CLI -
moltis import detect,moltis import all, ormoltis import select -c providers,skills,memory - RPC - programmatic access via
openclaw.detectandopenclaw.importmethods
There's even automatic background syncing: if you continue using OpenClaw after import, Moltis watches for new session files and syncs them incrementally within seconds.
FAQ
Is Moltis free and open-source?
Yes. Moltis is open-source at github.com/moltis-org/moltis with 2.7k+ stars. You can self-host it for free - you only pay for your server and LLM API usage. If you want managed hosting with extra features like coding agents and browser sessions, Nebula Deck provides that starting at $7/mo.
Is Moltis production-ready?
Moltis is officially alpha software. The project's documentation advises treating it accordingly: run in isolated environments, review your tool/provider configuration, keep secrets rotated, and don't expose it publicly without strong authentication. That said, it's actively developed with 3,700+ commits and a growing community.
Can I import my data from OpenClaw?
Yes - and it's built into Moltis itself, not just the managed offering. Moltis automatically detects an existing OpenClaw installation and can import identity, providers, skills, memory, sessions, channels, MCP servers, and workspace files. The import is read-only and idempotent, so you can safely run it alongside OpenClaw and re-import to pick up new data.
Does Moltis work with local LLMs?
Yes. Moltis has two built-in local inference backends: GGUF (powered by llama.cpp, cross-platform with GPU acceleration) and MLX (Apple Silicon native). It also works with Ollama, LM Studio, or any OpenAI-compatible local server. This gives you a fully offline AI assistant - the agent and the LLM both run on your hardware, with zero external API calls.
How is Moltis different from Hermes Agent?
Both are self-hostable AI assistants with multi-channel messaging, persistent memory, MCP support, and self-improving skills. The key difference is focus: Moltis prioritizes a minimal trusted runtime (Rust binary, ~256MB, fast startup) and multi-tenant density. Hermes Agent is Python-first with a research-oriented learning loop - trajectory generation, RL environments, and user modeling. Both have autonomous skill improvement (Moltis enables it by default). The practical difference: Moltis is easier to deploy on small VPS instances with no runtime to install, while Hermes is more interesting if your priority is the research and reinforcement feedback loop.
How is Moltis different from OpenClaw?
OpenClaw is a TypeScript-based ecosystem with companion apps (macOS, iOS, Android), a broad plugin system, and ~1M+ lines of code. Moltis is a Rust single binary with ~270K lines, designed for minimal runtime overhead. Both support channels, tools, skills, and memory. The practical difference: OpenClaw has a larger feature surface and native mobile apps, while Moltis has a smaller trusted computing base, faster startup, and lower resource requirements - making it better suited for dense multi-tenant hosting.
Want to try Moltis without setting up a server? Nebula Deck provides managed Moltis instances with on-demand coding agents, browser sessions, and SearXNG-powered web search - starting at $7/mo.