modelctl

A small, focused CLI for managing local LLM servers. Define your models once in TOML, then start, stop, restart, inspect status, tail logs, or drop into a live tui dashboard — without remembering which llama-server flags you used last week.

$ modelctl status
╭─────────────────┬───────────┬─────────────┬─────────┬───────┬───────────────────────╮
│ name            │ runtime   │ location    │ status  │ pid   │ endpoint              │
├─────────────────┼───────────┼─────────────┼─────────┼───────┼───────────────────────┤
│ gemma-4-26b-a4b │ llama_cpp │ local       │ stopped │ -     │ http://127.0.0.1:8001 │
│ gemma-4-31b     │ llama_cpp │ local       │ running │ 86515 │ http://127.0.0.1:8002 │
│ shadow-gpt-oss  │ llama_cpp │ ssh:shadow  │ running │ 33195 │ http://127.0.0.1:8003 │
╰─────────────────┴───────────┴─────────────┴─────────┴───────┴───────────────────────╯

modelctl is a personal launcher, not a replacement for Ollama or LM Studio. It stays out of the way: one config file, one PID per model, one log file per model, one binary. Write-once configuration, reproducible launches, and an opinionated but quiet TUI.

Features

Runtime-agnostic core, llama.cpp adapter first. The Runtime trait makes it straightforward to add mlx-lm, vllm, or any forked llama-server (e.g. PrismML's 1-bit Bonsai fork) as a new adapter.
Detached processes with PID and log tracking. Servers keep running after you close the shell. Stale PID files are detected and cleaned automatically.
OpenAI-compatible API out of the box. Because the first runtime is llama-server, every model you launch speaks /v1/chat/completions at the configured host and port.
Live tui dashboard. Ratatui-based, feature-gated, read-only by design. Shows configured models, live status, and a tailed log pane for the selected row. Colors align with macmon — terminal-native Color::Green, rounded borders, no forced backgrounds.
System metrics pane (Apple Silicon). When built with the default metrics feature on an Apple Silicon Mac, the TUI shows live sparkline charts for CPU %, GPU %, RAM usage, and power draw via macmon. Useful for watching how a model load affects your machine without switching terminals.
Shell-expanded config paths. ~/llama.cpp/build/bin/llama-server works as-is; no need to hardcode absolute paths.
Optional remote-host mode via ssh. Add ssh_host = "alias-or-user@host" to a [models.<name>] block and modelctl manages that model on the remote machine (start/stop/restart/status/logs) over ssh, without duplicating config between hosts. Uses key-based auth with BatchMode=yes; remote logs land at /tmp/modelctl-<name>.log on the target host and are streamed locally via modelctl logs -f.
Single ~2 MB release binary. No Python interpreter, no virtualenv, no runtime dependencies beyond libc.

Installation

From source

git clone https://github.com/wrale/modelctl.git
cd modelctl
cargo install --path .

Release build lands at ~/.cargo/bin/modelctl (~2 MB).

Requirements

Rust 1.80+ for the cargo install path
macOS or Linux (tested on macOS; Linux should work — the dirs crate resolves state and config paths correctly on both)
A configured runtime binary — modelctl launches external servers. For the llama_cpp adapter, a working llama-server must exist at the path you point binary to in the config.

Quick Start

# Create a starter config with a placeholder Gemma 4 entry
modelctl config init

# Edit it to match your local model files
$EDITOR "$(modelctl config path)"

# Launch
modelctl start gemma-4-31b
modelctl status
modelctl logs gemma-4-31b -f

# Live dashboard
modelctl tui

# Stop
modelctl stop gemma-4-31b

Configuration

modelctl reads its config from the platform-native config directory:

Platform	Path
macOS	`~/Library/Application Support/modelctl/models.toml`
Linux	`~/.config/modelctl/models.toml`

Run modelctl config path to print the resolved location.

Example `models.toml`

[models.gemma-4-26b-a4b]
runtime = "llama_cpp"
binary = "~/llama.cpp/build/bin/llama-server"
model = "~/models/gemma-4-26B-A4B-it-UD-Q4_K_XL.gguf"
host = "127.0.0.1"
port = 8001
extra_args = [
    "--temp", "1.0",
    "--top-p", "0.95",
    "--top-k", "64",
    "--reasoning", "on",
    "-ngl", "99",
]

[models.gemma-4-31b]
runtime = "llama_cpp"
binary = "~/llama.cpp/build/bin/llama-server"
model = "~/models/gemma-4-31B-it-UD-Q4_K_XL.gguf"
host = "127.0.0.1"
port = 8002
extra_args = [
    "--temp", "1.0",
    "--top-p", "0.95",
    "--top-k", "64",
    "--reasoning", "on",
    "-ngl", "99",
    "--ctx-size", "8192",
    "--cache-type-k", "q8_0",
    "--cache-type-v", "q8_0",
]

Field reference

Field	Required	Description
`runtime`	yes	Adapter name. Currently `llama_cpp`.
`binary`	yes	Path to the server executable. `~` is expanded locally; for `ssh_host` entries the path is on the remote host and not expanded locally.
`model`	yes	Path to the model file (e.g. a GGUF). Same local-vs-remote semantics as `binary`.
`host`	no	Bind host. Passed as `--host` if set.
`port`	no	Bind port. Passed as `--port` if set.
`extra_args`	no	Array of additional flags passed through verbatim.
`ssh_host`	no	If set, the model is managed on a remote host via ssh (alias from `~/.ssh/config` or `user@host`). See "Remote hosts" below.

Remote hosts

Add an ssh_host field to any [models.<name>] block to have modelctl manage that model on a remote machine over ssh instead of locally:

[models.remote-gpt-oss-20b]
runtime = "llama_cpp"
ssh_host = "shadowfax"  # alias from ~/.ssh/config, or user@host
binary = "/home/josh/src/llama.cpp/build-cuda12/bin/llama-server"
model = "/home/josh/models/gpt-oss-20b-F16.gguf"
host = "127.0.0.1"
port = 8003
extra_args = [
    "--jinja",
    "-ngl", "99",
    "--ctx-size", "32768",
    "--cache-type-k", "q8_0",
    "--cache-type-v", "q8_0",
    "--parallel", "1",
]

Semantics:

binary and model are paths on the remote filesystem. They are not expanded or checked locally.
start spawns the server on the remote host via ssh <host> 'setsid ... & echo $!', captures the remote PID, and stores it in the local state dir so subsequent stop / status / restart operations know what to target.
stop runs ssh <host> 'kill <pid>', polls via kill -0, and escalates to kill -9 after ~6 seconds if needed. Same timing as the local path.
status checks liveness with ssh <host> 'kill -0 <pid>' under a short connect timeout so unreachable hosts fail fast instead of hanging the table.
logs streams ssh <host> 'tail -F /tmp/modelctl-<name>.log' for follow mode, or cat for a one-shot read. Remote logs live at a predictable path on the target host.
restart sequences the remote stop and remote start correctly.

Requirements for the ssh_host path:

Key-based ssh auth to the target host — modelctl invokes ssh with BatchMode=yes, so password prompts will fail fast.
An ssh alias or user@host that resolves to your target. If you use an alias from ~/.ssh/config, any LocalForward, ProxyJump, or IdentityFile settings there are honored.
Writable /tmp on the remote host for the log file.

Remote entries show up in modelctl status with ssh:<host> in the location column so it's obvious which machine each model lives on. Local entries show local there. The endpoint column shows the service's listen address, which for remote entries is on the remote host's loopback and not directly reachable from your machine — reach it from the local side via an ssh LocalForward or by invoking modelctl logs / modelctl status which tunnel through ssh themselves.

Only the llama_cpp runtime supports remote mode today. Other runtimes work locally as always but will error on start if given an ssh_host.

Commands

Command	Behavior
`modelctl start <name>`	Spawn the configured server detached. Writes PID + log.
`modelctl stop <name>`	`SIGTERM`, falls back to `SIGKILL` after ~6s.
`modelctl restart <name>`	Stop (if running) then start.
`modelctl status` / `ls`	Table of all configured models with live status.
`modelctl logs <name> [-f]`	Print or tail the log file.
`modelctl config path`	Print the config file path.
`modelctl config init`	Write a starter config if none exists.
`modelctl tui`	Launch the live dashboard (feature-gated).
`modelctl about`	Print license and third-party attribution.

Start and stop are idempotent: starting a running model errors with its PID; stopping a stale PID clears the file.

TUI

Feature-gated behind two default cargo features: tui (ratatui + crossterm) and metrics (macmon, Apple Silicon only). To build without the dashboard and metrics pane (smaller binary, broader platform support):

cargo install --path . --no-default-features

Layout

 modelctl v0.1.0                                                    0 running  3 configured
╭ models ──────────────────────────────────────────────────────────────────────────────────╮
│       name                 runtime      status    pid      endpoint                      │
│ › ○   bonsai-8b            llama_cpp    stopped   —        http://127.0.0.1:8005         │
│   ○   gemma-4-26b-a4b      llama_cpp    stopped   —        http://127.0.0.1:8001         │
│   ○   gemma-4-31b          llama_cpp    stopped   —        http://127.0.0.1:8002         │
╰──────────────────────────────────────────────────────────────────────────────────────────╯
╭ log bonsai-8b ───────────────────────────────────────────────────────────────────────────╮
│slot update_slots: id  2 | task 432 | n_tokens = 276, memory_seq_rm [276, end)            │
│slot init_sampler: id  2 | task 432 | init sampler, took 0.04 ms, tokens: text = 277,     │
│total = 277                                                                               │
│slot update_slots: id  2 | task 432 | prompt processing done, n_tokens = 277,             │
│batch.n_tokens = 4                                                                        │
│srv  params_from_: Chat format: Hermes 2 Pro                                              │
│slot print_timing: id  1 | task 383 |                                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────╯
╭ CPU   3% ───── 41°C ╮╭ GPU   1% ───── 38°C ╮╭ RAM 15.1/32G ── 47% ╮╭ PWR 0.1W ───────────╮
│                     ││                     ││                     ││                     │
│                     ││                     ││                     ││                     │
│                     ││                     ││▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂││                     │
│                     ││                     ││█████████████████████││                     │
│▁▁▂▂▁▁▂▁▁▂ ▁▁▁▂▃▂▂▂▂▁││        ▁ ▂     ▁▃▁▁ ││█████████████████████││▁▁▁▁▁▁▁▁ ▂ ▁ ▁▁▂▁▃▃▃ │
╰─────────────────────╯╰─────────────────────╯╰─────────────────────╯╰─────────────────────╯
 ↑↓/jk select  g/G top/bottom  q/Esc quit  use CLI for start/stop

The metrics pane appears only on Apple Silicon Macs (the metrics feature uses macmon's IOKit bindings, which require aarch64-apple-darwin). On Intel Macs and Linux the TUI renders without it.

Keybindings

Key	Action
`↑`/`↓`, `j`/`k`	Select model
`g`/`Home`	Jump to first row
`G`/`End`	Jump to last row
`q`/`Esc`	Quit

Log highlighting

Log lines are colorized by heuristic keyword match:

Keyword	Rendering
`error` / `fatal` / `panic`	reversed accent (alarm)
`warn`	bold accent
`listening` / `loaded` / `ready`	accent
`slot` / `init` / `srv`	dimmed
everything else	terminal default

Architecture

modelctl/
├── Cargo.toml
├── src/
│   ├── main.rs              # clap dispatch
│   ├── config.rs            # TOML loading, shell expansion, starter template
│   ├── state.rs             # PID file, log file, process liveness via nix
│   ├── runtime/
│   │   ├── mod.rs           # `trait Runtime { build_command }`
│   │   └── llama_cpp.rs     # first adapter
│   ├── cmd/                 # start, stop, restart, status, logs, config, about
│   ├── tui.rs               # ratatui dashboard (feature "tui")
│   └── metrics.rs           # macmon system metrics (feature "metrics")
└── .gitignore

Adding a new runtime

Create src/runtime/<name>.rs implementing Runtime::build_command.
Register it in runtime::for_name.
Reference it as runtime = "<name>" in models.toml.

That's the entire contract. The top-level start/stop/restart code paths are runtime-agnostic — they just spawn whatever Command the runtime returns.

Development

# Format + lint + test + build
cargo fmt --all
cargo clippy --all-targets -- -D warnings
cargo build --release

# Refresh third-party license bundle (required before committing any dep changes)
cargo install cargo-about  # one-time
cargo about generate about.hbs > THIRD_PARTY_LICENSES.md

THIRD_PARTY_LICENSES.md is generated from Cargo.lock by cargo-about and bundled into the binary via include_str! so modelctl about prints everything at runtime. Regenerate it whenever dependencies change.

License

Licensed under either of

Apache License 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license, (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Third-party crates bundled into the release binary are listed in THIRD_PARTY_LICENSES.md. The same content is available at runtime with modelctl about.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in modelctl by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

modelctl

Features

Installation

From source

Requirements

Quick Start

Configuration

Example `models.toml`

Field reference

Remote hosts

Commands

TUI

Layout

Keybindings

Log highlighting

Architecture

Adding a new runtime

Development

License

Contribution

About

Licenses found

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
THIRD_PARTY_LICENSES.md		THIRD_PARTY_LICENSES.md
about.hbs		about.hbs
about.toml		about.toml

Folders and files

Latest commit

History

Repository files navigation

modelctl

Features

Installation

From source

Requirements

Quick Start

Configuration

Example models.toml

Field reference

Remote hosts

Commands

TUI

Layout

Keybindings

Log highlighting

Architecture

Adding a new runtime

Development

License

Contribution

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Example `models.toml`

Packages