lmcpp is both an executable binary that can be run, and a library that can be used in Rust programs.

Installing `lmcpp-server-cli` `lmcpp-toolchain-cli` executables

Assuming you have Rust/Cargo installed, run this command in a terminal:

cargo install lmcpp

It will make lmcpp-server-cli lmcpp-toolchain-cli commands available in your PATH if you've allowed the PATH to be modified when installing Rust. cargo uninstall lmcpp uninstalls.

Adding `lmcpp` library as a dependency

Run this command in a terminal, in your project's directory:

cargo add lmcpp

To add it manually, edit your project's Cargo.toml file and add to the [dependencies] section:

lmcpp = "0.1.1"

The lmcpp library will be automatically available globally. Read the lmcpp library documentation.

Back to the crate overview.

Readme

lmcpp – `llama.cpp`'s `llama-server` for Rust

Fully Managed

Automated Toolchain – Downloads, builds, and manages the llama.cpp toolchain with LmcppToolChain.
Supported Platforms – Linux, macOS, and Windows with CPU, CUDA, and Metal support.
Multiple Versions – Each release tag and backend is cached separately, allowing you to install multiple versions of llama.cpp.

Blazing Fast UDS

UDS IPC – Integrates with llama-server’s Unix-domain-socket client on Linux, macOS, and Windows.
Fast! – Is it faster than HTTP? Yes. Is it measurably faster? Maybe.

Fully Typed / Fully Documented

Server Args – All llama-server arguments implemented by ServerArgs.
Endpoints – Each endpoint has request and response types defined.
Good Docs – Every parameter was researched to improve upon the original llama-server documentation.

CLI Tools & Web UI

lmcpp-toolchain-cli – Manage the llama.cpp toolchain: download, build, cache.
lmcpp-server-cli – Start, stop, and list servers.
Easy Web UI – Use LmcppServerLauncher::webui to start with HTTP and the Web UI enabled.

use lmcpp::*;

fn main() -> LmcppResult<()> {
    let server = LmcppServerLauncher::builder()
        .server_args(
            ServerArgs::builder()
                .hf_repo("bartowski/google_gemma-3-1b-it-qat-GGUF")?
                .build(),
        )
        .load()?;

    let res = server.completion(
        CompletionRequest::builder()
            .prompt("Tell me a joke about Rust.")
            .n_predict(64),
    )?;

    println!("Completion response: {:#?}", res.content);
    Ok(())
}

// With default model
cargo run --bin lmcpp-server-cli -- --webui
// Or with a specific model from URL:
cargo run --bin lmcpp-server-cli -- --webui -u https://huggingface.co/bartowski/google_gemma-3-1b-it-qat-GGUF/blob/main/google_gemma-3-1b-it-qat-Q4_K_M.gguf
// Or with a specific local model:
cargo run --bin lmcpp-server-cli -- --webui -l /path/to/local/model.gguf

How It Works

Your Rust App
      │
      ├─→ LmcppToolChain        (downloads / builds / caches)
      │         ↓
      ├─→ LmcppServerLauncher   (spawns & monitors)
      │         ↓
      └─→ LmcppServer           (typed handle over UDS*)
                │
                ├─→ completion()       → text generation
                └─→ other endpoints    → stuff

Endpoints ⇄ Typed Helpers

HTTP Route	Helper on `LmcppServer`	Request type	Response type
`POST /completion`	`completion()`	`CompletionRequest`	`CompletionResponse`
`POST /infill`	`infill()`	`InfillRequest`	`CompletionResponse`
`POST /embeddings`	`embeddings()`	`EmbeddingsRequest`	`EmbeddingsResponse`
`POST /tokenize`	`tokenize()`	`TokenizeRequest`	`TokenizeResponse`
`POST /detokenize`	`detokenize()`	`DetokenizeRequest`	`DetokenizeResponse`
`GET /props`	`props()`	–	`PropsResponse`
custom	`status()` ¹	–	`ServerStatus`
Open AI	`open_ai_v1_*()`	`serde_json::Value`	`serde_json::Value`

¹ Internal helper for server health.

Supported Platforms

Platform	CPU	CUDA	Metal	Binary Sources
Linux x64	✅	✅	–	Pre-built + Source
macOS ARM	✅	–	✅	Pre-built + Source
macOS x64	✅	–	✅	Pre-built + Source
Windows x64	✅	✅	–	Pre-built + Source

Installing lmcpp-server-cli lmcpp-toolchain-cli executables

Adding lmcpp library as a dependency

Installing `lmcpp-server-cli` `lmcpp-toolchain-cli` executables

Adding `lmcpp` library as a dependency