Mac Studio Server Configuration for Ollama

This repository contains configuration and scripts for running Ollama LLM server on Apple Silicon Macs in headless mode (tested on Mac Studio with M1 Ultra).

Overview

This configuration is optimized for running Mac Studio as a dedicated Ollama server, with:

Headless operation (SSH access recommended)
Minimal resource usage (GUI and unnecessary services disabled)
Automatic startup and recovery
Performance optimizations for Apple Silicon

Latest Updates

[v1.2.0] Added Docker autostart support for container applications (with Colima)
[v1.1.0] Added GPU Memory Optimization - configure Metal to use more RAM for models
[v1.0.0] Initial release with system optimizations and Ollama configuration

See the CHANGELOG for detailed version history.

Features

Automatic startup on boot
Optimized for Apple Silicon
System resource optimization through service disabling
External network access
Proper logging setup
SSH-based remote management
Docker autostart for container applications

Requirements

Mac with Apple Silicon
macOS Sonoma or later
Ollama installed
Administrative privileges
SSH enabled (System Settings → Sharing → Remote Login)

Remote Access

For optimal performance, we recommend:

Primary access method: SSH

ssh username@your-mac-studio-ip

(Optional) Screen Sharing is kept available for emergency/maintenance access but not recommended for regular use to save resources.

Installation

Clone this repository:

git clone https://github.com/anurmatov/mac-studio-server.git
cd mac-studio-server

(Optional) Configure installation:

# Default values shown
export OLLAMA_USER=$(whoami)  # User to run Ollama as
export OLLAMA_BASE_DIR="/Users/$OLLAMA_USER/mac-studio-server"

# Optional features - only set these if you need them
export OLLAMA_GPU_PERCENT="80"  # Optional: Enable GPU memory optimization (percentage of RAM to allocate)
export DOCKER_AUTOSTART="true"  # Optional: Enable automatic Docker startup

Run the installation script:

chmod +x scripts/install.sh
./scripts/install.sh

Configuration

The Ollama service is configured with the following optimizations:

External access enabled (0.0.0.0:11434)
8 parallel requests (adjustable)
30-minute model keep-alive
Flash attention enabled
Support for 4 simultaneously loaded models
Model pruning disabled

Customizing Configuration

To modify the Ollama service configuration:

Edit the configuration file:

vim config/com.ollama.service.plist

Apply the changes:

# Stop the current service
sudo launchctl unload /Library/LaunchDaemons/com.ollama.service.plist

# Copy the updated configuration
sudo cp config/com.ollama.service.plist /Library/LaunchDaemons/

# Set proper permissions
sudo chown root:wheel /Library/LaunchDaemons/com.ollama.service.plist
sudo chmod 644 /Library/LaunchDaemons/com.ollama.service.plist

# Load the updated service
sudo launchctl load -w /Library/LaunchDaemons/com.ollama.service.plist

Check the logs for any issues:

tail -f logs/ollama.err logs/ollama.log

System Optimizations

The installation process:

Disables unnecessary system services
Configures power management for server use
Optimizes for background operation
Maintains Screen Sharing capability for remote management

Logs

Log files are stored in the logs directory:

ollama.log - Ollama service logs
ollama.err - Ollama error logs
install.log - Installation logs
optimization.log - System optimization logs

Performance Considerations

This configuration significantly reduces system resource usage:

Memory usage reduction from 11GB to 3GB (tested on Mac Studio M1 Ultra)
Disables GUI-related services
Minimizes background processes
Prevents sleep/hibernation
Optimizes for headless operation

The dramatic reduction in memory usage (around 8GB) is achieved by:

Disabling Spotlight indexing
Turning off unnecessary system services
Minimizing GUI-related processes
Optimizing for headless operation

GPU Memory Optimization (Optional)

By default, Metal runtime allocates only about 75% of system RAM for GPU operations. This configuration includes optional GPU memory optimization that:

Runs at system startup (when enabled)
Allocates a configurable percentage of your total RAM to GPU operations
Logs the changes for monitoring

The GPU memory setting is critical for LLM performance on Apple Silicon, as it determines how much of your unified memory can be used for model operations.

This allows:

More efficient model loading
Better performance for large models
Increased number of concurrent model instances
Fuller utilization of Apple Silicon's unified memory architecture

To enable and configure GPU memory optimization, set the environment variable before installation:

export OLLAMA_GPU_PERCENT="80"  # Allocate 80% of RAM to GPU
./scripts/install.sh

Or to adjust after installation:

# Run with a custom percentage
OLLAMA_GPU_PERCENT=85 sudo ./scripts/set-gpu-memory.sh

If you don't set OLLAMA_GPU_PERCENT, GPU memory optimization will be skipped.

For best performance:

Use SSH for remote management
Keep display disconnected when possible
Avoid running GUI applications
Consider disabling Screen Sharing if not needed for emergency access
Adjust GPU memory percentage based on your available memory and workload

These optimizations leave more resources available for Ollama model operations, allowing for better performance when running large language models.

Docker Autostart (Optional)

If you need to run Docker containers (e.g., for Open WebUI), you can configure Docker to start automatically using Colima. This feature is completely optional.

What is Colima?

Colima is a container runtime for macOS that's designed to work well in headless environments. It provides Docker API compatibility without requiring Docker Desktop, making it ideal for server use.

Prerequisites:

Homebrew must be installed (the script will use it to install Colima and Docker CLI)
No special GUI requirements (works perfectly in headless environments)

To enable Docker autostart, run:

export DOCKER_AUTOSTART="true"
./scripts/install.sh

This will:

Install Colima and Docker CLI via Homebrew (if not already installed)
Create a LaunchDaemon that starts Colima automatically at boot time
Configure Colima with default settings

Troubleshooting Docker Autostart:

If Docker doesn't start automatically:

Check the logs:

cat ~/mac-studio-server/logs/docker.log

Try starting Colima manually:

colima start

Check Colima status:

colima status

If you don't need Docker containers, you can skip this feature entirely.

Versioning

This project follows Semantic Versioning:

MAJOR version for incompatible changes
MINOR version for new features
PATCH version for bug fixes

The current version is 1.2.0.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
scripts		scripts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mac Studio Server Configuration for Ollama

Overview

Latest Updates

Features

Requirements

Remote Access

Installation

Configuration

Customizing Configuration

System Optimizations

Logs

Performance Considerations

GPU Memory Optimization (Optional)

Docker Autostart (Optional)

What is Colima?

Prerequisites:

Troubleshooting Docker Autostart:

Versioning

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

anurmatov/mac-studio-server

Folders and files

Latest commit

History

Repository files navigation

Mac Studio Server Configuration for Ollama

Overview

Latest Updates

Features

Requirements

Remote Access

Installation

Configuration

Customizing Configuration

System Optimizations

Logs

Performance Considerations

GPU Memory Optimization (Optional)

Docker Autostart (Optional)

What is Colima?

Prerequisites:

Troubleshooting Docker Autostart:

Versioning

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages