DEV Community

Cover image for Why Memory Alignment Matters in Go: Making Your Structs Lean and Fast
Anh Tu Nguyen
Anh Tu Nguyen

Posted on • Edited on

Why Memory Alignment Matters in Go: Making Your Structs Lean and Fast

When writing Go code, it’s easy to forget what’s happening under the hood—especially when it comes to memory layout. But did you know that how you organize fields in a struct can actually bloat memory and even affect performance?

Let’s take a fun but technical dive into how memory alignment works in Go, and why struct layout matters more than you might think.


🧠 What Is Memory Alignment, Exactly?

Memory alignment is a concept rooted in how CPUs access memory. Most modern CPUs are optimized to access memory at aligned addresses—that is, addresses that are multiples of the data’s size.

For example:

  • An int64 (8 bytes) should ideally start at a memory address that’s a multiple of 8.
  • An int32 (4 bytes) should start at a multiple of 4.

If a variable is misaligned, the CPU might need to perform multiple memory reads just to get the full data. This slows things down. On top of that, if a variable spans across two cache lines, you’ll suffer a performance penalty because the CPU has to load both cache lines.

Here’s a simple analogy: imagine reading a sentence split across two pages in a book. You flip once, then again, just to get the whole message. Alignment keeps your "sentence" on the same page.

TL;DR:

  • Aligned data = fast memory access
  • Misaligned data = slow, possibly multiple reads

🛠️ How Go Handles Memory Alignment

Go takes care of alignment automatically. Each data type has an alignment requirement, and Go inserts padding bytes between struct fields to ensure proper alignment.

Let’s look at this struct:

type PoorlyAligned struct {
    a byte   // 1 byte
    b int64  // 8 bytes
    c byte   // 1 byte
}
Enter fullscreen mode Exit fullscreen mode

Although the fields themselves total 9 bytes, the compiler inserts padding to align each field properly. This results in:

📆 Total size: 24 bytes

Why?

Field Offset Size Notes
a (byte) 0 1
padding 1–7 7 To align int64 on 8-byte boundary
b (int64) 8–15 8 Starts at offset 8.
c (byte) 16 1 Starts at offset 16.
padding 17–23 7 To round struct size to 8-byte multiple

✅ Well-Aligned Layout = Happy Memory

Now let’s rearrange the fields:

type WellAligned struct {
    b int64  // 8 bytes
    a byte   // 1 byte
    c byte   // 1 byte
}
Enter fullscreen mode Exit fullscreen mode

Result:

📆 Total size: 16 bytes

Field Offset Size
b (int64) 0–7 8
a (byte) 8 1
c (byte) 9 1
padding 10–15 6

💡 By simply reordering the fields, we reduced the struct's size from 24 bytes down to 16 bytes—saving 8 bytes for every single instance of this struct.


🚀 Real-World Impact: Memory and Performance

Why does this matter?

  • Less memory per struct = lower overall memory usage
  • Smaller structs fit better into CPU cache lines
  • Better cache usage = fewer cache misses = faster processing
  • Less memory used = less work for the garbage collector

🔬 Benchmarking Time!

Let’s benchmark two slices: one using the poorly-aligned struct, one using the optimized version.

package main

import (
    "testing"
)

type PoorlyAligned struct {
    a byte   // 1 byte
    b int64  // 8 bytes
    c byte   // 1 byte
}

type WellAligned struct {
    b int64  // 8 bytes
    a byte   // 1 byte
    c byte   // 1 byte
}

var poorlySlice = make([]PoorlyAligned, 1_000_000)
var wellSlice = make([]WellAligned, 1_000_000)

func BenchmarkPoorlyAligned(b *testing.B) {
    var sum int64
    for n := 0; n < b.N; n++ {
        for i := range poorlySlice {
            sum += poorlySlice[i].c
        }
    }
}

func BenchmarkWellAligned(b *testing.B) {
    var sum int64
    for n := 0; n < b.N; n++ {
        for i := range wellSlice {
            sum += wellSlice[i].c
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

📊 Typical Results:

goos: darwin
goarch: arm64
pkg: metaleap/pkg/tunatest
cpu: Apple M1
BenchmarkPoorlyAligned-8            3609            323200 ns/op
BenchmarkWellAligned-8              3759            316617 ns/op
PASS
Enter fullscreen mode Exit fullscreen mode

Result: On my Apple M1 chip, optimizing the struct layout resulted in a ~2% performance improvement.


🛠️ Tools: Check Alignment with go vet

You don’t need to do this manually. Go provides a vetting tool to help:

go vet -fieldalignment ./...
Enter fullscreen mode Exit fullscreen mode

It will suggest better ordering for your structs when applicable, like:

struct with 24 bytes could be 16 bytes
Enter fullscreen mode Exit fullscreen mode

✅ Best Practices for Struct Layout in Go

  • Order fields from largest to smallest alignment.
  • Group fields with the same size together.
  • Consider memory layout when defining high-volume or performance-critical structs.
  • Use go vet -fieldalignment for automatic suggestions.

📝 Final Thoughts

Memory alignment is one of those “under-the-hood” details that can have outsized effects in real-world programs—especially those dealing with millions of objects or high-performance data processing.

With just a bit of attention to field ordering, you can:

  • Save memory
  • Speed up your programs
  • Make your data more cache-friendly

Go’s compiler does the heavy lifting to ensure safety and correctness. Your job is to be mindful of layout when performance or memory use matters.


📚 References

Top comments (3)

Collapse
 
trungdlp profile image
Trung Duong

Nice post, thanks Tu 🎉

Collapse
 
oleh_rudak_1e3e93ff091709 profile image
Oleh Rudak

I think there is a mistake in 1st example of memory alignment.
There is no need for padding after variable b (int32), since the offset is already 8, which is a multiplier of 8 (1 + 3 + 4).
Also, the offset after variable c (int64) is not needed.
This gives us the total size of 1 + 3 + 4 + 8 = 16.

Please correct me if my wrong.

Thank you for your article!

Collapse
 
tuna99 profile image
Anh Tu Nguyen

You are correct. My initial calculation was flawed. The offset after the int32 field is already 8, which is a valid boundary for the int64 field, so no padding is needed between them. The total size is indeed 16 bytes.
Thanks to your feedback, I have now corrected the example in the article. I really appreciate you helping to keep the content accurate!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.