← Index
Go Concurrency — Complete Deep Guide

Go Concurrency
Mastered Completely

Every concurrency primitive Go offers — from goroutines and channels to context, sync.Once, and the race detector — with diagrams, code, real-world use cases, and honest trade-offs.

①Concurrency vs Parallelism
②Goroutine
③Channel
④WaitGroup
⑤Mutex
⑥Select
⑦Context
⑧sync.Once
⑨Atomic
⑩Race Detector
Concept 01 · Foundation
Concurrency vs. Parallelism
🧠
Concurrency vs. Parallelism
Foundation · Rob Pike's Core Insight · Must Understand First
"Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once." — Rob Pike (Go co-creator). These are different problems with different solutions. Go is a concurrency language that can achieve parallelism when needed.
Concurrency: Structural Parallelism: Execution Go's model: CSP (Communicating Sequential Processes)
🔄 Concurrency
Structuring a program to handle multiple tasks — not necessarily at the same moment
1 CPU CORE Task A Task B (interleaved) Tasks take turns on one core Appears simultaneous but switches fast STRUCTURE: who runs when
⚡ Parallelism
Executing multiple tasks truly simultaneously on multiple CPU cores
CPU CORE 1 CPU CORE 2 CPU CORE 3 ← Task A running ← Task B running ← Task C running EXECUTION: truly at the same instant

🐹 GO'S APPROACH: CONCURRENCY BY DESIGN, PARALLELISM BY RUNTIME

Go's goroutines and channels give you a concurrent structure. The Go runtime's M:N scheduler then maps those goroutines to OS threads, which run on physical CPU cores in true parallel. Set GOMAXPROCS=runtime.NumCPU() (the default since Go 1.5) and your concurrent Go program automatically becomes parallel. You write for concurrency; Go delivers parallelism for free.

Real-World Analogy

  • Concurrency (Chef): One chef manages boiling pasta, chopping vegetables, and simmering sauce — switching between tasks. Only one hand moves at a time.
  • Parallelism (Kitchen): Three chefs each doing one task simultaneously — pasta chef, vegetable chef, sauce chef — all working at the exact same moment.
  • Go's model: You write the recipe (concurrency). The kitchen staff count (GOMAXPROCS) determines how many chefs run in parallel.

Go's M:N Scheduler

  • M goroutines mapped to N OS threads (N = GOMAXPROCS)
  • OS threads pinned to physical CPU cores for true parallelism
  • Goroutines are cooperatively + preemptively scheduled by the Go runtime
  • Network I/O automatically yields — goroutine suspends, thread free for others
  • Work-stealing: idle threads steal goroutines from busy thread queues

When Concurrency ≠ Parallelism

  • Single-core machine: concurrent Go program, no parallelism
  • GOMAXPROCS=1: goroutines interleave but never truly parallel
  • I/O-bound programs: concurrency wins even without multiple cores
  • CPU-bound programs: need parallelism (multiple cores) to go faster

CSP vs Shared Memory

  • CSP (Go's model): communicate by passing messages through channels. No shared state between goroutines.
  • Shared memory (Java/C++): threads share variables, use locks. Error-prone, deadlock-prone.
  • Go's mantra: "Do not communicate by sharing memory; share memory by communicating."
Concept 02 · Core Primitive
Goroutine
🚀
Goroutine
Lightweight Thread · User-space · Go Runtime Managed
A goroutine is a lightweight, concurrently executing function managed by the Go runtime — not the OS. Started with the go keyword. Initial stack is ~2KB (growing dynamically up to 1GB). You can run millions of goroutines on a single machine where OS threads would crash at ~10,000.
Initial stack: ~2KB OS thread: ~1MB Multiplier: 500x more goroutines Keyword: go

What is a Goroutine?

A goroutine is a function executing concurrently with other goroutines in the same address space. Unlike OS threads, goroutines are multiplexed onto a smaller number of OS threads by the Go runtime. When a goroutine blocks on I/O, the runtime moves it off the thread and runs another goroutine instead — maximizing CPU utilization with zero developer effort.

Goroutines are not OS threads. They're an abstraction ON TOP of OS threads. The Go scheduler (part of the runtime) decides which goroutines run on which threads.

How It Works — GMP Model

  • G (Goroutine): The goroutine itself — function + stack + state
  • M (Machine): OS thread managed by the runtime
  • P (Processor): Scheduling context — holds a local run queue of goroutines
  • Each P has a local queue. M runs one G from its P's queue at a time
  • When G blocks (channel, syscall), M detaches from P, P gets another M
  • Work-stealing: idle P steals half the goroutines from a busy P's queue

Goroutine Lifecycle

1
Created

go myFunc() — goroutine placed in P's local run queue

2
Runnable

Waiting in queue to be scheduled onto an M (OS thread)

3
Running

Executing on an M. Preempted after 10ms time slice or on yield point

4
Blocked

Waiting on channel, mutex, or syscall. M free to run other goroutines

5
Dead

Function returned. Stack reclaimed. goroutine exits cleanly.

Why Use Goroutines?

  • Handle 100,000+ concurrent connections with minimal memory
  • Parallelize CPU-bound work across all cores automatically
  • I/O-bound operations (HTTP, DB, disk) don't block other work
  • Worker pools, pipelines, fan-out patterns all built from goroutines
  • No callback hell, no async/await boilerplate — just go func()
GO RUNTIME — GMP SCHEDULER MODEL Global Queue G G G Overflow goroutines P (Processor 1) Local run queue G G G G M (OS Thread) Running G ↑ CPU Core 1 P (Processor 2) Local run queue G G G M (OS Thread) Running G ↑ CPU Core 2 G (Blocked) Waiting on channel/syscall M is freed → runs other G Work-steal: idle P takes goroutines from busy P's queue
goroutine/examples.go — Creation, closures, goroutine leak
// ── BASIC GOROUTINE ──────────────────────────────────────────── go doWork() // Run doWork() concurrently go func() { doWork() }() // Anonymous goroutine — immediately invoked // ── GOROUTINE WITH ARGS — ALWAYS PASS BY VALUE ───────────────── for i := 0; i < 5; i++ { go func(id int) { // ✓ Pass i as arg — captures correct value fmt.Println(id) }(i) // ✗ WRONG: go func() { fmt.Println(i) }() ← closure captures loop var → all print 5 } // ── GOROUTINE STACK GROWTH ───────────────────────────────────── // Start: 2KB → grows to 1GB max → runtime.Goexit() or function return reclaims // ── GOROUTINE COUNT — 1 MILLION IS FINE ──────────────────────── for i := 0; i < 1_000_000; i++ { go func() { time.Sleep(10 * time.Second) // sleeping goroutines: ~2-4KB each }() // 1M × 4KB = 4GB RAM (manageable) } // ── ⚠️ GOROUTINE LEAK — never do this ───────────────────────── func leaky() { ch := make(chan int) go func() { val := <-ch // blocks forever — nobody sends! fmt.Println(val) // goroutine leaks — never garbage collected }() } // ← function returns without sending to ch → goroutine lives forever // FIX: always use context cancellation or done channels to signal goroutine exit // ── GOMAXPROCS: control parallelism ──────────────────────────── runtime.GOMAXPROCS(runtime.NumCPU()) // Default since Go 1.5 — use all cores runtime.GOMAXPROCS(1) // Concurrency without parallelism

✓ ADVANTAGES

  • 2KB initial stack — 500x cheaper than OS threads (1MB)
  • Run millions concurrently on commodity hardware
  • Go runtime manages scheduling — zero OS context switch overhead
  • Blocking I/O doesn't waste OS threads — runtime parks goroutine
  • Preemptive since Go 1.14 — no starvation from CPU-bound goroutines
  • Stack grows/shrinks dynamically — no stack overflow errors

✗ DISADVANTAGES

  • Goroutine leaks are invisible — no crash, just memory growth
  • Cannot cancel a goroutine from outside — must use channels/context
  • No goroutine ID (by design) — makes debugging harder
  • Panic in goroutine not caught by parent — must recover() inside
  • Too many goroutines still cause GC pressure even at 2KB each
  • Loop variable capture bug is a classic trap for new Go developers
Concept 03 · Communication
Go Channel
📡
Go Channel
Typed · Goroutine-safe · Communication Primitive · CSP Core
A channel is a typed conduit through which goroutines communicate. Channels provide synchronization and data transfer in one construct. They are Go's answer to shared-memory concurrency — instead of locking a shared variable, goroutines send and receive values through a channel. "Don't communicate by sharing memory; share memory by communicating."
make(chan T) — unbuffered make(chan T, N) — buffered Thread-safe: always Directional: <-chan / chan<-
UNBUFFERED CHANNEL — make(chan T)
Goroutine A (SEND) BLOCKS until B receives Channel capacity=0 Goroutine B (RECV)
Sender blocks until receiver is ready. Perfect synchronization.
BUFFERED CHANNEL — make(chan T, 3)
Goroutine A (SEND) Blocks only when full v1 v2 cap=3, len=2 Goroutine B (RECV)
Send doesn't block until buffer full. Decouples producer from consumer speed.

Channel Operations & Rules

  • Send: ch <- val — blocks if unbuffered (until receiver ready) or buffer full
  • Receive: val := <-ch — blocks until value is available
  • Close: close(ch) — only sender should close; signals no more values
  • Range: for v := range ch — reads until channel closed
  • Check closed: v, ok := <-ch — ok=false means closed
  • Sending on closed channel: panic!
  • Receiving from closed channel: returns zero value immediately

Directional Channels

  • chan T — bidirectional (can send and receive)
  • <-chan T — receive-only (function can only read from it)
  • chan<- T — send-only (function can only write to it)
  • Directional channels enforce correct usage at compile time
  • Producer gets chan<-, consumer gets <-chan — prevents mistakes
func producer(ch chan<- int) {ch <- 42}
func consumer(ch <-chan int) {v := <-ch}

Nil Channel Behavior

  • Send on nil channel: blocks forever
  • Receive from nil channel: blocks forever
  • Close nil channel: panic!
  • Nil channels in select: that case is never selected
  • Useful trick: disable a select case by setting its channel to nil

Channel as Signal (Done Pattern)

  • done := make(chan struct{}) — zero-size struct signal
  • Use close(done) to broadcast to all receivers simultaneously
  • Closing broadcasts to ALL goroutines waiting on it
  • Sending sends to only ONE goroutine
  • This is the foundation of Context cancellation internally
channel/patterns.go — Unbuffered, buffered, done, pipeline
// ── UNBUFFERED: synchronization point ────────────────────────── func unbuffered() { ch := make(chan string) // capacity = 0 go func() { ch <- "order ready" }() // sender blocks until receiver ready msg := <-ch // receiver unblocks sender fmt.Println(msg) } // ── BUFFERED: decouple producer/consumer speed ───────────────── func buffered() { jobs := make(chan int, 10) // buffer 10 jobs — send won't block unless full for i := 0; i < 10; i++ { jobs <- i } // fill buffer without goroutine close(jobs) for j := range jobs { fmt.Println(j) } // range reads until closed } // ── DONE CHANNEL: broadcast stop signal ─────────────────────── func withDone() { done := make(chan struct{}) // zero-size — just a signal go func() { for { select { case <-done: // closing done broadcasts to ALL receivers fmt.Println("stopping"); return default: doWork() } } }() time.Sleep(5 * time.Second) close(done) // ← broadcasts to ALL goroutines listening on done } // ── PIPELINE: chain channels ───────────────────────────────── func generate(nums ...int) <-chan int { out := make(chan int) go func() { for _, n := range nums { out <- n }; close(out) }() return out } func square(in <-chan int) <-chan int { out := make(chan int) go func() { for n := range in { out <- n*n }; close(out) }() return out } // Usage: for v := range square(generate(2, 3, 4)) { fmt.Println(v) } → 4 9 16

✓ ADVANTAGES

  • Goroutine-safe by design — no mutex needed for communication
  • Synchronization and communication in one construct
  • Range loop cleanly drains channel until close
  • Directional types enforce correct usage at compile time
  • close() broadcasts to all receivers — elegant done pattern
  • Composes naturally into pipelines and fan-out/in

✗ DISADVANTAGES

  • Deadlock if sender and receiver both block — hard to debug
  • Sending on closed channel panics — must coordinate ownership
  • Buffer size tuning is non-trivial — wrong size causes bottlenecks
  • Not suitable for sharing large structs — prefer pointer + mutex
  • Channels have overhead — not always faster than sync primitives
  • Nil channel gotchas can block indefinitely silently
Concept 04 · Synchronization
sync.WaitGroup
sync.WaitGroup
Synchronization Primitive · Barrier · Collection Point
A WaitGroup waits for a collection of goroutines to finish. The main goroutine calls Add(n) to set the count, each goroutine calls Done() when finished, and Wait() blocks until the counter reaches zero. It's the standard way to fan-out work and collect completion.
Methods: Add · Done · Wait Zero value: ready to use Must not be copied after first use
main goroutine wg.Add(3) go work() go work() go work() wg.Wait() ← BLOCKS HERE WaitGroup 3 counter 0 → unblocks Wait() Add(3) Worker 1 does work... wg.Done() ← Worker 2 does work... wg.Done() ← Worker 3 wg.Done() ← counter == 0 wg.Wait() unblocks → main continues
waitgroup/patterns.go — Correct usage + common mistakes
// ── CORRECT USAGE ────────────────────────────────────────────── func processOrders(orders []Order) { var wg sync.WaitGroup // zero value ready to use for _, order := range orders { wg.Add(1) // ← Add BEFORE launching goroutine go func(o Order) { defer wg.Done() // ← defer ensures Done() always called processOrder(o) }(order) // ← pass order as arg (avoid loop var capture) } wg.Wait() // ← blocks until all Done() calls match Add() calls fmt.Println("all orders processed") } // ── WRONG: Add inside goroutine — RACE CONDITION ─────────────── // for _, order := range orders { // go func(o Order) { // wg.Add(1) ← WRONG: wg.Wait() may fire before Add called! // defer wg.Done() // processOrder(o) // }(order) // } // ── WRONG: Don't copy WaitGroup ─────────────────────────────── // func bad(wg sync.WaitGroup) { wg.Wait() } ← copy! doesn't work // func good(wg *sync.WaitGroup) { wg.Wait() } ← pointer ✓ // ── ADVANCED: WaitGroup + error collection ───────────────────── func processWithErrors(items []Item) []error { var ( wg sync.WaitGroup mu sync.Mutex errs []error ) for _, item := range items { wg.Add(1) go func(it Item) { defer wg.Done() if err := process(it); err != nil { mu.Lock() errs = append(errs, err) // mutex protects shared errs slice mu.Unlock() } }(item) } wg.Wait() return errs }

✓ ADVANTAGES

  • Simple 3-method API: Add, Done, Wait
  • Zero value is ready to use — no initialization needed
  • defer wg.Done() ensures completion even on panic
  • Works with any number of goroutines — dynamic Add count
  • Reusable after Wait() completes — counter resets to zero

✗ DISADVANTAGES

  • Must call Add() before goroutine starts — not inside goroutine
  • Never copy a WaitGroup — must pass by pointer
  • No way to collect return values — need extra channel or slice+mutex
  • No timeout — use context for cancellable waiting
  • Add/Done mismatch (negative counter) causes panic
Concept 05 · Shared State Protection
sync.Mutex & sync.RWMutex
🔐
Mutex & RWMutex
Mutual Exclusion · Shared Memory Protection · sync package
A Mutex (Mutual Exclusion lock) ensures only one goroutine accesses shared data at a time. Use Mutex when all access is read+write. Use RWMutex (read-write mutex) when reads are frequent and writes are rare — allows multiple concurrent readers OR one exclusive writer.
Mutex: Lock / Unlock RWMutex: RLock / RUnlock / Lock / Unlock Rule: always use defer Unlock

Mutex vs RWMutex

  • sync.Mutex: One goroutine at a time — any operation (R or W)
  • sync.RWMutex: Multiple concurrent readers OR one exclusive writer
  • Use Mutex: protecting a counter, appending to a slice, map writes
  • Use RWMutex: cache with frequent reads and rare writes
  • RLock(): multiple goroutines can hold simultaneously
  • Lock(): blocks until ALL RLock holders release

When to Use Mutex vs Channel

  • Use Mutex when: protecting state (counter, cache, shared map). Simple ownership — one goroutine owns data.
  • Use Channel when: passing ownership of data between goroutines, coordinating work, signaling events.
  • Rob Pike's rule: "Use channels to orchestrate; use mutexes to protect."
  • Example: sync.Map is a built-in concurrent-safe map (use over map+Mutex for read-heavy)
mutex/safe_cache.go — Mutex and RWMutex patterns
// ── sync.Mutex: exclusive access ───────────────────────────── type SafeCounter struct { mu sync.Mutex count int } func (c *SafeCounter) Inc() { c.mu.Lock() defer c.mu.Unlock() // always defer — even if panic occurs c.count++ } // ── sync.RWMutex: many readers, one writer ──────────────────── type MenuCache struct { mu sync.RWMutex items map[string]MenuItem } func (c *MenuCache) Get(id string) (MenuItem, bool) { c.mu.RLock() // Multiple goroutines can RLock simultaneously defer c.mu.RUnlock() item, ok := c.items[id] return item, ok } func (c *MenuCache) Set(id string, item MenuItem) { c.mu.Lock() // Exclusive — blocks all readers and writers defer c.mu.Unlock() c.items[id] = item } // ── ⚠ DEADLOCK: never lock the same mutex twice ─────────────── // mu.Lock() // mu.Lock() ← deadlock! goroutine blocks itself waiting for itself // ── sync.Map: built-in concurrent map ──────────────────────── var m sync.Map m.Store("key", "value") // goroutine-safe write v, ok := m.Load("key") // goroutine-safe read m.Range(func(k, v any) bool { // goroutine-safe iteration fmt.Println(k, v); return true })

✓ ADVANTAGES

  • Simple — Lock/Unlock pattern is universally understood
  • RWMutex dramatically improves read-heavy workload performance
  • defer mu.Unlock() guarantees unlock even on panic
  • Lower overhead than channels for simple shared state
  • sync.Map provides zero-config concurrent map for read-heavy use

✗ DISADVANTAGES

  • Deadlock risk: forgot to Unlock, double Lock, lock ordering bugs
  • Must not copy mutex after first use
  • Priority inversion and lock contention at high concurrency
  • Channels are often more readable for ownership passing
  • Cannot be used recursively — locking twice in same goroutine = deadlock
Concept 06 · Multi-Channel Control
Select Statement
🎛️
Select Statement
Multi-Channel Multiplexer · Non-blocking Operations · Timeout Pattern
Select lets a goroutine wait on multiple channel operations simultaneously. Like a switch statement, but for channels. Blocks until one case is ready, then executes it. If multiple cases are ready simultaneously, one is chosen at random. The default case makes select non-blocking.
Blocks: until one case ready Multiple ready: random selection default: non-blocking
select/patterns.go — Timeout, default, nil channel disable
// ── BASIC SELECT: wait on multiple channels ──────────────────── select { case msg := <-ch1: fmt.Println("received from ch1:", msg) case msg := <-ch2: fmt.Println("received from ch2:", msg) case ch3 <- "data": fmt.Println("sent to ch3") } // blocks until ONE case is ready // ── TIMEOUT PATTERN: most important real-world use ───────────── func fetchWithTimeout() (Result, error) { resultCh := make(chan Result, 1) go func() { resultCh <- slowAPICall() }() select { case res := <-resultCh: return res, nil case <-time.After(3 * time.Second): // ← fires after 3s return Result{}, errors.New("timeout") } } // ── NON-BLOCKING with default ────────────────────────────────── select { case v := <-ch: fmt.Println("got:", v) default: fmt.Println("channel empty — move on") // no blocking! } // ── DISABLE A CASE using nil channel ────────────────────────── var ch2 chan int // nil channel — this case is NEVER selected select { case v := <-ch1: fmt.Println(v) case v := <-ch2: fmt.Println(v) // ← never runs (ch2 is nil) } // ── GRACEFUL SHUTDOWN PATTERN ───────────────────────────────── func worker(jobs <-chan Job, done <-chan struct{}) { for { select { case job := <-jobs: processJob(job) case <-done: return // shutdown signal } } }

✓ ADVANTAGES

  • Wait on multiple channels without spinning or goroutines
  • Timeout pattern is trivial with time.After()
  • Non-blocking channel operations via default case
  • Graceful shutdown by combining done channel with work channel
  • Nil channel trick elegantly disables cases dynamically

✗ DISADVANTAGES

  • Pseudo-random case selection can be surprising
  • No priority between cases — all equal chance
  • Complex nested selects become hard to read
  • time.After leaks timer if case fires on other channel first (use time.NewTimer instead)
Concept 07 · Cancellation & Deadlines
context.Context
⏱️
context.Context
Cancellation · Deadlines · Request-scoped Values · Standard Library
Context carries cancellation signals, deadlines, and key-value pairs across API boundaries and goroutines. It is the idiomatic way to cancel long-running operations, propagate request timeouts, and pass request-scoped values (like user ID, trace ID) without function signature pollution.
WithCancel: manual cancel WithTimeout: auto-cancel after duration WithDeadline: cancel at absolute time WithValue: attach request-scoped data
CONTEXT PROPAGATION TREE — cancellation flows downward context.Background() root — never cancelled WithTimeout(ctx, 5s) auto-cancels after 5s WithValue(ctx, k, v) adds request-scoped data DB Query goroutine ctx.Done() ← cancel HTTP call goroutine ctx.Done() ← cancel Cancelling parent cancels ALL children automatically
context/patterns.go — WithTimeout, WithCancel, propagation
// ── Rule 1: Pass Context as first parameter ──────────────────── func GetUser(ctx context.Context, userID string) (*User, error) { // all functions that might be slow take ctx as first arg return db.QueryContext(ctx, "SELECT * FROM users WHERE id=$1", userID) } // ── WithTimeout: cancel after duration ──────────────────────── func fetchOrder(orderID string) (*Order, error) { ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second) defer cancel() // ← ALWAYS defer cancel — releases resources immediately return db.QueryContext(ctx, "SELECT * FROM orders WHERE id=$1", orderID) // if 3s pass before query completes → ctx.Done() closes → query cancels } // ── WithCancel: manual cancellation ─────────────────────────── func longJob(ctx context.Context) error { childCtx, cancel := context.WithCancel(ctx) defer cancel() for { select { case <-childCtx.Done(): // parent or manual cancel return childCtx.Err() // context.Canceled or DeadlineExceeded default: doUnitOfWork() } } } // ── WithValue: request-scoped data (trace ID, user ID) ──────── type ctxKey string const RequestIDKey ctxKey = "requestID" ctx := context.WithValue(r.Context(), RequestIDKey, "req-abc-123") reqID := ctx.Value(RequestIDKey).(string) // type assert to get value // Rule: use private type for keys — prevents collisions between packages

✓ ADVANTAGES

  • Propagates cancellation automatically to all child operations
  • Deadline/timeout flows across goroutines, DB queries, HTTP calls
  • Prevents goroutine leaks from long-running operations
  • Standard — works with net/http, database/sql, gRPC, all major libraries
  • WithValue carries request-scoped data without polluting function signatures

✗ DISADVANTAGES

  • Adding ctx to every function signature is boilerplate (required, not optional)
  • Forgetting defer cancel() leaks resources (common mistake)
  • WithValue uses interface{} keys — runtime panic on type mismatch
  • Context is not a replacement for config or optional parameters
  • Can't cancel an already-started DB query (driver must support it)
Concept 08 · One-Time Initialization
sync.Once
1️⃣
sync.Once
Lazy Initialization · Singleton · Exactly-Once Execution
sync.Once guarantees a function is executed exactly once — regardless of how many goroutines call it simultaneously. All subsequent calls to Do() after the first are no-ops. Zero overhead after the first execution. The backbone of goroutine-safe lazy initialization and the Singleton pattern in Go.
Method: Do(f func()) Guarantee: exactly once Goroutine-safe: always After first call: zero cost
once/singleton.go — DB pool, lazy init, configuration
// ── GOROUTINE-SAFE SINGLETON WITH sync.Once ──────────────────── type DBPool struct { db *sql.DB } var ( dbInstance *DBPool dbOnce sync.Once // zero value ready to use ) func GetDB() *DBPool { dbOnce.Do(func() { // first call: executes this function db, _ := sql.Open("postgres", os.Getenv("DATABASE_URL")) dbInstance = &DBPool{db: db} }) // subsequent calls: no-op return dbInstance // always returns same instance } // ── 1000 goroutines all call GetDB() simultaneously ─────────── // sql.Open() is called EXACTLY ONCE — no double-init, no race condition // ── LAZY CONFIG LOADING ──────────────────────────────────────── type Config struct { Port int; Secret string } var ( cfg Config cfgOnce sync.Once ) func GetConfig() Config { cfgOnce.Do(func() { cfg = loadFromEnv() // reads .env file exactly once }) return cfg } // ── ⚠ GOTCHA: once.Do does NOT re-run if f panics ───────────── // If f panics, Do marks it as "done" — future calls also skip it // Guard against this by recovering inside f if needed // ── sync.OnceFunc (Go 1.21) — cleaner API ───────────────────── // initDB := sync.OnceFunc(func() { /* init */ }) // initDB() // subsequent calls no-op

✓ ADVANTAGES

  • Atomic guarantee — exactly one execution even under heavy concurrency
  • Zero overhead after first call (atomic read, branch predict)
  • Eliminates double-checked locking anti-pattern
  • Works for any initialization — not just singletons
  • Zero value is ready — no constructor needed

✗ DISADVANTAGES

  • If Do() panics, subsequent calls silently do nothing
  • Cannot reset or re-run — truly once per Once instance
  • No error return — must store error in outer variable
  • Deadlock if Do() tries to call Do() on same Once (self-call)
Concept 09 · Lock-Free Operations
sync/atomic — Atomic Operations
⚛️
sync/atomic
Lock-Free · CPU-level Atomicity · Low-Level Primitive
Atomic operations are indivisible CPU-level operations on primitive types — no goroutine can observe them in a half-completed state. Faster than mutexes for simple counters, flags, and pointers because they use hardware instructions (CAS — Compare-And-Swap) instead of OS-level locking. The building block of lock-free data structures.
No mutex overhead Types: int32/64, uint32/64, pointer, bool Go 1.19+: atomic.Int64, atomic.Bool etc.
atomic/counter.go — Modern API (Go 1.19+) and classic
// ── MODERN API: atomic.Int64 (Go 1.19+) ─────────────────────── var reqCount atomic.Int64 // Multiple goroutines safe — no mutex needed: reqCount.Add(1) // increment n := reqCount.Load() // read reqCount.Store(0) // set old := reqCount.Swap(100) // swap and return old value // ── atomic.Bool (Go 1.19+) ──────────────────────────────────── var isRunning atomic.Bool isRunning.Store(true) if isRunning.Load() { fmt.Println("server is up") } // ── COMPARE AND SWAP (CAS) — foundation of lock-free ────────── var counter int64 atomic.CompareAndSwapInt64(&counter, 0, 1) // Only sets counter=1 if current value is 0 — atomic test-and-set // Used to implement lock-free state machines and spinlocks // ── PERFORMANCE COMPARISON: atomic vs mutex ──────────────────── // Mutex.Lock/Unlock: ~20ns per op (kernel-assisted) // atomic.AddInt64: ~3ns per op (single CPU instruction) // → 6x faster for simple counter increments // ── WHEN TO USE ATOMIC vs MUTEX ─────────────────────────────── // Atomic: simple int counter, bool flag, single pointer read/write // Mutex: updating multiple fields, complex data structures, maps // ── ⚠ WRONG: non-atomic read of atomic value ───────────────── var count int64 // go func() { count++ }() ← DATA RACE! // go func() { fmt.Println(count) } ← DATA RACE! // Fix: atomic.AddInt64(&count, 1) and atomic.LoadInt64(&count)

✓ ADVANTAGES

  • 6-10x faster than mutex for simple operations
  • No goroutine blocking — truly lock-free
  • CPU hardware guarantee — indivisible at instruction level
  • Modern API (atomic.Int64) is ergonomic and misuse-resistant
  • Perfect for high-frequency counters (request count, error count)

✗ DISADVANTAGES

  • Only works on single primitive values — not struct fields together
  • CAS loops can spin indefinitely under high contention
  • Easy to use incorrectly — mixing atomic and non-atomic access = race
  • No memory ordering guarantees without explicit barriers
  • Cannot atomically update two values — need mutex for that
Concept 10 · Debug & Safety
Race Detector & Deadlock Detection
🕵️
Race Detector & Deadlock Detection
Built-in Tooling · Production Safety · go test -race
Go ships with a built-in data race detector using Google's ThreadSanitizer. Run with go test -race or go run -race. Detects concurrent reads and writes to shared variables that aren't synchronized. Also understand deadlock detection — Go's runtime prints a deadlock stack trace and panics when all goroutines are blocked.
Race: go test -race Overhead: 5-15x slowdown (dev/test only) Deadlock: runtime auto-detects Memory: 5-10x more RAM

Data Race — What Is It?

A data race occurs when two or more goroutines access the same variable concurrently, and at least one of the accesses is a write, and they are not synchronized by any synchronization primitive (channel, mutex, atomic).

Data races are undefined behavior in Go — they can cause corrupted data, crashes, or seemingly correct but wrong results. They are the #1 bug in concurrent code.

Common Deadlock Causes

  • Goroutine A holds lock 1, waits for lock 2
  • Goroutine B holds lock 2, waits for lock 1 → circular wait
  • Channel send with no receiver — both goroutines waiting
  • All goroutines blocked → runtime detects → panic: deadlock
  • WaitGroup counter goes negative (more Done() than Add())
race/detection.go — Race example, fix, deadlock example
// ── DATA RACE EXAMPLE ───────────────────────────────────────── var counter int func racyIncrement() { var wg sync.WaitGroup for i := 0; i < 1000; i++ { wg.Add(1) go func() { defer wg.Done() counter++ // ← DATA RACE: read-modify-write not atomic }() } wg.Wait() // counter might be 800, 950, 1000 — non-deterministic! ❌ } // ── FIX 1: atomic ──────────────────────────────────────────── var atomicCounter atomic.Int64 go func() { atomicCounter.Add(1) }() // ✓ safe // ── FIX 2: mutex ────────────────────────────────────────────── var mu sync.Mutex go func() { mu.Lock() defer mu.Unlock() counter++ // ✓ safe }() // ── DETECT: run with -race flag ─────────────────────────────── // $ go test -race ./... // $ go run -race main.go // Output: // WARNING: DATA RACE // Write at 0x... by goroutine 7: // main.racyIncrement.func1() // /app/main.go:12 // Previous read at 0x... by goroutine 6: // → Exact file and line number reported! // ── DEADLOCK EXAMPLE ───────────────────────────────────────── func deadlock() { ch := make(chan int) ch <- 42 // ← deadlock: no goroutine receives, main goroutine blocks } // Runtime output: fatal error: all goroutines are asleep - deadlock! // → Go runtime detects when ALL goroutines are blocked and panics // ← This is the ONLY deadlock Go auto-detects // ← Partial deadlock (some goroutines still running) is NOT detected

⚠ COMMON CONCURRENCY PITFALLS — AVOID THESE

  • Loop variable capture: go func() { use(i) }() — all goroutines see same i. Fix: pass as argument go func(id int) { use(id) }(i)
  • Goroutine leak: goroutine blocked on channel with no sender/receiver. Always use context cancellation or done channels.
  • Forgetting defer cancel(): WithTimeout/WithCancel context never cancelled — resource leak until GC.
  • Closing a closed channel: panic. Designate one owner (sender). Use sync.Once to close safely.
  • Non-atomic map access: concurrent map reads+writes without mutex → runtime panic "concurrent map write".
  • WaitGroup copied: pass *WaitGroup, never WaitGroup by value.
Final Summary
Go Concurrency — Complete Reference Summary

Every primitive at a glance — when to reach for each tool.

🧠
Concurrency vs Parallelism
Concurrency = structure (dealing with many). Parallelism = execution (doing many at once). Go gives you concurrency; GOMAXPROCS gives you parallelism.
Fundamental — know this first
🚀
Goroutine
2KB lightweight thread. go myFunc(). GMP scheduler. Always pass loop variables as args. Use context for cancellation.
For every concurrent task
📡
Channel
Typed communication pipe. Unbuffered = sync point. Buffered = decouple speeds. close() broadcasts. Use for ownership transfer.
For passing data between goroutines
WaitGroup
Wait for N goroutines to complete. Add before launch. Always defer Done(). Pass by pointer. Cannot collect return values alone.
For fan-out and collection
🔐
Mutex / RWMutex
Protects shared state. Mutex: exclusive. RWMutex: many readers OR one writer. Always defer Unlock. Never copy.
For protecting shared variables
🎛️
Select
Wait on multiple channels. Random if multiple ready. default = non-blocking. nil channel = disabled case. Timeout via time.After().
For multi-channel coordination
⏱️
Context
Cancellation, deadlines, request values. Always first param. Always defer cancel(). Flows from parent to all children automatically.
For cancellation and timeouts
1️⃣
sync.Once
Execute exactly once even with 1M concurrent callers. Zero-value ready. Zero overhead after first call. No reset possible.
For lazy singleton initialization
⚛️
sync/atomic
Lock-free CPU-level operations. 6x faster than mutex for simple values. Use atomic.Int64, atomic.Bool (Go 1.19+). For counters and flags only.
For high-freq counters & flags
🕵️
Race Detector
go test -race. Finds data races using ThreadSanitizer. 5-15x slower — dev/test only. Runtime auto-detects total deadlocks.
Always run in CI pipeline
Decision Guide: Which Primitive to Use?
Situation Use This Why Avoid This
Run code concurrently go func() Goroutine is the atomic unit of concurrency OS threads (too heavy)
Pass data between goroutines channel Type-safe, goroutine-safe ownership transfer Shared variable without sync
Wait for N goroutines WaitGroup Simple barrier — Add/Done/Wait time.Sleep (wrong)
Protect shared counter/map Mutex Simple exclusive access to shared state Channel (overkill)
Frequent reads, rare writes RWMutex Multiple concurrent readers allowed Mutex (blocks all readers)
Wait on multiple channels select Built-in multi-channel multiplexer Polling loop (wasteful)
Timeout / cancel operation context Standard propagation across goroutines and libs Timer channel alone
Non-blocking channel check select + default Default case makes select non-blocking Spinning goroutine
Initialize singleton sync.Once Exactly once, goroutine safe, zero overhead after init() (always runs)
High-freq counter / flag atomic.Int64 6x faster than mutex, hardware guarantee Mutex (overkill for single int)
Broadcast stop signal close(done) or ctx.Cancel() close() fans out to all goroutines at once Sending N times for N goroutines
Find concurrency bugs go test -race ThreadSanitizer catches races at runtime Code review alone (insufficient)
Concurrent-safe map sync.Map or map+RWMutex sync.Map for read-heavy, RWMutex for write-heavy Plain map (concurrent panic)

🐹 Go Concurrency — The Big Picture

Start with goroutines + channels These two primitives solve 80% of concurrency problems idiomatically. Channels handle communication and synchronization together.
WaitGroup for fan-out collection When you launch N goroutines and need to wait for all — WaitGroup is the standard. Always Add before launch, defer Done.
Mutex for shared state, not channels When multiple goroutines need to update the same struct/map/counter — mutex is cleaner than a channel-based guardian goroutine.
Context is not optional Every function that does I/O, DB, or network should accept ctx as first param. Enables timeouts and cancellation to propagate correctly.
Always run -race in CI Data races are undefined behavior. They don't always crash immediately — they cause subtle, non-deterministic bugs that are nearly impossible to debug without the race detector.
Don't leak goroutines A goroutine blocked forever is a memory leak. Every goroutine must have a clear exit path — context cancellation, done channel, or function return.
📜 The 6 Golden Rules of Go Concurrency
RULE 1 Don't communicate by sharing memory; share memory by communicating.
RULE 2 Every goroutine must have a way to exit — always provide a done channel or context.
RULE 3 Always call defer wg.Done() and defer cancel() — never trust your own code not to panic.
RULE 4 Pass loop variables as goroutine arguments — never close over them.
RULE 5 Only the sender closes a channel. Never close from the receiver side.
RULE 6 Run go test -race in CI on every commit. No exceptions.