Delay context cancellation in Go
The Go standard library defines context.Context
to propagate deadlines,
cancellation signals, and request-scoped values. For a server, the main function
usually creates a top-level context and triggers the cancel on an interrupt
signal. Today, we'll explore the interaction of context cancellation with
batched workloads, like exporting traces.
package mainimport ("context" "os" "os/signal" "syscall" )func runMain ()error { ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt)defer cancel() tracer := NewTracer(ctx) trace.SetDefault(tracer)defer tracer.Flush()return runMyServer(ctx) }
Once a context is canceled, all requests using that context fail. Batched workloads, like logs and traces, will fail to send queued data as the server shuts down. We'll have an observability gap during the server shutdown.
Upstream issue¶
The Go issue, context: add WithoutCancel, discusses the nuances and bemoans conflating cancellation and context values. The issue took just under three years (July 2020 to May 2023) to land in the Go standard library. The main reasons necessary to remove cancellation are:
- Processing rollbacks or other cleanup.
- Observability tasks to run after the context is canceled. This is the use case I needed.
WithDelayedCancel¶
With the new Go additions, we can create a derived context that cancels after the parent context is done. For additional cleverness, we'll use the new context.AfterFunc to avoid starting a goroutine until the context is done.
package contextsimport ("context" "time" )// WithDelayedCancel returns a new context that cancels // after the parent is done plus a delay. Useful to flush // traces without dropping all traces because the context // is canceled. func WithDelayedCancel (parent context.Context, delay time.Duration) context.Context { child, childCancel := context.WithCancel(context.WithoutCancel(parent)) context.AfterFunc(parent,func () { time.Sleep(delay) childCancel() })return child }
We use WithDelayedCancel to allow the tracer a grace period to flush traces.
package mainimport ("context" "os" "os/signal" "syscall" )func runMain ()error { ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt)defer cancel()flushGracePeriod := 5 * time.Second tracer := NewTracer(contexts.WithDelayedCancel(ctx, flushGracePeriod)) trace.SetDefault(tracer)defer tracer.Flush()return runMyServer(ctx) }