Fixing a Memory Leak in Go: Understanding time.After
Estimated reading time: 2 minutes
Recently, we decided to investigate why our application ARANGOSYNC for synchronizing two ArangoDB clusters across data centers used up a lot of memory – around 2GB in certain cases. The environment contained ~1500 shards with 5000 GOroutines. Thanks to tools like pprof (to profile CPU and memory usage) it was very easy to identify the issue. The GO profiler showed us that memory was allocated in the function `time.After()` and it accumulated up to nearly 1GB. The memory was not released so it was clear that we had a memory leak. We will explain how memory leaks can occur using the `time.After()` function through three examples.
Valid usage of the time.After() function
select {
case <-time.After(time.Second):
// do something after 1 second.
}
Nothing is wrong with the above code because there is only one possibility when the `select` statement is finished. When it is done the timer which was created internally in the `time.After()` function was stopped and resources were freed.
Invalid usage of the time.After() function
It is very tempting to write the following code:
select {
case <-time.After(time.Second):
// do something after 1 second.
case <-ctx.Done():
// do something when context is finished.
// resources created by the time.After() will not be garbage collected
}
In the above `select` statement, if the `time.After()` function is finished everything works like in the first example. But if the `ctx.Done()` is finished earlier, then the timer which was created in the `time.After` function is not stopped and resources are not released – causing a memory leak (see the documentation here).
Improved usage of the time.After() function
In production code, one should use `time.After()` in the following way instead:
delay := time.NewTimer(time.Second)
select {
case <-delay.C:
// do something after one second.
case <-ctx.Done():
// do something when context is finished and stop the timer.
if !delay.Stop() {
// if the timer has been stopped then read from the channel.
<-delay.C
}
}
Here, one creates a new timer and when it is finished all resources created by the `time.NewTimer()` are released. In the other case when `ctx.Done()` occurs before, then resources are released using the `delay.Stop()` function. It may occur that the `ctx.Done()` finishes, and immediately afterwards the timer expires. So that is why there is an additional condition \ checking whether the timer has expired or stopped.
I hope that this finding is useful for others, it at least solved our problem immediately. Feel free to leave comments below or ping me on the ArangoDB Community Slack (@tomasz.arangodb)
Continue Reading
ArangoML Pipeline – A Common Metadata Layer for Machine Learning Pipelines
2 Comments
Leave a Comment
Get the latest tutorials, blog posts and news:
couldn’t you also do
“`
timeout := time.After(time.Second)
select {
case <-timeout:
// do something after 1 second.
case <-ctx.Done():
go func() { <-timeout }() // prevent leak
// do something when context is finished.
}
“`
yes, the below line:
go func() { <-timeout }() // prevent leak
could also work, but it creates a separate go-routine which eventually will be finished depends on the timeout variable.
When the timeout variable was high then go-routine would exist during this timeout, which is not good for a performance. It is better to close a timer when we know that it is not longer required.