PHPnews.io

A Journey to find a memory leak

Written by JoliCode / Original link on Jul. 1, 2020

In this article, I will cover my journey to find and fix a memory leak in a PHP application. The final patch is simple, but only the journey is important, right?

Introduction

In our application, we had a worker that consumed a lot of RAM. After 10 seconds, the consumption reached about 1.5Gb! I use to find and eradicate memory leak quite quickly, but this time, it caused me a lot of trouble.

In the past, I used php-meminfo, a very good extension. But it is not compatible with PHP 7.3+ yet. Unfortunately, we run PHP 7.4.

So I use primitive tool: I added few calls to memory_get_usage() in my worker. And... surprise it reported very low memory usage: about 50Mb whereas my OS reported more than 1Gb. What the hell is going on here?

Then I tried blackfire, same, it's not able to see what’s going on.

I needed to do my homework, so I re-read an old article written by Julien Pauli about Zend Memory Manager.

To summarize this article very quickly:

OK, the issue should not be my code nor the vendor. It should be in an extension, or PHP itself ! But I may be wrong :)

What is in the memory?

The application has too many lines of code, and since memory_get_usage() reports the wrong memory usage, I'll need to find another way to find this leak.

I decided to see what's in the RAM to make decisions. To do that I started to look at what part of RAM was growing. At this point I was pretty sure the issue was in an extension. I ran the following command twice :

sudo cat /proc/<PID>/maps > before # or after

And I made a diff on these two files.

Surprise, the HEAP grew a lot. Let's dump it thanks to following command (I found the memory addresses thanks to the previous command):

$ sudo gdb -p <PID>
dump memory ./memory.dump 0x1234567 0x98765432

Since it's full of binary data, I used my favorite command in such situation:

strings memory.dump > memory.dump.string

And then I opened the file with vim. It was full of HTML. OK, I think I found the culprit.

Make a reproducer

The worker was responsible of the following tasks :

So I make a reproducer to test each part of the code:

$count = 25_000;
​
// Blank
for ($i=0; $i < $count ; $i++) {
    $content = file_get_contents(__DIR__. "/fixtures/$i.txt");
}
​
// PCRE
for ($i=0; $i < $count ; $i++) {
    $content = file_get_contents(__DIR__. "/fixtures/$i.txt");
    preg_match('/title/', $content, $m);
}
​
// DOM
for ($i=0; $i < $count ; $i++) {
    $content = file_get_contents(__DIR__. "/fixtures/$i.txt");
    $d = new Crawler($content);
    $t = $d->filter("title");
}
​
// JSON
for ($i=0; $i < $count ; $i++) {
    $content = file_get_contents(__DIR__. "/fixtures/$i.txt");
    json_encode($content);
}
​
// AMQP
$channel = $c->get(Broker::class)->getAmqpChannel();
$exchange = new AMQPExchange($channel);
$exchange->setType('direct');
$exchange->setName('leak');
$exchange->declare();
$queue = new AMQPQueue($channel);
$queue->setName('leak');
$queue->setArgument('x-queue-mode', 'lazy');
$queue->declare();
$queue->bind('leak', 'leak');
for ($i=0; $i < $count ; $i++) {
    $content = file_get_contents(__DIR__. "/fixtures/$i.txt");
    $exchange->publish($content, 'leak', AMQP_NOPARAM, ['delivery_mode' => 2]);
}
for ($i=0; $i < $count ; $i++) {
    $envelope = $queue->get();
    if (!$envelope) {
        break;
    }
    $queue->ack($envelope->getDeliveryTag());
}

And I benched the code. Nothing was wrong here. Bad news! Or good news: PHP does not leak.

Reconsider everything

So I go back to my code, and I started to bypass some part of the code, until the application does not leak.

I was in the part I thought in the beginning: the analysis of HTML. So now I'm able to create a new reproducer, with the exact part of what is going badly:

use Masterminds\HTML5;

require __DIR__.'/vendor/autoload.php';

$html = file_get_contents('https://www.php.net/');
$html5 = new HTML5();
$dom = $html5->loadHTML($html);
echo "Converting to HTML 5\n";
for ($i=0; $i < 100; $i++) {
    $html5->saveHTML($dom);  // This is this line in my application that leak
    printf("%.2f\n", memory_get_usage(false) / 1024 / 1024);
}

The results were a bit crazy, the value kept growing .

The fix was pretty obvious and easy.

But wait

At this point I was a bit confused: I managed to find a leak with memory_get_usage(), but I said the leak could not be found with this tool. Actually I found an additional leak.

So I started to dig again, and I managed to create this reproducer:

$content = file_get_contents('https://www.php.net/');

$count = $argv[1] ?? 251;

for ($i = 0; $i < $count; $i++) {
    $crawler = new Crawler($content);
    $nodes = $crawler->filterXPath('descendant-or-self::head/descendant-or-self::*/title');
    $nodes->each(static function ($node): void {
        $node->html();
    });
    if (0 == $i % 10) {
        preg_match('/^VmRSS:\s(.*)/m', file_get_contents('/proc/self/status'), $m);
        printf("%03d - %.2fMb - %s\n", $i, memory_get_usage(true) / 1024 / 1024, trim($m[1]));
    }
}

This code could be simplified, but it looks like what I have in the application. As you can see, I used two methods to get the memory usage:

And here the result where astonishing:

i   - PHP    - OS       - Duration
000 - 4.00Mb - 37936 kB - 0.084s
010 - 4.00Mb - 45648 kB - 0.530s
020 - 4.00Mb - 53040 kB - 0.991s
030 - 4.00Mb - 60696 kB - 1.488s
040 - 4.00Mb - 68352 kB - 1.981s
050 - 4.00Mb - 76008 kB - 2.455s
060 - 4.00Mb - 83400 kB - 2.973s
070 - 4.00Mb - 91056 kB - 3.576s
080 - 4.00Mb - 98712 kB - 4.208s
090 - 4.00Mb - 106368 kB - 4.682s
100 - 4.00Mb - 113760 kB - 5.146s
110 - 4.00Mb - 121416 kB - 5.622s
120 - 4.00Mb - 129072 kB - 6.098s
130 - 4.00Mb - 136728 kB - 6.561s
140 - 4.00Mb - 144120 kB - 7.024s
150 - 4.00Mb - 151776 kB - 7.491s

The leak is terrible. In 150 iterations, it consumes more than 150Mb

PHP does not see any increase, but my OS does. How could it be?

What is the real cause?

I read a bit the code, and I saw that:

$rules = new OutputRules($stream, $options);
$trav = new Traverser($dom, $stream, $rules, $options);

and in the Traverser constructor:

$this->rules->setTraverser($this);

We have a cyclic reference here. And this is something PHP does not like. It makes freeing memory harder. Only the Garbage Collector can solve this issue.

I could let the GC do its job, but this code was on a critical path, where we need extreme performance. Moreover, the GC does not run every time. It is triggered whenever 10000 possible cyclic objects or arrays are currently in memory and one of them falls out of scope.

But 2 objects in memory, that should not be that bad? No it's not until I saw that:

$this->dom = $dom;

OK! Here we have a demoniac combination:

Conclusion

I eventually made another patch to mitigate this leak.

In this patch, I "help" PHP to free memory by breaking the circular reference. The Garbage Collector is not involved anymore, and the memory stays constant and very low.

I'm happy.

So how to prevent such issue:

calevans memory jolicode

« Introducing Docker Starter 3.0 - Améliorer la DX de vos Fixtures PHP »