Skip to content

Memory Leak - calling itself inside of itself, forever, with a solution #14

@ghost

Description

Hi!
I've learned something and I've found a solution

Crawler.php

Line 202 foreach($crawling as $site)

My PHP error logs went haywire after running the script for an extended period of time, 971 occurrences of itself and it causes a crash at that point

Stack trace: #2 Crawler.php(494): Crawler->followLinks() #3 Crawler.php(494): Crawler->followLinks() #4 Crawler.php(494): Crawler->followLinks() #5 Crawler.php(494): Crawler->followLinks() . . .

and it goes on and on, up to 971 occurrences.

The issue is that , this means that what the script has been doing is
Get called for Url1 (and doesn't call getDetails for it?)
View first a href on Url1
Visit a href URL(Lets call it URL2) from Url1 and call getDetails on it
Visit the first a href on URL2. calls getdetails
visits the first a href on the last a href on url 2 from url1.
visits the first a href on the last a href on the last a href on url2 from url 1.
etc etc etc, it goes on and on forever UNTIL one subprocess doesn't have any a hrefs or all the a hrefs were processed, then it goes to its parent node.

Meaning

The first original array worth of a href URLs are not fully processed until the sub-processes finish, and for the sub-processes to finish the sub-sub-processes have to finish, and eventually you get to a point where this happens Allowed memory size of 16582912000 bytes exhausted (tried to allocate 20480 bytes)

Solution!

Line 166 function followLinks($url) to function followLinks($url, $depth = 0)
170 Insert if ($depth >= 12) {return;} replace 12 with how deep you want it to go
Line 203: erase and fill with if(isset($site)){$this->followLinks($site, $depth + 1);}

Alternative Solution

public function count_method_occurrences($method_name) { $backtrace = debug_backtrace(); $count = 0; foreach ($backtrace as $trace) { if (isset($trace['class']) && isset($trace['function']) && $trace['function'] === $method_name) { $count++; } } return $count; }

Call it via

if ($this->count_method_occurrences('followLinks') < 12) { foreach(){} }

*Please note occurrences via this will be +1 than $depth

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions