Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crawler crashes if there is invalid url #139

Closed
tuanngocnguyen opened this issue May 27, 2020 · 3 comments
Closed

crawler crashes if there is invalid url #139

tuanngocnguyen opened this issue May 27, 2020 · 3 comments

Comments

@tuanngocnguyen
Copy link

tuanngocnguyen commented May 27, 2020

I have come across an issue where tool_crawler crashes when there is an invalid url in a course:

<a href="http://<iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/2YULdjmg3o0&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen></iframe>"><iframe width="560" height="315" src="https://www.youtube.com/embed/2YULdjmg3o0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></a>

Error stack when running "sudo -u www-data php admin/tool/crawler/cli/crawler.php --verbose=2":

...
 - Found link to:                      / http://<iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/2YULdjmg3o0&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen></iframe> => http://<iframe width="560" height="315" src="https://www.youtube.com/embed/2YULdjmg3o0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
Default exception handler: Coding error detected, it must be fixed by a programmer: A lock was created but not released at:
[dirroot]/admin/tool/crawler/classes/robot/crawler.php on line 552

 Code should look like:

 $factory = \core\lock\lock_config::get_lock_factory('type');
 $lock = $factory-&gt;get_lock(160583528);
 $lock-&gt;release();  // Locks must ALWAYS be released like this.

 Debug: 
Error code: codingerror
* line 117 of /lib/classes/lock/lock.php: coding_exception thrown
* line 567 of /admin/tool/crawler/classes/robot/crawler.php: call to core\lock\lock->__destruct()
* line 81 of /admin/tool/crawler/lib.php: call to tool_crawler\robot\crawler->process_queue()
* line 55 of /admin/tool/crawler/cli/crawler.php: call to tool_crawler_crawl()

!!! Coding error detected, it must be fixed by a programmer: A lock was created but not released at:
[dirroot]/admin/tool/crawler/classes/robot/crawler.php on line 552
@tuanngocnguyen tuanngocnguyen changed the title crawler crash if there is invalid url crawler crashes if there is invalid url May 27, 2020
@tuanngocnguyen
Copy link
Author

This merge request may resolve the issue:

#128

@tuanngocnguyen
Copy link
Author

Actual Error

 - Found link to:                      / http://<iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/2YULdjmg3o0&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen></iframe> => http://<iframe width="560" height="315" src="https://www.youtube.com/embed/2YULdjmg3o0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
Default exception handler: error/invalidpersistenterror Debug: url: Data submitted is invalid
Error code: invalidpersistenterror
$a contents: Data submitted is invalid
* line 467 of /lib/classes/persistent.php: core\invalid_persistent_exception thrown
* line 392 of /admin/tool/crawler/classes/robot/crawler.php: call to core\persistent->create()
* line 877 of /admin/tool/crawler/classes/robot/crawler.php: call to tool_crawler\robot\crawler->mark_for_crawl()
* line 847 of /admin/tool/crawler/classes/robot/crawler.php: call to tool_crawler\robot\crawler->link_from_node_to_url()
* line 621 of /admin/tool/crawler/classes/robot/crawler.php: call to tool_crawler\robot\crawler->parse_html()
* line 568 of /admin/tool/crawler/classes/robot/crawler.php: call to tool_crawler\robot\crawler->crawl()
* line 81 of /admin/tool/crawler/lib.php: call to tool_crawler\robot\crawler->process_queue()
* line 55 of /admin/tool/crawler/cli/crawler.php: call to tool_crawler_crawl()

!!! error/invalidpersistenterror !!!

@tuanngocnguyen
Copy link
Author

Closed as fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant