You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had a case where the crawler just seemed to get stuck. Turns out it was because of a dml_write_exception being thrown from within crawl() in classes/robot/crawler.php. In my case it was because there was a page having a link that was longer than 1033 characters (the size of the url field).
I'm not sure the best solution to this. At some point you have to have a field size limit and 1033 seems big enough, but the Moodle Wiki seems to cause issues in this case. There are some links that exceed that amount resulting in the underlying dml_write_exception.
I will submit a patch which at least captures these exceptions, logs the error using mtrace(), then continues processing other URLs in the queue. This is good enough for me at this point, however, ideally there'd be some way of logging this information where a report could be pulled by the administrator along with the broken links report, but that's beyond the amount of time I have to work on this right now.
The text was updated successfully, but these errors were encountered:
@pennedav we've recently refactored these tables to use a persistent, and importantly we no longer directly query based on the url, but a hash of the url, which means we should be able to just swap this from a char to a text field of random length. Will you still be submitting a patch?
@brendanheywood FYI I've been laid off from my employer, for whom I was working on this. Feel free to use the patch as-is or reject the pull as you see fit. Sounds like your refactoring will address the issue anyway.
This is an issue I alluded to in issue #127:
I'm not sure the best solution to this. At some point you have to have a field size limit and 1033 seems big enough, but the Moodle Wiki seems to cause issues in this case. There are some links that exceed that amount resulting in the underlying dml_write_exception.
I will submit a patch which at least captures these exceptions, logs the error using mtrace(), then continues processing other URLs in the queue. This is good enough for me at this point, however, ideally there'd be some way of logging this information where a report could be pulled by the administrator along with the broken links report, but that's beyond the amount of time I have to work on this right now.
The text was updated successfully, but these errors were encountered: