First, a suggestion for everyone: be sure to scrub your logs if you share
them so you don't send out sensitive info like your ComicVine API key.
I think the issue there is that the description for that comic is HUGE. The
max buffer if 262,144 bytes and the description on the issue being scraped
is 319,697 bytes. I found a way to increase that buffer size (I"m going to
set it to a default of 16MB) to see if it fixes the issue. I tested it by
scraping a comic as if it were The Sandman #75 (the one causing the problem
here) and recreated the bug. The fix appears to work correctly.
I also see that I forgot to include the imprint mapping in the refactored
scraping code, so that's getting pulled in as well when this is pushed to a
PR later today.
On Mon, Jul 20, 2020 at 5:54 PM Guy Incognito <dmarc-noreply@xxxxxxxxxxxxx>
wrote:
Only happening with a development build, I was able to scrape the comic in
question in 0.6.
This happens for a few comics – haven’t looked into it too much just yet
to see if there’s a pattern. ComicVine page for the comic can be found
here
<https://comicvine.gamespot.com/hellblazer-in-the-line-of-fire-1-vol-10/4000-479366/>.
A partial CX log is attached – showing from where the failure begins to
where it ends.