[gpodder-devel] util.remove_html_tags optimizations

  • From: me at nikosapi.org (nikosapi)
  • Date: Wed, 19 Mar 2008 23:16:08 -0500

Hello again!

While hunting a bug down today I found some code that was slowing down 
gPodder's loading time :D

The original bug was that in the episode descriptions I was still seeing stuff 
like ’. This is because the python codepoint2name dict doesn't include 
all of the possible unicode characters. So I replaced the old code with a 
regex that converts the codepoint numbers directly to unicode characters. In 
a quick benchmark I calculated that using the old code took 3.28 sec worth of 
load time whereas the new code uses < 0.1 sec of load time :)

Here are some examples of feeds which include those weird codepoints:
- http://feeds.feedburner.com/doctorow_podcast
- http://feeds.feedburner.com/nlo

Now what's really cool is that when I launch gPodder, it's ready to go in less 
than 2 seconds! (on an intel E6300)

Let me know what you guys think,

nick
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gpodder-r615-remove_html_optimizations.patch
Type: text/x-diff
Size: 1118 bytes
Desc: not available
URL: 
<https://lists.berlios.de/pipermail/gpodder-devel/attachments/20080319/a063ea97/attachment.patch>

Other related posts: