[gpodder-devel] Woodchuck patches: woodchuck integration

From: neal at walfield.org (Neal H. Walfield)
Date: Thu, 04 Aug 2011 15:40:27 +0200
Hi Thomas,

First: thanks for the review, it's clear you put a fair amount of time
into it.

At Wed, 3 Aug 2011 12:00:08 +0200,
Thomas Perl wrote:
> On Tue, Aug 02, 2011 at 04:50:22PM +0200, Neal H. Walfield wrote:
> > This patch modifies the gtk frontend to implement the first of these
> > two callbacks.  It turns out this was surprisingly easy.  If this
> > approach is acceptable, I'll modify the other frontends similarly.
> > [...]
> > http://hssl.cs.jhu.edu/~neal/woodchuck/ports/gpodder/commits/36a4a3bf90c3036a99c9bfaa938dd5cd0127024b
> 
> Thanks, some feedback:
> 
>  * The "woodchuck" module should go in src/gpodder/plugins/
>  * The DEFAULT_PLUGINS in src/gpodder/__init__.py should be modified
>    accordingly

Done.

>  * What's the purpose of "mainthread.py"? It's not documented :/
>  * Please remove the shebang line (#!/usr/bin/env python2.5) in
>    mainthread.py (if you want to run it, you can use "python -m
>    gpodder.mainthread") - also, don't hardcode python2.5

I'm not sure what you mean by not documented, the two functions
include a doc string :).

mainthread is a module I wrote in another context to ensure that a
function is executed in the context of the main thread.  This
functionality is needed, because DBus can only be used from a single
thread, most Woodchuck calls use DBus, and download queue threads
result in calls to the Woodchuck module by way of the hooks, such as
on_episode_save callback.

A reason to include mainthread.py would be to make bug fixing easier:
mainthread.py is used by other projects (well, so far just FeedingIt,
but this issue appears to be a common one).  Thus, if a bug in found
in mainthread.py, it is easier to propagate the fix: just copy the new
version.

As discussed on irc, I've implemented a point solution and have
integrated the code directly into the woodchuck module.

>  * Constants (like refresh_interval in woodchuck.py) should be uppercase

Ok.

>  * The "config" object is never used in mywoodchuck - is it really
>    necessary to pass this object through? (if so, why?)

That's a bug, which I've now fixed, thanks.  (The config object was
used in previous versions of the patch in which Woodchuck queued
downloads directly.  I failed to remove it in the latest revision.)

>  * The extracting of a single podcast channel via URL is done multiple
>    times in mywoodchuck's code. Better write a get_podcast(url) function
>    that will take care of the filtering and logging of the warning,
>    returning None when the podcast channel is not found
>  * The same is probably true for get_episode(podcast, url), where the
>    first argument is already a podcast channel object and the second is
>    the URL of the episode (+returning None, etc)

Corrected.

>  * For episodes, the unique identifier should be the GUID, because it
>    might be possible that feeds publish different items at the same URL
>    (yes, this is very crappy, but it has happened before..)

Thanks for pointing this out.  I've changed this accordingly.

>  * The changes support that you implemented in the model is only used
>    for the code in on_podcast_save? If so, what's the problem with
>    unconditionally copying the title to the human_readable_name (or
>    adding a hook for on_podcast_title_changed). The same is for
>    on_podcast_url_changed - there are only a few cases (HTTP redirect
>    and after gpodder.net subscription uploads) where the URL of a
>    podcast really changes in gPodder.

The code is also used in on_episode_saved for detecting changes to the
published, http_last_modified and last_playback attributes.

Copying the title or url unconditionally is possible, but expensive:
this results in a DBus call.

As for implementing specialized hooks a la on_podcast_title_changed:
how do you envision the implementation in model.py?  Adding a property
setting for title et al. that invokes the appropriate hook?

>  * In the any statements, if you don't use list comprehension, but
>    rather a generator expression, it might perform better - any can stop
>    after the first positive result, the list comprehension builds a list
>    of booleans and then any uses this list to determine its result. Just
>    leaving out the square brackets should do the trick :)

Good point.  Changed.

>  * Please adhere to PEP-8[1] where possible. Don't add spaces after
>    function names (NOT: self.auto_download (stream, obj)).

Sorry, that inconsistency of PEP-8 confuses me ('f(x)' but 'if (x)'?).
I've correct this.  I reread PEP-8 and reviewed the code and I didn't
notice any other other violations.

>  * You can probably shrink the length of "check_subscriptions" by using
>    set() and its functions difference() and intersection().

It may be a couple of lines shorter, but I doubt it is faster as I
need both (A - B) and (B - A).

>  * For idle_add, maybe you want to use gpodder.util.idle_add? ;)

I guess you mean for the mainthread.py code, as woodchuck was already
using util.idle_add.  As I mentioned above, that code was kept
intentionally generic.  If I'm going to use gpodder.util.idle_add,
then it needs to be changed to guarantee that the function is executed
in the main loop.  I've changed it now.


I've test the gtkui and gpo.  Unfortunately, python-pyside doesn't
want to installed on my Debian Squeeze box insisting that I install
some hundreds of packages from unstable.  I would appreciate it if you
would test this for me.

As usual, you can pull the woodchuck branch from:

  http://hssl.cs.jhu.edu/~neal/woodchuck/ports/gpodder.git/

Thanks,

Neal
References:
- [gpodder-devel] Woodchuck patches: woodchuck integration
  - From: Neal H. Walfield
- [gpodder-devel] Woodchuck patches: woodchuck integration
  - From: Thomas Perl
[gpodder-devel] Woodchuck patches: woodchuck integration

Other related posts: