[gpodder] Re: gpodder doesn't recognize the date

  • From: Felix Rabinovich <felix@xxxxxxxxxxxxxx>
  • To: gpodder@xxxxxxxxxxxxx
  • Date: Mon, 13 Feb 2012 20:57:26 -0800

Hi Thomas,

First, I want to clarify - I have no idea how it happened, but I accidentally sent my email from my wife's account. Anyway, I am not trying to impersonate her... my name is Felix :)

now to your points.

On 2/12/2012 2:47 AM, Thomas Perl wrote:
Hi Ilona,

On Sat, Feb 11, 2012 at 10:03:37AM -0800, Ilona Rabinovich wrote:
I use command-line version (gpo) on my linux (Fedora 16) server. As
a side note, it seems to be the only podcast client that stores the
metadata in well-designed database and allows me to read this and
use on a webpage. Very nicely done - thank you!
I'm glad you find it useful. Is that webpage some private pet project or
is it available somewhere? I'd be interested to have a look how you use
it - also it might be interesting for the gPodder Web UI.
It is internal tiny front end to gPodder with 2 users (me and my wife). the source code is here: http://pastebin.com/xZRe16uV . Nothing special - the PHP part took me less than an hour to write. Main thing was to figure out how gPodder database works.

So, two related problems on one particular feed -
http://radio.nationalreview.com/radioderb/radioderb.xml

First, for some reason, gpo doesn't get pubDate field. It looks to
me the same way, as it does on other feeds, where it comes across
fine, for example,

<pubDate>Fri, 10 Feb, 2012 09:00:00 EST</pubDate>
After some trial and error (one can use feedparser._parse_date() to try
parsing arbitrary strings) I found out that the problem is the comma
after the month. Removing it lets feedparser get the correct date.

I'ved filed a bug report at the feedparser issue tracker, please
subscribe there if you want to be informed about updates:

     http://code.google.com/p/feedparser/issues/detail?id=327

Anyway, I used your mail as an excuse to add a workaround to gPodder's
code and also change the method of getting the pubDate slightly, see:

     http://gpodder.org/commit/ee2a5b7e

The fix will be incorporated into the next gPodder version - if
feedparser fixes this issue and releases a new version, this fix will be
available to you even if you don't upgrade gPodder (it's a feedparser
issue or rather, it's an issue in the RSS feed that feedparser could
work around).
It looks like thanks to your ("our") uncanny analysis feedparser update is coming soon. Not sure I understand your point about getting feedparser without upgrading. Are you including feedparser library from the web (the way people include jquery from Google or Microsoft CDN)? So, when feedparser guys post their update, your code will automagically pull the latest? Not that it matters too much - I don't have a problem with updating gpodder if necessary...
WARNING:gpodder.model:Using download URL as GUID for<title field>
Yes, this is deliberate. We depend on the download URL as GUID, and
while this usually works fine, it could lead to problems for some feeds
(e.g. a feed without GUIDs and with the download URLs being temporary
download links that change all the time).

The fix here is to tell the feed author to add GUIDs to the feed (even
of these GUIDs are just the download URLs themselves - gPodder assumes
the GUID stays the same, and this even works with temporary download
URLs - change the enclosure URL, but keep the GUID the same).

Well, the feed is indeed missing GUID, so the warning is valid (I
don't remember seeing it in v2). The problem is that for whatever
reason, it overwrites the data in the database, including my
manually set published field. I know I was setting the date manually
in v2, and it didn't get wiped out between downloads.
One thing is that you shouldn't be required to update the pubDate in the
database manually. On the other hand, if the pubdate in the database is
valid, and the pubdate from the feed is bogus (e.g. 0), we should indeed
not write the pubdate into the database.

Can you please open a bug report for this? I'm not sure if we can fix
this, as it could require an additional database read before a database
update, but I'll have a look at it.

The bugtracker URL to file the bug report: http://bugs.gpodder.org/
https://bugs.gpodder.org/show_bug.cgi?id=1566. And sure - if the feedparser is resolved, this issue of overwriting will become moot.
With my PHP and C/Java knowledge I tried to troubleshoot - but I
think I hit the brick wall. I can't figure out where does the
program get the date, and what is different in this date.
I've posted some Python code snippets (or interactive shell logs) into
issue 327 in feedparser, this should clue you in on how it works:

     http://code.google.com/p/feedparser/issues/detail?id=327

Thanks for your detailed report!
Thank *you* for great program!
HTH :)
Thomas


Other related posts: