[gpodder] Re: gpodder doesn't recognize the date

  • From: Thomas Perl <m@xxxxxx>
  • To: gpodder@xxxxxxxxxxxxx
  • Date: Sun, 12 Feb 2012 11:47:05 +0100

Hi Ilona,

On Sat, Feb 11, 2012 at 10:03:37AM -0800, Ilona Rabinovich wrote:
> I use command-line version (gpo) on my linux (Fedora 16) server. As
> a side note, it seems to be the only podcast client that stores the
> metadata in well-designed database and allows me to read this and
> use on a webpage. Very nicely done - thank you!

I'm glad you find it useful. Is that webpage some private pet project or
is it available somewhere? I'd be interested to have a look how you use
it - also it might be interesting for the gPodder Web UI.

> So, two related problems on one particular feed -
> http://radio.nationalreview.com/radioderb/radioderb.xml
> 
> First, for some reason, gpo doesn't get pubDate field. It looks to
> me the same way, as it does on other feeds, where it comes across
> fine, for example,
> 
> <pubDate>Fri, 10 Feb, 2012 09:00:00 EST</pubDate>

After some trial and error (one can use feedparser._parse_date() to try
parsing arbitrary strings) I found out that the problem is the comma
after the month. Removing it lets feedparser get the correct date.

I'ved filed a bug report at the feedparser issue tracker, please
subscribe there if you want to be informed about updates:

    http://code.google.com/p/feedparser/issues/detail?id=327

Anyway, I used your mail as an excuse to add a workaround to gPodder's
code and also change the method of getting the pubDate slightly, see:

    http://gpodder.org/commit/ee2a5b7e

The fix will be incorporated into the next gPodder version - if
feedparser fixes this issue and releases a new version, this fix will be
available to you even if you don't upgrade gPodder (it's a feedparser
issue or rather, it's an issue in the RSS feed that feedparser could
work around).

> WARNING:gpodder.model:Using download URL as GUID for <title field>

Yes, this is deliberate. We depend on the download URL as GUID, and
while this usually works fine, it could lead to problems for some feeds
(e.g. a feed without GUIDs and with the download URLs being temporary
download links that change all the time).

The fix here is to tell the feed author to add GUIDs to the feed (even
of these GUIDs are just the download URLs themselves - gPodder assumes
the GUID stays the same, and this even works with temporary download
URLs - change the enclosure URL, but keep the GUID the same).

> Well, the feed is indeed missing GUID, so the warning is valid (I
> don't remember seeing it in v2). The problem is that for whatever
> reason, it overwrites the data in the database, including my
> manually set published field. I know I was setting the date manually
> in v2, and it didn't get wiped out between downloads.

One thing is that you shouldn't be required to update the pubDate in the
database manually. On the other hand, if the pubdate in the database is
valid, and the pubdate from the feed is bogus (e.g. 0), we should indeed
not write the pubdate into the database.

Can you please open a bug report for this? I'm not sure if we can fix
this, as it could require an additional database read before a database
update, but I'll have a look at it.

The bugtracker URL to file the bug report: http://bugs.gpodder.org/

> With my PHP and C/Java knowledge I tried to troubleshoot - but I
> think I hit the brick wall. I can't figure out where does the
> program get the date, and what is different in this date.

I've posted some Python code snippets (or interactive shell logs) into
issue 327 in feedparser, this should clue you in on how it works:

    http://code.google.com/p/feedparser/issues/detail?id=327

Thanks for your detailed report!

HTH :)
Thomas


Other related posts: