[gpodder-devel] Results of the file naming test (your OPML files)

  • From: thp at perli.net (Thomas Perl)
  • Date: Thu, 08 May 2008 17:51:07 +0200

Hello!

I've finished the script and let it run through all the OPML files that
have been sent to me. All feeds seem to work great with the current file
naming scheme, except for two:

http://feeds.themerlinshow.com/TheMerlinShow
This feed always has "episode.mp4" as file name. The episode URLs are
redirects to the real files, which resolve to something like this:

> $ HEAD -S 
> http://feeds.themerlinshow.com/~r/TheMerlinShow/~5/125754107/redirect.mp4
> HEAD 
> http://feeds.themerlinshow.com/~r/TheMerlinShow/~5/125754107/redirect.mp4 --> 
> 302 Moved Temporarily
> HEAD 
> http://www.podtrac.com/pts/redirect.mp4?http://media.libsyn.com/media/themerlinshow/tms_020_Jesse_Thorn_02_ipod.mp4
>  --> 302 Found
> HEAD 
> http://media.libsyn.com/media/themerlinshow/tms_020_Jesse_Thorn_02_ipod.mp4 
> --> 302 Found
> HEAD
> http://cdn2.libsyn.com/themerlinshow/tms_020_Jesse_Thorn_02_ipod.mp4
> --> 200 OK

This means that for this podcast with "redirect.mp4" as base name for
every episode, we can use a HTTP head request to determine a valid file
name. This is a suboptimal solution, so I'm thinking about how we can
avoid such situations.


http://americanpublicmedia.publicradio.org/podcasts/xml/prairie_home_companion/news_from_lake_wobegon.xml
This one unfortunately constructs its URLs so that episodes from
different months but same day-of-month collide, e.g.

http://download.publicradio.org/podcast/phc/2008/01/26_nflw_64.mp3
and
http://download.publicradio.org/podcast/phc/2008/04/26_nflw_64.mp3

I have to think about what we can do in such a situation. I am thinking
about adding a dictionary of filename => url pairs that can be used to
check if a downloaded file has been downloaded from a specific URL or
not. If so, we know that the filename corresponds to the URL. If not, we
have to rename the file and (after downloading it) save the file name in
the dictionary. The only problem is that we also have to look in our
dictionary if there is a generated filename for every URL for which a
downloaded file doesn't exist in the default location. Another problem
is that we want to be downwards-compatible with previous versions (i.e.
downloads from previous versions should be detected and used).

Any ideas?


Thanks,
Thomas



Other related posts: