Hi, On Sat, Jun 04, 2011 at 05:42:03PM +0200, Neal H. Walfield wrote: > At Sat, 4 Jun 2011 15:14:29 +0400, > Justin Forest wrote: > > I think it is a good idea to use an external download manager. Not > > only for mobile devices, but in general. So that gPodder would focus > > on podcasting and not deal with bandwidth throttling, etc. > > Woodchuck isn't so much about performing downloads as it is about > scheduling them. That is, Woodchuck's primary goal is to determine > when it is a good time to download data and which data to download. > (That said, it does provide a simple downloader for objects accessible > over http and https.) So, to what extend does it differ in functionality from iphbd / the "Hearbeat Daemon" on Maemo 5, which is used to make sure that all network-using applications do network-intensive requests at the same time to save power? http://wiki.maemo.org/Documentation/Maemo_5_Developer_Guide/Architecture/System_Software#Heartbeat_Daemon_.28Heartbeatd.29 > > My understanding is that there would need to be an API to queue an URL > > for downloading, > > Right. The basic idea is that gPodder registers a stream for each > podcast subscription (pywoodchuck.stream_register). At some point, > Woodchuck makes an upcall telling gPodder that it is a good time to > update one or more streams (pywoodchuck.stream_update_cb). (If > gPodder is not running, Woodchuck first starts it using > org.freedesktop.DBus.StartServiceByName.) gPodder then updates the > feeds and reports whether the update was successful > (pywoodchuck.stream_updated). For each new episode, gPodder registers > a new object (pywoodchuck.object_register). Why does Woodchuck have to know so much domain-specific information about a given application? The "good time to update a stream" call in my opinion doesn't need to know about this - just make sure you have a good interval and do the online/offline checking and maybe have some minimum and maximum interval, and try to sync this interval as good as possible between the registered apps. Also, what does Woodchuck do with the information whether or not an update is successful? For the episodes, I again don't really get the point - why does gPodder need to register each episode with Woodchuck? Isn't a more simpler interface like "I've got a total of 120MB of data that I want to download, can I download now or will you tell me when I can?" more suitable and not so object-specific? What do you gain (apart from API complexity ;) by having all these objects floating around between an application and Woodchuck? > When the user listens to a podcast, gPodder reports this to Woodchuck > (pywoodchuck.object_used). That way, Woodchuck can learn the user's > preferences. Isn't that something that requires knowledge of the data in question? If you know that object 01abcd has been listened to, how can Woodchuck determine whether or not the user is interested in object bb0a256? Also, isn't downloading detached from consuming/listening to files? I might listen to a given episode the first time two weeks after it has been downloaded - what kind of use would this be to Woodchuck? Does Woodchuck keep track of all downloads all of the time? What influence does this have on memory consumption and performance when looking up a given download in a local database? Also, does this model take streaming of episodes into account? It's basically a simultaneous download + consumption, as far as managing network traffic and usage are concerned. Will Woodchuck be aware of and take into account things like traffic limits of 3G connections (e.g. my current plan offers 3GB of traffic per month, and the operator charges a premium for every MB above that). > Nevertheless, gPodder should still report that it was downloaded > (pywoodchuck.object_downloaded) so that Woodchuck can improve its > model of the user's preferences. Again, I'm interested to hear about some examples what kind of preferences these could be and how they could be used to do something useful and provide value to the user or application. I'm interested in this project, and can see some good use cases, but I'm also not sure if Woodchuck tries to do too much than what it's supposed to do, and if maybe all that micro-mangement of data is really something that will be an advantage in the long run. I think the interesting problems here are: * If and when to download a given file of a certain size * Priorities for downloads (the user explicitly wants do download something vs. "if I'm on home WiFi + the device is charging + it's 3 AM, then also try to download/cache other files") * Management of traffic limits (i.e. 3G connection with a limited traffic allowance) * Management of power (i.e. download when on charger, but avoid everything - including feed updates - when the battery goes below a certain threshold) * Awareness of other applications (i.e. don't download if the user is using a web browser, to increase the web browser's responsiveness) * Awareness of other machines (i.e. limit the bandwidth if a user on the same network is browsing the web - maybe using Bonjour/Zeroconf?) * Notification interface for "pause all your downloads" / "resume all your download" (i.e. pause download when leaving home and network connectivity changes from WiFi to 3G, etc..) The great thing about this is that we could then have a system-wide "Download preferences" configuration where the user can configure the preferences (e.g. "download even on 3G, because I have a flat rate") and these preferences would be implemented by the Woodchuck policy / daemon and automatically used by all the applications. Thanks, Thomas