[usf-devel] embedded inline pictures
- From: unmei <unmei@xxxxxxxxxxxx>
- To: usf-devel@xxxxxxxxxxxxx
- Date: Sat, 13 Aug 2005 19:35:05 +0200
Embedding lots of pictures into USF may be controversial and a very
suboptimal way to have picture subtitle, agreed.
But there are few other options for picture subtitles the the current
time - VobSub and that's about it.
For embedding lots of pictures into USF, using the already possible way
with a <embedded> top-level element is not good. When muxing, this
element is seperated from the timed subtitle data and stored somewhere
else. *I assume* the former content of <embedded> is fully read into
memory on startup, therefore seeking on the harddrive is not an issue,
but there are two other problems:
1) The place where this embedded data goes is most likely not meant for
storing huge amounts of data that is actually tightly related to timed
events in this case it is even "use once and forget".
2) Pixifier loads all picture data provided in embedded <b64file> child
elements into a internal "picture server" and there they are
uncompressed! Another renderer may implement a solution where this data
is not decompressed at startup but only on request, but this is then
very "rerendered"-oriented and inferior for reused embedded pictures
such as logos and animated pictures.
In order to provide a comparably low-ressource way for prerendered
subtitles, the <image> element itself should be extended to allow
picture data directly inside it.
This picture data has to be base64 encoded as well, therefore there is
no gain on the file size. But it is a way where the situation is handled
much more according the the semantics. Prerendered subtitle pictures are
used only within "their" subtitle, only during that subtitle's on-screen
time. There is no reason to keep them globally accessible and accessible
over the entire script time.
Furthermore if you were to look at the "source", the USF in a text
editor, it is much more "logical" to have such picture data where it is
actually used - compared to having <image> elements all having only a
filename attribute with distict value and all of the picture data in a
huge <embedded> element.
Dropped ideas
*a child element of <image> analogue or identical to the <b64file>
element in <embedded>. Problem: "element either has a child element XOR
it has parseable character data" - maybe depending on presence of an
attribute or only by detection. Having character data AND a child in the
same <image> element would have to be prohibited - you cant have a
reference plus inline data at the same time. It may be possible, but IMO
it looks fishy. Also this additional element is additional overhead
(even if only a few bytes, and it still is more complex).
*guess whether it is inline depending on the data.
Either by defining a length threshold where every content shorter than
this is interpreted as reference and every content longer is interpreted
as inline data. Problem: this threshold would have to be defined and it
would restrict the freedom. No long filenames and no small inline files.
May not be too restricting, considering you seldom need filenames of
100+ characters and "reasonable" prerendered text pictures with less
than maybe 300 byte are very rare, even if it is only a 2bit encoded
short word. But as you can see the range where the threshold could be
set is too large and one day there will be that picture that is smaller :)
Other detecting possibility is to take the content and try to interpret
it as reference. Either you find a matching file, then assume it was a
reference, or you don't find one in which case you try to un-base64 it
and if successful try to detect the file type. You could also do it the
other way round - trying to de-base64 it first. This approach as well is
not good IMO because it is again too much guessing IMO.
Proposal
the <image> element should be changed in the following way:
*introduce a new attribute "content" (or suggest a better name)
This attribute is optional and defaults to the value "reference".
The possible values of this attribute are "reference"|"inline[/type]".
If the value of this attribute is "reference" (either explicit or
implied), the image element contains a reference (file name) like it
used to be.
But if this attribute is "inline" (possibly followed by a slash and a
MIME-Type or file extension), the content character data of the element
is a base64 encoded file.
This file is not to be assigned to a globally available and named image
variable. It's scope only extends to this very <image> element and it
cannot be referenced from the "outer world". It is decoded when this
<image> element is first to be used and the decoded data is unloaded
when the <image> element disappears from the screen.
Mentioning the type in the content attribute may be necessary to know
what to do with the encoded file data. Remember that the encoded data
does not include the file name. If you do not know the file type
beforehand you eventually decompress the data only to discover you do
not support that file format. Or you don't even know what format it is,
until/unless you sucessfully found characteristics of a certain format
in the file data. (actually this way is considered better than relying
on the file extension, but it is more work).
Comment
The proposal is not strictly backward compatible.
But IMO it is about as safe as possible to use.
If a parser not knowing about inline reads a <image> element with inline
data:
*it encounters an attribute it does not know. Possibly it will stop
already here falling into some sort of error handling.
*If it continues, it reads the content in. It maybe unlucky that it
incounters a (supposed) filename that is severeal kilobytes long, but it
should have handling for this case. Then it either realises that this
cannot be a filename due to some of the characters found may not be
valid for a filename, or it tries to find a file of this name. If the
supposed filename is long it is alomst impossible it finds such a file.
If the supposed filename is very short there may be a slight chance it
finds a match.
BUT a inline file will almost ever be at least a couple of hundred bytes
long. The smallest sub picture i found is the word "Gut." in 2bit PNG or
OPS - this is 300 byte (=400byte in base64). How large is the chance of
finding a filename in the current path or embedded whos name maches all
the 100+ characters.. it is about (number of files in search
path)/(192^(length)). 192^10 is already ~ 70*10^21, 192^100 is 2*10^221.
The chance that tomorrow will not happen is probably by orders bigger
than that a useful base64 encoded file matches a filename.
I don't have file format recognition by only looking at the data in
pixifier yet. I do rely on file extensions so far. But i will surely try
to implement this better solution by the time i support inline image
data (for the formats i support at all).
You are encouraged to comment on this proposal or suggest improvements.
I am not fixed to this particular way, but i feel like having to
implement inline pictures in one or the other way sometime soon.
Again, no answer means "go on" ;)
regards, unmei
http://usf.corecodec.org
Other related posts: