[haiku-commits] Re: haiku: hrev49508 - src/kits/media

  • From: Julian Harnath <julian.harnath@xxxxxxxxxxxxxx>
  • To: <haiku-commits@xxxxxxxxxxxxx>
  • Date: Tue, 4 Aug 2015 19:56:36 +0200

Hey Dario,

combined reply to several of your mails in here.

On 03.08.2015 20:32, Dario Casalinuovo wrote:
> [about performance]
> I tested it, and makes difference, enough to move the breakage point
> a bit up.

The previous version of the control loop was broken, so that it works better now is not such a surprise. However, you reintroduced an old bug by removing the enqueue time (see below).


On 04.08.2015 12:16, Dario Casalinuovo wrote:
> I think this can confuse if someone don't understand at all the code.
> But i changed the variable name anyway.

The point of code readability is to make it obvious. And to be honest, renaming lateness to "tempLateness", as you did, doesn't make it any better. You're still assigning "waitUntil" to it, a value which is not lateness, so the variable name makes no sense.


On 04.08.2015 12:27, Dario Casalinuovo wrote:

For those interested in the topic, just a few considerations on the topic.
My opinion is that the whole problem is just like taking all from the
wrong end. I understand the 'blackouts' thing, but i don't understand
why using the queued time would change anything.

Okay, earlier you wanted a more mathematical answer, so here goes, a definition of when an event is late:
An event is late if and only if the producer enqueues it too late into the event queue of its consumer.

The enqueue of a buffer from a producer A to a consumer B is the exact moment when A hands off the buffer to B. After A has enqueued the buffer in B's queue, for node A it's over with this buffer, finished, nothing to do anymore. The buffer is now B's business. This is important, because it means that the enqueue is the defining moment of passing on a buffer from one node to another. Whatever happens with the event after the enqueue has nothing to do with producer node A anymore.

Now, consider the following example.
* Three media nodes: A, B, and C
* Nodes are chained together A --> B --> C
* Latencies of the nodes are: A 10ms ; B 20ms ; C 10ms
* There is a buffer produced by A which should be shown to the user at performance time 100ms

To perform the event @ 100ms, it would ideally go like this:

Perf-time What happens
--------- ------------
60 ms A begins producing the buffer
70 ms A finished the buffer
A enqueues buffer event in B's event queue
B dequeues the buffer event
B starts processing the buffer
90 ms B finished processing the buffer
B hands it over to C, just like before
C starts processing
100 ms C shows the result to the user


It all worked fine! No lateness anywhere. Great. But now, reality is often less ideal, so there might be delays or blackouts or whatever you might want to call it. Let's insert one and see what happens:

Perf-time What happens
--------- ------------
60 ms A begins producing the buffer
70 ms A finished the buffer
A enqueues buffer event in B's event queue
<<< Blackout for 5 ms! >>>
75 ms B dequeues the buffer event
B starts processing the buffer
95 ms B finished processing the buffer
B hands it over to C, just like before
C starts processing
105 ms C shows the result to the user

So now there is obviously a lateness of 5ms. Your version and the version with enqueue_time will both notice that, both get the same value of 5ms. However, the important difference is: which node will increase its latency?

Here's what your version of the control loop would do: when B dequeues the buffer @ 75ms, it will notice during event dispatch that it's 5ms late. So node B will notify node A about the lateness, and A will usually increase its latency by the same amount. Now imagine the same happens again, another blackout in the same place, between enqueue and dequeue. Another disruption, node B will blame node A again, making A increase its latency by another 5ms. And then it happens again, and A gets again more latency, but it doesn't help! And again, and again...
Increasing A's latency does not help. Because node A did nothing wrong! It delivered the event exactly when it should. But node A will now have an ever-increasing latency and the node chain never enters a stable state. And this chain of events is not far-fetched, it is exactly what is documented in bug ticket #7285.

So, what can be done about this? Let's consider the same situation, but we calculate lateness with enqueue_time. When B's control loop checks lateness, it looks at when node A *enqueued* the buffer. It sees that A delivered just-in-time, at exactly 70ms. Node B now knows that A did nothing wrong! It will *not* send a lateness-notice to A. Instead, B will deliver the buffer to C, and now *node C* will notice that the buffer is late. Node C will notify node B about the lateness, and node B increases its latency by 5ms. With the increased latency, all is fine now. The chain is stable and any further 5ms-blackouts there cannot disrupt it anymore. There is no run-away latency.

With the enqueue_time you know which node caused the problem. Without it, you don't. There's this little time window between enqueue and dequeue in your version which can lead to blaming the wrong node for lateness.

So please re-add the enqueue_time and use it for lateness, otherwise we will again have the bug from ticket #7285.

Best regards,

--
Julian

Other related posts: