[darkice] Re: Problem with trunk

  • From: Rafael Diniz <rafael@xxxxxxxxxx>
  • To: darkice@xxxxxxxxxxxxx
  • Date: Fri, 17 May 2013 23:11:00 -0300

It's still blocking:
(gdb) bt
#0  0x00007ff810e122d4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-
linux-gnu/libpthread.so.0
#1  0x000000000040ef0b in MultiThreadedConnector::transfer (this=0x184be80, 
bytes=0, bufSize=4096, sec=1, usec=0) at MultiThreadedConnector.cpp:299
#2  0x0000000000411323 in DarkIce::encode (this=this@entry=0x184b8e0) at 
DarkIce.cpp:1281
#3  0x00000000004114fc in DarkIce::run (this=0x184b8e0) at DarkIce.cpp:1302
#4  0x0000000000406bb1 in main (argc=5, argv=<optimized out>) at main.cpp:159


Em sex 17 maio 2013, às 05:52:44, Edwin van den Oetelaar escreveu:
> r529 fixes the hangups of the network connection.
> 
> On Thu, May 16, 2013 at 10:21 PM, Edwin van den Oetelaar
> <oetelaar.automatisering@xxxxxxxxx> wrote:
> > About the reconnection part, it is bad!
> > I have been testing and added network timeouts in sending to icecast server 
etc.
> > This works, if you can not send it in 15 seconds it is too late anyway.
> > So far so good.
> > But then the reconnect flag comes into play and the consumer thread
> > want to reconnect (with the network not connected)
> > It starts doing blocking calls (gethostbyname() etc) which block
> > forever. (even when the network is put back in)
> > At that time everything is messed up, sound buffers must be
> > overflowing the producing thread is not handing out work anymore since
> > one of the consumers is still working on it.
> > All in all very messy and not easy to fix.
> > It really needs a redesign.
> > Also the C++ part is almost impossible to follow because of overloaded
> > operators and functions and inheritance, and lack of a Class Diagram.
> > FYI I am more a C guy not a C++ guy.
> >
> > I am open for suggestions.
> > Greetings,
> > Edwin van den Oetelaar
> >
> >
> > On Thu, May 16, 2013 at 10:46 AM, Daniel Eckl <daniel.eckl@xxxxxxxxx> wrote:
> >> Regarding Issue 85: Correct, my bug report stated that Release 1.1 has a
> >> bug, that is already fixed in trunk. Sorry for creating some
> >> misunderstandings here. From my point of view, now everything is back okay
> >> for this one.
> >>
> >> Regarding the MultithreadedConnector changes: I built r527 and tested what
> >> happens when I stop and restart my icecast2 server. It does not reconnect
> >> for me.
> >>
> >> Here's the output when the server goes down:
> >>
> >> 16-May-2013 10:37:29 Exception caught in BufferedSink :: write3
> >>
> >> 16-May-2013 10:37:29 MultiThreadedConnector :: sinkThread reconnecting  0
> >> 16-May-2013 10:37:29 couldn't write all from encoder to underlying sink 
1101
> >>
> >> When the server is back again, nothing ever happens. So yes, there's still
> >> some work needed. Thanks for your efforts, Edwin!
> >>
> >> Regarding the AAC support: I'm not sure if I understood your points
> >> correctly, but if that means, you want to replace the proprietary aac+
> >> library with something open source, I'd really appreciate it. I guess 
that's
> >> nothing already available in distribution rerpositories - that would be 
best
> >> of course. But an OSS alternative would be great!
> >>
> >> Best regards,
> >> Daniel
> >>
> >>
> >> 2013/5/16 Edwin van den Oetelaar <oetelaar.automatisering@xxxxxxxxx>
> >>>
> >>> There is still an issue with the MultithreadedConnector.cpp
> >>> It has not shown up yet, but there is a possibility for a race between
> >>> thread creation and starting.
> >>> What could happen is like this:
> >>> - consumer thread created (but not yet holding a lock on condition)
> >>> - producer sends message (broadcast) to consumers to start working
> >>> - a consumer can mis the first message thereby never starting, which
> >>> results in the producer waiting forever for it to finish
> >>> (this will result in a hang during startup, not during normal running)
> >>> I would like to make sure this can not happen, the chance is small but
> >>> still. it should not be possible.
> >>> Furthermore, should the FDK AAC not replace or be an alternative for
> >>> AACPlus-v2 ?
> >>>
> >>> This library:
> >>> git clone --depth 1 git://github.com/mstorsjo/fdk-aac.git
> >>>
> >>> I am using a modified darkice for personal use which includes 'remote
> >>> control' so sound can be muted without breaking the stream or messing
> >>> with mixers.
> >>> It also allows for restarting and some status reporting, this enables
> >>> the use of a real electric switches and real LED's to light up, buzzer
> >>> to beep based on on-off/connection/error status.
> >>> This could be added (in a clean way, not the hack I use privately) by
> >>> using an Arduino board (over serial) with the Firmata protocol...
> >>> Too much for you? :-)
> >>>
> >>> Greetings,
> >>> Edwin van den Oetelaar
> >>>
> >>>
> >>> On Thu, May 16, 2013 at 5:39 AM, Rafael Diniz <rafael@xxxxxxxxxx> wrote:
> >>> > Oops, sorry, my fault Edwin, all fine in r524.
> >>> > Any bug remaining for 1.2 release which should be fixed?
> >>> >
> >>> > Best regards,
> >>> > Rafael Diniz
> >>> >
> >>> > Em qua 15 maio 2013, às 20:41:47, Edwin van den Oetelaar escreveu:
> >>> >> If I read the comments right you (Rafael) actually fixed the problems
> >>> >> with
> >>> > r503.
> >>> >> But you changed reverted it back.
> >>> >> Daniel seems very handy with this version-control-thing and reported
> >>> >> the problem as issue 85.
> >>> >> As I see it now, you should recommit your changes.
> >>> >> Maybe Daniel sees it another way?
> >>> >>
> >>> >> What do you think Daniel?
> >>> >>
> >>> >> Greetings from Holland,
> >>> >> Edwin
> >>> >>
> >>> >> On Wed, May 15, 2013 at 11:19 PM, Rafael Diniz <rafael@xxxxxxxxxx>
> >>> >> wrote:
> >>> >> > Commited again in r523 with proposed Edwin changes.
> >>> >> > Lets test!
> >>> >> >
> >>> >> > Btw, Daniel, do you already have write access to svn?
> >>> >> > We need help.
> >>> >> > ; )
> >>> >> >
> >>> >> > Edwin, concerning CastSink.* modifications you made and I reverted,
> >>> >> > do you
> >>> > want
> >>> >> > to try to fix them?
> >>> >> > http://code.google.com/p/darkice/issues/detail?id=85
> >>> >> >
> >>> >> > Best regards,
> >>> >> > Rafael Diniz
> >>> >> >
> >>> >> > Em qua 15 maio 2013, às 17:59:55, Daniel Eckl escreveu:
> >>> >> >> Hi Rafael,
> >>> >> >>
> >>> >> >> Thanks for your note. We are fully aware that we are in 522 and that
> >>> >> >> you
> >>> >> >> reverted the changes of r510 in r514. ;)
> >>> >> >>
> >>> >> >> Because of that revert, it was specifically interesting for Edwin on
> >>> >> >> how to
> >>> >> >> (easily) re-apply his changes again now to r522 and then fix the
> >>> >> >> issues
> >>> >> >> caused by them. So my two lines (svn diff -r514:513 and apply this
> >>> >> >> resulting diff) are meant to be applied to a r522 checkout to get
> >>> >> >> his
> >>> >> >> MultiThreadedConnector code in question back in to then work on the
> >>> >> >> issues.
> >>> >> >>
> >>> >> >> Just try it, check out r522 and then apply the -r514:r513 diff ;)
> >>> >> >>
> >>> >> >> Best regards,
> >>> >> >> Daniel
> >>> >> >>
> >>> >> >>
> >>> >> >> 2013/5/15 Rafael Diniz <rafael@xxxxxxxxxx>
> >>> >> >>
> >>> >> >> > Daniel and Edwin,
> >>> >> >> > I already reverted this yesterday.
> >>> >> >> >
> >>> >> >> > Btw, we are in r522.
> >>> >> >> > Please take a look in the lastest commits to see what I did.
> >>> >> >> > ; )
> >>> >> >> >
> >>> >> >> > Best regards,
> >>> >> >> > Rafael Diniz
> >>> >> >> >
> >>> >> >> > Em qua 15 maio 2013, às 03:25:54, Daniel Eckl escreveu:
> >>> >> >> > > Hi Edwin,
> >>> >> >> > >
> >>> >> >> > > I just skimmed through the latest commits and I guess you just
> >>> >> >> > > need to
> >>> >> >> > > revert commit 514 (which essentially is my patch to revert just
> >>> >> >> > > the
> >>> >> >> > > MultiThreadedConnector changes) to have your initial code back
> >>> >> >> > > in.
> >>> >> >> > >
> >>> >> >> > > Just like e.g.
> >>> >> >> > >
> >>> >> >> > > svn diff -r514:513 > ../revert-514.patch
> >>> >> >> > > patch -p0 <../revert-514.patch
> >>> >> >> > >
> >>> >> >> > > Then you can add your changes to fix the race condition and
> >>> >> >> > > commit all
> >>> >> >> > back
> >>> >> >> > > in.
> >>> >> >> > >
> >>> >> >> > > Regards,
> >>> >> >> > > Daniel
> >>> >> >> > >
> >>> >> >> > >
> >>> >> >> > > 2013/5/14 Edwin van den Oetelaar
> >>> >> >> > > <oetelaar.automatisering@xxxxxxxxx>
> >>> >> >> > >
> >>> >> >> > > > I think I found the race condition.
> >>> >> >> > > > MultiThreadedConnector.cpp about line 264
> >>> >> >> > > >
> >>> >> >> > > > Basically, an interlocking problem of locks.
> >>> >> >> > > > It was possible to mis a condition signal from an encoding
> >>> >> >> > > > thread.
> >>> >> >> > > >
> >>> >> >> > > > Original code:
> >>> >> >> > > > pthread_mutex_unlock(&mutex_start); // UNLOCK, release the
> >>> >> >> > > > consumers'
> >>> >> >> > > > cond variable, now they can run
> >>> >> >> > > > // problem here, a encoder thread could have signal without a
> >>> > listener
> >>> >> >> > > > active
> >>> >> >> > > > pthread_mutex_lock(&mutex_done);    // LOCK a condition 'done'
> >>> > variable
> >>> >> >> > > > change
> >>> >> >> > > >
> >>> >> >> > > > Fixed : It should be interlocked like this.
> >>> >> >> > > > pthread_mutex_lock(&mutex_done);    // LOCK early to prevent
> >>> >> >> > > > missing
> >>> > a
> >>> >> >> > > > condition 'done' variable change
> >>> >> >> > > > pthread_mutex_unlock(&mutex_start); // UNLOCK, release the
> >>> >> >> > > > consumers'
> >>> >> >> > > > cond variable, now they can run
> >>> >> >> > > >
> >>> >> >> > > > How do we apply this patch?
> >>> >> >> > > > Since some code was reverted a few moments ago....
> >>> >> >> > > >
> >>> >> >> > > > Please assist.
> >>> >> >> > > > Edwin van den Oetelaar
> >>> >> >> > > >
> >>> >> >> > > > On Tue, May 14, 2013 at 10:13 PM, Edwin van den Oetelaar
> >>> >> >> > > > <oetelaar.automatisering@xxxxxxxxx> wrote:
> >>> >> >> > > > > I will investigate more.
> >>> >> >> > > > > I already started when the first related message appeared,
> >>> >> >> > > > > but at
> >>> >> >> > this
> >>> >> >> > > > > moment I do not see where the problem really happens.
> >>> >> >> > > > > A condition keeps waiting but it never happens, which is
> >>> >> >> > > > > very
> >>> >> >> > strange,
> >>> >> >> > > > > since this problem happens after hours and the routine is
> >>> >> >> > > > > running
> >>> >> >> > > > > 100++ times per second.
> >>> >> >> > > > > If more debug info is available please send it!!
> >>> >> >> > > > > It appears I introduced it so I will fix it.
> >>> >> >> > > > > Greetings,
> >>> >> >> > > > > Edwin van den Oetelaar
> >>> >> >> > > > >
> >>> >> >> > > > >
> >>> >> >> > > > > On Tue, May 14, 2013 at 5:14 PM, Rafael Diniz
> >>> >> >> > > > > <rafael@xxxxxxxxxx>
> >>> >> >> > wrote:
> >>> >> >> > > > >> Edwin,
> >>> >> >> > > > >> Commit 510 introduced the bug.
> >>> >> >> > > > >> http://code.google.com/p/darkice/source/detail?r=510
> >>> >> >> > > > >>
> >>> >> >> > > > >> There is patch reverting some changes that fixes (I'm
> >>> >> >> > > > >> testing it
> >>> >> >> > right
> >>> >> >> > > > now) the
> >>> >> >> > > > >> problem here:
> >>> >> >> > > > >> http://code.google.com/p/darkice/issues/detail?id=84
> >>> >> >> > > > >>
> >>> >> >> > > > >> I have a guess about the deadlock but I want to understand
> >>> >> >> > > > >> better
> >>> >> >> > your
> >>> >> >> > > > >> modifications. In order to not break current svn, I
> >>> >> >> > > > >> reverted some
> >>> >> >> > > > changes as
> >>> >> >> > > > >> proposed in the ticket in r514.
> >>> >> >> > > > >>
> >>> >> >> > > > >>
> >>> >> >> > > > >> Btw, I commited the fixes to darksnow, thanks!
> >>> >> >> > > > >>
> >>> >> >> > > > >> Best regards,
> >>> >> >> > > > >> Rafael Diniz
> >>> >> >> > > > >>
> >>> >> >> > > > >> Em qui 09 maio 2013, às 19:56:49, Edwin van den Oetelaar
> >>> >> >> > > > >> escreveu:
> >>> >> >> > > > >>> On Fri, May 10, 2013 at 12:47 AM, Rafael Diniz
> >>> > <rafael@xxxxxxxxxx>
> >>> >> >> > > > wrote:
> >>> >> >> > > > >>> > GDB backtrace of darkice when it locks:
> >>> >> >> > > > >>> >
> >>> >> >> > > > >>> > (gdb) bt
> >>> >> >> > > > >>> > #0  0x00007f17e42a62d4 in pthread_cond_wait@@GLIBC_2.3.2
> >>> >> >> > > > >>> > ()
> >>> > from
> >>> >> >> > > > >> /lib/x86_64-
> >>> >> >> > > > >>> > linux-gnu/libpthread.so.0
> >>> >> >> > > > >>> > #1  0x000000000040e83b in
> >>> >> >> > > > >>> > MultiThreadedConnector::transfer
> >>> >> >> > > > (this=0x11e9a60,
> >>> >> >> > > > >>> > bytes=0, bufSize=4096, sec=1, usec=0) at
> >>> >> >> > > > MultiThreadedConnector.cpp:271
> >>> >> >> > > > >>> > #2  0x0000000000410a53 in DarkIce::encode
> >>> >> >> > > > >>> > (this=this@entry
> >>> >> >> > =0x11e94c0)
> >>> >> >> > > > at
> >>> >> >> > > > >>> > DarkIce.cpp:1236
> >>> >> >> > > > >>> > #3  0x0000000000410c2c in DarkIce::run (this=0x11e94c0)
> >>> >> >> > > > >>> > at
> >>> >> >> > > > DarkIce.cpp:1257
> >>> >> >> > > > >>> > #4  0x00000000004067b6 in main (argc=5, argv=<optimized
> >>> >> >> > > > >>> > out>)
> >>> > at
> >>> >> >> > > > >> main.cpp:159
> >>> >> >> > > > >>> >
> >>> >> >> > > > >>>
> >>> >> >> > > > >>> Ok, I will take a look at this.
> >>> >> >> > > > >>> Not tonight, it is already 01:00 past midnight :-)
> >>> >> >> > > > >>>
> >>> >> >> > > > >>> Greetings from the Netherlands,
> >>> >> >> > > > >>> Edwin
> >>> >> >> > > > >>>
> >>> >> >> > > > >>>
> >>> >> >> > > > >>
> >>> >> >> > > >
> >>> >> >> > > >
> >>> >> >> > >
> >>> >> >> >
> >>> >> >> >
> >>> >> >>
> >>> >> >
> >>> >>
> >>> >>
> >>>
> >>
> 
> 

Other related posts: