Hi everyone,
*Tim Roberts* over at the osr.com forum has been very kind in answering
my audio driver questions over the last few months. Recently, he
referred me to this mailing list where "/all the cool audio driver kids,
including several very helpful members of the Microsoft audio team, hang
out/".
My goal is to play networked voice data (in our app) to a virtual render
endpoint, and have it 'loop-back' to a virtual capture endpoint that
3rd-party chat apps like Skype, Zoom, etc. can consume.
My audio driver prototype is based on SYSVAD/WaveRT. Besides being newer
and recommended in the docs, WaveRT handles more of the copying data
to/from the DMA buffers. My hope was that I could write less code and
have a more stable driver.
What's needed is just a straight-through audio pipe, so I tried using a
single, common buffer for render & capture streams; keeping the
respective PlayPositions in sync. For this to have a hope of working,
both render and capture streams would have to be identical in terms of
format (e.g. channels, frames/sec). Alas, when I tested, each stream had
significant differences in their PCM format (channels, frame rate). I'm
still testing to see if I can limit/coerce both endpoints to the same
format: a single channel at some arbitrary 'good enough' frame rate, but
this approach is starting to look uncomfortably fragile.
My second prototype is based upon MSVAD/WaveCyclic. With WaveCyclic, I
implement the DMA buffer copying logic, so I can do whatever data
conversion is required. I'm aware of a few other 'virtual audio driver'
projects on Github with similar goals; all based upon WaveCyclic.
I wanted to see how they dealt with the data format conversion issue,
but when I look at their CopyTo and CopyFrom implementations, *I don't
see them dealing with conversion at all*. There doesn't appear to be any
code (e.g. in DataRangeIntersection) that limits streams to a specific
number of channels or frame-rate.
Can anyone shed light on this?
I'm sure my expectations must be off somehow, but it seems to me that
some kind of conversion *must* be required when copying data between
streams of sufficiently different format...