[mirtoolbox] Re: Question about fluctuation patterns

  • From: Olivier Lartillot <olivier.lartillot@xxxxxxxxxxxxx>
  • To: mirtoolbox@xxxxxxxxxxxxx
  • Date: Fri, 31 Jul 2009 18:40:50 +0200

Hi,

Le 7.7.2009 à 16.36, Igor Vatolkin a écrit :

Hi,

I have a question about the calculation of fluctuation patterns - maybe you can help me.

Back from holidays, replies will be more prompt now.

There are two different questions here, one related to frame- decomposition, the other to fluctuation. Let's see each of them separately:


If I calculate a simple feature like RMS:
--------
a=miraudio('test.wav');
f=mirframe(a,'Length',512,'sp');
res=mirrms(f);
resData=mirgetdata(res);
--------
- I get an array of 231 RMS-values for a given file (one value per 512-sample time window). It means in other words, that the given song contains 231 time windows of 512 samples.

The number of frames (here, 231) is not directly related to the window size (here, 512 samples), but mainly to the hop factor, i.e. the distance between successive frames.
In your example, you wrote: f=mirframe(a,'Length',512,'sp')
As you did not specify any hop factor, by default mirframe chooses a half overlapping, hence frames are located every 512/2 = 256 samples.


If I calculate fluctuation patterns:
--------
res=mirfluctuation(f);
resData=mirgetdata(res);
--------
- I get a matrix of 24*1903 values. I guess 24 is the number of bands, but I cannot understand where is the number 1903 from. It is not divisible by 231 at any rate. What I exactly want to know is which fluctuation pattern feature vector corresponds to which time window (or time period) from the original song. Is it possible?

The mirfluctuation operators computes the spectrum within each of the 24 bands separately. Each FFT here is 1903 samples long. Why? mainly because in mirfluctuation, the mirspectrum along bands is performed for the range 0-10 Hz and with a minimal frequency resolution of .01 Hz.

Now to answer to your practical question, you can get the frame positions by using get(..., 'FramePos') and the X-axis coordinates (for instance the FFT frequencies) by using get(..., 'Pos').


Some more background to this question: if I want to classify the first 5 seconds of a music song, I can calculate the mean value and deviation of RMS (it is easy to calculate how much 512-sample time windows are in these 5 seconds).

You don't even need to know that. Just use:
a = miraudio('czardas','extract',0,5)
r = mirrms(a,'frame')
mirstat(r)

But how can I add such a feature like "mean value of fluctuation pattern x from 5 seconds" to the input vector for my classifier?

fl = mirfluctuation(a,'Summary');
mean(mirgetdata(fl))
would be the direct answer to your question, but I am not sure whether such averaging of the FFT makes much sense. Maybe you could have a look at the peaks of the fluctuation curve?

Regards,

Olivier

Other related posts: