[mirtoolbox] Re: MFCCs and query by humming

  • From: Olivier Lartillot <olartillot@xxxxxxxxx>
  • To: mirtoolbox@xxxxxxxxxxxxx
  • Date: Fri, 15 Jan 2010 15:04:58 +0200

sarah lam kirjoitti 13.1.2010 kello 19.34:

oh. pardon my ignorance but how is it that i read some journals and they only used MFCCs as their features and they were able to obtain good results through query by humming?

Anyone's thought?

And with regards to the MFCCs coefficient matrix, i'm not sure if it's because of the transposition problem and therefore clustering is unable to be carried out. When i used a range for Rank eg. 1:3, i'm able to cluster it. However when i used a single coefficient i'm not able to do so.

I confirm the error you got using mircluster using a single rank comes from the bug I mentioned in my previous reply. In the upcoming update of MIRtoolbox, this error will disappear.


On Thu, Jan 14, 2010 at 1:21 AM, Olivier Lartillot <olartillot@xxxxxxxxx > wrote: OK, so if you used MFCC as feature, all this means – I hope you will not be too much offended by that ;-) – that despite your singing capabilities, you were not able to reproduce correctly the *timbral* aspect of the original song. I guess this was not surprising, after all, because if the original song contained instrumental sounds, they should sound timbrally quite different compared to your vocal query.

So, seriously speaking, if you want to compare melodies, then you will need to use something like mirpitch instead of MFCC. But again you will not be able to compare temporal trajectories of pitch with mirdist and mirquery because there is no suitable distance for that in MIRtoolbox (yet).

Unless anyone has anything to suggest?

Regards

Olivier


sarah lam kirjoitti 13.1.2010 kello 19.11:

i find the distance between songs using MFCCs as the feature. Using mirquery, the song that i hummed didn't come out as one of the top few rankings. Which means that the dist between the song that i hummed and the song that i want to retrieve didn't have the shortest distance using MFCCs. So i'm not very sure what the reason could be.

On Wed, Jan 13, 2010 at 11:53 PM, Olivier Lartillot <olartillot@xxxxxxxxx > wrote:
Hi Sarah,

Le 12.1.2010 à 18.02, sarah lam a écrit :

Hi Oliver and the rest of the gang

Question on MFCC
mirmfcc(..., 'Rank’, N) computes the coefficients of rank(s) N. The default value is N = 1:13 I'm very puzzled as to why N must be a range and why it can't be a discrete number. I'm just guessing that perhaps there might not be enough signal information if only a single coefficient is being used.

N can be a single value, no problem. However, it is true that in current release 1.2.3, there is a bug in the graphical representation and also in mirgetdata, when computing mirmfcc with frame decomposition: the vector is erroneously transposed. This will be corrected in the next release.


Question on Query by Humming
When the original wav is used as the query, mirdist() returns the queried song with the shortest distance.

mirdist only returns distances. I guess you meant mirquery(), right?

However when I recorded my humming as the query, mirdist() did not work as well.
And i really do not think i'm singing out of tune.
Is it because there might be too much background music and therefore just humming the melody is not good enough?

Please note that mirdist(x,Y) is nothing more than the computation of the distance between x and Y where x and Y related to one given *feature*. So all depends on the feature you are using. Besides, current distances used in mirdist do not take into account for instance the distance between pitch trajectories between songs (and query). So query by humming using MIRtoolbox would require some further thoughts.

Regards,

Olivier




Other related posts: