[mirtoolbox] Re: Accuracy of mirkeystrength estimations for raw-audio vs pitch-contour inputs

From: Olivier Lartillot <olartillot@xxxxxxxxx>
To: mirtoolbox@xxxxxxxxxxxxx
Date: Tue, 8 Jan 2019 18:16:41 +0100

Hi Tudor,

So sorry for the late reply.

About interpreting the frame-decomposed keystrength figure, well there is
nothing definite. You could take the mean of the strength values across frames.
ks = mirkeystrength(…)
m = mirmean(ks)
mirgetdata(m)

Actually this was not working anymore since MIRtoolbox 1.6. But in the new
update 1.7.2 I just released, it works.

Olivier

19. okt. 2018 kl. 09:36 skrev Tudor Popescu <tudor3@xxxxxxxxx>:

Hi Olivier,

Many thanks for your helpful reply! I had indeed just used
mirkeystrength('soundfile.mp3') previously; but you explain very clearly why
frame decomposition was needed. It seems indeed very much needed in the case
of my tunes - which are vocal outputs therefore also no chords but just
individual notes; I will try several l&h combinations to see which one works
best in their case.

What I'm unsure how to interpret however (and couldn't find in the
toolbox's/script's documentation), is the colour-coding in the matrix
produced by mirkeystrength when called with the Frame option. Is it the more
consistent colouring of the CM row in the attached output that tells me that
C major is the highest-evidence candidate, or are the individual time-bin
colours to be interpreted by themselves?

Please also let me know if you had any thoughts as to my other question:
Finally, would my diatonicity estimates likely be any more accurate if
instead of the raw recording, I used time-series representing its simplifed
(Prosogram <https://sites.google.com/site/prosogram/home>-stylised) pitch
contour (attached)?

Thanks very much once again!
Tudor

On Fri, 19 Oct 2018 at 09:00, Olivier Lartillot <olartillot@xxxxxxxxx
<mailto:olartillot@xxxxxxxxx>> wrote:
Dear Tudor,

What command did you use exactly? When using
mirkeystrength('mozart sonata Aminor.mp3’)
I get the figure mozart.jpg enclosed, where Amin is the second candidate
after Dmin. It gets better with frame decomposition:
mirkeystrength('mozart sonata Aminor.mp3','Frame',5,.5)
cf. mozart_frame.jpg

For the scale, I get scale.jpg, with Fmaj, Dmin and Amin more prominent than
Cmaj. But please note that by computing the keystrength without frame
decomposition, we are considering the spectrum of the whole excerpt, so
basically listening to the 7 notes played all at the same time. The
difference then between the different tonalities depends to a single pitch.
The Krumhansl method used in keystrength is not optimal for these kind of
scales without clear chords.

Olivier

18. okt. 2018 kl. 20:20 skrev Tudor Popescu <tudor3@xxxxxxxxx
<mailto:tudor3@xxxxxxxxx>>:

Dear list (dear Olivier),

I have a bunch of short tunes (MP3), and I'd like to estimate how close each
of them gets to being diatonic, i.e. whether the set of all its employed
notes (almost) spells out one scale or another, as opposed to being flat. As
a proxy of this "diatonicity" measure, I thought I'd look for a particularly
high peak in a file's key strength profile, as estimated by mirkeystrength.

I first did a 'hello world'/sanity check with a short sample from a A-minor
piano sonata, as well as an even simpler case, a recording of the notes of
the C major scale played up and down. I expected a sharp peak for the
estimated key, with much lower other peaks. However, the strongest key
estimates, rather than the expected a minor and C major, were instead d
minor and F major respectively (attached).

Granted, these estimated keys are only one-step away on the circle of fiths
from the correct keys (which themselves had quite high peaks in the plot);
but I am wondering how reliable these key estimates are if they are not
sharp using such simple examples. The MIRToolbox uses well-polished
algorithms, so I'm rather inclined to think I am somehow wrongly
interpreting its outputs?!

Finally, would my diatonicity estimates likely be any more accurate if
instead of the raw recording, I used time-series representing its simplifed
(Prosogram <https://sites.google.com/site/prosogram/home>-stylised) pitch
contour (attached)?

Many thanks for your help!

Tudor
<keystrength estimate for Cmajor scale.jpg><mozart sonata Aminor.mp3><Cmajor
scale.mp3><keystrength estimate for mozart sonata
Aminor.jpg><Prosogram-stylised pitch contour.jpg>

<mirkeystrenth (with frames) estimate for Cmajor.jpg><Prosogram-stylised
pitch contour.jpg><scale.jpg><scale.jpg><scale.jpg>

[mirtoolbox] Re: Accuracy of mirkeystrength estimations for raw-audio vs pitch-contour inputs

Other related posts: