[codeface] AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: Preparing time series data - sloccount analysis

  • From: Matthias Gemmer <matthias.gemmer@xxxxxxxxxxxxxxxxxxxx>
  • To: "codeface@xxxxxxxxxxxxx" <codeface@xxxxxxxxxxxxx>
  • Date: Thu, 5 Mar 2015 09:24:30 +0000

>Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im Auftrag 
>von Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx>
>Gesendet: Mittwoch, 4. März 2015 20:44
>An: codeface@xxxxxxxxxxxxx
>Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: Preparing time series 
>data - sloccount analysis
>
>Am 04/03/2015 um 15:58 schrieb Matthias Gemmer:
>>> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im 
>>> Auftrag von Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx>
>>> Gesendet: Dienstag, 3. März 2015 20:32
>>> An: codeface@xxxxxxxxxxxxx
>>> Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: Preparing time series data 
>>> - sloccount analysis
>>>
>>> Am 03/03/2015 um 10:40 schrieb Matthias Gemmer:
>>>>> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im 
>>>>> Auftrag von Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx>
>>>>> Gesendet: Montag, 2. März 2015 18:28
>>>>> An: codeface@xxxxxxxxxxxxx; Mitchell Joblin
>>>>> Betreff: [codeface] Re: AW: Re: AW: Re: Preparing time series data - 
>>>>> sloccount analysis
>>>>>
>>>>> Am 02/03/2015 um 18:09 schrieb Matthias Gemmer:
>>>>>>> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im 
>>>>>>> Auftrag von Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx>
>>>>>>> Gesendet: Montag, 2. März 2015 17:22
>>>>>>> An: codeface@xxxxxxxxxxxxx
>>>>>>> Betreff: [codeface] Re: AW: Re: Preparing time series data - sloccount 
>>>>>>> analysis
>>>>>>>
>>>>>>> Am 02/03/2015 um 17:20 schrieb Wolfgang Mauerer:
>>>>>>>>> Enter an environment number, or 0 to exit  Selection: 9
>>>>>>>>> Browsing in the environment with call:
>>>>>>>>>    add.sloccount.ts(conf, sloccount.plot.id, commit.date, res)
>>>>>>>>> Called from: debugger.look(ind)
>>>>>>>>> Browse[1]> ls()
>>>>>>>>> [1] "commit.date" "conf"        "plot.id"     "values"
>>>>>>>>> Browse[1]> print(commit.date)
>>>>>>>>> [1] "2006-09-05 20:20:16 UTC"
>>>>>>>>> Browse[1]> print(values)
>>>>>>>>> $lang.info
>>>>>>>>>   lang lines  fraction
>>>>>>>>> 1  xml    98 0.5833333
>>>>>>>>> 2 perl    70 0.4166667
>>>>>>>>>
>>>>>>>>> $metrics
>>>>>>>>>   person.months total.cost schedule.months avg.devel
>>>>>>>>> 1          0.37       4426            1.71      0.22
>>>>>>>>
>>>>>>> that looks alright -- is plot.id properly assigned?
>>>>>>
>>>>>> Browse[1]> print(plot.id)
>>>>>> numeric(0)
>>>>> so that's the culprit... There is no valid plot ID for the time
>>>>> series in the database. Can you please check that an appropriate
>>>>> table is available in the database?
>>>>>
>>>>
>>>> There is a table called timeseries with the column plotId.
>>>> mysql> DESCRIBE timeseries;
>>>> +--------------+------------+------+-----+---------+-------+
>>>> | Field        | Type       | Null | Key | Default | Extra |
>>>> +--------------+------------+------+-----+---------+-------+
>>>> | plotId       | bigint(20) | NO   | MUL | NULL    |       |
>>>> | time         | datetime   | NO   |     | NULL    |       |
>>>> | value        | double     | NO   |     | NULL    |       |
>>>> | value_scaled | double     | YES  |     | NULL    |       |
>>>> +--------------+------------+------+-----+---------+-------+
>>>> 4 rows in set (0.02 sec)
>>>>
>>>> The table is also filled with data. The table contains datasets for
>>>> plotId=5, plotId=6, plotId=7 and plotId=8.
>>>>
>>>>>
>>>>> Which values do sloccount.plot.id (and understand.plot.id) have
>>>>> in do.complexity.analysis (Frame 3/4)?
>>>>>
>>>>
>>>> The values for sloccount.plot.id and understand.plot.id are obviously
>>>> invalid.
>>>>
>>>> Browse[1]> print(sloccount.plot.id)
>>>> numeric(0)
>>>> Browse[1]> print(understand.plot.id)
>>>> numeric(0)
>>>
>>> it was not so obvious to me; I was trying to ensure that
>>> parallelisation did not introduce any issues here. But your observation
>>> clarified that this is not the case.
>>>
>>> Since the error seems to be deterministically reproducible at your
>>> site, can you debug around the creation of the index (for instance by
>>> printing out what's going on; alternatively, you could also use the
>>> built-in debugger)?
>>>
>>
>> In the file codeface/R/complexity.r:
>>
>> Assignment of sloccount.plot.id and understand.plot.id:
>>   ## Obtain a plot IDs for the sloccount and understand raw time series 
>> before
>>   ## parallel processing commences to avoid race conditions
>>   sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount")
>>   understand.plot.id <- get.or.create.plot.id(conf, "understand_raw")
>>       -> sloccount.plot.id and understand.plot.id have the value "x".
>>              Are these values feasible? Or Shall I have a closer look at the 
>> function 'get.or.create.plot.id'?
>
>since the SQL specification for the plot ID is
>
>`id` BIGINT NOT NULL AUTO_INCREMENT
>
>the value "x" seems quite impossible. Can you please query your
>database to see what value is stored there?
>

The table is empty.
mysql> select * from plots;
Empty set (0.01 sec)

-- Matthias Gemmer
>
>Best regards, Wolfgang Mauerer
>>
>> The part where the sloccount analysis is performed:
>>   if (conf$sloccount == TRUE) {
>>       loginfo(str_c("Performing sloccount analysis for ", commit.hash, "\n"),
>>                       logger="complexity")
>>       res <- do.sloccount.analysis(code.dir)
>>
>>       # This call fails
>>       add.sloccount.ts(conf, sloccount.plot.id, commit.date, res)
>>   }
>>   logdevinfo("Finished analysing sample ", i, "\n", logger="complexity")
>>
>>       -> After 'res <- do.sloccount.analysis(code.dir)' res contains:
>>               "lang.info.lang"
>>               "lang.info.lines"
>>               "lang.info.fraction"
>>               "metrics.person.months"
>>               "metrics.total.cost"
>>               "metrics.schedule.months"
>>               "metrics.avg.devel"
>>               "1"
>>               "xml"
>>               98
>>               0.583333333333333
>>               0.37
>>               4426
>>               1.71
>>               0.22
>>               "2"
>>               "perl"
>>               70
>>               0.416666666666667
>>               0.37
>>               4426
>>               1.71
>>               0.22
>>
>> Best regards, Matthias Gemmer
>>
>>> Best regards, Wolfgang Mauerer
>>

Other related posts: