Am 04/03/2015 um 15:58 schrieb Matthias Gemmer: >> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im >> Auftrag von Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx> >> Gesendet: Dienstag, 3. März 2015 20:32 >> An: codeface@xxxxxxxxxxxxx >> Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: Preparing time series data - >> sloccount analysis >> >> Am 03/03/2015 um 10:40 schrieb Matthias Gemmer: >>>> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im >>>> Auftrag von Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx> >>>> Gesendet: Montag, 2. März 2015 18:28 >>>> An: codeface@xxxxxxxxxxxxx; Mitchell Joblin >>>> Betreff: [codeface] Re: AW: Re: AW: Re: Preparing time series data - >>>> sloccount analysis >>>> >>>> Am 02/03/2015 um 18:09 schrieb Matthias Gemmer: >>>>>> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im >>>>>> Auftrag von Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx> >>>>>> Gesendet: Montag, 2. März 2015 17:22 >>>>>> An: codeface@xxxxxxxxxxxxx >>>>>> Betreff: [codeface] Re: AW: Re: Preparing time series data - sloccount >>>>>> analysis >>>>>> >>>>>> Am 02/03/2015 um 17:20 schrieb Wolfgang Mauerer: >>>>>>>> Enter an environment number, or 0 to exit Selection: 9 >>>>>>>> Browsing in the environment with call: >>>>>>>> add.sloccount.ts(conf, sloccount.plot.id, commit.date, res) >>>>>>>> Called from: debugger.look(ind) >>>>>>>> Browse[1]> ls() >>>>>>>> [1] "commit.date" "conf" "plot.id" "values" >>>>>>>> Browse[1]> print(commit.date) >>>>>>>> [1] "2006-09-05 20:20:16 UTC" >>>>>>>> Browse[1]> print(values) >>>>>>>> $lang.info >>>>>>>> lang lines fraction >>>>>>>> 1 xml 98 0.5833333 >>>>>>>> 2 perl 70 0.4166667 >>>>>>>> >>>>>>>> $metrics >>>>>>>> person.months total.cost schedule.months avg.devel >>>>>>>> 1 0.37 4426 1.71 0.22 >>>>>>> >>>>>> that looks alright -- is plot.id properly assigned? >>>>> >>>>> Browse[1]> print(plot.id) >>>>> numeric(0) >>>> so that's the culprit... There is no valid plot ID for the time >>>> series in the database. Can you please check that an appropriate >>>> table is available in the database? >>>> >>> >>> There is a table called timeseries with the column plotId. >>> mysql> DESCRIBE timeseries; >>> +--------------+------------+------+-----+---------+-------+ >>> | Field | Type | Null | Key | Default | Extra | >>> +--------------+------------+------+-----+---------+-------+ >>> | plotId | bigint(20) | NO | MUL | NULL | | >>> | time | datetime | NO | | NULL | | >>> | value | double | NO | | NULL | | >>> | value_scaled | double | YES | | NULL | | >>> +--------------+------------+------+-----+---------+-------+ >>> 4 rows in set (0.02 sec) >>> >>> The table is also filled with data. The table contains datasets for >>> plotId=5, plotId=6, plotId=7 and plotId=8. >>> >>>> >>>> Which values do sloccount.plot.id (and understand.plot.id) have >>>> in do.complexity.analysis (Frame 3/4)? >>>> >>> >>> The values for sloccount.plot.id and understand.plot.id are obviously >>> invalid. >>> >>> Browse[1]> print(sloccount.plot.id) >>> numeric(0) >>> Browse[1]> print(understand.plot.id) >>> numeric(0) >> >> it was not so obvious to me; I was trying to ensure that >> parallelisation did not introduce any issues here. But your observation >> clarified that this is not the case. >> >> Since the error seems to be deterministically reproducible at your >> site, can you debug around the creation of the index (for instance by >> printing out what's going on; alternatively, you could also use the >> built-in debugger)? >> > > In the file codeface/R/complexity.r: > > Assignment of sloccount.plot.id and understand.plot.id: > ## Obtain a plot IDs for the sloccount and understand raw time series before > ## parallel processing commences to avoid race conditions > sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount") > understand.plot.id <- get.or.create.plot.id(conf, "understand_raw") > -> sloccount.plot.id and understand.plot.id have the value "x". > Are these values feasible? Or Shall I have a closer look at the > function 'get.or.create.plot.id'? since the SQL specification for the plot ID is `id` BIGINT NOT NULL AUTO_INCREMENT the value "x" seems quite impossible. Can you please query your database to see what value is stored there? Best regards, Wolfgang Mauerer > > The part where the sloccount analysis is performed: > if (conf$sloccount == TRUE) { > loginfo(str_c("Performing sloccount analysis for ", commit.hash, "\n"), > logger="complexity") > res <- do.sloccount.analysis(code.dir) > > # This call fails > add.sloccount.ts(conf, sloccount.plot.id, commit.date, res) > } > logdevinfo("Finished analysing sample ", i, "\n", logger="complexity") > > -> After 'res <- do.sloccount.analysis(code.dir)' res contains: > "lang.info.lang" > "lang.info.lines" > "lang.info.fraction" > "metrics.person.months" > "metrics.total.cost" > "metrics.schedule.months" > "metrics.avg.devel" > "1" > "xml" > 98 > 0.583333333333333 > 0.37 > 4426 > 1.71 > 0.22 > "2" > "perl" > 70 > 0.416666666666667 > 0.37 > 4426 > 1.71 > 0.22 > > Best regards, Matthias Gemmer > >> Best regards, Wolfgang Mauerer >