[codeface] AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: Preparing time series data - sloccount analysis

  • From: Matthias Gemmer <matthias.gemmer@xxxxxxxxxxxxxxxxxxxx>
  • To: "codeface@xxxxxxxxxxxxx" <codeface@xxxxxxxxxxxxx>
  • Date: Thu, 5 Mar 2015 13:19:03 +0000

>Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im Auftrag 
>von Mitchell Joblin <joblin.m@xxxxxxxxx>
>Gesendet: Donnerstag, 5. März 2015 13:14
>An: Wolfgang Mauerer
>Cc: codeface@xxxxxxxxxxxxx
>Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: 
>Preparing time series data - sloccount analysis
>
>On Thu, Mar 5, 2015 at 11:12 AM, Wolfgang Mauerer
><wolfgang.mauerer@xxxxxxxxxxx> wrote:
>> On 05.03.2015 12:04, Matthias Gemmer wrote:
>>>>>>>>>>>
>>>>>>>>>>> Browse[1]> print(plot.id)
>>>>>>>>>>> numeric(0)
>>>>>>>>>>
>>>>>>>>>> so that's the culprit... There is no valid plot ID for the time
>>>>>>>>>> series in the database. Can you please check that an appropriate
>>>>>>>>>> table is available in the database?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> There is a table called timeseries with the column plotId.
>>>>>>>>> mysql> DESCRIBE timeseries;
>>>>>>>>> +--------------+------------+------+-----+---------+-------+
>>>>>>>>> | Field        | Type       | Null | Key | Default | Extra |
>>>>>>>>> +--------------+------------+------+-----+---------+-------+
>>>>>>>>> | plotId       | bigint(20) | NO   | MUL | NULL    |       |
>>>>>>>>> | time         | datetime   | NO   |     | NULL    |       |
>>>>>>>>> | value        | double     | NO   |     | NULL    |       |
>>>>>>>>> | value_scaled | double     | YES  |     | NULL    |       |
>>>>>>>>> +--------------+------------+------+-----+---------+-------+
>>>>>>>>> 4 rows in set (0.02 sec)
>>>>>>>>>
>>>>>>>>> The table is also filled with data. The table contains datasets for
>>>>>>>>> plotId=5, plotId=6, plotId=7 and plotId=8.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Which values do sloccount.plot.id (and understand.plot.id) have
>>>>>>>>>> in do.complexity.analysis (Frame 3/4)?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The values for sloccount.plot.id and understand.plot.id are
>>>>>>>>> obviously
>>>>>>>>> invalid.
>>>>>>>>>
>>>>>>>>> Browse[1]> print(sloccount.plot.id)
>>>>>>>>> numeric(0)
>>>>>>>>> Browse[1]> print(understand.plot.id)
>>>>>>>>> numeric(0)
>>>>>>>>
>>>>>>>>
>>>>>>>> it was not so obvious to me; I was trying to ensure that
>>>>>>>> parallelisation did not introduce any issues here. But your
>>>>>>>> observation
>>>>>>>> clarified that this is not the case.
>>>>>>>>
>>>>>>>> Since the error seems to be deterministically reproducible at your
>>>>>>>> site, can you debug around the creation of the index (for instance by
>>>>>>>> printing out what's going on; alternatively, you could also use the
>>>>>>>> built-in debugger)?
>>>>>>>>
>>>>>>>
>>>>>>> In the file codeface/R/complexity.r:
>>>>>>>
>>>>>>> Assignment of sloccount.plot.id and understand.plot.id:
>>>>>>>    ## Obtain a plot IDs for the sloccount and understand raw time
>>>>>>> series before
>>>>>>>    ## parallel processing commences to avoid race conditions
>>>>>>>    sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount")
>>>>>>>    understand.plot.id <- get.or.create.plot.id(conf, "understand_raw")
>>>>>>>        -> sloccount.plot.id and understand.plot.id have the value "x".
>>>>>>>               Are these values feasible? Or Shall I have a closer look
>>>>>>> at the function 'get.or.create.plot.id'?
>>>>>>
>>>>>>
>>>>>> since the SQL specification for the plot ID is
>>>>>>
>>>>>> `id` BIGINT NOT NULL AUTO_INCREMENT
>>>>>>
>>>>>> the value "x" seems quite impossible. Can you please query your
>>>>>> database to see what value is stored there?
>>>>>>
>>>>>
>>>>> The table is empty.
>>>>> mysql> select * from plots;
>>>>> Empty set (0.01 sec)
>>>>
>>>>
>>>> please try to run the other SQL statements produced by the code to see
>>>> why no entry is created. get.or.create.plot.id() inserts a new entry
>>>> into the table is no ID for a desired plot is available.
>>>
>>>
>>> The branch which creates a plot ID is not entered. The condition
>>> 'length(res) < 1' is
>>> in both cases (sloccount.plot.id and understand.plot.id) not satisfied.
>>>
>>> For sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount"):
>>>    res <- dbGetQuery(con, str_c(query, ";"))
>>>    # str_c(query, ";"): SELECT id FROM plots WHERE name='sloccount' AND
>>> projectId=2;
>>>    # res: "id"
>>>    # length(res): 1
>>>    if (length(res) < 1) {
>>>      ## Plot ID is not assigned yet, create one
>>>      res <- get.clear.plot.id.con(con, pid, plot.name, range.id)
>>>    } else {
>>>      res <- res$id
>>>    }
>>>    # res: "x"
>>
>>
>> @Mitchell, could you try to reproduce this? I don't see why a result
>> with non-zero length should be returned from the SQL query if the
>> database is empty.
>
>The SQL query probably returns a data frame and length(..) called on a
>data frame does not return the number of rows. To get the number of
>rows of a data frame you should be using nrow(..) instead of
>length(..).
>
>--Mitchell
>

That worked for me.
After replacing 'length' with 'nrow' a new plot ID is created!

Now an error appears in add.sloccout.ts:
  dat <- cbind(plotId=plot.id, time=commit.date, values$metrics)
  res <- dbWriteTable(conf$con, "sloccount_ts", dat, append=TRUE, 
row.names=FALSE)

The cbind works fine.
But the following 'dbWriteTable' returns "Unknown column 'person.months' in 
'field list'".

While debugging (frame 10 of traceback), I can find a column 'person.months':
Browse[1]> ls()
[1] "conn"  "name"  "value"
Browse[1]> print(value)
  plotId                time person.months total.cost schedule.months avg.devel
1      1 2006-09-05 20:20:16          0.37       4426            1.71      0.22

...
2015-03-05 13:41:11 [codeface.R.complexity] INFO: Performing sloccount analysis 
for acf102237f9e26c1b864bf6e432f65040b477851
2015-03-05 13:41:11 [codeface.R] INFO: Traceback:
  0:
  1: config.script.run({
         conf <- config.from.args(positional.args = list("rep
  2: withCallingHandlers(expr, error = function(e) {
         if (!interactive()) {

  3: do.complexity.analysis(conf)
  4: mclapply.db(conf, 1:nrow(commits.list), function(conf, i) {
         logdevinfo(s
  5: mclapply(X, function(i) {
         conf <- init.db.global(conf)
         res.local <- F
  6: lapply(X = X, FUN = FUN, ...)
  7: FUN(1:18[[1]], ...)
  8: FUN(conf, i)
  9: add.sloccount.ts(conf, sloccount.plot.id, commit.date, res)
 10: dbWriteTable(conf$con, "sloccount_ts", dat, append = TRUE, row.names = FALS
 11: .valueClassTest({
         standardGeneric("dbWriteTable")
     }, "logical", "dbWriteT
 12: is(object, Cl)
 13: is(object, Cl)
 14: .local(conn, name, value, ...)
 15: dbGetQuery(conn, sql)
 16: dbGetQuery(conn, sql)
 17: dbSendQuery(conn, statement, ...)
 18: .valueClassTest(standardGeneric("dbSendQuery"), "DBIResult", "dbSendQuery")
 19: is(object, Cl)
 20: is(object, Cl)
 21: .local(conn, statement, ...)
2015-03-05 13:41:11 [codeface.R] CRITICAL: could not run statement: Unknown 
column 'person.months' in 'field list'
2015-03-05 13:41:11 [codeface.R] INFO: Error dump was written to 
'error.dump.rda'.
2015-03-05 13:41:11 [codeface.R] INFO: To debug, launch R and run 
'load("error.dump.rda"); debugger(error.dump)'
2015-03-05 13:41:11 [codeface.util] MainProcess ERROR: Command 
'/home/codeface/codeface/codeface/R/complexity.r --loglevel info -c 
/home/codeface/codeface/codeface.conf -p /tmp/jqueryHzVMOt -j 1 
/home/codeface/projects/jquery/.git 1' failed with exit code 1.
(stdout: None
stderr: None)
Traceback (most recent call last):
  File "/home/codeface/.local/bin/codeface", line 9, in <module>
    load_entry_point('codeface==0.2.0', 'console_scripts', 'codeface')()
  File "/home/codeface/codeface/codeface/cli.py", line 198, in main
    return run(sys.argv)
  File "/home/codeface/codeface/codeface/cli.py", line 194, in run
    return args.func(args)
  File "/home/codeface/codeface/codeface/cli.py", line 112, in cmd_run
    args.profile_r, args.jobs, args.tagging)
  File "/home/codeface/codeface/codeface/project.py", line 173, in 
project_analyse
    execute_command(cmd, direct_io=True, cwd=cwd)
  File "/home/codeface/codeface/codeface/util.py", line 278, in execute_command
    raise Exception(msg)
Exception: Command '/home/codeface/codeface/codeface/R/complexity.r --loglevel 
info -c /home/codeface/codeface/codeface.conf -p /tmp/jqueryHzVMOt -j 1 
/home/codeface/projects/jquery/.git 1' failed with exit code 1.
(stdout: None
stderr: None)

Best regards, Matthias Gemmer

>
>>
>> Thanks, Wolfgang Mauerer
>>
>>>
>>> For understand.plot.id <- get.or.create.plot.id(conf, "understand_raw"):
>>>    res <- dbGetQuery(con, str_c(query, ";"))
>>>    # str_c(query, ";"): SELECT id FROM plots WHERE name='understand_raw'
>>> AND projectId=2;
>>>    # res: "id" (Manually: Empty set (see below))
>>>    # length(res): 1
>>>    if (length(res) < 1) {
>>>      ## Plot ID is not assigned yet, create one
>>>      res <- get.clear.plot.id.con(con, pid, plot.name, range.id)
>>>    } else {
>>>      res <- res$id
>>>    }
>>>    # res: "x"
>>>
>>> SQL Statements run manually:
>>> mysql> SELECT id FROM plots WHERE name='sloccount' AND projectId=2;
>>> Empty set (0.00 sec)
>>>
>>> mysql> SELECT id FROM plots WHERE name='understand_raw' AND projectId=2;
>>> Empty set (0.00 sec)
>>>
>>> Best regards, Matthias Gemmer
>>>
>>>> Best regards, Wolfgang Mauerer
>>>>
>>>
>>


Other related posts: