[codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: Preparing time series data - sloccount analysis

  • From: Wolfgang Mauerer <wolfgang.mauerer@xxxxxxxxxxx>
  • To: codeface@xxxxxxxxxxxxx, Mitchell Joblin <joblin.m@xxxxxxxxx>
  • Date: Thu, 05 Mar 2015 12:12:47 +0100

On 05.03.2015 12:04, Matthias Gemmer wrote:
Browse[1]> print(plot.id)
numeric(0)
so that's the culprit... There is no valid plot ID for the time
series in the database. Can you please check that an appropriate
table is available in the database?


There is a table called timeseries with the column plotId.
mysql> DESCRIBE timeseries;
+--------------+------------+------+-----+---------+-------+
| Field        | Type       | Null | Key | Default | Extra |
+--------------+------------+------+-----+---------+-------+
| plotId       | bigint(20) | NO   | MUL | NULL    |       |
| time         | datetime   | NO   |     | NULL    |       |
| value        | double     | NO   |     | NULL    |       |
| value_scaled | double     | YES  |     | NULL    |       |
+--------------+------------+------+-----+---------+-------+
4 rows in set (0.02 sec)

The table is also filled with data. The table contains datasets for
plotId=5, plotId=6, plotId=7 and plotId=8.


Which values do sloccount.plot.id (and understand.plot.id) have
in do.complexity.analysis (Frame 3/4)?


The values for sloccount.plot.id and understand.plot.id are obviously
invalid.

Browse[1]> print(sloccount.plot.id)
numeric(0)
Browse[1]> print(understand.plot.id)
numeric(0)

it was not so obvious to me; I was trying to ensure that
parallelisation did not introduce any issues here. But your observation
clarified that this is not the case.

Since the error seems to be deterministically reproducible at your
site, can you debug around the creation of the index (for instance by
printing out what's going on; alternatively, you could also use the
built-in debugger)?


In the file codeface/R/complexity.r:

Assignment of sloccount.plot.id and understand.plot.id:
   ## Obtain a plot IDs for the sloccount and understand raw time series before
   ## parallel processing commences to avoid race conditions
   sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount")
   understand.plot.id <- get.or.create.plot.id(conf, "understand_raw")
       -> sloccount.plot.id and understand.plot.id have the value "x".
              Are these values feasible? Or Shall I have a closer look at the 
function 'get.or.create.plot.id'?

since the SQL specification for the plot ID is

`id` BIGINT NOT NULL AUTO_INCREMENT

the value "x" seems quite impossible. Can you please query your
database to see what value is stored there?


The table is empty.
mysql> select * from plots;
Empty set (0.01 sec)

please try to run the other SQL statements produced by the code to see
why no entry is created. get.or.create.plot.id() inserts a new entry
into the table is no ID for a desired plot is available.

The branch which creates a plot ID is not entered. The condition 'length(res) < 
1' is
in both cases (sloccount.plot.id and understand.plot.id) not satisfied.

For sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount"):
   res <- dbGetQuery(con, str_c(query, ";"))
   # str_c(query, ";"): SELECT id FROM plots WHERE name='sloccount' AND 
projectId=2;
   # res: "id"
   # length(res): 1
   if (length(res) < 1) {
     ## Plot ID is not assigned yet, create one
     res <- get.clear.plot.id.con(con, pid, plot.name, range.id)
   } else {
     res <- res$id
   }
   # res: "x"

@Mitchell, could you try to reproduce this? I don't see why a result
with non-zero length should be returned from the SQL query if the
database is empty.

Thanks, Wolfgang Mauerer

For understand.plot.id <- get.or.create.plot.id(conf, "understand_raw"):
   res <- dbGetQuery(con, str_c(query, ";"))
   # str_c(query, ";"): SELECT id FROM plots WHERE name='understand_raw' AND 
projectId=2;
   # res: "id" (Manually: Empty set (see below))
   # length(res): 1
   if (length(res) < 1) {
     ## Plot ID is not assigned yet, create one
     res <- get.clear.plot.id.con(con, pid, plot.name, range.id)
   } else {
     res <- res$id
   }
   # res: "x"

SQL Statements run manually:
mysql> SELECT id FROM plots WHERE name='sloccount' AND projectId=2;
Empty set (0.00 sec)

mysql> SELECT id FROM plots WHERE name='understand_raw' AND projectId=2;
Empty set (0.00 sec)

Best regards, Matthias Gemmer

Best regards, Wolfgang Mauerer




Other related posts: