[codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: Preparing time series data - sloccount analysis

  • From: Mitchell Joblin <joblin.m@xxxxxxxxx>
  • To: Wolfgang Mauerer <wolfgang.mauerer@xxxxxxxxxxx>
  • Date: Thu, 5 Mar 2015 12:14:38 +0000

On Thu, Mar 5, 2015 at 11:12 AM, Wolfgang Mauerer
<wolfgang.mauerer@xxxxxxxxxxx> wrote:
> On 05.03.2015 12:04, Matthias Gemmer wrote:
>>>>>>>>>>
>>>>>>>>>> Browse[1]> print(plot.id)
>>>>>>>>>> numeric(0)
>>>>>>>>>
>>>>>>>>> so that's the culprit... There is no valid plot ID for the time
>>>>>>>>> series in the database. Can you please check that an appropriate
>>>>>>>>> table is available in the database?
>>>>>>>>>
>>>>>>>>
>>>>>>>> There is a table called timeseries with the column plotId.
>>>>>>>> mysql> DESCRIBE timeseries;
>>>>>>>> +--------------+------------+------+-----+---------+-------+
>>>>>>>> | Field        | Type       | Null | Key | Default | Extra |
>>>>>>>> +--------------+------------+------+-----+---------+-------+
>>>>>>>> | plotId       | bigint(20) | NO   | MUL | NULL    |       |
>>>>>>>> | time         | datetime   | NO   |     | NULL    |       |
>>>>>>>> | value        | double     | NO   |     | NULL    |       |
>>>>>>>> | value_scaled | double     | YES  |     | NULL    |       |
>>>>>>>> +--------------+------------+------+-----+---------+-------+
>>>>>>>> 4 rows in set (0.02 sec)
>>>>>>>>
>>>>>>>> The table is also filled with data. The table contains datasets for
>>>>>>>> plotId=5, plotId=6, plotId=7 and plotId=8.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Which values do sloccount.plot.id (and understand.plot.id) have
>>>>>>>>> in do.complexity.analysis (Frame 3/4)?
>>>>>>>>>
>>>>>>>>
>>>>>>>> The values for sloccount.plot.id and understand.plot.id are
>>>>>>>> obviously
>>>>>>>> invalid.
>>>>>>>>
>>>>>>>> Browse[1]> print(sloccount.plot.id)
>>>>>>>> numeric(0)
>>>>>>>> Browse[1]> print(understand.plot.id)
>>>>>>>> numeric(0)
>>>>>>>
>>>>>>>
>>>>>>> it was not so obvious to me; I was trying to ensure that
>>>>>>> parallelisation did not introduce any issues here. But your
>>>>>>> observation
>>>>>>> clarified that this is not the case.
>>>>>>>
>>>>>>> Since the error seems to be deterministically reproducible at your
>>>>>>> site, can you debug around the creation of the index (for instance by
>>>>>>> printing out what's going on; alternatively, you could also use the
>>>>>>> built-in debugger)?
>>>>>>>
>>>>>>
>>>>>> In the file codeface/R/complexity.r:
>>>>>>
>>>>>> Assignment of sloccount.plot.id and understand.plot.id:
>>>>>>    ## Obtain a plot IDs for the sloccount and understand raw time
>>>>>> series before
>>>>>>    ## parallel processing commences to avoid race conditions
>>>>>>    sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount")
>>>>>>    understand.plot.id <- get.or.create.plot.id(conf, "understand_raw")
>>>>>>        -> sloccount.plot.id and understand.plot.id have the value "x".
>>>>>>               Are these values feasible? Or Shall I have a closer look
>>>>>> at the function 'get.or.create.plot.id'?
>>>>>
>>>>>
>>>>> since the SQL specification for the plot ID is
>>>>>
>>>>> `id` BIGINT NOT NULL AUTO_INCREMENT
>>>>>
>>>>> the value "x" seems quite impossible. Can you please query your
>>>>> database to see what value is stored there?
>>>>>
>>>>
>>>> The table is empty.
>>>> mysql> select * from plots;
>>>> Empty set (0.01 sec)
>>>
>>>
>>> please try to run the other SQL statements produced by the code to see
>>> why no entry is created. get.or.create.plot.id() inserts a new entry
>>> into the table is no ID for a desired plot is available.
>>
>>
>> The branch which creates a plot ID is not entered. The condition
>> 'length(res) < 1' is
>> in both cases (sloccount.plot.id and understand.plot.id) not satisfied.
>>
>> For sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount"):
>>    res <- dbGetQuery(con, str_c(query, ";"))
>>    # str_c(query, ";"): SELECT id FROM plots WHERE name='sloccount' AND
>> projectId=2;
>>    # res: "id"
>>    # length(res): 1
>>    if (length(res) < 1) {
>>      ## Plot ID is not assigned yet, create one
>>      res <- get.clear.plot.id.con(con, pid, plot.name, range.id)
>>    } else {
>>      res <- res$id
>>    }
>>    # res: "x"
>
>
> @Mitchell, could you try to reproduce this? I don't see why a result
> with non-zero length should be returned from the SQL query if the
> database is empty.

The SQL query probably returns a data frame and length(..) called on a
data frame does not return the number of rows. To get the number of
rows of a data frame you should be using nrow(..) instead of
length(..).

--Mitchell

>
> Thanks, Wolfgang Mauerer
>
>>
>> For understand.plot.id <- get.or.create.plot.id(conf, "understand_raw"):
>>    res <- dbGetQuery(con, str_c(query, ";"))
>>    # str_c(query, ";"): SELECT id FROM plots WHERE name='understand_raw'
>> AND projectId=2;
>>    # res: "id" (Manually: Empty set (see below))
>>    # length(res): 1
>>    if (length(res) < 1) {
>>      ## Plot ID is not assigned yet, create one
>>      res <- get.clear.plot.id.con(con, pid, plot.name, range.id)
>>    } else {
>>      res <- res$id
>>    }
>>    # res: "x"
>>
>> SQL Statements run manually:
>> mysql> SELECT id FROM plots WHERE name='sloccount' AND projectId=2;
>> Empty set (0.00 sec)
>>
>> mysql> SELECT id FROM plots WHERE name='understand_raw' AND projectId=2;
>> Empty set (0.00 sec)
>>
>> Best regards, Matthias Gemmer
>>
>>> Best regards, Wolfgang Mauerer
>>>
>>
>

Other related posts: