>Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im Auftrag >von Wolfgang Mauerer <wolfgang.mauerer@xxxxxxxxxxx> >Gesendet: Donnerstag, 5. März 2015 15:20 >An: codeface@xxxxxxxxxxxxx >Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: >Re: AW: Re: Preparing time series data - sloccount analysis > >On 05.03.2015 15:17, Matthias Gemmer wrote: >>> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im >>> Auftrag von Mitchell Joblin <joblin.m@xxxxxxxxx> >>> Gesendet: Donnerstag, 5. März 2015 15:05 >>> An: codeface@xxxxxxxxxxxxx >>> Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: >>> Re: Preparing time series data - sloccount analysis >>> >>> On Thu, Mar 5, 2015 at 1:50 PM, Wolfgang Mauerer <wm@xxxxxxxxxxxxxxxx> >>> wrote: >>>> >>>> >>>> Am 05/03/2015 um 14:19 schrieb Matthias Gemmer: >>>>>> Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im >>>>>> Auftrag von Mitchell Joblin <joblin.m@xxxxxxxxx> >>>>>> Gesendet: Donnerstag, 5. März 2015 13:14 >>>>>> An: Wolfgang Mauerer >>>>>> Cc: codeface@xxxxxxxxxxxxx >>>>>> Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: >>>>>> Preparing time series data - sloccount analysis >>>>>> >>>>>> On Thu, Mar 5, 2015 at 11:12 AM, Wolfgang Mauerer >>>>>> <wolfgang.mauerer@xxxxxxxxxxx> wrote: >>>>>>> On 05.03.2015 12:04, Matthias Gemmer wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Browse[1]> print(plot.id) >>>>>>>>>>>>>>>> numeric(0) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> so that's the culprit... There is no valid plot ID for the time >>>>>>>>>>>>>>> series in the database. Can you please check that an appropriate >>>>>>>>>>>>>>> table is available in the database? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> There is a table called timeseries with the column plotId. >>>>>>>>>>>>>> mysql> DESCRIBE timeseries; >>>>>>>>>>>>>> +--------------+------------+------+-----+---------+-------+ >>>>>>>>>>>>>> | Field | Type | Null | Key | Default | Extra | >>>>>>>>>>>>>> +--------------+------------+------+-----+---------+-------+ >>>>>>>>>>>>>> | plotId | bigint(20) | NO | MUL | NULL | | >>>>>>>>>>>>>> | time | datetime | NO | | NULL | | >>>>>>>>>>>>>> | value | double | NO | | NULL | | >>>>>>>>>>>>>> | value_scaled | double | YES | | NULL | | >>>>>>>>>>>>>> +--------------+------------+------+-----+---------+-------+ >>>>>>>>>>>>>> 4 rows in set (0.02 sec) >>>>>>>>>>>>>> >>>>>>>>>>>>>> The table is also filled with data. The table contains datasets >>>>>>>>>>>>>> for >>>>>>>>>>>>>> plotId=5, plotId=6, plotId=7 and plotId=8. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Which values do sloccount.plot.id (and understand.plot.id) have >>>>>>>>>>>>>>> in do.complexity.analysis (Frame 3/4)? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The values for sloccount.plot.id and understand.plot.id are >>>>>>>>>>>>>> obviously >>>>>>>>>>>>>> invalid. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Browse[1]> print(sloccount.plot.id) >>>>>>>>>>>>>> numeric(0) >>>>>>>>>>>>>> Browse[1]> print(understand.plot.id) >>>>>>>>>>>>>> numeric(0) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> it was not so obvious to me; I was trying to ensure that >>>>>>>>>>>>> parallelisation did not introduce any issues here. But your >>>>>>>>>>>>> observation >>>>>>>>>>>>> clarified that this is not the case. >>>>>>>>>>>>> >>>>>>>>>>>>> Since the error seems to be deterministically reproducible at your >>>>>>>>>>>>> site, can you debug around the creation of the index (for >>>>>>>>>>>>> instance by >>>>>>>>>>>>> printing out what's going on; alternatively, you could also use >>>>>>>>>>>>> the >>>>>>>>>>>>> built-in debugger)? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In the file codeface/R/complexity.r: >>>>>>>>>>>> >>>>>>>>>>>> Assignment of sloccount.plot.id and understand.plot.id: >>>>>>>>>>>> ## Obtain a plot IDs for the sloccount and understand raw time >>>>>>>>>>>> series before >>>>>>>>>>>> ## parallel processing commences to avoid race conditions >>>>>>>>>>>> sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount") >>>>>>>>>>>> understand.plot.id <- get.or.create.plot.id(conf, >>>>>>>>>>>> "understand_raw") >>>>>>>>>>>> -> sloccount.plot.id and understand.plot.id have the value >>>>>>>>>>>> "x". >>>>>>>>>>>> Are these values feasible? Or Shall I have a closer >>>>>>>>>>>> look >>>>>>>>>>>> at the function 'get.or.create.plot.id'? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> since the SQL specification for the plot ID is >>>>>>>>>>> >>>>>>>>>>> `id` BIGINT NOT NULL AUTO_INCREMENT >>>>>>>>>>> >>>>>>>>>>> the value "x" seems quite impossible. Can you please query your >>>>>>>>>>> database to see what value is stored there? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The table is empty. >>>>>>>>>> mysql> select * from plots; >>>>>>>>>> Empty set (0.01 sec) >>>>>>>>> >>>>>>>>> >>>>>>>>> please try to run the other SQL statements produced by the code to see >>>>>>>>> why no entry is created. get.or.create.plot.id() inserts a new entry >>>>>>>>> into the table is no ID for a desired plot is available. >>>>>>>> >>>>>>>> >>>>>>>> The branch which creates a plot ID is not entered. The condition >>>>>>>> 'length(res) < 1' is >>>>>>>> in both cases (sloccount.plot.id and understand.plot.id) not satisfied. >>>>>>>> >>>>>>>> For sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount"): >>>>>>>> res <- dbGetQuery(con, str_c(query, ";")) >>>>>>>> # str_c(query, ";"): SELECT id FROM plots WHERE name='sloccount' >>>>>>>> AND >>>>>>>> projectId=2; >>>>>>>> # res: "id" >>>>>>>> # length(res): 1 >>>>>>>> if (length(res) < 1) { >>>>>>>> ## Plot ID is not assigned yet, create one >>>>>>>> res <- get.clear.plot.id.con(con, pid, plot.name, range.id) >>>>>>>> } else { >>>>>>>> res <- res$id >>>>>>>> } >>>>>>>> # res: "x" >>>>>>> >>>>>>> >>>>>>> @Mitchell, could you try to reproduce this? I don't see why a result >>>>>>> with non-zero length should be returned from the SQL query if the >>>>>>> database is empty. >>>>>> >>>>>> The SQL query probably returns a data frame and length(..) called on a >>>>>> data frame does not return the number of rows. To get the number of >>>>>> rows of a data frame you should be using nrow(..) instead of >>>>>> length(..). >>>>>> >>>>>> --Mitchell >>>>>> >>>>> >>>>> That worked for me. >>>>> After replacing 'length' with 'nrow' a new plot ID is created! >>>> >>>> The following patch should fix this for good then: >>>> >> >> After patching there is still a problem on my side. >> The call 'dbWriteTable' in the function add.sloccount.ts produces (as >> before) the message >> Unknown column 'person.months' in 'field list'. >the patch was not supposed to address this issue. Did you check that the >database schema is uptodate with the latest version? > >Best regards, Wolfgang Mauerer The SQL-Schema on my side is identical to the schema of the latest commit (79ebe108e2) to the master-branch. The SQL-Statement of 'dbWriteTable' Browse[1]> print(statement) [1] "LOAD DATA LOCAL INFILE '/tmp/RtmpZRB6lj/rsdbi503512c75278' INTO TABLE `sloccount_ts` LINES TERMINATED BY '\n' (`plotId`, `time`, `person.months`, `total.cost`, `schedule.months`, `avg.devel`);" matches the corresponding table. mysql> describe sloccount_ts; +-----------------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------------+------------+------+-----+---------+-------+ | plotId | bigint(20) | NO | MUL | NULL | | | time | datetime | NO | PRI | NULL | | | person_months | double | NO | | NULL | | | total_cost | double | NO | | NULL | | | schedule_months | double | NO | | NULL | | | avg_devel | double | NO | | NULL | | +-----------------+------------+------+-----+---------+-------+ Best Regards, Matthias Gemmer