>Von: codeface-bounce@xxxxxxxxxxxxx <codeface-bounce@xxxxxxxxxxxxx> im Auftrag >von Mitchell Joblin <joblin.m@xxxxxxxxx> >Gesendet: Donnerstag, 5. März 2015 13:14 >An: Wolfgang Mauerer >Cc: codeface@xxxxxxxxxxxxx >Betreff: [codeface] Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: AW: Re: >Preparing time series data - sloccount analysis > >On Thu, Mar 5, 2015 at 11:12 AM, Wolfgang Mauerer ><wolfgang.mauerer@xxxxxxxxxxx> wrote: >> On 05.03.2015 12:04, Matthias Gemmer wrote: >>>>>>>>>>> >>>>>>>>>>> Browse[1]> print(plot.id) >>>>>>>>>>> numeric(0) >>>>>>>>>> >>>>>>>>>> so that's the culprit... There is no valid plot ID for the time >>>>>>>>>> series in the database. Can you please check that an appropriate >>>>>>>>>> table is available in the database? >>>>>>>>>> >>>>>>>>> >>>>>>>>> There is a table called timeseries with the column plotId. >>>>>>>>> mysql> DESCRIBE timeseries; >>>>>>>>> +--------------+------------+------+-----+---------+-------+ >>>>>>>>> | Field | Type | Null | Key | Default | Extra | >>>>>>>>> +--------------+------------+------+-----+---------+-------+ >>>>>>>>> | plotId | bigint(20) | NO | MUL | NULL | | >>>>>>>>> | time | datetime | NO | | NULL | | >>>>>>>>> | value | double | NO | | NULL | | >>>>>>>>> | value_scaled | double | YES | | NULL | | >>>>>>>>> +--------------+------------+------+-----+---------+-------+ >>>>>>>>> 4 rows in set (0.02 sec) >>>>>>>>> >>>>>>>>> The table is also filled with data. The table contains datasets for >>>>>>>>> plotId=5, plotId=6, plotId=7 and plotId=8. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Which values do sloccount.plot.id (and understand.plot.id) have >>>>>>>>>> in do.complexity.analysis (Frame 3/4)? >>>>>>>>>> >>>>>>>>> >>>>>>>>> The values for sloccount.plot.id and understand.plot.id are >>>>>>>>> obviously >>>>>>>>> invalid. >>>>>>>>> >>>>>>>>> Browse[1]> print(sloccount.plot.id) >>>>>>>>> numeric(0) >>>>>>>>> Browse[1]> print(understand.plot.id) >>>>>>>>> numeric(0) >>>>>>>> >>>>>>>> >>>>>>>> it was not so obvious to me; I was trying to ensure that >>>>>>>> parallelisation did not introduce any issues here. But your >>>>>>>> observation >>>>>>>> clarified that this is not the case. >>>>>>>> >>>>>>>> Since the error seems to be deterministically reproducible at your >>>>>>>> site, can you debug around the creation of the index (for instance by >>>>>>>> printing out what's going on; alternatively, you could also use the >>>>>>>> built-in debugger)? >>>>>>>> >>>>>>> >>>>>>> In the file codeface/R/complexity.r: >>>>>>> >>>>>>> Assignment of sloccount.plot.id and understand.plot.id: >>>>>>> ## Obtain a plot IDs for the sloccount and understand raw time >>>>>>> series before >>>>>>> ## parallel processing commences to avoid race conditions >>>>>>> sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount") >>>>>>> understand.plot.id <- get.or.create.plot.id(conf, "understand_raw") >>>>>>> -> sloccount.plot.id and understand.plot.id have the value "x". >>>>>>> Are these values feasible? Or Shall I have a closer look >>>>>>> at the function 'get.or.create.plot.id'? >>>>>> >>>>>> >>>>>> since the SQL specification for the plot ID is >>>>>> >>>>>> `id` BIGINT NOT NULL AUTO_INCREMENT >>>>>> >>>>>> the value "x" seems quite impossible. Can you please query your >>>>>> database to see what value is stored there? >>>>>> >>>>> >>>>> The table is empty. >>>>> mysql> select * from plots; >>>>> Empty set (0.01 sec) >>>> >>>> >>>> please try to run the other SQL statements produced by the code to see >>>> why no entry is created. get.or.create.plot.id() inserts a new entry >>>> into the table is no ID for a desired plot is available. >>> >>> >>> The branch which creates a plot ID is not entered. The condition >>> 'length(res) < 1' is >>> in both cases (sloccount.plot.id and understand.plot.id) not satisfied. >>> >>> For sloccount.plot.id <- get.or.create.plot.id(conf, "sloccount"): >>> res <- dbGetQuery(con, str_c(query, ";")) >>> # str_c(query, ";"): SELECT id FROM plots WHERE name='sloccount' AND >>> projectId=2; >>> # res: "id" >>> # length(res): 1 >>> if (length(res) < 1) { >>> ## Plot ID is not assigned yet, create one >>> res <- get.clear.plot.id.con(con, pid, plot.name, range.id) >>> } else { >>> res <- res$id >>> } >>> # res: "x" >> >> >> @Mitchell, could you try to reproduce this? I don't see why a result >> with non-zero length should be returned from the SQL query if the >> database is empty. > >The SQL query probably returns a data frame and length(..) called on a >data frame does not return the number of rows. To get the number of >rows of a data frame you should be using nrow(..) instead of >length(..). > >--Mitchell > That worked for me. After replacing 'length' with 'nrow' a new plot ID is created! Now an error appears in add.sloccout.ts: dat <- cbind(plotId=plot.id, time=commit.date, values$metrics) res <- dbWriteTable(conf$con, "sloccount_ts", dat, append=TRUE, row.names=FALSE) The cbind works fine. But the following 'dbWriteTable' returns "Unknown column 'person.months' in 'field list'". While debugging (frame 10 of traceback), I can find a column 'person.months': Browse[1]> ls() [1] "conn" "name" "value" Browse[1]> print(value) plotId time person.months total.cost schedule.months avg.devel 1 1 2006-09-05 20:20:16 0.37 4426 1.71 0.22 ... 2015-03-05 13:41:11 [codeface.R.complexity] INFO: Performing sloccount analysis for acf102237f9e26c1b864bf6e432f65040b477851 2015-03-05 13:41:11 [codeface.R] INFO: Traceback: 0: 1: config.script.run({ conf <- config.from.args(positional.args = list("rep 2: withCallingHandlers(expr, error = function(e) { if (!interactive()) { 3: do.complexity.analysis(conf) 4: mclapply.db(conf, 1:nrow(commits.list), function(conf, i) { logdevinfo(s 5: mclapply(X, function(i) { conf <- init.db.global(conf) res.local <- F 6: lapply(X = X, FUN = FUN, ...) 7: FUN(1:18[[1]], ...) 8: FUN(conf, i) 9: add.sloccount.ts(conf, sloccount.plot.id, commit.date, res) 10: dbWriteTable(conf$con, "sloccount_ts", dat, append = TRUE, row.names = FALS 11: .valueClassTest({ standardGeneric("dbWriteTable") }, "logical", "dbWriteT 12: is(object, Cl) 13: is(object, Cl) 14: .local(conn, name, value, ...) 15: dbGetQuery(conn, sql) 16: dbGetQuery(conn, sql) 17: dbSendQuery(conn, statement, ...) 18: .valueClassTest(standardGeneric("dbSendQuery"), "DBIResult", "dbSendQuery") 19: is(object, Cl) 20: is(object, Cl) 21: .local(conn, statement, ...) 2015-03-05 13:41:11 [codeface.R] CRITICAL: could not run statement: Unknown column 'person.months' in 'field list' 2015-03-05 13:41:11 [codeface.R] INFO: Error dump was written to 'error.dump.rda'. 2015-03-05 13:41:11 [codeface.R] INFO: To debug, launch R and run 'load("error.dump.rda"); debugger(error.dump)' 2015-03-05 13:41:11 [codeface.util] MainProcess ERROR: Command '/home/codeface/codeface/codeface/R/complexity.r --loglevel info -c /home/codeface/codeface/codeface.conf -p /tmp/jqueryHzVMOt -j 1 /home/codeface/projects/jquery/.git 1' failed with exit code 1. (stdout: None stderr: None) Traceback (most recent call last): File "/home/codeface/.local/bin/codeface", line 9, in <module> load_entry_point('codeface==0.2.0', 'console_scripts', 'codeface')() File "/home/codeface/codeface/codeface/cli.py", line 198, in main return run(sys.argv) File "/home/codeface/codeface/codeface/cli.py", line 194, in run return args.func(args) File "/home/codeface/codeface/codeface/cli.py", line 112, in cmd_run args.profile_r, args.jobs, args.tagging) File "/home/codeface/codeface/codeface/project.py", line 173, in project_analyse execute_command(cmd, direct_io=True, cwd=cwd) File "/home/codeface/codeface/codeface/util.py", line 278, in execute_command raise Exception(msg) Exception: Command '/home/codeface/codeface/codeface/R/complexity.r --loglevel info -c /home/codeface/codeface/codeface.conf -p /tmp/jqueryHzVMOt -j 1 /home/codeface/projects/jquery/.git 1' failed with exit code 1. (stdout: None stderr: None) Best regards, Matthias Gemmer > >> >> Thanks, Wolfgang Mauerer >> >>> >>> For understand.plot.id <- get.or.create.plot.id(conf, "understand_raw"): >>> res <- dbGetQuery(con, str_c(query, ";")) >>> # str_c(query, ";"): SELECT id FROM plots WHERE name='understand_raw' >>> AND projectId=2; >>> # res: "id" (Manually: Empty set (see below)) >>> # length(res): 1 >>> if (length(res) < 1) { >>> ## Plot ID is not assigned yet, create one >>> res <- get.clear.plot.id.con(con, pid, plot.name, range.id) >>> } else { >>> res <- res$id >>> } >>> # res: "x" >>> >>> SQL Statements run manually: >>> mysql> SELECT id FROM plots WHERE name='sloccount' AND projectId=2; >>> Empty set (0.00 sec) >>> >>> mysql> SELECT id FROM plots WHERE name='understand_raw' AND projectId=2; >>> Empty set (0.00 sec) >>> >>> Best regards, Matthias Gemmer >>> >>>> Best regards, Wolfgang Mauerer >>>> >>> >>