[codeface] [PATCH 2/2] Add functions to compute developer classification

  • From: Mitchell Joblin <mitchell.joblin.ext@xxxxxxxxxxx>
  • To: codeface@xxxxxxxxxxxxx
  • Date: Wed, 7 Oct 2015 16:30:37 +0200

- This implements the standard notion of core and pheripheral
developer based on participation in the version control system
Ref: Terceiro A, Rios LR, Chavez C (2010) An empirical study on
the structural complexity introduced by core and peripheral
developers in free software projects.

Signed-off-by: Mitchell Joblin <mitchell.joblin.ext@xxxxxxxxxxx>
---
codeface/R/developer_classification.R | 26 ++++++++++++++++++++++++++
codeface/R/test_developer_classification.R | 17 +++++++++++++++++
2 files changed, 43 insertions(+)
create mode 100644 codeface/R/developer_classification.R
create mode 100644 codeface/R/test_developer_classification.R

diff --git a/codeface/R/developer_classification.R
b/codeface/R/developer_classification.R
new file mode 100644
index 0000000..251099c
--- /dev/null
+++ b/codeface/R/developer_classification.R
@@ -0,0 +1,26 @@
+source("db.r")
+source("query.r")
+
+## Classify a set of developers based on the number of commits made withing a
+## time range using the standard participation based notion
+get.developer.class.con <- function(con, project.id, start.date, end.date) {
+ commit.df <- get.commits.by.date.con(con, project.id, start.date, end.date)
+ developer.class <- get.developer.class(commit.df)
+
+ return(developer.class)
+}
+
+## Low-level function to compute classification
+get.developer.class <- function(commit.df, threshold=0.8) {
+ author.commit.count <- count(commit.df, "author")
+ author.commit.count <- author.commit.count[order(-author.commit.count$freq),]
+ num.commits <- nrow(commit.df)
+ commit.threshold <- round(threshold * num.commits)
+ core.test <- cumsum(author.commit.count$freq) < commit.threshold
+ core.developers <- author.commit.count[core.test,]
+ peripheral.developers <- author.commit.count[!core.test,]
+ res <- list(core=core.developers, peripheral=peripheral.developers)
+
+ return(res)
+}
+
diff --git a/codeface/R/test_developer_classification.R
b/codeface/R/test_developer_classification.R
new file mode 100644
index 0000000..1fbc6cd
--- /dev/null
+++ b/codeface/R/test_developer_classification.R
@@ -0,0 +1,17 @@
+library(testthat)
+source("developer_classification.R")
+
+get.developer.class.test <- function() {
+ threshold <- 0.8
+ sample.size <- 1000
+
+ commit.df <- data.frame(author=sample(1:50, size=sample.size, replace=T))
+ developer.class <- get.developer.class(commit.df, threshold)
+
+ res <- sum(developer.class$core$freq) < threshold*sample.size
+ return(res)
+}
+
+test_that("get.developer.class returns expected values", {
+ expect_true(get.developer.class.test())
+ })
\ No newline at end of file
--
2.1.4


Other related posts: