[codeface] Re: [PATCH 4/5] Add query to generate edgelist based on email communication

  • From: Mitchell Joblin <joblin.m@xxxxxxxxx>
  • To: codeface@xxxxxxxxxxxxx
  • Date: Wed, 04 Nov 2015 23:12:49 +0000

On Thu, 5 Nov 2015 00:09 Wolfgang Mauerer <
wolfgang.mauerer@xxxxxxxxxxxxxxxxx> wrote:

Am 29/10/15 um 11:57 schrieb Mitchell Joblin:

- Two people are joined together when they have contributed
to a common email thread

Signed-off-by: Mitchell Joblin <mitchell.joblin.ext@xxxxxxxxxxx>
---
codeface/R/query.r | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/codeface/R/query.r b/codeface/R/query.r
index 720e9b9..a7e2af9 100644
--- a/codeface/R/query.r
+++ b/codeface/R/query.r
@@ -544,13 +544,20 @@ query.top.contributors.changes <- function(con,
range.id, limit=20) {

## Compute edgelist for mailing list communication
query.mail.edgelist <- function(con, pid, start.date, end.date) {
- query <- str_c("SELECT who AS `from`, createdBy AS `to`, COUNT(*) AS
`weight`",
- "FROM mail_thread, thread_responses",
- "WHERE
mail_thread.mailThreadId=thread_responses.mailThreadId",
- "AND projectId=", pid,
- "AND mailDate >=", sq(start.date),
- "AND mailDate <", sq(end.date),
- "GROUP BY mail_thread.createdBy, thread_responses.who",
sep=" ")
+ query <- str_c("SELECT mail1.author as `from`, mail2.author as `to`,
+ COUNT(*) as `weight`",
+ "FROM mail mail1, mail mail2",

the query might be easier to read if you use "mail_from" instead of
mail1, and "mail_to" instead of mail2, albeit clarity is naturally
subjective.

Yes, it's a good suggestion. I'll change it. Thanks!


+ "WHERE mail1.projectId=", pid,
+ "AND mail1.projectId=mail2.projectId",
+ "AND mail1.threadId=mail2.threadId",
+ "AND mail1.mlId=mail2.mlId",
+ "AND mail1.author!=mail2.author",
+ "AND mail1.creationDate > mail2.creationDate",
+ "AND mail1.creationDate >=", sq(start.date),
+ "AND mail1.creationDate <", sq(end.date),
+ "AND mail2.creationDate >=", sq(start.date),
+ "AND mail2.creationDate <", sq(end.date),
+ "GROUP BY mail1.author, mail2.author", sep=" ")
dat <- dbGetQuery(con, query)

so that's now on top of your previous attempt on the function,
right? I assume this series is not against master, but is based
on one of your custom branches. Can you please squash the previous
version with this one so that we end up with just the correct version
in master?

Yes I will do that. It's nice to get early feedback on the initial
attempts, but in the end I won't pollute master with all trials :)

--Mitchell

Thanks, Wolfgang


return(dat)

Other related posts: