[edm-discuss] Re: Clustering with more DB tables

  • From: "Joseph E. Beck" <joseph.beck@xxxxxxxxx>
  • To: edm-discuss@xxxxxxxxxxxxx
  • Date: Wed, 24 Jun 2009 15:19:00 -0400

On Wed, Jun 24, 2009 at 3:10 PM, Alvaro Ortigosa <alvaro.ortigosa@xxxxxx>wrote:

> I use both Weka and SPSS. Both of them support extracting data from a sql
> database through a sql query. So you can use a join to get the data from
> different tables (memory problems can arise).

I tried using the SPSS interface, but gave up in frustration.  I'd advise
creating a new table/view* in a standard SQL client for a couple of reasons:
1.  Less likely to encounter weird problems.  As I said, SPSS was glitchy
for me, and Weka does not manage memory well.
2.  You've created a reusable artifact.
2a.  You don't have to spend the computation cost each time you do a
slightly different analysis, just reuse the table
2b.   Other members of the project can inspect and possibly use the table
you've built.  For larger projects, not having to reinvent the wheel is a
good thing.  Having the data as a table makes it easy to inspect for errors,
and manipulate if the form isn't quite right.


* I'm assuming views would work but am not sure.  MySQL did not support them
when I was a heavy user, so we just created tables.

> Optionaly, you can define a view on your database, if your DBMS support
> them.
> Best,
> A:
> 2009/6/24 qazmlp q <qazmlp1209@xxxxxxxxxxxxxx>
> On Wed, 24 Jun 2009 21:24:32 +0530 wrote
>> >Most of the tools do work that way.
>> Which way? Supporting only single table? or with multiple tables?
>> >  As a simple point you can form a
>> >temporary table containing all of the results for the purposes of a
>> > single process.  What Database or wrapper are you using?
>> I use mySQL. But I assume that the problem is a common one.
>> Isn't this a very common case? Having only a single table is unrealistic
>> for me.
>> Regards,
>> Michael
Joseph E. Beck
Research Scientist
Computer Science Department, Fuller Labs 138
Worcester Polytechnic Institute

