Database design assessment

  • From: Stéphane Faroult <sfaroult@xxxxxxxxxxxx>
  • To: "Oracle-L (E-mail)" <Oracle-L@xxxxxxxxxxxxx>
  • Date: Wed, 26 Jul 2017 15:27:49 +0800

Hello listers,

I'm writing a tool to extract tables, columns, FKs and indexes (in other words, I also know PKs and unique constraints) from a .mwb MySQL Workbench file (for those who are interested, it's a zip file that mostly contains ... an XML file).

<rant>Not the <table><row> ... </row> ... </table> type of XML, no, deeply hierarchical XML that is to XML what EAV is to database modelling, with mostly two tags, <value> and <link>, and what really tells you what you have between tags is hidden in attributes. Isn't it interesting to have data from a tool about relational modeling stored in a way that doesn't express a profound knowledge of the subject?</rant>

Anyway, I'm loading everything into an in-memory SQLite file where I can process data with civilized means. I teach a database class this fall, I'd like to give some design projects. This being China, I expect a large number of students, which means that I need quick and fair ways of assessing a model. I plan to compare student submission to some kind of "reference model" (knowing that they'll probably give to tables and columns names that may be different from my own or differently spelt, so I'll have to guess what is what), but I also want to be able to assess a design "on its own merits" when everything else fails.

So far I have queries that return:

- Isolate tables - tables that aren't involved into any FK relationship. I can imagine reasons to have this kind of table (parameter table), but not in the kind of exercise given to students

- "multiple legs" - tables that aren't referenced by any other, but reference MORE than two tables (the classic many-to-many relationship). Once again, I can imagine some reasons in real life for having this, but not in a student exercise.

- Tables where every column is nullable except perhaps an autoincremented one.

- Tables without a PK (of course)

- Tables that have a PK that is an autoincremented column and no other unique constraint (which means that you can happily insert duplicate rows)

- Single-column tables

- Ultra-wide tables (I define ultra-wide as more than three times the average number of columns when you exclude two-column label tables)

- (this is more debatable, because tools encourage bad habits) tables that aren't referenced by any other table and yet have a useless autoincrement numerical id.


If you have other ideas of things I could check in this context, please email me off-list.

Thanks,

Stéphane Faroult




--
//www.freelists.org/webpage/oracle-l


Other related posts:

  • » Database design assessment - Stéphane Faroult