[dba-SQLServer] Opinions invited: question about normalization

Gustav Brock gustav at cactus.dk
Mon Dec 29 10:26:00 CST 2014


Hi Arthur

I think you are mixing two very different aspects here: User interface and filtering method.

Having multiple "and" or multiple "or" is not so difficult, but mixing and/or is. I don't think there's an universally perfect method, it depends on your application.

As for your search techniques, maybe some of the new BI systems can be of help if you can define the volunteers' properties as dimensions.
For a project, I found that copying the normalized data to (nearly) flat search tables was quite effective while keeping the "true" data in the original structure. This also allows for an easy (though not instantaneous) complete rebuild of the search tables from scratch. It was done in SQL Server, no fancy tools.

/gustav

-----Oprindelig meddelelse-----
Fra: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] På vegne af Arthur Fuller
Sendt: 27. december 2014 19:14
Til: Peter Brawley
Emne: [dba-SQLServer] Opinions invited: question about normalization

I have in development an app that is intended for use by agencies that use a lot of volunteers. My initial design was fully normalized (by which I mean I went to 5NF not just petty 3NF), but this posed significant problems on the design of the FE, which is web-based. This means that attached to the Volunteers table are several child tables, each of which connects to a lookup table, and each of which child could have several rows. For example, consider Languages-Spoken, easy enough in itself, point to a lookup table listing languages, present a combo-box and that's that. But now we add Skills (Driver, Cook, Visits, etc.), and now just with these two lists, it's already complex: I need a Russian-speaking volunteer who can also serve as a Driver or Cook, say.

That leads to complex queries that eventually involve nested Having predicates, and that is tough to implement in a UI that recognizes that most of its users are, not to cast aspersions, but let's face it, a lot of users in this category are going to have problems with the intricacies of ANDs and ORs.

So I was thinking that there's another design strategy in which all the lookup tables are folded into a single table, with an identifier that describes the domain (i.e. languages spoken, skills offered, fields of interest, etc.), the point being that a successive number of predicates could be assembled from a single table, thus avoiding the complexities of multi-table queries.

I know that this design violates virtually everything Dr. Codd has written, but he was after all a mere demi-god, and as problems have progressed and products such as NoSQL have emerged, I've begun to doubt propositions I formerly held absolute. Disk space is cheap, speed is abundant, and (well not in my particular case) databases are expanding to billions of rows.

Any thoughts, my friends?

--
Arthur




More information about the dba-SQLServer mailing list