Anupam, Think about the purpose of a primary key constraint. The question you need to answer is: why do you think you need the constraint? All relational tables must be uniquely identifiable, so that some column or combination of columns (even if it is all the columns) represents a unique identifier. I don't wish to discuss why that is true with anyone; go back to Codd and Date for that. However, the presence of a unique identifier is one thing. Implementing the mechanism of a primary key (or unique key or foreign key) constraint is quite another. What the reasons for doing so?
So, with the fact table in a data warehouse, do *any* of these reasons truly apply? I'm sure that the first item (i.e. protecting against duplicate entries) seems the most important, but riddle me this: is a unique index (which is the enforcement mechanism used by primary key and unique key constraints in Oracle) truly the only way to protect against duplicate entries? Many people say yes, forgetting that there really are other ways, very low-tech and easy to overlook, but they exist. The logic of ETL processes can ensure that duplicate entries don't occur. Validation jobs running during and after data loading can detect whether any errors occurred during data loading, alerting data warehouse administrators of the problem. The fact is, the inherent latency involved with the ETL or ELT process does not mandate the mechanism of a unique index to enforce uniqueness. Even if someone were to argue that a unique index is indeed the very best mechanism for enforcing uniqueness, where is the need? In a dimensional data model, what type of query is it that needs to uniquely identify a row in a fact table? The answer is: a mistake. If you have a query that absolutely needs a unique identifier in a fact table, then it means that your data model has a relationship between two (or more) fact tables, and that is an inherent mistake in a dimensional data model. When that occurs, it is time to go back to the design floor and rethink assumptions and redraw the model. So, think it through. The unique identifier for a fact table should be the combination of all the dimensional key columns. There is no need for a surrogate key on the fact table, because all of the dimensional key columns are themselves surrogate keys, and to understand the reasoning for that, go back and remember the reasons surrogate keys are used in data warehouses. The presence of a unique identifier on the fact table does not mean that it needs to be enforced by a primary key. Uniqueness can be enforced in other ways, and any query logic that may seem to necessitate a unique identifier is either itself in error or has exposed an error in the data model. Sorry for the brevity, but don't be afraid to question authority on this matter. You've discovered only the first of many disadvantages of the use of unique indexes in DW environments on Oracle. Oracle can do a magnificent job with data warehouse databases, but sometimes you have to remember that the mechanisms provided by Oracle (and other RDBMS's) aren't always the best implementation choice. Keep an open mind, and keep in mind the purpose of the application. Hope this helps... Tim Gorman consultant -> Evergreen Database Technologies, Inc. postal => P.O. Box 630791, Highlands Ranch CO 80163-0791 website => http://www.EvDBT.com/ email => Tim@xxxxxxxxx mobile => +1-303-885-4526 fax => +1-303-484-3608 Lost Data? => http://www.ora600.be/ for info about DUDE... On 8/3/2010 10:53 PM, Anupam Pandey wrote: Hi Listers,-- //www.freelists.org/webpage/oracle-l |