Hi, this is Dan Tow, the author of SQL Tuning, wading in with a few remarks: One of the list members (no, not my editor) let me know of this thread, so I thought I might clear up a few things with a response. There's been some speculation regarding the origin of the method I describe in the book. I came up with it on my own about ten years ago, without a literature search, but later learned that there was some fairly similar stuff out there in the academic literature that long predated my entry into the field, so I make no particular claim to the method being hugely original. However, I believe that the book is at least original as a *non-academic*, readable, detailed description of the approach. I'd also like to add a few remarks regarding Mike Thomas's well-thought-out review of the book. I'd like to begin by thanking Mike for his review. Where he liked the book, I heartily agree;-), and where he listed minuses, I find his remarks well-considered. Even where I mildly disagree, he points up choices I made that I could have explained better in the book. Since late is better than never, I'm taking this opportunity to explain my choices, here, since each of the concerns Mike expressed maps to a conscious choice I made in writing the book. My responses follow his remarks, below: Thanks, Dan Tow dantow@xxxxxxxxxxxxxx 650-858-1557 www.singingsql.com > -------------------------------------------------------------------- > > I read it cover-to-cover, and its great. It covers > many database types, not just Oracle. > > Replaces 'trial-and-error' approach with ... well lets > call it a trial (e.g. hard) ... a manual approach to Yes, it *is* hard at first. I encourage my readers to keep at it, though. Most new tools just slow you down, until you master them, and the tools this book teaches are no exception. I learned pottery long ago, and the potter's wheel just got in the way, at first, but few potters would do without one! > tuning SQL. The method it teaches is a very manual > procedure for analyzing and tuning SQL. Its nothing > like any other book I've read. > > I'm undecided how to recommend it. So, here are some > plus and minus points in my opinion: > > PLUS: > ---------- > 1) Cool tricks I've never seen before to drive SQL > paths. I'm talking 'way-beyond' concat null or +0. > > 2) Manual techniques to analyze and derive an > 'optimal' and 'robust' SQL access plan. After you > learn these techniques you should know if CBO is > helping or not. > > 3) Diagramming techniques for SQL that are wonderful. > And, mathematical techniques which go deep into > selecting a robust SQL plan. > > 4) Discussion supports almost any SQL database, and > especially good with general purpose databases. > > 5) Good descriptions that eliminate misunderstanding > of SQL syntax and side-effects on tuning SQL. > > > MINUS: > ---------- > 1) Does not emphasize how to work with Oracle's CBO. > Following the 80/20 rule, if you accept CBO and > statistics as good on 80% of SQL, then what do you do > with the 20% problem cases? > Options might include a) figure out how to tell/hint > the CBO to succeed, b) try to manually force the SQL > plan, or c) try something else e.g. partitions, > parallel, analytical fns, stored outlines, 10g new > stuff, etc.. Hmmmm? Having friends on Oracle's CBO team, I'd have to say that *more* than 80% of all SQL is fine without manual tuning. However, it is astonishing how much of the *problem* SQL can be improved, and how many times faster it can usually be made. Problem SQL is *almost invariably* long-running because it has something wrong with it. If you use the right methodologies to find the SQL to tune, you will almost never find that you just have to live with how it already performs. (However, you will sometimes require outside-the-box solutions such as I describe in Chapter 10.) Given all that, I find that the approach to take is to 0) Find the right SQL to tune. If you do this well, you won't need to tune much, and quality-of-result is more important than how long it takes to find the result. 1) Decide, using the method in the book, what the right execution plan is, without paying any attention to what the CBO is already doing. (Thinking about the current plan just prejudices you to repeat its mistake. This is similar to why it is hard to debug your own code, because you already know what it is *supposed* to do, which just gets in the way of seeing what it *is* doing.) Where even the best plan is too slow, figure out how to modify the application using the outside-the-box approaches of Chapter 10. 2) Modify the SQL, or the application, as needed. Once you have step one (which is the subject of 80% of the book) behind you, this is straightforward, and covered reasonably well, I think, for all three databases, in the 20% of the book that is left to this step. There are certainly details, as Mike describes, such as stored outlines, that I considered outside the scope of the book, but I think these details are never the *only* way to solve the generally easy step 2. > > 2) Heavy focus on nested loops with few and > complicated exceptions for anything else. I think > (guess) if you are doing a data warehouse the > techniques might not help. It is true that I have spent little of my career on data-warehouse applications, but almost all the applications I've worked on had major long- running batch components, which call for the same tuning techniques that apply to data warehouses. I could have explained my emphasis on nested loops better, in retrospect: Even for the biggest, ugliest data-warehouse query, you should *first* find the best (or very nearly best) possible all-nested-loops plan. It turns out to require 2 long chapters to thoroughly describe how to do that, not even counting the chapter that describes how to even define the problem, mathematically, in the form of a query diagram. *After* you've found the best nested-loops plan, you will very often find that it is so good that it is not worth the bother to tune it further! Even where you ought to tune it further, you need to have the join order and running rowcounts of that best nested-loops plan to perform the relatively simple calculations I describe at the end of Chapter 6 to correctly choose where you need hash joins. If, instead, you try to decide on join methods before you know the best nested- loops join order, you will frequently choose hash joins to tables being reached at the wrong point (with far too many joining rows), where a nested-loops join at the *right* point in the join order would be cheaper. Much of the impression of the usefulness of hash joins, I believe, comes about where a *bad* join order was made much-less-bad with one or more hash joins, where a *better* solution would have been a corrected join order. Correct join orders rarely benefit *much* from hash joins, except where poorly designed applications read more rows than they should. (Hash joins to small tables are frequently a *little* better, and I don't suggest overriding the CBO's join-method choice in these cases, but the improvement is most often very minor when the join order is correct.) > > 3) Does not discuss Oracle's nor the others (MS SQL > Server and DB2/2) analytical functions. Personally, > we've used these to improve our report SQL a lot. Generally, the analytical functions still require that you read the raw rows - I focus on getting to those rows fast - if you do that, the rest is generally trouble-free. > > 4) Technique based on row counts, and I'm told it > would be better with block counts. I have not tested > yet, but I'll let you know after OP101. The technique *really* emphasizes row counts to the *largest* tables. (The heuristics are set up to reach these tables with as few rows as possible, even if that means reading more rows in the smaller tables. Roughly, this in turn is a good way to minimize physical I/Os and runtime, since you'll generally reach one big-table block per big-table row, and the big tables are going to be by far the most-poorly-cached objects in the queries (including the better-cached, more-compact big-table indexes that you use to reach the big tables).) That said, the technique *also* works to do quite a good job of minimizing rowcounts to all the tables, which in turn does quite a good job of minimizing block counts. I've always been acutely aware of the true goals and the true cost function we need to minimize, but it is astonishing, really, how rarely you need to adjust the answer you get by simply minimizing rowcounts, with an emphasis on the largest-table rowcounts. I probably could have added another 50 pages or so of complexity to handle the extremely rare cases where the distinction matters enough to bother, but, remember, you (and everyone else) *already* find the method hard, especially at first - I'd rather get the main points across well, and leave it to the top-top experts (and the automated optimizers) to hyper-optimize the very rare corner cases. > > The reason I think the book is great is because I want > to know how close I'm getting on 'problem' SQL, and > when I need to try something else. Some of Tow's > techniques are not discussed anywhere else. The > diagramming and mathematical techniques are gold. The > analysis and tuning methods work on MySQL databases. > > In light of recent thread discussions, Tow's book is > not an arsenal of weapons to fight the Oracle SQL > tuning war. Rather, its more like GPS and a sniper > rifle (please don't put this on your cube wall). > > Regards, > > Mike Thomas Dan Tow dantow@xxxxxxxxxxxxxx 650-858-1557 www.singingsql.com We make SQL sing! ---------------------------------------------------------------- Please see the official ORACLE-L FAQ: http://www.orafaq.com ---------------------------------------------------------------- To unsubscribe send email to: oracle-l-request@xxxxxxxxxxxxx put 'unsubscribe' in the subject line. -- Archives are at //www.freelists.org/archives/oracle-l/ FAQ is at //www.freelists.org/help/fom-serve/cache/1.html -----------------------------------------------------------------