Re: Remote table performance on complex queries/views

  • From: Lothar Flatz <l.flatz@xxxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Tue, 13 Sep 2016 11:36:16 +0200

That is an interesting case. Jonathan is correct as always. I like to add my 2 cent.
This seems to be a major design disaster. Instead of reading the column from one table you need two tables to provide the content of a column.
Usually the idea behind it to be flexible. You achieve it that way, but the cost of efficiency is often to high to pay.
In my eyes this is over ambitious database design.
It would be helpful to understand the meaning of table erptable575 and erptable620.
E.g. sometimes such design is supporting different languages.
It would also be helpful if the values of the foreign key columns (e.g. ltab.col25, ltab.col18..) are repeating.
If yes, some sort of caching can come into play.
Generally a join would be preferable to an inline select and a local select would be preferable to one containing remote access.
However, even an local join is still suboptimal, as Jonathan already stated.
There might be some limited options to rewrite the query without changing the code, like the sql translate in 12c or advanced rewrite or applying hints that enforce the optimizer to rewrite. Each of that would be cumbersome ,difficult and maybe impossible depending on the variation of those statements.
I just wanted to mention the option.
If a rewrite is not possible make sure each inline select is covered by a single index only access or an access to an IOT.

Regards

Lothar


On 13.09.2016 10:34, Jonathan Lewis wrote:



That type of code is bad within a single instance and guaranteed to be a disaster if the scalar subqueries go remote. Distributed queries are bad enough because they typically give you a choice of nested loop: large numbers of very small messages flooding the network (which is the effect you get from the subquery approach) or hash joins which pull excessive volumes of data across the network.

Mladen's suggested bodge has some merit - if you create read-only snapshots of the remote tables, with the minimum set of columns, any row restrictions you can invent, and only the minimum set of indexes you could then do an atomic fast refresh every few minutes.

For each MV you can create a view which reuses the original table name (create or replace view TABLEX as select * from MV_TABLEX) so that your local code doesn't have to change.

Oracle 12c can unnest inline scalar suqbueries and turn them into outer joins - but that won't necessariy help much, even if it did it right every time.


Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
------------------------------------------------------------------------
*From:* oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on behalf of Rich J [rjoralist3@xxxxxxxxxxxxxxxxxxxxx]
*Sent:* 12 September 2016 21:28
*To:* Oracle L
*Subject:* Remote table performance on complex queries/views

Hey all,

We're testing a vendor's app. Since the app generates dynamic SQL that mostly don't use binds, I've put their schema in a separate database and used views of our ERP tables over DB links for them to query. Performance was terrible, timing out interactive queries after five minutes. The vendor recommended that instead of those views of our ERP tables that we reference the tables more directly using the "ERP.ERPTABLE@ERPDB" syntax. The app isn't timing out any more, but performance is still suboptimal for our users.

The app queries in question appear to be simple, but it turns out that the simplicity is masked by the several layers of nested views that contain dozens of correlated subqueries, each to one of a handful of tables in the remote ERP DB. The views are something like:

           CREATE SYNONYM erpt1 FOR erpschema.erptable575@ERPDB;
           CREATE SYNONYM erpt2 FOR erpschema.erptable620@ERPDB;

           SELECT
( SELECT col6 FROM erpt1 WHERE erpt1pk = ltab.col25) description,
( SELECT col14 FROM erpt1 WHERE erpt1pk = ltab.col18) otherdesc,
( SELECT col7 FROM erpt1 WHERE erpt1pk = ltab.col7) randomstuff,
( SELECT col40 FROM erpt2 WHERE erpt2pk = ltab.col33) morestuff,
( SELECT newstuff FROM otherlocalview olv WHERE ltab.col99 = olv.col21 )
           ...
           FROM local_table ltab;

There's dozens of the inline SELECTs to the remote ERP tables, via the "erpt1" and other similar synonyms. And the "otherlocalview" contains similar subqueries and references to yet more views, etc. One particular explain plan takes almost a minute to come back and is over 320 steps.

So, I'm tasked with making this work. My kneejerk is to replace the subqueries with an outer joined table, but I really can't rewrite the vendor's code. The vendor claims that creating their schema in the ERP DB directly will take care of the performance issue. I have created a test schema in the test ERP DB and it does perform better for the end user, but the lack of binds is a deal breaker, having gone through that particular hell before. So I haven't done a more scientific comparison between the app schema being local to the ERP DB and being remote. I've also ruled out Data Guard and MVs for other non-related reasons...

Is there something from the optimizer point-of-view that I may have missed? Any other tricks of tweaking to get the last 10% improvement (instead of the 95% improvement by reworking the vendor app)?

Any pointers are appreciated!  TIA!
Rich



--




Other related posts: