That type of code is bad within a single instance and guaranteed to be a
disaster if the scalar subqueries go remote. Distributed queries are bad enough
because they typically give you a choice of nested loop: large numbers of very
small messages flooding the network (which is the effect you get from the
subquery approach) or hash joins which pull excessive volumes of data across
the network.
Mladen's suggested bodge has some merit - if you create read-only snapshots of
the remote tables, with the minimum set of columns, any row restrictions you
can invent, and only the minimum set of indexes you could then do an atomic
fast refresh every few minutes.
For each MV you can create a view which reuses the original table name (create
or replace view TABLEX as select * from MV_TABLEX) so that your local code
doesn't have to change.
Oracle 12c can unnest inline scalar suqbueries and turn them into outer joins -
but that won't necessariy help much, even if it did it right every time.
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on behalf
of Rich J [rjoralist3@xxxxxxxxxxxxxxxxxxxxx]
Sent: 12 September 2016 21:28
To: Oracle L
Subject: Remote table performance on complex queries/views
Hey all,
We're testing a vendor's app. Since the app generates dynamic SQL that mostly
don't use binds, I've put their schema in a separate database and used views of
our ERP tables over DB links for them to query. Performance was terrible,
timing out interactive queries after five minutes. The vendor recommended that
instead of those views of our ERP tables that we reference the tables more
directly using the "ERP.ERPTABLE@ERPDB" syntax. The app isn't timing out any
more, but performance is still suboptimal for our users.
The app queries in question appear to be simple, but it turns out that the
simplicity is masked by the several layers of nested views that contain dozens
of correlated subqueries, each to one of a handful of tables in the remote ERP
DB. The views are something like:
CREATE SYNONYM erpt1 FOR erpschema.erptable575@ERPDB;
CREATE SYNONYM erpt2 FOR erpschema.erptable620@ERPDB;
SELECT
( SELECT col6 FROM erpt1 WHERE erpt1pk = ltab.col25) description,
( SELECT col14 FROM erpt1 WHERE erpt1pk = ltab.col18) otherdesc,
( SELECT col7 FROM erpt1 WHERE erpt1pk = ltab.col7) randomstuff,
( SELECT col40 FROM erpt2 WHERE erpt2pk = ltab.col33) morestuff,
( SELECT newstuff FROM otherlocalview olv WHERE ltab.col99 =
olv.col21 )
...
FROM local_table ltab;
There's dozens of the inline SELECTs to the remote ERP tables, via the "erpt1"
and other similar synonyms. And the "otherlocalview" contains similar
subqueries and references to yet more views, etc. One particular explain plan
takes almost a minute to come back and is over 320 steps.
So, I'm tasked with making this work. My kneejerk is to replace the subqueries
with an outer joined table, but I really can't rewrite the vendor's code. The
vendor claims that creating their schema in the ERP DB directly will take care
of the performance issue. I have created a test schema in the test ERP DB and
it does perform better for the end user, but the lack of binds is a deal
breaker, having gone through that particular hell before. So I haven't done a
more scientific comparison between the app schema being local to the ERP DB and
being remote. I've also ruled out Data Guard and MVs for other non-related
reasons...
Is there something from the optimizer point-of-view that I may have missed?
Any other tricks of tweaking to get the last 10% improvement (instead of the
95% improvement by reworking the vendor app)?
Any pointers are appreciated! TIA!
Rich