Last week I had a talk at Oracle’s OpenWorld 2011 titled Modern PL/SQL Code Checking and Dependency Analysis.
The problem I described in chapter 4 was to find all view columns using the column UNIT_COST of the table COSTS in the SH schema. Other usages of this column (e.g. in where or order by clauses) have to be ignored. To solve this problem within the Oracle Database Server 11.2 a parser is necessary (at least I’m not aware of another solution). Even a DBA_DEPENDENCY_COLUMNS view as described in Rob van Wijk’s post is not enough to solve this problem.
However, in this particular case, no custom or 3rd party parser is necessary. Oracle provides a procedure named PARSEQUERY in the PL/SQL package UTL_XML which is in fact well-suited to solve this problem as I will show later. First, I’d like explain which columns should be found by a dependency analysis procedure based on some sample views.
Oracle’s sales history demo schema SH provides a view named PROFITS, which is defined as follows:
CREATE OR REPLACE VIEW PROFITS AS
SELECT s.channel_id,
s.cust_id,
s.prod_id,
s.promo_id,
s.time_id,
c.unit_cost,
c.unit_price,
s.amount_sold,
s.quantity_sold,
c.unit_cost * s.quantity_sold TOTAL_COST
FROM costs c, sales s
WHERE c.prod_id = s.prod_id
AND c.time_id = s.time_id
AND c.channel_id = s.channel_id
AND c.promo_id = s.promo_id;
The columns using COSTS.UNIT_COSTS are highlighted.
The following view uses the column TOTAL_COSTS in GROSS_MARGIN (line 14) and GROSS_MARGIN_PERCENT (lines 14 and 15). The usage is not evident at the first glance since it is based on the column GROSS_MARGIN (line 4) of the named query GM and the column COST (line 8) in GM’s subquery. This kind of dependencies need to be identified.
CREATE OR REPLACE VIEW GROSS_MARGINS AS
WITH
gm AS (
SELECT time_id, revenue, revenue - cost AS gross_margin
FROM (
SELECT time_id,
unit_price * quantity_sold AS revenue,
total_cost AS cost
FROM profits
)
)
SELECT t.fiscal_year,
SUM(revenue) AS revenue,
SUM(gross_margin) AS gross_margin,
round(100 * SUM(gross_margin) / SUM(revenue), 2)
AS gross_margin_percent
FROM gm
INNER JOIN times t ON t.time_id = gm.time_id
GROUP BY t.fiscal_year
ORDER BY t.fiscal_year;
The next view does not present the data of COSTS.UNIT_COST as a column, even if the view depends on the table COSTS
CREATE OR REPLACE VIEW REVENUES AS
SELECT fiscal_year, revenue
FROM gross_margins;
The last view uses COSTS.UNIT_COST but not as part of a column expression and therefore has not to be reported. The usage in the order by clause is considered safe.
CREATE OR REPLACE VIEW SALES_ORDERED_BY_GM AS
SELECT channel_id,
cust_id,
prod_id,
promo_id,
time_id,
amount_sold,
quantity_sold
FROM profits
ORDER BY (unit_price - unit_cost) DESC;
So, the following result of the dependency analysis is expected:
SCHEMA | VIEW | COLUMN |
---|---|---|
SH | PROFITS | UNIT_COST |
SH | PROFITS | TOTAL_COST |
SH | GROSS_MARGINS | GROSS_MARGIN |
SH | GROSS_MARGINS | GROSS_MARGIN_PERCENT |
Exactly this result is created by the following query
SELECT *
FROM TABLE(coldep_pkg.get_dep('sh', 'costs', 'unit_cost'));
Now I just list all the code snippets I’ve written to create this result. Please note, that this is considered just a proof-of-concept code to show how UTL_XML.PARSEQUERY could be used for SQL dependency analysis in conjunction with Oracle dictionary views. This means that this is not a complete implementation. For example, wild cards (*) are not handled which may lead to missing dependencies. Additionally, table/view sources are not checked which may lead to false positives (in case a column is used in multiple view/table sources). – Please feel free to complete the code. However, an update is highly appreciated ;-)
GRANT EXECUTE ON SYS.UTL_XML TO SH;
CREATE OR REPLACE TYPE "SH"."COLDEP_TYP" AS
OBJECT (schema_name VARCHAR2(30),
view_name varchar2(30),
column_name VARCHAR2(30))
/
CREATE OR REPLACE TYPE "SH"."COLDEP_L" IS TABLE OF coldep_typ
/
CREATE OR REPLACE PACKAGE "SH"."COLDEP_PKG" IS
FUNCTION parse_query(p_query IN VARCHAR2) RETURN xmltype;
FUNCTION get_dep(p_schema_name IN VARCHAR2,
p_object_name IN VARCHAR2,
p_column_name IN VARCHAR2) RETURN coldep_l
PIPELINED;
FUNCTION process_view(p_schema_name IN VARCHAR2,
p_view_name IN VARCHAR2,
p_column_name IN VARCHAR2,
p_query IN CLOB) RETURN coldep_l;
END coldep_pkg;
/
CREATE OR REPLACE PACKAGE BODY "SH"."COLDEP_PKG" IS
FUNCTION parse_query(p_query IN VARCHAR2) RETURN xmltype IS
v_clob CLOB;
v_xml xmltype;
BEGIN
dbms_lob.createtemporary(v_clob, TRUE);
-- parse query and get XML as CLOB
sys.utl_xml.parsequery(USER, p_query, v_clob);
-- create XMLTYPE from CLOB
v_xml := xmltype.createxml(v_clob);
dbms_lob.freetemporary(v_clob);
RETURN v_xml;
END parse_query;
FUNCTION get_dep(p_schema_name IN VARCHAR2,
p_object_name IN VARCHAR2,
p_column_name IN VARCHAR2) RETURN coldep_l
PIPELINED IS
BEGIN
-- query dictionary dependencies
FOR v_dep IN (SELECT d.owner AS schema_name,
d.name AS view_name,
v.text AS query_text
FROM all_dependencies d
INNER JOIN all_views v
ON v.owner = d.owner
AND v.view_name = d.name
WHERE d.referenced_owner = upper(p_schema_name)
AND d.referenced_name = upper(p_object_name)
AND d.type = 'VIEW')
LOOP
-- process every fetched view
FOR v_views IN (
SELECT VALUE(pv) coldep
FROM TABLE(process_view(v_dep.schema_name,
v_dep.view_name,
p_column_name,
v_dep.query_text)) pv)
LOOP
-- return column usages in v_dep.view_name
PIPE ROW(v_views.coldep);
-- get column usages of views using v_dep.view_name (recursive calls)
FOR v_recursive IN (
SELECT VALUE(dep) coldep
FROM TABLE(get_dep(v_views.coldep.schema_name,
v_views.coldep.view_name,
v_views.coldep.column_name)) dep)
LOOP
-- return column usages of recursive call
PIPE ROW(v_recursive.coldep);
END LOOP;
END LOOP;
END LOOP;
END get_dep;
FUNCTION process_view(p_schema_name IN VARCHAR2,
p_view_name IN VARCHAR2,
p_column_name IN VARCHAR2,
p_query IN CLOB) RETURN coldep_l IS
v_search_l coldep_l := coldep_l(coldep_typ(NULL,
NULL,
p_column_name));
v_xml xmltype;
v_previous_count INTEGER := 0;
v_coldep_l coldep_l := coldep_l();
BEGIN
-- parse view query
v_xml := parse_query(p_query);
-- get inline dependencies from secondary select lists
-- TODO: handle table/view source and wildcard properly
WHILE v_previous_count < v_search_l.count
LOOP
v_previous_count := v_search_l.count;
FOR v_secondary IN (
SELECT nvl(x.alias_name, x.column_reference) AS alias_name
FROM (SELECT t.select_list_item,
t.alias_name,
extractvalue(VALUE(c), 'COLUMN') AS column_reference
FROM xmltable('//SELECT_LIST_ITEM[ancestor::FROM or ancestor::WITH]'
passing v_xml
columns select_list_item xmltype path '//SELECT_LIST_ITEM',
alias_name VARCHAR2(30) path '//COLUMN_ALIAS') t,
TABLE(xmlsequence(extract(select_list_item, '//COLUMN'))) c) x
WHERE upper(x.column_reference) IN (SELECT upper(column_name)
FROM TABLE(v_search_l))
AND upper(alias_name) NOT IN (SELECT upper(column_name)
FROM TABLE(v_search_l)))
LOOP
-- add internal column usage
v_search_l.extend;
v_search_l(v_search_l.count) := coldep_typ(NULL,
NULL,
v_secondary.alias_name);
END LOOP;
END LOOP;
-- analyze primary select list
-- TODO: handle table/view source and wildcard properly
FOR v_primary IN (
SELECT x.column_id, atc.column_name
FROM (SELECT t.select_list_item,
t.column_id,
extractvalue(VALUE(c), 'COLUMN') AS column_reference
FROM xmltable('//SELECT_LIST_ITEM[not (ancestor::FROM) and not (ancestor::WITH)]'
passing v_xml
columns column_id FOR ordinality,
select_list_item xmltype path '//SELECT_LIST_ITEM') t,
TABLE(xmlsequence(extract(select_list_item, '//COLUMN'))) c) x
INNER JOIN all_tab_columns atc
ON atc.owner = p_schema_name
AND atc.table_name = p_view_name
AND atc.column_id = x.column_id
WHERE upper(x.column_reference) IN (SELECT upper(column_name)
FROM TABLE(v_search_l))
ORDER BY x.column_id)
LOOP
-- add external column usage
v_coldep_l.extend;
v_coldep_l(v_coldep_l.count) := coldep_typ(p_schema_name,
p_view_name,
v_primary.column_name);
END LOOP;
-- return column dependencies
RETURN v_coldep_l;
END process_view;
END coldep_pkg;
/
Below you find the XML parser output of the query defined in the view GROSS_MARGINS. The model becomes quite clear, even if I could not find a schema description.
<QUERY>
<WITH>
<WITH_ITEM>
<QUERY_ALIAS>GM</QUERY_ALIAS>
<QUERY>
<SELECT>
<SELECT_LIST>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<SUB>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
<COLUMN_REF>
<COLUMN>COST</COLUMN>
</COLUMN_REF>
</SUB>
<COLUMN_ALIAS>GROSS_MARGIN</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
</SELECT_LIST>
</SELECT>
<FROM>
<FROM_ITEM>
<QUERY>
<SELECT>
<SELECT_LIST>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<MUL>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>UNIT_PRICE</COLUMN>
</COLUMN_REF>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>QUANTITY_SOLD</COLUMN>
</COLUMN_REF>
</MUL>
<COLUMN_ALIAS>REVENUE</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>TOTAL_COST</COLUMN>
</COLUMN_REF>
<COLUMN_ALIAS>COST</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
</SELECT_LIST>
</SELECT>
<FROM>
<FROM_ITEM>
<TABLE>PROFITS</TABLE>
</FROM_ITEM>
</FROM>
</QUERY>
</FROM_ITEM>
</FROM>
</QUERY>
</WITH_ITEM>
</WITH>
<SELECT>
<SELECT_LIST>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>FISCAL_YEAR</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<SUM>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
</SUM>
<COLUMN_ALIAS>REVENUE</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<SUM>
<COLUMN_REF>
<COLUMN>GROSS_MARGIN</COLUMN>
</COLUMN_REF>
</SUM>
<COLUMN_ALIAS>GROSS_MARGIN</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<ROUND>
<DIV>
<MUL>
<LITERAL>100</LITERAL>
<SUM>
<COLUMN_REF>
<COLUMN>GROSS_MARGIN</COLUMN>
</COLUMN_REF>
</SUM>
</MUL>
<SUM>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
</SUM>
</DIV>
<LITERAL>2</LITERAL>
</ROUND>
<COLUMN_ALIAS>GROSS_MARGIN_PERCENT</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
</SELECT_LIST>
</SELECT>
<FROM>
<FROM_ITEM>
<JOIN>
<INNER/>
<JOIN_TABLE_1>
<QUERY_ALIAS>GM</QUERY_ALIAS>
</JOIN_TABLE_1>
<JOIN_TABLE_2>
<TABLE>TIMES</TABLE>
<TABLE_ALIAS>T</TABLE_ALIAS>
</JOIN_TABLE_2>
<ON>
<EQ>
<COLUMN_REF>
<TABLE>TIMES</TABLE>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
<COLUMN_REF>
<TABLE_ALIAS>GM</TABLE_ALIAS>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
</EQ>
</ON>
</JOIN>
</FROM_ITEM>
</FROM>
<GROUP_BY>
<EXPRESSION_LIST>
<EXPRESSION_LIST_ITEM>
<COLUMN_REF>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>FISCAL_YEAR</COLUMN>
</COLUMN_REF>
</EXPRESSION_LIST_ITEM>
</EXPRESSION_LIST>
</GROUP_BY>
<ORDER_BY>
<ORDER_BY_LIST>
<ORDER_BY_LIST_ITEM>
<COLUMN_REF>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>FISCAL_YEAR</COLUMN>
</COLUMN_REF>
</ORDER_BY_LIST_ITEM>
</ORDER_BY_LIST>
</ORDER_BY>
</QUERY>
Please note that UTL_XML.PARSEQUERY is suited for extended query dependency analysis only. DML may be parsed, but the resulting model is incomplete with 11.2.0.2 (e.g. clauses missing in the select statement are not included in the model, like the SET clause in an update statement). If you need to analyze PL/SQL beyond PL/Scope you still may need a 3rd party parser.
9 Comments
Hi Phillip, thanks for the explanation its been really helplful….
Is there a way i could a SQL analysis on 2 different queries, I had this post on the oracle forum as well , please see this link : link
It would be a great help if you could guide me on how i should proceed about the same.
1)What are the initial development steps i need to do?
2)What would be the modules i would have to plan out so as to accomplish the same
It would be great, if you could help out………..
Thanks!!
Hi Nayan,
Yes, you may analyze or compare two SQL statements. The statements and their results. Based on your comparison requirements this can become a quite complex and time consuming task. Honestly I do not understand your requirements good enough to recommend how to proceed exactly to solve your problem. But I recommend to define some simple test cases with expected results (SQL Statement 1, SQL statement 2, exact comparison result) to allow others to provide an appropriate solution approach.
Brilliant stuff.
One way of solving the wildcard problem might be running the view sources through DBMS_UTILITY.EXPAND_SQL_TEXT() first – this will expand the wildcard columns and also recursively expand the underlying views. (The latter might not be desirable, though.)
EXPAND_SQL_TEXT() is a 12c feature, but is available on 11g too as DBMS_SQL2.EXPAND_SQL_TEXT() – undocumented, yet seems to work just fine.
Unfortunately, what coldep_pkg.get_dep() does is impact, while I discovered this blog post looking for a solution for lineage, which is the opposite problem: given the SQL query, find all table columns it depends on. Here, I’d definitely use EXPAND_SQL_TEXT() as the first step, and this post provides the techniques to do the rest – still not easy, but frankly until today I thought it was virtually impossible.
Thanks Branko.
Please note that in Oracle Database 18c, the utl_xml.parsequery function ist not accessible anymore. See https://www.salvis.com/blog/2018/04/30/white-listed-pl-sql-programs-in-oracle-database-18c/ for details.
Thanks for the heads up! I don’t have an 18c installation on hand, but isn’t parsequery() available in 18cR3 (https://www.morganslibrary.org/reference/pkgs/utl_xml.html#uxml10)?
It’s available, but it is protected by
ACCESSIBLE BY (PACKAGE SYS.DBMS_METADATA)
. I’ve used this workaround in the plscope-utils project.All is not lost then, thanks!
Can you help me. I can see UTL_XML_LIB, however the File Spec is NULL. What is the name of the file from ORA_HOME that should be ther?
FILE_SPEC
inDBA_LIBRARIES
is empty, becauseUTL_XML_LIB
is part of a static library. Why do you need this information?