Finding Wrong Hints

2022-01-14

Introduction

I have used the Oracle Database for many years. And I use hints. For experiments, but also in production code. There are cases when you know more than the Oracle Database. For example about the cardinality of a data source or the number of result rows to process or the number of expected executions of a statement. Hints are a way to provide additional information, limit the solution space and enable the database to do a better job. That’s a good thing.

Hints Are Instructions

Hints are passed as special comments at a certain position in SQL statements. They are comments, but they are also instructions. They have to be followed. However, there are cases when hints are not applicable. For example when you request the optimizer to use an index when there is no index defined for the underlying table. In such a case the Oracle Database has basically two options. Either throw an error or ignore the invalid instruction and find another solution. The Oracle Database does the latter.

Hint Report

Starting with version 19c you can produce a hint report that reveals unused hints. Here’s an example:

Unused hint

create table t (c1 integer, c2 varchar2(20));
insert into t values (1, 'one');
insert into t values (2, 'two');
select /*+ index(t) */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /*+ index(t) */ * from t where c1 > 0
 
Plan hash value: 1601196873
 
------------------------------------------
| Id  | Operation                 | Name |
------------------------------------------
|   0 | SELECT STATEMENT          |      |
|   1 |  TABLE ACCESS STORAGE FULL| T    |
------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
 
   1 -  SEL$1 / "T"@"SEL$1"
         U -  index(t)

The hint index(t) defined on line 4 is valid, but it’s reported as unused on line 25. No wonder. There is no index defined on table t.

Let’s create an index and rerun the query.

Used hint

create unique index t_c1_i on t(c1);
select /*+ index(t) */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /*+ index(t) */ * from t where c1 > 0
 
Plan hash value: 2704710798
 
------------------------------------------------------
| Id  | Operation                           | Name   |
------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T      |
|   2 |   INDEX RANGE SCAN                  | T_C1_I |
------------------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
 
   1 -  SEL$1 / "T"@"SEL$1"
           -  index(t)

Now the hint index(t) defined on line 2 is reported as used on line 24.

Mixing Hints and Comments

What happens if we mix hints and comments? It depends on where you place the comment. Let’s look at the next example.

Hint followed by unknown tokens

select /*+ index(t) forcing unnecessary index access */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /*+ index(t) forcing unnecessary index access */ * from t where 
c1 > 0
 
Plan hash value: 2704710798
 
------------------------------------------------------
| Id  | Operation                           | Name   |
------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T      |
|   2 |   INDEX RANGE SCAN                  | T_C1_I |
------------------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 4 (E - Syntax error (3))
---------------------------------------------------------------------------
 
   1 -  SEL$1
         E -  forcing
         E -  index 
         E -  unnecessary
 
   1 -  SEL$1 / "T"@"SEL$1"
           -  index(t)

The comment forcing unnecessary index access on line 1 is interpreted as a series of hints and reported as errors on lines 24 to 26. The token access was not reported. However, the hint index(t) was reported as used on line 29.

What happens if we move the comment to the beginning?

Unknown tokens followed by hint

select /*+ forcing unnecessary index access index(t) */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /*+ forcing unnecessary index access index(t) */ * from t where 
c1 > 0
 
Plan hash value: 2704710798
 
------------------------------------------------------
| Id  | Operation                           | Name   |
------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T      |
|   2 |   INDEX RANGE SCAN                  | T_C1_I |
------------------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3 (E - Syntax error (3))
---------------------------------------------------------------------------
 
   1 -  SEL$1
         E -  forcing
         E -  index 
         E -  unnecessary

The same invalid hints are reported as before on lines 24 to 26. However, the hint index(t) was used but not reported as such. This seems to be a limitation of the current hint report in the Oracle Database 21c.

Anyways, it clearly shows that you should not mix comments and hints. Instead, you should write it like this:

Distinguish between comments and hints

select /* forcing unnecessary index access */ /*+ index(t) */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /* forcing unnecessary index access */ /*+ index(t) */ * from t 
where c1 > 0
 
Plan hash value: 2704710798
 
------------------------------------------------------
| Id  | Operation                           | Name   |
------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T      |
|   2 |   INDEX RANGE SCAN                  | T_C1_I |
------------------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
 
   1 -  SEL$1 / "T"@"SEL$1"
           -  index(t)

Now the hint index(t) is reported as used. All good, right?

The Problem

I like statically-type languages. Mainly because errors are reported at compile time whenever possible. However, to check hints I need to produce an explain plan. This is possible for a single statement only. This is cumbersome especially when you write code in PL/SQL. As far as I know, there is no option to produce a compile error for invalid hints.

I recently reviewed a system and found a lot of invalid hints. Here are some real-life hints copied from a production code base:

/*+ parallel 4 */
/*+ no_xml_query_rewrite +materialize */
/*+ materialized */
/*+ first rows cardinality (a,10) */
/*+ append nologging */
/*+ le ading(g) u se_nl(g) u se_hash(p, b) */

The last example is a kind of commented-out hint series. In this case, it’s clearly commented-out code. But if you see just a single hint like /*+ le ading(g) */ in the code, you do not know if the space after le was entered intentionally or by accident.

So, how can we identify invalid hints in our code?

Step 1 – Distinguish Between Comments and Hints

We can configure Oracle’s SQL Developer to show hints in a different colour than comments. Here’s the screenshot of an example I showed above:

Go to this GitHub repository and follow the instructions to configure your SQL Developer installation accordingly. See also this blog post for more information about the Arbori code that makes such code highlighting possible.

This step make hints stand out in your code. However, it does not reveal invalid hints.

Step 2 – Install db* CODECOP for SQL Developer

To reveal invalid hints we need a linter. A tool that does some static code analysis. db* CODECOP is such a tool suite. The SQL Developer extension is available for free. It checks the editor content for violations of the Trivadis PL/SQL & SQL Coding Guidelines. Furthermore, db* CODECOP allows you to implement custom guideline checks. The example GitHub repository provides the following four guideline checks regarding hints:

G-9600: Never define more than one comment with hints.
G-9601: Never use unknown hints.
G-9602: Always use the alias name instead of the table name.
G-9603: Never reference an unknown table/alias.

To install db* CODECOP and these additional custom guideline checks follow the instructions in this GitHub repository.

Finding Wrong Hints With db* CODECOP

I asked my followers on Twitter if this hint is valid:

The result is not really representative. However, 25% thought that /*+ +materialize */ is a valid hint.

Checking the code with db* CODECOP reveals that the hint is invalid and the majority of the poll participants were right.

Verify Result

But is the result of db* CODECOP correct? The following explain plan shows that the hint /*+ +materialize */ is not reported at all. It’s treated as a comment. Another example is where the hint report is incomplete.

Hint report for +materialize

with e as (
   select /*+ +materialize */ *
     from emp
    where deptno = 10
)
select *
  from e;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
with e as (    select /*+ +materialize */ *      from emp     where 
deptno = 10 ) select *   from e
 
Plan hash value: 3956160932
 
------------------------------------------
| Id  | Operation                 | Name |
------------------------------------------
|   0 | SELECT STATEMENT          |      |
|   1 |  TABLE ACCESS STORAGE FULL| EMP  |
------------------------------------------

Let’s run the same query after removing the extra + in the hint:

Hint report for materialize

with e as (
   select /*+ materialize */ *
     from emp
    where deptno = 10
)
select *
  from e;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
with e as (    select /*+ materialize */ *      from emp     where 
deptno = 10 ) select *   from e
 
Plan hash value: 3494145522
 
--------------------------------------------------------------------------------
| Id  | Operation                                | Name                        |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                             |
|   1 |  TEMP TABLE TRANSFORMATION               |                             |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_DFD9DB186_8AAEBD74 |
|   3 |    TABLE ACCESS STORAGE FULL             | EMP                         |
|   4 |   VIEW                                   |                             |
|   5 |    TABLE ACCESS STORAGE FULL             | SYS_TEMP_DFD9DB186_8AAEBD74 |
--------------------------------------------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
 
   2 -  SEL$1
           -  materialize

Now, the materialize hint has an effect on the execution plan and the hint is reported as used on line 33.

Conclusion

I believe that hints are required for certain use cases. You may have a different opinion. However, if you are using hints in your code you should ensure that they are valid. db* CODECOP can help you to do that. The SQL Developer extension is free. Just use it.

8 Comments

Jonathan Lewis says:

2022-01-17 at 12:41

Hair-splitting here, but I think the “+materialize” is not reported in the hint report because the + is tokenized, recognised as an illegal token, and the hint/comment text is ignored from that point onwards – so “materialize” isn’t even seen.

You’d get the same effect with e.g. /*+ no materialize */
This leads to “random” failtures due to writing style when people mix hints and comments in their hints, e.g.

explain plan for select /*+ need garbage reporting index(t1) */ * from t1
explain plan for select /*+ need to report garbage index(t1) */ * from t1

The first one reports 3 errors and one hint
The second one reports 1 error (need) and no hints because “to” was tokenized and caused the rest to be discarded

Regards
Jonathan Lewis

Philipp Salvisberg says:

2022-01-17 at 13:26

Thank you Jonathan for commenting. I adapt your second example for completeness:

create table t (c1 integer, c2 varchar2(20));
insert into t values (1, 'one');
insert into t values (2, 'two');
create unique index t_c1_i on t(c1);

select /*+ need to report garbage index(t) */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /*+ need to report garbage index(t) */ * from t where c1 > 0
 
Plan hash value: 2704710798
 
------------------------------------------------------
| Id  | Operation                           | Name   |
------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T      |
|   2 |   INDEX RANGE SCAN                  | T_C1_I |
------------------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
 
   1 -  SEL$1
         E -  need

Yes, some tokens like “+” and “to” obviously break the evaluation of a hint series. ~~However, not reported hints might still be used by the optimizer. Therefore I have to assume that hints are differently processed for the hint report. In fact, the hint report is misleading.~~

I was wrong, because the index plan is used by default. The next example should make clear that the hint is not considered anymore and the hint report shows what really was considered. It just suppresses some errors.

select /*+ need to report garbage full(t) */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /*+ need to report garbage full(t) */ * from t where c1 > 0
 
Plan hash value: 2704710798
 
------------------------------------------------------
| Id  | Operation                           | Name   |
------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T      |
|   2 |   INDEX RANGE SCAN                  | T_C1_I |
------------------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
 
   1 -  SEL$1
         E -  need

and here the plan for the statement without a conflicting comment:

select /*+ need garbage reporting full(t) */ * from t where c1 > 0;
select * from dbms_xplan.display_cursor(format => 'basic +hint_report');

EXPLAINED SQL STATEMENT:
------------------------
select /*+ need garbage reporting full(t) */ * from t where c1 > 0
 
Plan hash value: 1601196873
 
------------------------------------------
| Id  | Operation                 | Name |
------------------------------------------
|   0 | SELECT STATEMENT          |      |
|   1 |  TABLE ACCESS STORAGE FULL| T    |
------------------------------------------
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 4 (E - Syntax error (3))
---------------------------------------------------------------------------
 
   1 -  SEL$1
         E -  garbage
         E -  need
         E -  reporting
 
   1 -  SEL$1 / "T"@"SEL$1"
           -  full(t)

Jonathan Lewis says:

2022-01-17 at 13:56

Philipp,

The optimizer hasn’t actually “used the index hint”, it’s just followed the path it would have taken if the hint had not been there. If it had (noticed) used the hint it would have been listed in the hint_report. You could add a test where the hint is full(t) to confirm this.

Regards
Jonathan Lewis

Reply
- Philipp Salvisberg says:
  
  2022-01-17 at 14:01
  
  Jonathan,
  
  You’re right.
  
  After I saved my comment I thought I should check the default plan. As a result I I changed the comment. You probably answered based on my first version of the comment. I’m sorry for the confusion.
  
  Regards,
  Philipp
  
  Reply
Jonathan Lewis says:

2022-01-17 at 15:12

A little detail that’s not commonly known – there are some hints that require you to use the table name (even the schema.table) rather than the alias.

One I know of (thanks to an article by Roger MacNicol: http://orasql.org/2019/04/16/correct-syntax-for-the-table_stats-hint/ ) is the table_stats() hint. Two more, I assume, but haven’t tested in detail yet, would be the index_stats() and column_stats() hints.

It makes sense for these hints because if a table appears several times in a query you’d expect the optimizer to insist that the statistics on that table (and its indexes and columns) were consistent for the whole query.

(You can also see in v$sql_hints that they’re statement-level hints, so you couldn’t make them vary for different query blocks – and in fact the hints would be ignored if you included the @query_block part that can be used in so many other hints.)

Regards
Jonathan Lewis

Reply
- Philipp Salvisberg says:
  
  2022-01-17 at 16:36
  
  An important remark. Thank you.
  
  The checks regarding “G-9602: Always use the alias name instead of the table name” are applied only for chosen hints (e.g. leading). table_stats, index_stats and column_stats are not among them. For these three hints only the name is checked. However, based on your comment I create two issues on the custom validator GitHub repo:
  - #46: Hints: Add test cases for table_stats, index_stats and column_status
  - #47: Add guideline checks for TABLE_STATS hint
  BTW: @query_block references are currently not checked. This is a known and documented limitation and might lead to some false negatives.
  
  Reply
Pantea Tourang says:

2023-11-20 at 08:58

Very informative article. I learned new stuff. Thank you

Reply
- Philipp Salvisberg says:
  
  2023-11-20 at 11:47
  
  Thank you, Pantea.
  
  Reply