Avoid Implicit Type Conversion in JSON Access

Avoid Implicit Type Conversion in JSON Access

Introduction

Before comparing two values, the Oracle Database automatically ensures that both values have the same data type. It converts one of the values to match the data type of the other value. The SQL Language Reference manual describes when and how implicit data conversions happen. However, Oracle recommends that you convert data types explicitly. This ensures consistent results and better performance.

I am using JSON relational duality views in my current project. During development, I have stumbled across some implicit data conversions that I was unaware of and that are causing poor performance. I will use a simplified example to show you what I mean.

  1. Setup
  2. JSON-relational Duality View
  3. JSON Data Types
  4. Explicit Conversion
  5. JSON Collection View
  6. Conclusion

1. Setup

I’ve tested this example with an Oracle Database 23.6 and 23.7.

In a schema of your choice you can run the following setup script:

1) Setup table dept and emp
drop table if exists emp;
drop table if exists dept;

create table dept (
   deptno number(2, 0)      not null constraint dept_pk primary key,
   dname  varchar2(14 char) not null,
   loc    varchar2(13 char) not null
);

create table emp (
   empno    number(4, 0)      not null  constraint emp_pk primary key,
   ename    varchar2(10 char) not null,
   job      varchar2(9 char)  not null,
   mgr      number(4, 0)                constraint emp_mgr_fk references emp,
   hiredate date              not null,
   sal      number(7, 2)      not null,
   comm     number(7, 2),
   deptno   number(2, 0)      not null  constraint emp_deptno_fk references dept
);

insert into dept (deptno, dname, loc)
values (10, 'ACCOUNTING', 'NEW YORK'),
       (20, 'RESEARCH',   'DALLAS'),
       (30, 'SALES',      'CHICAGO'),
       (40, 'OPERATIONS', 'BOSTON');

insert into emp (empno, ename, job, mgr, hiredate, sal, comm, deptno)
values (7566, 'JONES',  'MANAGER',   7839, date '1981-04-02', 2975, null, 20),
       (7698, 'BLAKE',  'MANAGER',   7839, date '1981-05-01', 2850, null, 30),
       (7782, 'CLARK',  'MANAGER',   7839, date '1981-06-09', 2450, null, 10),
       (7788, 'SCOTT',  'ANALYST',   7566, date '1987-04-19', 3000, null, 20),
       (7902, 'FORD',   'ANALYST',   7566, date '1981-12-03', 3000, null, 20),
       (7499, 'ALLEN',  'SALESMAN',  7698, date '1981-02-20', 1600,  300, 30),
       (7521, 'WARD',   'SALESMAN',  7698, date '1981-02-22', 1250,  500, 30),
       (7654, 'MARTIN', 'SALESMAN',  7698, date '1981-09-28', 1250, 1400, 30),
       (7844, 'TURNER', 'SALESMAN',  7698, date '1981-09-08', 1500,    0, 30),
       (7900, 'JAMES',  'CLERK',     7698, date '1981-12-03',  950, null, 30),
       (7934, 'MILLER', 'CLERK',     7782, date '1982-01-23', 1300, null, 10),
       (7369, 'SMITH',  'CLERK',     7902, date '1980-12-17',  800, null, 20),
       (7839, 'KING',   'PRESIDENT', null, date '1981-11-17', 5000, null, 10),
       (7876, 'ADAMS',  'CLERK',     7788, date '1987-05-23', 1100, null, 20);

begin
   dbms_stats.gather_table_stats(user, 'dept');
   dbms_stats.gather_table_stats(user, 'emp');
end;
/

alter session set nls_date_format = 'YYYY-MM-DD';
column deptno format a6
column dname format a10
column loc format a9
column empid format a8
column empno format a5
column ename format a6
column mgr format a4
column sal format a4
column comm format a4

select * from dept;
select * from emp;
Table EMP dropped.


Table DEPT dropped.


Table DEPT created.


Table EMP created.


4 rows inserted.


14 rows inserted.


PL/SQL procedure successfully completed.


Session altered.


DEPTNO DNAME      LOC      
------ ---------- ---------
    10 ACCOUNTING NEW YORK 
    20 RESEARCH   DALLAS   
    30 SALES      CHICAGO  
    40 OPERATIONS BOSTON   


EMPNO ENAME  JOB        MGR HIREDATE    SAL COMM DEPTNO
----- ------ --------- ---- ---------- ---- ---- ------
 7566 JONES  MANAGER   7839 1981-04-02 2975          20
 7698 BLAKE  MANAGER   7839 1981-05-01 2850          30
 7782 CLARK  MANAGER   7839 1981-06-09 2450          10
 7788 SCOTT  ANALYST   7566 1987-04-19 3000          20
 7902 FORD   ANALYST   7566 1981-12-03 3000          20
 7499 ALLEN  SALESMAN  7698 1981-02-20 1600  300     30
 7521 WARD   SALESMAN  7698 1981-02-22 1250  500     30
 7654 MARTIN SALESMAN  7698 1981-09-28 1250 1400     30
 7844 TURNER SALESMAN  7698 1981-09-08 1500    0     30
 7900 JAMES  CLERK     7698 1981-12-03  950          30
 7934 MILLER CLERK     7782 1982-01-23 1300          10
 7369 SMITH  CLERK     7902 1980-12-17  800          20
 7839 KING   PRESIDENT      1981-11-17 5000          10
 7876 ADAMS  CLERK     7788 1987-05-23 1100          20

14 rows selected. 

The model has primary and foreign keys which are required for the duality views. However, the constraints do not need to be enabled.

2. JSON-relational Duality View

Now let’s create an updateable duality view for the model we created earlier.

2) Create duality view dept_dv
create or replace json duality view dept_dv as
dept @insert @update @delete
{
   _id: deptno
   dname
   loc
   emps: emp @insert @update @delete
      {
         empno
         ename
         job
         emp @unnest @link(from: [mgr])
            {
               mgr    : empno @nocheck
               mgrname: ename @nocheck
            }
         hiredate
         sal
         comm
      }
};
Json duality view DEPT_DV created.

We can query the department 10 as follows:

3) Query deptno 10 in dept_dv
column json_data format a50
alter session disable parallel query;
set pagesize 1000

select json_serialize(dv.data returning varchar2 pretty) as json_data
  from dept_dv dv
 where dv.data."_id" = 10;
Session altered.


JSON_DATA
--------------------------------------------------
{
  "_id" : 10,
  "_metadata" :
  {
    "etag" : "E0146035FE26EE16D4968A21E6350D81",
    "asof" : "00002604AFC8A4BC"
  },
  "dname" : "ACCOUNTING",
  "loc" : "NEW YORK",
  "emps" :
  [
    {
      "empno" : 7782,
      "ename" : "CLARK",
      "job" : "MANAGER",
      "mgr" : 7839,
      "mgrname" : "KING",
      "hiredate" : "1981-06-09T00:00:00",
      "sal" : 2450,
      "comm" : null
    },
    {
      "empno" : 7839,
      "ename" : "KING",
      "job" : "PRESIDENT",
      "hiredate" : "1981-11-17T00:00:00",
      "sal" : 5000,
      "comm" : null
    },
    {
      "empno" : 7934,
      "ename" : "MILLER",
      "job" : "CLERK",
      "mgr" : 7782,
      "mgrname" : "CLARK",
      "hiredate" : "1982-01-23T00:00:00",
      "sal" : 1300,
      "comm" : null
    }
  ]
}

The where clause in line 7 contains an implicit conversion. It’s not that obvious and the execution plan does not help to spot it either.

4) Execution plan
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  8nns7xgu0v6vu, child number 4
-------------------------------------
select json_serialize(dv.data returning varchar2 pretty) as json_data   
from dept_dv dv  where dv.data."_id" = 10
 
Plan hash value: 2166484641
 
---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP     |     1 |    10 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMP_PK  |     1 |       |     0   (0)|          |
|   3 |  SORT GROUP BY              |         |     1 |    38 |            |          |
|*  4 |   TABLE ACCESS FULL         | EMP     |     5 |   190 |     3   (0)| 00:00:01 |
|   5 |  TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    20 |     1   (0)| 00:00:01 |
|*  6 |   INDEX UNIQUE SCAN         | DEPT_PK |     1 |       |     0   (0)|          |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("OUTER_ALIAS2"."EMPNO"=:B1)
   4 - filter("OUTER_ALIAS1"."DEPTNO"=:B1)
   6 - access("DEPTNO"=10)

The highlighted lines show that a unique index scan has been performed for deptno 10. This looks good. So where does the implicit conversion take place?

3. JSON Data Types

The condition dv.data."_id" = 10 in the where clause compares a JSON data type with a numeric data type. As a result, the Oracle Database needs to convert one of the values. It decided to convert the JSON to a numeric value. That’s what we see in the execution plan.

Really? Can we prove this claim? Yes, with the following statement.

5) Determine JSON data type
select is_json_condition, value
  from (
          select deptno is json as is_json,
                 deptno is json(value) as is_json_value,
                 deptno is json(array) as is_json_array,
                 deptno is json(object) as is_json_object,
                 deptno is json(scalar) as is_json_scalar,
                 deptno is json(scalar number) as is_json_scalar_number,
                 deptno is json(scalar string) as is_json_scalar_string,
                 deptno is json(scalar binary_double) as is_json_scalar_binary_double,
                 deptno is json(scalar binary_float) as is_json_scalar_binary_float,
                 deptno is json(scalar date) as is_json_scalar_date,
                 deptno is json(scalar timestamp) as is_json_scalar_timestamp,
                 deptno is json(scalar timestamp with time zone) as is_json_scalar_timestamp_with_time_zone,
                 deptno is json(scalar null) as is_json_scalar_null,
                 deptno is json(scalar boolean) as is_json_scalar_boolean,
                 deptno is json(scalar binary) as is_json_scalar_binary,
                 deptno is json(scalar interval year to month) as is_json_scalar_interval_year_to_month,
                 deptno is json(scalar interval day to second) as is_json_scalar_interval_day_to_second
            from (select dv.data."_id" as deptno from dept_dv dv where rownum = 1)
       ) src unpivot (
          value for is_json_condition in (
             is_json,
             is_json_value,
             is_json_array,
             is_json_object,
             is_json_scalar,
             is_json_scalar_number,
             is_json_scalar_string,
             is_json_scalar_binary_double,
             is_json_scalar_binary_float,
             is_json_scalar_date,
             is_json_scalar_timestamp,
             is_json_scalar_timestamp_with_time_zone,
             is_json_scalar_null,
             is_json_scalar_boolean,
             is_json_scalar_binary,
             is_json_scalar_interval_year_to_month,
             is_json_scalar_interval_day_to_second
          )
       );
IS_JSON_CONDITION                       VALUE
--------------------------------------- -----
IS_JSON                                 true 
IS_JSON_VALUE                           true 
IS_JSON_ARRAY                           false
IS_JSON_OBJECT                          false
IS_JSON_SCALAR                          true 
IS_JSON_SCALAR_NUMBER                   true 
IS_JSON_SCALAR_STRING                   false
IS_JSON_SCALAR_BINARY_DOUBLE            false
IS_JSON_SCALAR_BINARY_FLOAT             false
IS_JSON_SCALAR_DATE                     false
IS_JSON_SCALAR_TIMESTAMP                false
IS_JSON_SCALAR_TIMESTAMP_WITH_TIME_ZONE false
IS_JSON_SCALAR_NULL                     false
IS_JSON_SCALAR_BOOLEAN                  false
IS_JSON_SCALAR_BINARY                   false
IS_JSON_SCALAR_INTERVAL_YEAR_TO_MONTH   false
IS_JSON_SCALAR_INTERVAL_DAY_TO_SECOND   false

17 rows selected. 

In line 20 we query dv.data."_id" and use the IS JSON condition to determine the data type. The most granular type is JSON scalar number.

4. Explicit Conversion

So, we know now, that we need to convert a JSON scalar number to a number. And how do we do that? – By using a SQL/JSON path expression method, as in the next example.

7) Query deptno 10 in dept_dv with explicit type conversion
select json_serialize(dv.data returning varchar2 pretty) as json_data
  from dept_dv dv
 where dv.data."_id".number() = 10;

We used the .number() method in this example. The execution plan for this query looks now like this:

8) Execution plan querying dept_dv with explicit type conversion
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  4rjg8g02bph16, child number 2
-------------------------------------
select json_serialize(dv.data returning varchar2 pretty) as json_data   
from dept_dv dv  where dv.data."_id".number() = 10
 
Plan hash value: 2166484641
 
---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP     |     1 |    10 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMP_PK  |     1 |       |     0   (0)|          |
|   3 |  SORT GROUP BY              |         |     1 |    38 |            |          |
|*  4 |   TABLE ACCESS FULL         | EMP     |     5 |   190 |     3   (0)| 00:00:01 |
|   5 |  TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    20 |     1   (0)| 00:00:01 |
|*  6 |   INDEX UNIQUE SCAN         | DEPT_PK |     1 |       |     0   (0)|          |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("OUTER_ALIAS2"."EMPNO"=:B1)
   4 - filter("OUTER_ALIAS1"."DEPTNO"=:B1)
   6 - access("DEPTNO"=10)

The plan looks the same as without calling the .number() method, but in this case, we did not rely on implicit type conversion.

We’ve proven, that a .number() call does not make things worse. But are there cases where such an explicit type conversion results in a better execution plan?

5. JSON Collection View

Let’s say we need a view that filters the data in the duality view. For example, for security purposes. One way to achieve this is to create a JSON collection view. From a consumer point of view, such a view behaves 100% the same as a duality view. The only difference is that it is read-only and allows the full SQL grammar to define such a view.

Here’s an example.

9) Create JSON collection view dept_cv
create or replace json collection view dept_cv as
select dv.data
  from dept_dv dv
 where dv.data."_id".number() = 10 or dv.data.loc.string() = 'DALLAS';
Json collection view DEPT_CV created.

Now, let’s run a query with implicit type conversion.

10) Query deptno 10 in dept_cv
column json_data format a50
alter session disable parallel query;
set pagesize 1000

select json_serialize(cv.data returning varchar2 pretty) as json_data
  from dept_cv cv
 where cv.data."_id" = 10;
Session altered.


JSON_DATA
--------------------------------------------------
{
  "_id" : 10,
  "_metadata" :
  {
    "etag" : "E0146035FE26EE16D4968A21E6350D81",
    "asof" : "00002604B0C08AAE"
  },
  "dname" : "ACCOUNTING",
  "loc" : "NEW YORK",
  "emps" :
  [
    {
      "empno" : 7782,
      "ename" : "CLARK",
      "job" : "MANAGER",
      "mgr" : 7839,
      "mgrname" : "KING",
      "hiredate" : "1981-06-09T00:00:00",
      "sal" : 2450,
      "comm" : null
    },
    {
      "empno" : 7839,
      "ename" : "KING",
      "job" : "PRESIDENT",
      "hiredate" : "1981-11-17T00:00:00",
      "sal" : 5000,
      "comm" : null
    },
    {
      "empno" : 7934,
      "ename" : "MILLER",
      "job" : "CLERK",
      "mgr" : 7782,
      "mgrname" : "CLARK",
      "hiredate" : "1982-01-23T00:00:00",
      "sal" : 1300,
      "comm" : null
    }
  ]
}

The execution plan of the query above looks like this:

11) Execution plan querying dept_cv with implicit type conversion
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID  gkhradvxjrwfq, child number 4
-------------------------------------
select json_serialize(cv.data returning varchar2 pretty) as json_data   
from dept_cv cv  where cv.data."_id" = 10
 
Plan hash value: 1318549807
 
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |     9 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP    |     1 |    10 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMP_PK |     1 |       |     0   (0)|          |
|   3 |  SORT GROUP BY              |        |     1 |    38 |            |          |
|*  4 |   TABLE ACCESS FULL         | EMP    |     5 |   190 |     3   (0)| 00:00:01 |
|*  5 |  TABLE ACCESS FULL          | DEPT   |     1 |    20 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("OUTER_ALIAS2"."EMPNO"=:B1)
   4 - filter("OUTER_ALIAS1"."DEPTNO"=:B1)
   5 - filter((("DEPTNO"=10 OR "LOC"='DALLAS') AND 
              JSON_VALUE(JSON_SCALAR("DEPTNO" JSON NULL ON NULL ) FORMAT OSON , '$' 
              RETURNING NUMBER NULL ON ERROR TYPE(STRICT) )=10))
 
SQL Analysis Report (identified by operation id/Query Block Name/Object Alias):
-------------------------------------------------------------------------------
 
   5 -  SEL$95C0DFA4 / "OUTER_ALIAS0"@"SEL$3"
           -  The following columns have predicates which preclude their 
              use as keys in index range scan. Consider rewriting the 
              predicates.
                "DEPTNO"

The SQL analysis report at the bottom is interesting. It clearly states that no index range scan was used and that we should rewrite the query to avoid implicit type conversion.

Let’s do that.

12) Query deptno 10 in dept_cv with explicit type conversion
select json_serialize(cv.data returning varchar2 pretty) as json_data
  from dept_cv cv
 where cv.data."_id".number() = 10;

And now the execution plan has changed for the better.

13) Execution plan querying dept_cv with explicit type conversion
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  cca5xrgndnmkd, child number 2
-------------------------------------
select json_serialize(cv.data returning varchar2 pretty) as json_data   
from dept_cv cv  where cv.data."_id".number() = 10
 
Plan hash value: 2166484641
 
---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP     |     1 |    10 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMP_PK  |     1 |       |     0   (0)|          |
|   3 |  SORT GROUP BY              |         |     1 |    38 |            |          |
|*  4 |   TABLE ACCESS FULL         | EMP     |     5 |   190 |     3   (0)| 00:00:01 |
|   5 |  TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    20 |     1   (0)| 00:00:01 |
|*  6 |   INDEX UNIQUE SCAN         | DEPT_PK |     1 |       |     0   (0)|          |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("OUTER_ALIAS2"."EMPNO"=:B1)
   4 - filter("OUTER_ALIAS1"."DEPTNO"=:B1)
   6 - access("DEPTNO"=10)

Making an index unique scan possible.

6. Conclusion

Always use a SQL/JSON path expression method (binary(), boolean(), date(), dateWithTime(), number(), string(), …) when comparing a JSON value to a non-JSON value in SQL. This way you avoid implicit type conversions, improve the readability of your code, and give the Oracle Database everything it needs to create an optimal execution plan.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.