Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
200 views
in Technique[技术] by (71.8m points)

sql - Fastest way to update column with rownum

I have a table contains data for 3 months , every day contains about 100,000,000 records. I want to update the table with rownum in a column (as a serial) for each day. I tried the two simplest ways, db sequence ( it took long time), and a simple update statement :

update /*+ append parallel */ my_table
set my_row_num = rownum
WHERE  my_date = DATE'2019-11-11';

and it took about 112 minutes.

I want the fastest way to update the table.

p.s: the table is portioned by date column but it is not the same one that I'm using to update there is an index on the date column that I'm using to update.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Ensure that your update statement is using both parallel reads and parallel writes - replace the append hint with enable_parallel_dml.

Sample table and data:

--drop table my_table;
create table my_table(my_row_num number, my_date date);
insert into my_table select 1, date '2019-11-11' from dual connect by level <= 100000;
begin
    dbms_stats.gather_table_stats(user, 'my_table');
end;
/
commit;

The append hint is only intended for inserts. And while the parallel hint is meant to apply to the whole statement, it does not necessarily apply to the DML part. Unless you have already ran alter session enable parallel dml;, your update will not be fully parallelized. In the below execution plan, not that the first PX operation does not happen until after both the UPDATE operations.

explain plan for
update /*+ append parallel */ my_table
set my_row_num = rownum
WHERE  my_date = DATE'2019-11-11';


select * from table(dbms_xplan.display);


Plan hash value: 1791278450
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT       |          |   100K|  1074K|    39   (3)| 00:00:01 |        |      |            |
|   1 |  UPDATE                | MY_TABLE |       |       |            |          |        |      |            |
|   2 |   COUNT                |          |       |       |            |          |        |      |            |
|   3 |    PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   4 |     PX SEND QC (RANDOM)| :TQ10000 |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   5 |      PX BLOCK ITERATOR |          |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| MY_TABLE |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   6 - filter("MY_DATE"=TO_DATE(' 2019-11-11 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
 
   0 -  STATEMENT
         U -  parallel
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 2
   - PDML is disabled in current session

But with the new 12c hint enable_parallel_dml, the second UPDATE is a child of a PX operation. This implies that the writing will now also be done in parallel.

explain plan for
update /*+ enable_parallel_dml parallel */ my_table
set my_row_num = rownum
WHERE  my_date = DATE'2019-11-11';

select * from table(dbms_xplan.display);


Plan hash value: 3185872929
 
--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                 |          |   100K|  1074K|    39   (3)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR                  |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)            | :TQ10002 |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,02 | P->S | QC (RAND)  |
|   3 |    UPDATE                        | MY_TABLE |       |       |            |          |  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE                   |          |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,02 | PCWP |            |
|   5 |      PX SEND HASH (BLOCK ADDRESS)| :TQ10001 |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,01 | S->P | HASH (BLOCK|
|   6 |       BUFFER SORT                |          |   100K|  1074K|            |          |  Q1,01 | SCWP |            |
|   7 |        COUNT                     |          |       |       |            |          |  Q1,01 | SCWP |            |
|   8 |         PX RECEIVE               |          |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,01 | SCWP |            |
|   9 |          PX SEND 1 SLAVE         | :TQ10000 |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,00 | P->S | 1 SLAVE    |
|  10 |           PX BLOCK ITERATOR      |          |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,00 | PCWC |            |
|* 11 |            TABLE ACCESS FULL     | MY_TABLE |   100K|  1074K|    39   (3)| 00:00:01 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
  11 - filter("MY_DATE"=TO_DATE(' 2019-11-11 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
 
   0 -  STATEMENT
         U -  parallel
 
Note
-----
   - automatic DOP: Computed Degree of Parallelism is 2

But fully tuning parallel statements is tough. You'll probably want to use select dbms_sqltune.report_sql_monitor(sql_id => 'your SQL_ID') from dual to ensure that your statement is requesting enough parallel sessions (is the parallel hint and configuration good enough?), the system is giving it enough parallel sessions (are there enough parallel sessions available?), and the statement is able to use the parallel sessions (is there a serialization point, perhaps with the rownum function that will prevent this statement from fully using parallelism?).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...