HelloDBA [中文]
Search Internet Search HelloDBA
  Oracle Technology Site. email: fuyuncat@gmail.com  MSN: fuyuncat@hotmail.com Add to circle  acoug  acoug 

All about oracle sorting

[中文]

Author:  fuyuncat

Source:  www.HelloDBA.com

Date:  2010-03-18 03:17:15

Share to  Twitter Facebook GMail Blogger Orkut Google Bookmarks Sina Weibo QQ Renren Tieba Kaixin Douban Taobao

sort arithmetic   

    During sorting, data will first be input into a private memory area named sort area, and be sorted in memory. If the data to be sorted is too large to be handle once, the sorted data will be directly written (no caching) into temporary tablespace as a data set. Once all of data be written into disk, oracle will merge those data sets recursively, till to all data be merged in one data set.
 

Initial Runs

    The process that sort data in-memory is called initial run. Up to 80~90% of the sort_area_size may be used for read buffers during an initial run, and the rest is used for write buffers. If we know how many data (considering output records, average length and block size, it could be estimated) need be sorted, we can estimate the number of initial runs by below formula.
    Initial Runs = CEIL(SORT_DATA_BLOKCS/ROUND(0.8*SORT_AREA_SIZE/BLOCK_SIZE))

Merges

    2 or more data sets could be merged once. The number data sets be merged once is called merge width. Merging is also be processed in sort area. To merge the data set, need first directly read them from disk. Just like MBRC, mutliple blocks could be read in one io, which is controlled by SMRC (sort_multiblock_read_count, who become a hidden parameter from 9i). However, the value of this parameter is not the actual SMRC, it actually is to limit the MAX SMRC. To merge data sets, at least 2 data set must be selected. And correspond to asynchronous io (disk_asynch_io=TRUE), 2 read buffers required. So, the actual SMRC could be calculated as below.
    SMRC = MIN(ROUND(SORT_AREA_SIZE/(2*2*BLOCK_SIZE))-1, _sort_multiblock_read_count)

    Then, the merge width could be calculated base on the SMRC,
    merge width = ROUND(SORT_AREA_SIZE/(2*SMRC*BLOCK_SIZE))-1

    It will keep merging, untill all data are merged in one data set. The process will be recursive. For the last data sets, if the number is less than merge width, they will be put into next round. We have below function to calculate the number of merges.
 

    Note: All of the merges except the last one are named Intermediate runs.
 

SQL代码
  1. HELLODBA.COM>create or replace function sort_merges(init_runs number, width number)   
  2.   2  return number   
  3.   3  as  
  4.   4   i number:=init_runs;   
  5.   5   n number:=0;   
  6.   6  begin  
  7.   7    while i>width loop   
  8.   8      n := n+floor(i/width);   
  9.   9      i := floor(i/width)+ mod(i,width);   
  10.  10    end loop;   
  11.  11    if i > 0 then  
  12.  12      n := n+1;   
  13.  13    end if;   
  14.  14    return n;   
  15.  15  end;   
  16.  16  /   
  17.   
  18. Function created.   

    Look into an example. Here we have 65536b sort area, 721 blocks estimated to be sorted with 8k block size. We can get the sorting statistics data by below query.
 

SQL代码
  1. HELLODBA.COM>select init_runs,SMRC,merge_width,sort_merges(init_runs,merge_width) merges from  
  2.   2  (select init_runs,   
  3.   3         SMRC,   
  4.   4         round(sort_area_size / (2 * SMRC * block_size), 0) - 1 as merge_width   
  5.   5    from (select sort_blocks, sort_area_size, block_size, ceil(sort_blocks / round(0.8 * sort_area_size / block_size, 0)) init_runs,   
  6.   6                 round(sort_area_size / (2 * 2 * block_size), 0) - 1 SMRC   
  7.   7            from (select 721 as sort_blocks, 65536 as sort_area_size, 8192 as block_size from dual)));   
  8.   
  9.  INIT_RUNS       SMRC MERGE_WIDTH     MERGES   
  10. ---------- ---------- ----------- ----------   
  11.        121          1           3         60   
  12.   
  13. 1 row selected.   

new sort algorithm

    From 10gR2, new sort algorithm is introduced, which controled by hidden parameter "_new_sort", default is true. Here is its improvement.
 

  *In-memory sort.
    Less momory required for those small data set sorting in memory.
  *Merge Sort
    Will save keys for those initial data sets, and will choose the data sets to be merged by comparing the keys. And you will see a new statistics entry from the trace file:
       Comparisons while searching for key in-memory 120
    The temporary blocks will be more possible to be re-used when mergeing, save disk blocks.
 

Trace

    10031, 10032, 10033 event traces will help us to understand the sorting. Below is the trace contents of previous example.
 

SQL代码
  1. Recording run at 401ca9 for 6 blocks   
  2. Recording run at 401cb0 for 6 blocks   
  3. Recording run at 401cb6 for 6 blocks   
  4. Recording run at 401d3c for 6 blocks   
  5. Recording run at 401d42 for 6 blocks   
  6. Recording run at 401d48 for 6 blocks   
  7. ...   
  8. ---- Sort Parameters ------------------------------   
  9. sort_area_size                    65536   
  10. sort_area_retained_size           65536   
  11. sort_multiblock_read_count        1   
  12. max intermediate merge width      3   
  13. Merging run at 401d1a for 1 blocks   
  14. Merging run at 401c3a for 6 blocks   
  15. Total number of blocks to read: 7 blocks   
  16. Recording run at 401d1b for 7 blocks   
  17. Merging run at 401c40 for 6 blocks   
  18. Merging run at 401c46 for 6 blocks   
  19. Merging run at 401c5c for 6 blocks   
  20. Total number of blocks to read: 18 blocks   
  21. Recording run at 401d22 for 17 blocks   
  22. ...   
  23. ---- Sort Statistics ------------------------------   
  24. Initial runs                              121   
  25. Intermediate runs                         59   
  26. Number of merges                          60   
  27. Input records                             47582   
  28. Output records                            47582   
  29. Disk blocks 1st pass                      721   
  30. Total disk blocks used                    896   
  31. Total number of comparisons performed     651813   
  32.   Comparisons performed by in-memory sort 320660   
  33.   Comparisons performed during merge      331033   
  34.   Comparisons while searching for key in-memory 120   
  35. Temp segments allocated                   1   
  36. Extents allocated                         56   
  37. Uses version 2 sort   
  38. Uses asynchronous IO   
  39.     ---- Run Directory Statistics ----   
  40. Run directory block reads (buffer cache)  240   
  41. Block pins (for run directory)            1   
  42. Block repins (for run directory)          239   
  43.     ---- Direct Write Statistics -----   
  44. Write slot size                           8192   
  45. Write slots used during in-memory sort    2   
  46. Number of direct writes                   2884   
  47. Num blocks written (with direct write)    2884   
  48. Block pins (for sort records)             2884   
  49. Waits for async writes                    1731   
  50.     ---- Direct Read Statistics ------   
  51. Size of read slots for merge phase        8192   
  52. Number of read slots for merge phase      6   
  53. Size of read slots for output             8192   
  54. Number of read slots for output           8   
  55. Number of direct sync reads               1498   
  56. Number of blocks read synchronously       1498   
  57. Number of direct async reads              1386   
  58. Number of blocks read asynchronously      1386   
  59. Waits for async reads                     347   
  60. ---- End of Sort Statistics -----------------------   

   --- Fuyuncat ---

Top

Copyright ©2005, HelloDBA.Com All reseverd.

Declaration
by fuyuncat