Your first and best strategy is to avoid sorts when possible, because they consume a substantial amount of CPU time, memory and disk. However when this cannot be done,which is most often the case, sorting needs to be tuned.
Sorting can take place fully in memory, and that is the desired case. However it is more likely to spill over to disk sorting,especially with large tables, which can be extremely time-consuming despite even the best physical design of a database.
The second best strategy is to sort in memory as much as possible and sort on disk only when absolutly necessary. Of course, this implies allocating sufficinet temporary disk space (in effect, the TEMP tablespace), and separating this space physically from the rest of the Oracle datafiles, rollback segments and redo logs.
Who generates sorts ? : The CREATE INDEX statement obviously requires a sort operation on the index key to enable the creation of the B*tree structure. ALTER INDEX... REBUILD likewise requires the same sort. However, you can choose to sort the data at the OS level and then create the index with the NOSORT option. This doesnt usually buy you anything, unless you happen to already have a sorted copy of the data, because you are onlu trading RDBMS sorting for OS sorting, which isnt much of a trade. Other options include using a fast, third party sorting utility, such as SyncSort, or using Oracle's Parallel Query Option (PQO) to use SQL*Loader and load the data in parallel, unsorted.
ORDER BY and GROUP BY usually require sorts. However, an ORDER BY on an indexed column uses the already sorted index in most circumstances. To verify this action use the command EXPLAIN PLAN.
The DISTINCT qualifier must use a sorting technique(again, unless it is used on a column already with a unique index) to eliminate duplicte column values. Like wise a UNION must eliminate duplicate rows (However a UNION ALL , by definition allows duplicate rows, so because it doesnt eliminate duplicates, it doesn't require sorting. If primary keys are enforced on the two UNIONable tables, there wont be any duplicates to start with, so the UNION ALL is a recommended substitute for the UNION operation ). INTERSECT and MINUS require some duplicate elimination, though experience nowhere near the burden of a UNION operation. Similarly, IN and NOT IN can require sorting, especially if they are in support of nested subqueries. A join operation requires sorts of whatever tables do not already have exsiting indexes on the join key. The more usual situation, though , is for tables to be joined on primary keys(already having unique indexes),thereby negating the need for sorting any of the tables. The following list sums up the SQL commands or operators that can trigger sorts:
The primary parameters affecting sort operations are:
If a sort operation requires more than SORT_AREA_RETAINED_SIZE for an in-memory sort, it attempts to perform the sort within SORT_AREA_SIZE for an external disk sort, allocating a temporary segment in the process. If the sort operation requires further memory, it splits the sort burden into multiple sort runs and allocates multiple temporary segments for that purpose. The server process sorts one segment at a time and returns the merger of the sorted segments as the result. These memory allocations are not stored in the SGA shared pool, except when using MTS. Instead, they are part of the UGA. If you are using MTS,they are part of the SGA shared pool because the UGA is relocated there anyway.
Using EXPLAIN PLAN,you can see that many SQL statements can require multiple sorts within their execution plans. The sort that is currently executing in known as the active sort. A join sort is a sort in support of a joib operation. Any active sort requires SORT_AREA_SIZE. Any join sort requires SORT_AREA_RETAINED_SIZE. These settings hold true only for the dedicated server. For PQO, each parallel query server requires SORT_AREA_SIZE . However the two sets of parallel servers can be working at once. So , for PQO, set the following values:
For PQO, the optional value is 1MB. Higher values havent yielded better performance. In general SORT_AREA_SIZE = SORT_AREA_RETAINED_SIZE, except for MTS, which requires some special considerations.For MTS, set SORT_AREA_RETAINED_SIZE much smaller than SORT_AREA_SIZE.As a guideline, you can set:
SORT_AREA_RETAINED_SIZE = (SORT_AREA_SIZE / the number of expected concurrent sorts), but not less than 1/10 (SORT_AREA_SIZE)
Temporary (Sort) segments must be created when a sort cannot take place fully in memory. That is, as discussed, when the sort operation memory requirments exceed SORT_AREA_RETAINED_SIZE, it then requires the allocation of a temporary segment and attempts to work within SORT_AREA_SIZE. A temporary tablespace segment cannot contain any permanent objects and conisists solely of a single sort segment. Temporary Tablespaces are created with the CREATE or ALTER TABLESPACE ....... TEMPORARY syntax. Again,these temporary tablespaces are made up of one segment, created initially on the fly by the first sort requiring it. This segment grows in extents as sort concurrency and operation size increase.
A useful guide in seting the parameters for the extent are INITIAL = NEXT = ( max size as prescribed by datafile or disk) / (number of expected concurrent sorts). The number of expected concurrent sorts can be calculated roughly as equal to twice the number of concurrent queries . This is a case when you dont want one large extent sized just below the datafile size, which would normally be a good recommendation for general use (permanent) tablespaces. Also, set INITIAL= NEXT = some multiple of SORT_AREA_SIZE plus at least one block for the overhead of the extent header because you wouldn't want any single sort requiring more than one extent. At the same time, you can afford to have a few sorts stored in the same extents, due to the random nature fo concurrent access. Set PCTINCREASE to 0 because you dont want any surprises such as increasingly large NEXT extents. Besides, becasue concurrency again plays a factor here;having equal sized extents is a fair approach, barring actual sizing techniques, and it works well with random size requirements ( no single sort need is too far from the average) .
In the SGA, a memory structure known as the Sort Extent Pool (SEP) includes the extents that make up the single sort segments belonging to the temporary tablespaces. When sort space is requested by a process, this pool offers free extents (those that have been allocated and used by an earlier running process, and are now free but not deallocated) to be reused, much like the capability of reusing buffers in the database buffer cache. Furthermore, the V$SORT_SEGMENT contains information such as number of users,and block using the temporary sort segments. You can use this to determine efficiency (hits) and help size your extents properly.
Oracle offers the capability of having sorts bypass the database buffer cache, this is called sort direct write . Of course, you will need to SORT_AREA_SIZE bytes,but each sort operation can have its own memory buffers and write them directly to disk. The size of the buffers is set by the init.ora parameter SORT_WRITE_BUFFERSIZE(32-64KB), and the number of buffers is set by SORT_WRITE_BUFFER(2-8).Each regular (serial) sort operation requires a Sort Direct Writes Buffer of:
(SORT_WRITE_BUFFERS x SORT_WRITE_BUFFER_SIZE ) + SORT_AREA_SIZE
For PQO , each (parallel) sort requires :
((SORT_WRITE_BUFFERS x SORT_WRITE_BUFFER_SIZE) + SORT_AREA_SIZE) x 2 x (degree of parallelism)
The init.ora parameter, SORT_DIRECT_WRITES, determines the sorting behavior regarding using the database buffer cache or not. If set to AUTO, the default, and if SORT_AREA_SIZE >= 10 x Sort Direct Writes Buffer, the Sort Direct Writes Buffer is used. If set to FALSE, sort writes are buffered in the database buffer cache before being written back out to disk.These are normal sort buffer writes. If set to TRUE, sort writes are always sort direct writes. VLDBs, DSSs, and Data Warehouses should normally have this set to TRUE (or at least left to AUTO)
In general,always use the cost-based optimization approach. The rule-based approach is available for the benfit of existing applications, but new optimizer functionality uses the cost-based approach. The following features are only available with cost-based optimization.
The cost-based approach genrally chooses an execution plan that is as good as or better than the plan chosen by the rule-based approach. This is specially true for large queries with multiple joins or multiple indexes. To maintain the effectivnes of cost-based optimizer, you must keep statistics current
To use cost-based optimization for a statement, collect statistics for the table accessed by the statement and enable cost-based optimization using one of these methods :
Oracle supports rule-based optimization, but you should design new applcations to use cost-based optimization. You should also use cost-based optimization for data warehousing applications because the cost based optimizer supports new and enhanced features for DSS. You should eventually migrate your existing applications to use th ecost-based approach, because eventually, the rule-based approach will not be availabe in the Oracle Server.
An outline consists primarly of a set of hints that is equivalent to the optimizers results for the execution plan generation of a particular SQL statement. When Oracle creates an outline, Plan Stability examines the optimization results using the same data used to generate the execution plan. That is, Oracle uses the input to the execution plan to generate an outline and not the execution plan itself.
Oracle uses one of two scenarios when compiling SQL statements and matching them with outlines. The first scenario is that if you disable outline use by setting the system/session parameter USE_STORED_OUTLINES to FALSE, Oracle does not attmept to match SQL text to outlines. The second scenario involves the following two matching steps.
First,if you specify that Oracle must use a particular outline category, only outlines in that category are candidates for matching. Second, if the SQL text of the incoming statement exactly matches the SQL text in an outline in that category, Oracle considers both texts identical and uses the outline. Oracle considers any differences a mismatch.
Differences include spacing changes, carriage return variations, embedded hints, or even differences in comment text. These rules are identical to the rules for cursor matching.
How Oracle Stores Outlines:
Oracle stores outline data in the OL$ table and hint data in the OL$HINTS table. Unless you remove them Oracle retains outlines indefinitely. Oracle retains execution plans in cache and only recreates them if they become invalid or if the cache is not large enough to hold all of the them. The only effect outlines have on caching execution plans is that the outlines category name is used in addition to the SQL text to identify whether the plan is in cache. This ensures Oracle does not use an execution plan complied under one category to execute a SQL statement that Oracle should compile under a different category.
Oracle can automatically create outlines for all SQL statements, or you can create them for specific SQL statements. In either case, the outlines derive their input from the rule-based or cost-based optimizers. Oracle creates stored outlines automatically when you set the parameter CREATE_STORED_OUTLINES to TRUE. When activated, Oracle creates outlines for all executed SQL statements. You can also create stored outlines for specific statements using the CREATE OUTLINE statement.
To use stored outlines when Oracle compiles a SQL statement,set the system parameter USE_STORED_OUTLINES to TRUE or to a category name. If you set USE_STORED_OUTLINES to TRUE, Oracle uses outlines in the DEFAULT category. If you specify a category in the parameter, Oracle uses outlines in that category until you re-set the USE_STORED_OUTLINES to FALSE.If you specify a category name and Oracle does not find an outline in that category that matches de SQL statement, Oracle searches for an outline in the DEFAULT category. You can access information about outlines and related hint data that Oracle stores in the : USER_OUTLINES and USER_OUTLINE_HINTS views.
OLTP applications are high throughput,insert/update-intensive systems.These systems are characterized by growing volumes of data that several hundred users access concurrently. Typical OLTP applications are airline reservation systems and banking applications. They key goals of OLTP systems are: availability(7x24); speed(throughput); concurrency and recoverarability.
In these types of databases you must avoid excessive use of indexes and clusters because these structures slow down insert and update activity
The following elements are crucial for OLTP systems: Rollback segments, Indexes-Cluster and hashing, Discrete Transactions, Data Block Size, Dynamic Allocation of space to tables and rollback segments, Transaction processing monitors and the multi-threaded server, the shared pool, well tuned SQL statements,integrity constraints, client/server architecture, dynamically changeable initialization parameters, procedures, packages and functions.
Decision support system or Data Warehousing applications typcially convert large amounts of information into used-defined reports. They perform queries on the large amount of data gathered from OLTP applications. Decision makers use these applications to determine what strategies the organization should take. An example of a decision support system is a marketing tool that determines the buying patterns of consumers based on infromation gathered from demographic studies. The key goals of a data warehousing system (DSS) are repsonse time, accuracy and availability.
The key to performance in a DSS is properly tuned queries and proper use of indexes,clusters and hashing.
The following issues are crucial in implementing and tuning DSS: Materialized Views, Indexes ( B*Tree and Bitmap), Cluster, hashing, Data block size, Parallel Execution, Star query, The optimizer, Using hints in queries, PL/SQL functions in SQL statements.
One way to improve the response time in DSS is to use Parallel execution. This feature enables multiple processes to simultaneously process a single SQL statement. By spreading the processing over many processes, Oracle can execute complex statements more quickly than if only a single server processed them. Symmetric multiprocessing (SMP), clustered, or massively parallel systems gain the largest performance benefits from parallel execution. This is because operations can be effectivley spread among many CPUs on a single system.
Try and apply the previous two steps for a temporary basis, give me a break !!!
In a MTS environments,just as there is a SGA, there is also a User Global Area (UGA), which contains user session information, sort areas, and private SQL areas. Normally the default RDBMS instance ( also known as dedicated server) results in a one to one mapping of user processes to server processes.
With MTS, the UGA is moved up into the shared pool. The remaining process-specifc memory is reatined in the Process Global Area (PGA) and holds information that cannot be shared. In this way, the total amount of memory required in using MTS is not really more than the dedicated server, just redistributed. However, you have to increase the SHARED_POOL_SIZE, To help size the UGA that is relocated to the SGA, use:
select sum(value) from v$SESSTAT SE, V$STATNAME SN WHERE SN.NAME = 'max session memory' AND SE.STATISTIC# = SN.STATISTIC#;
This yields the maximum amount of UGA session memory used since instance startup. You may wish to take samples of this result often , and then increment the SHARED_POOL_SIZE by this amount. From Oracle 8 you can use
'session uga memory max'
With MTD you have some control over the distribution of server versus user memory. Two init.ora parameters that affect user memory are:
The parameter SESSION_CACHED_CURSORS can be set to the expected maximum number of session cursors to be cached in the users memory area, as long as reparsing is kept low. This helps offload server memory requirements at the expense of increasing user memory. Optionally, set this parameter when statements are frequently reused.
The default setting for CLOSE_CACHE_OPEN_CURSOR is FALSE, meaning that cursors are not closed on COMMIT.Optionally set this to TRUE if SQL statements are rarely reused. Make sure these two parameters do not conflict, setting both to TRUE or FALSE seems to be the best approach.