Back around the era of Oracle 6 or 7, I had an intuition that a certain type of query could be written within the limits of SQL*Plus syntax, but wasn’t quite clear as to how. I was doing some ad-hoc DBA work for a development database once a week or so, and wished to get a handle on who was doing what how frequently in terms of object creation. I envisioned a tabular matrix tallying kinds of objects by user with the aim of watching the changes over time. After doing some legwork exploring the sys.obj$ table and hitting upon using DECODE to trap specific object/user pairs into desired columns, I came up with this:
w h i t e s p a c e
SELECT username,
COUNT( DECODE(O.type#, 2, O.obj#, ”) ) Tbl,
COUNT( DECODE(O.type#, 1, O.obj#, ”) ) Idx,
COUNT( DECODE(O.type#, 5, O.obj#, ”) ) Syn,
COUNT( DECODE(O.type#, 4, O.obj#, ”) ) Vue,
COUNT( DECODE(O.type#, 6, O.obj#, ”) ) Seq,
COUNT( DECODE(O.type#, 7, O.obj#, ”) ) Proc,
COUNT( DECODE(O.type#, 8, O.obj#, ”) ) Func,
COUNT( DECODE(O.type#, 9, O.obj#, ”) ) Pkg,
COUNT( DECODE(O.type#,12, O.obj#, ”) ) Trig
FROM
sys.obj$ O, sys.dba_users U
WHERE
U.user_id = O.owner# (+)
GROUP BY username
ORDER BY 2 DESC
/
w h i t e s p a c e
This yields the following result (about two decades later!) within an Oracle 11g2 test database:
w h i t e s p a c e
USR TBL IDX SYN VUE SEQ PROC FUNC PKG TRIG
================ ====== ====== ====== ====== ====== ====== ====== ====== ======
SYS 951 1144 9 3747 114 89 92 582 4
APEX_040000 426 1362 54 175 3 19 12 234 439
SYSTEM 154 232 8 12 20 1 0 1 2
XDB 79 450 0 5 5 4 6 34 27
MDSYS 76 237 0 65 6 1 106 28 36
CTXSYS 49 61 0 76 3 2 2 74 0
HR 7 19 0 1 3 2 0 0 2
ROB 5 1 0 0 0 0 0 0 0
OUTLN 3 5 0 0 0 1 0 0 0
FLOWS_FILES 1 5 5 0 0 0 0 0 1
APEX_PUBLIC_USER 0 0 0 0 0 0 0 0 0
================ ====== ====== ====== ====== ====== ====== ====== ====== ======
Grand Totals: 1751 3516 76 4081 154 119 218 953 511
11 rows selected.
w h i t e s p a c e
view Formatting SQL
Although this SQL snippet seemed quite a clever achievement at the time, it actually represents a kind of kludge. Using multiple DECODEs is not very legible, but they were necessary. This style of query forces a switching of roles between column and row values – the reason they’ve been christened pivot queries in Oracle 11g. The values for type#, codes indicating table, view, or index, and so on, would occupy row values within a typical query. Instead they have been made to serve as column headings. (The other element which makes the trick work is using an aggregate function, COUNT in this case, to tally up the resulting sparse matrix via GROUP BY.)
Now, have a look at the equivalent syntax in Oracle 11g utilizing the new PIVOT keyword. It produces the same query results as above:
w h i t e s p a c e
SELECT
Username, Tbl, Idx, Syn, Vue, Seq, Proc, Func, Pkg, Trig
FROM
(SELECT username, O.type#, O.obj#
FROM sys.obj$ O, sys.dba_users U
WHERE U.user_id = O.owner# (+) )
PIVOT
(COUNT(obj#) FOR type# IN
(2 AS Tbl,
1 AS Idx,
5 AS Syn,
4 AS Vue,
6 AS Seq,
7 AS Proc,
8 AS Func,
9 AS Pkg,
12 AS Trig )
)
ORDER BY 2 DESC
/
w h i t e s p a c e
I must admit that at first my old-school proclivities preferred the earlier syntax. I found it more intuitive because the GROUP BY is made explicit, and it didn’t rely upon a nested subquery. But I’ve settled into the pivot idea now; it has become more natural. The key, as always with SQL, is to conceptualize what is going on for each verb, set-theoretically where possible. PIVOT is a more complex verb, since, as Tom Kyte has put it, it operates a bit like a blending of WHERE and GROUP BY. A comparison of the explain plans for these two formulations validates that they are essentially syntactic variants of the same query. In both cases the plan decomposes as follows, the only difference being that the HASH JOIN operation in the plan uses GROUP BY and GROUP BY PIVOT, respectively:
SELECT – SORT – HASH – HASH JOIN – Access Predicates – Table Accesses(FULL)
Deconstructing the pivot query in stages will enable you to see what is going on. Isolate on the subquery above (in red) for a moment. If you executed it solo the result would be three long columns of data: username, a numeric code for object type, and a unique numeric identifier for a specific object. Try it!
view subquery output
What the PIVOT clause does is count occurrences of a chosen subset of possible values for the object type and then display these counts as separate columns. The ORDER BY clause was arbitrary in this query. Generally, you can use this template to implement any kind of aggregate value distribution question against a large table, using SUM, AVG, MAX, COUNT, etc. – making it a natural tool for data warehousing and preliminary data-mining surveying expeditions.
Odds & Ends
As a dataist’s aside, here’s a little SQL utility I’ve often used to gain an overview about how specific column values are distributed within an unfamiliar table, perhaps in an unfamiliar database.
w h i t e s p a c e
SELECT COUNT(*), &&which_grouping_column
FROM &which_table
GROUP BY &&which_grouping_column
ORDER BY 1 DESC
/
UNDEF which_grouping_column;
UNDEF which_table;
w h i t e s p a c e
The run example shows it applied against the sys.obj$ table we’ve already mentioned, but without the decoded translations used previously. Interesting to note that nowadays Oracle apparently has more than 100 different types of objects it deals with!
w h i t e s p a c e
Enter value for which_grouping_column: type#
Enter value for which_table: sys.obj$
COUNT(*) TYPE#
========== ========
4081 4
4051 5
3516 1
1753 2
1602 13
. . .
3 74
2 101
1 59
1 57
1 0
46 rows selected.
w h i t e s p a c e
Finally, a point of interest I gleaned from Oracle’s Technet articles: PIVOT has an XML option for output. Notice that in my original query I was selectively concerned with just a few different types of database objects. And due to this, I had to expressly enumerate them (either in the DECODE statements or within the IN list) – meaning: I had to already know these specific values. If you don’t know all the values, or simply prefer XML formatting with large amounts of data, then the XML variant with the keyword ANY is for you.
w h i t e s p a c e
SELECT * FROM
(SELECT username, O.type#, O.obj#
FROM sys.obj$ O, sys.dba_users U
WHERE U.user_id = O.owner# (+)
)
PIVOT XML
(COUNT(obj#)
FOR type# IN (ANY)
)
/
w h i t e s p a c e
Note: The XML string is a CLOB datatype, so make sure you set long high within SQL*PLus. The result of the query looks like this:
USERNAME TYPE#_XML
================ ===================================================================
APEX_040000 <PivotSet><item><column name = “TYPE#”>1</column><column name . . .
APEX_PUBLIC_USER <PivotSet><item><column name = “TYPE#”></column><column name =. . .
CTXSYS <PivotSet><item><column name = “TYPE#”>1</column><column name . . .
. . .
w h i t e s p a c e
More Info
Oracle Developer-Net has a nice thorough look at PIVOT syntax, including a discussion of the UNPIVOT keyword. And for examples of pivoting within a Data Warehousing context, see this link. You can also read the follow-up post to this pivot query introduction, to see examples of multiple column pivoting in Oracle.
~RS