<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Structured Data</title>
	
	<link>http://structureddata.org</link>
	<description>Data, Databases, Performance &amp; Scalability</description>
	<lastBuildDate>Mon, 02 Apr 2012 05:30:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/StructuredData" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="structureddata" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Visualizing Active Session History (ASH) Data With R</title>
		<link>http://structureddata.org/2011/12/20/visualizing-active-session-history-ash-data-with-r/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=visualizing-active-session-history-ash-data-with-r</link>
		<comments>http://structureddata.org/2011/12/20/visualizing-active-session-history-ash-data-with-r/#comments</comments>
		<pubDate>Tue, 20 Dec 2011 18:30:41 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[#rstats]]></category>
		<category><![CDATA[ASH]]></category>
		<category><![CDATA[AWR]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1684</guid>
		<description><![CDATA[One of the easiest ways to understand something is to see a visualization. Looking at Active Session History (ASH) data is no exception and I&#8217;ll dive into how to do so with R and how I used R plots to visually present a problem and confirm a hypothesis. But first some background&#8230; Background Frequently DBAs use the Automatic Workload Repository (AWR) as an entry point for troubleshooting performance problems and in this case the adventure started the same way. In the AWR report Top 5 Timed Foreground Events, the log file sync event was showing up as the #3 event. This needed deeper investigation as often times the cause for longer log file sync times is related to longer log file parallel write times. Top 5 Timed Foreground Events ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Avg wait % DB Event Waits Time(s) (ms) time Wait Class ------------------------------ ------------ ----------- ------ ------ ---------- log file sync 3,155,253 9,197 3 6.4 Commit Drilling into this a bit deeper the two log file events reported the following in the Foreground Wait Events and Background Wait Events sections of the AWR report: Foreground Wait Events -> s - second, ms - millisecond - 1000th of a second -> Only [...]]]></description>
			<content:encoded><![CDATA[<p>One of the easiest ways to understand something is to see a visualization.  Looking at Active Session History (ASH) data is no exception and I&#8217;ll dive into how to do so with <a href="http://www.r-project.org/" target="_blank">R</a> and how I used R plots to visually present a problem and confirm a hypothesis.  But first some background&#8230;</p>
<h3>Background</h3>
<p>Frequently DBAs use the Automatic Workload Repository (AWR) as an entry point for troubleshooting performance problems and in this case the adventure started the same way.  In the AWR report <strong>Top 5 Timed Foreground Events</strong>, the <strong>log file sync</strong> event was showing up as the #3 event.  This needed deeper investigation as often times the cause for longer <strong>log file sync</strong> times is related to longer <strong>log file parallel write</strong> times.</p>
<pre>
Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
log file sync                     3,155,253       9,197      3    6.4 Commit
</pre>
<p>Drilling into this a bit deeper the two log file events reported the following in the <strong>Foreground Wait Events</strong> and <strong>Background Wait Events</strong> sections of the AWR report:</p>
<pre>
Foreground Wait Events
-> s  - second, ms - millisecond -    1000th of a second
-> Only events with Total Wait Time (s) >= .001 are shown
-> ordered by wait time desc, waits desc (idle events last)
-> %Timeouts: value of 0 indicates value was < .5%.  Value of null is truly 0

                                                             Avg
                                        %Time Total Wait    wait    Waits   % DB
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
log file sync                 3,155,253     0      9,197       3      1.0    6.4

Background Wait Events
-> ordered by wait time desc, waits desc (idle events last)
-> Only events with Total Wait Time (s) >= .001 are shown
-> %Timeouts: value of 0 indicates value was < .5%.  Value of null is truly 0

                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
log file parallel write         659,157     0        526       1      0.2   18.2
</pre>
<p>It is generally always worth looking at the breakdown of these times, as they are <em>averages</em>.  We really want to understand the entire histogram of these wait events.  For that we can look at the <strong>Wait Event Histogram</strong> section of the AWR report as below.</p>
<pre>
Wait Event Histogram
-> Units for Total Waits column: K is 1000, M is 1000000, G is 1000000000
-> % of Waits: value of .0 indicates value was <.05%; value of null is truly 0
-> % of Waits: column heading of <=1s is truly <1024ms, >1s is truly >=1024ms
-> Ordered by Event (idle events last)

                                                    % of Waits
                                 -----------------------------------------------
                           Total
Event                      Waits   <1ms  <2ms  <4ms  <8ms <16ms <32ms  <=1s   >1s
-------------------------- ------ ----- ----- ----- ----- ----- ----- ----- -----
log file parallel write    661.6K  84.7   9.7   4.7    .4    .1    .3    .0
log file sync              3138.K  14.0  42.8  30.4   7.9   2.3   2.6    .1

Wait Event Histogram Detail (64 msec to 2 sec)
-> Units for Total Waits column: K is 1000, M is 1000000, G is 1000000000
-> Units for % of Total Waits:
   ms is milliseconds
   s is 1024 milliseconds (approximately 1 second)
-> % of Total Waits: total waits for all wait classes, including Idle
-> % of Total Waits: value of .0 indicates value was <.05%;
   value of null is truly 0
-> Ordered by Event (only non-idle events are displayed)

                                                 % of Total Waits
                                 -----------------------------------------------
                           Waits
                           64ms
Event                      to 2s <32ms <64ms <1/8s <1/4s <1/2s   <1s   <2s  >=2s
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
log file parallel write       52 100.0    .0    .0
log file sync               3535  99.9    .1    .0
</pre>
<p>One thing that you should notice here is there are two sections of <strong>Wait Event Histogram</strong>; the buckets less than 32ms and buckets greater than 32ms.  It is also important to note that only the absence of a value means no timings fell into that bucket -- so even though the report shows .0 there are still events in that bucket (read the section description as it mentions this).</p>
<p>We can see from the second histogram section that there were 52 times that <strong>log file parallel write</strong> was over 64ms as well as 3535 times <strong>log file sync</strong> was over 64ms.  At this point a hypothesis is formed that the two events are correlated -- that is, the belief is the long <strong>log file parallel write</strong> events may be causing the long <strong>log file sync</strong>.  To find data that supports the hypothesis (or not) we can look at the Active Session History (ASH) data to get a more granular view of the wait events.</p>
<h3>Diving Into The ASHes With R</h3>
<p>If the hypothesis is correct, the ASH data should show times where we observe long <strong>log file parallel write</strong> <em>and</em> long <strong>log file sync</strong> waits.  One could write SQL against V$ACTIVE_SESSION_HISTORY to collect all the samples that may demonstrate the hypothesis -- e.g. collect a list of sample ids for each event and examine them for overlap, but a visualization is worth a thousand words.</p>
<p>One way to get quick and easy visualizations from ASH data is to use <a href="http://www.r-project.org/" target="_blank">R</a>.  R has become a very popular tool for those doing statistical analysis and it has some quite useful graphing and plotting packages built in.  R can connect to Oracle via a JDBC package which makes importing data trivial.</p>
<p>Here is a plot that I put together using R for ~300 sample ids (~5 minutes) from ASH (recall that TIME_WAITED is in microseconds):</p>
<p><img style="display:block; margin-left:auto; margin-right:auto;" src="http://structureddata.org/wp-content/uploads/2011/12/Rplot01.png" alt="Rplot01" title="Rplot01.png" border="0" width="800" height="600" /></p>
<p>As you can see from the plots, nearly every time there is a long <strong>log file parallel write</strong> there are also numerous long <strong>log file sync</strong> events.  This data supports the hypothesis.</p>
<h3>Averages Suck</h3>
<p>Most any statistician (or anyone keen on numbers and stats) will tell you averages suck.  This does not mean averages are not useful, but one has to keep in mind averages can hide infrequent outliers (which may make them a tad bit evil).  For instance, AWR is reporting an average of 1ms for <strong>log file parallel write</strong> but we can see that there are several that are falling in the 20ms range in the 5 minute capture.  If we zoom in on the graph (set the y axis at a max of 3ms), we can understand why:<br />
<img style="display:block; margin-left:auto; margin-right:auto;" src="http://structureddata.org/wp-content/uploads/2011/12/Rplot02.png" alt="Rplot02" title="Rplot02.png" border="0" width="800" height="600" /></p>
<p>Most of the <strong>log file parallel write</strong> events are coming in around 0.5ms so even with some 20ms outliers it still yields an average of 1ms, thus hiding the magnitude of impact for the outliers.  This is why drilling down into the ASH data was important for us to understand the scope of the issue.</p>
<h3>With More Data Comes More Questions</h3>
<p>At this point, the visualization of ASH data shows a strong correlation between <strong>log file sync</strong> and <strong>log file parallel write</strong> outliers, but the root cause has not yet been identified.  Perhaps there is more insight lurking in the ASH data?</p>
<p>Given this is an OLTP workload and <strong>log file parallel write</strong> is an IO operation, perhaps it's worth looking at another IO operation like, say, <strong>db file sequential read</strong> - single block IO times.</p>
<p>In the below plot, I've taken a 60 second window of ASH data and plotted all the <strong>db file sequential read</strong> events.  </p>
<p><img style="display:block; margin-left:auto; margin-right:auto;" src="http://structureddata.org/wp-content/uploads/2011/12/Rplot11.png" alt="Rplot11" title="Rplot11.png" border="0" width="800" height="600" /><br />
Interestingly, there are several samples where the IOs are taking significantly longer than most -- the majority are below 10ms but we see a few samples where there are groups in the 20ms or more range.  Let's add the <strong>log file sync</strong> events to the plot.<br />
<img style="display:block; margin-left:auto; margin-right:auto;" src="http://structureddata.org/wp-content/uploads/2011/12/Rplot12.png" alt="Rplot12" title="Rplot12.png" border="0" width="800" height="600" /><br />
Hmmm... even more interesting.  The data is showing a correlation between <strong>log file sync</strong> and <strong>db file sequential read</strong>.  Any bets on what the plot looks like if <strong>log file parallel write</strong> is added?</p>
<p><img style="display:block; margin-left:auto; margin-right:auto;" src="http://structureddata.org/wp-content/uploads/2011/12/Rplot13.png" alt="Rplot13" title="Rplot13.png" border="0" width="800" height="600" /></p>
<p>Very interesting.  The data is showing us that all 3 events are correlated strongly.  Clearly we are on to something here...and using R to plot the ASH data was an easy way to present and understand it visually.</p>
<h3>Summary</h3>
<p>While this blog post did not go into specific root cause, it was a healthy journey in debugging and data driven analysis.  Keep in mind that just because AWR averages look good, don't overlook the fact that the event histograms should also be reviewed to see if there are outliers.  R allows an easy way to put scatter plots of event times together to get a nice visual of what is going on.</p>
<h3>Source Code</h3>
<p>Here is the code I used to do the analysis.  It's pretty straight forward and well commented (I think).  Enjoy!<br />
I'll also mention that I use the <a href="http://rstudio.org/" target="_blank">RStudio</a> IDE for R.</p>
<p><script src="https://gist.github.com/1502596.js"> </script></p>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/SoWsEkuO4Mw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/12/20/visualizing-active-session-history-ash-data-with-r/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>UKOUG – A Deep Dive into the SQL Monitoring Report</title>
		<link>http://structureddata.org/2011/11/30/ukoug-a-deep-dive-into-the-sql-monitoring-report/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=ukoug-a-deep-dive-into-the-sql-monitoring-report</link>
		<comments>http://structureddata.org/2011/11/30/ukoug-a-deep-dive-into-the-sql-monitoring-report/#comments</comments>
		<pubDate>Wed, 30 Nov 2011 17:00:21 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[#ukoug2011]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1645</guid>
		<description><![CDATA[The&#160;UK Oracle User Group Conference 2011 is just around the corner and I just realized that I haven&#8217;t yet highlighted this, other than the &#8220;I&#8217;m speaking&#8221; banner on my blog. &#160;I&#8217;ll be speaking on one of my favorite and most used reports &#8212; the SQL Monitor Report. &#160;Below are the session details [direct link]. &#160;Hope to see you there! Presentation&#160;Title: A Deep Dive into the SQL Monitoring Report Speaker: Mr Greg Rahn Company: Oracle Presentation&#160;abstract: The SQL Monitoring Report was introduced in Oracle Database 11g and has become the single most used diagnostic report by the Oracle Real-World Performance Group and Oracle database development for SQL statement performance issues. This session will start with a technical overview of the SQL Monitoring Report and what metrics and information it provides. From there we&#8217;ll deep dive into numerous examples from the field explaining how this report was used to diagnose and validate performance issues. There is a wealth of information in the SQL Monitoring Report and this session will provide the necessary knowledge to best leverage it. Presentation&#160;begins: 05/12/2011 11:05 Presentation&#160;duration: 60 Presentation&#160;content&#160;level: 2 (1 = indepth, 5 = strategic overview) Audience&#160;experience: All experience levels Audience&#160;function: Product Expert, Product User Speaker&#160;biography: Greg [...]]]></description>
			<content:encoded><![CDATA[<p>The&nbsp;<a href="http://techandebs.ukoug.org/">UK Oracle User Group Conference 2011</a> is just around the corner and I just realized that I haven&#8217;t yet highlighted this, other than the &#8220;I&#8217;m speaking&#8221; banner on my blog. &nbsp;I&#8217;ll be speaking on one of my favorite and most used reports &#8212; the SQL Monitor Report. &nbsp;Below are the session details [<a href="http://2011.ukoug.org/default.asp?p=8850&amp;dlgact=shwprslist&amp;prs_prsid=6660">direct link</a>]. &nbsp;Hope to see you there!</p>
<table style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: #ffffff;" border="0" cellspacing="0" cellpadding="2">
<tbody>
<tr>
<td valign="top"><strong>Presentation&nbsp;Title:</strong></td>
<td valign="top">A Deep Dive into the SQL Monitoring Report</td>
</tr>
<tr>
<td valign="top"><strong>Speaker:</strong></td>
<td valign="top">Mr Greg Rahn</td>
</tr>
<tr>
<td valign="top"><strong>Company:</strong></td>
<td valign="top">Oracle</td>
</tr>
<tr>
<td valign="top"><strong>Presentation&nbsp;abstract:</strong></td>
<td valign="top">The SQL Monitoring Report was introduced in Oracle Database 11g and has become the single most used diagnostic report by the Oracle Real-World Performance Group and Oracle database development for SQL statement performance issues. This session will start with a technical overview of the SQL Monitoring Report and what metrics and information it provides. From there we&#8217;ll deep dive into numerous examples from the field explaining how this report was used to diagnose and validate performance issues. There is a wealth of information in the SQL Monitoring Report and this session will provide the necessary knowledge to best leverage it.</td>
</tr>
<tr>
<td valign="top"><strong>Presentation&nbsp;begins:</strong></td>
<td valign="top">05/12/2011 11:05</td>
</tr>
<tr>
<td valign="top"><strong>Presentation&nbsp;duration:</strong></td>
<td valign="top">60</td>
</tr>
<tr>
<td valign="top"><strong>Presentation&nbsp;content&nbsp;level:</strong></td>
<td valign="top">2 (1 = indepth, 5 = strategic overview)</td>
</tr>
<tr>
<td valign="top"><strong>Audience&nbsp;experience:</strong></td>
<td valign="top">All experience levels</td>
</tr>
<tr>
<td valign="top"><strong>Audience&nbsp;function:</strong></td>
<td valign="top">Product Expert, Product User</td>
</tr>
<tr>
<td valign="top"><strong>Speaker&nbsp;biography:</strong></td>
<td valign="top">Greg Rahn is a database performance engineer in the Real-World Performance Group at Oracle Corporation. He joined the Real-World Performance Group in 2004 and has been working with Oracle databases since 1997. His primary focus is performance and scalability for Oracle data warehouses, specializing in Parallel Execution and Query Optimizer subject areas.</td>
</tr>
<tr>
<td valign="top"><strong>Related&nbsp;topics:</strong></td>
<td valign="top">Oracle RDBMS: Features and Options, Oracle RDBMS: Performance Managment</td>
</tr>
<tr>
<td valign="top"><strong>Hall:</strong></td>
<td valign="top">Hall 5</p>
</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/luInsxO5U-Q" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/11/30/ukoug-a-deep-dive-into-the-sql-monitoring-report/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Counting Triangles Faster</title>
		<link>http://structureddata.org/2011/10/17/counting-triangles-faster/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=counting-triangles-faster</link>
		<comments>http://structureddata.org/2011/10/17/counting-triangles-faster/#comments</comments>
		<pubDate>Mon, 17 Oct 2011 21:05:49 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Parallel Execution]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[graph analysis]]></category>
		<category><![CDATA[Vertica]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1559</guid>
		<description><![CDATA[A few weeks back one of the Vertica developers put up a blog post on counting triangles in an undirected graph with reciprocal edges. The author was comparing the size of the data and the elapsed times to run this calculation on Hadoop and Vertica and put up the work on github and encouraged others: &#8220;do try this at home.&#8221; So I did. Compression Vertica draws attention to the fact that their compression brought the size of the 86,220,856 tuples down to 560MB in size, from a flat file size of 1,263,234,543 bytes resulting in around a 2.25X compression ratio. My first task was to load the data and see how Oracle&#8217;s Hybrid Columnar Compression would compare. Below is a graph of the sizes. As you can see, Oracle&#8217;s default HCC query compression (query high) compresses the data over 2X more than Vertica and even HCC query low compression beats out Vertica&#8217;s compression number. Query Elapsed Times The closest gear I had to Vertica&#8217;s hardware was an Exadata X2-2 system &#8212; both use 2 socket, 12 core Westmere-EP nodes. While one may try to argue that Exadata may somehow influence the execution times, I&#8217;ll point out that I was using [...]]]></description>
			<content:encoded><![CDATA[<p>A few weeks back one of the Vertica developers put up a <a href="http://www.vertica.com/2011/09/21/counting-triangles/" target="_blank">blog post on counting triangles</a> in an undirected graph with reciprocal edges.  The author was comparing the size of the data and the elapsed times to run this calculation on Hadoop and Vertica and put up the work on github and encouraged others: &#8220;do try this at home.&#8221;  So I did.</p>
<h3>Compression</h3>
<p>Vertica draws attention to the fact that their compression brought the size of the 86,220,856 tuples down to 560MB in size, from a flat file size of 1,263,234,543 bytes resulting in around a 2.25X compression ratio.  My first task was to load the data and see how Oracle&#8217;s Hybrid Columnar Compression would compare.  Below is a graph of the sizes.</p>
<p><a href="http://structureddata.org/wp-content/uploads/2011/10/compression.png"><img src="http://structureddata.org/wp-content/uploads/2011/10/compression.png" alt="" title="compression" width="1408" height="958" class="aligncenter size-full wp-image-1596" /></a></p>
<p>As you can see, Oracle&#8217;s default HCC query compression (query high) compresses the data over 2X more than Vertica and even HCC query low compression beats out Vertica&#8217;s compression number.  </p>
<h3>Query Elapsed Times</h3>
<p>The closest gear I had to Vertica&#8217;s hardware was an <a href="http://www.oracle.com/technetwork/database/exadata/dbmachine-x2-2-datasheet-175280.pdf" target="_blank">Exadata X2-2</a> system &#8212; both use 2 socket, 12 core Westmere-EP nodes.  While one may try to argue that Exadata may somehow influence the execution times, I&#8217;ll point out that I was using <a href="http://download.oracle.com/docs/cd/E11882_01/server.112/e25554/px.htm#BCECBIDF">In-Memory Parallel Execution</a> so no table data was even read from spinning disk or Exadata Flash Cache &#8212; it&#8217;s all memory resident in the database nodes&#8217; buffer cache.  This seems to be inline with how Vertica executed their tests though not explicitly stated (it&#8217;s a reasonable assertion).  </p>
<p>After I loaded the data and gathered table stats, I fired off the exact same SQL query that Vertica used to count triangles to see how Oracle would compare.  I ran the query on 1, 2 and 4 nodes just like Vertica.  Below is a graph of the results.</p>
<p><a href="http://structureddata.org/wp-content/uploads/2011/10/elapsed1.png"><img src="http://structureddata.org/wp-content/uploads/2011/10/elapsed1.png" alt="" title="elapsed1" width="1408" height="958" class="aligncenter size-full wp-image-1567" /></a></p>
<p>As you can see, the elapsed times are reasonably close but overall in the favor of Oracle winning 2 of the 3 scale points as well as having a lower sum of the three executions:  Vertica 519 seconds, Oracle 487 seconds &#8212; advantage Oracle of 32 seconds.</p>
<h3>It Should Go Faster!</h3>
<p>As a database performance engineer I was thinking to myself, &#8220;it really should go faster!&#8221;  I took a few minutes to look over things to see what could make this perform better.  You might think I was looking at parameters or something like that, but you would be wrong.  After a few minutes of looking at the query and the execution plan it became obvious to me &#8212; it could go faster!  I made a rather subtle change to the SQL query and reran my experiments.  With the modified SQL query Oracle was now executing twice as fast on 1 node than Vertica was on 4 nodes.  Also, on 4 nodes, the elapsed time came in at just 14 seconds, compared to the 97 seconds Vertica reported &#8212; a difference of almost 7X!  Below are the combined results.</p>
<p><a href="http://structureddata.org/wp-content/uploads/2011/10/elapsed2.png"><img src="http://structureddata.org/wp-content/uploads/2011/10/elapsed2.png" alt="" title="elapsed2" width="1408" height="958" class="aligncenter size-full wp-image-1568" /></a></p>
<h3>What&#8217;s The Go Fast Trick?</h3>
<p>I was thinking a bit more about the problem at hand &#8212; we need to count vertices but not count them twice since they are reciprocal.  Given that for any edge, it exists in both directions, the query can be structured like Vertica wrote it &#8212; doing the filtering with a join predicate like <strong>e1.source < e2.source</strong> to eliminate the duplicates or we can simply use a single table filter predicate like <strong>source < dest</strong> <em>before</em> the join takes place.  One of the first things they taught me in query planning and optimization class was to filter early!  That notation pays off big here because the early filter cuts the rows going into the first join as well as the output of the first join by a factor of 2 &#8212; 1.8 billion rows output vs. 3.6 billion.  That&#8217;s a huge savings not only in the first join, but also in the second join as well.</p>
<p>Here is what my revised query looks like:</p>
<pre class="brush: sql; title: ; notranslate">
with
  e1 as (select * from edges where source &lt; dest),
  e2 as (select * from edges where source &lt; dest),
  e3 as (select * from edges where source &gt; dest)
select count(*)
from e1
join e2 on (e1.dest = e2.source)
join e3 on (e2.dest = e3.source)
where e3.dest = e1.source
</pre>
<h3>Summary</h3>
<p>First, I&#8217;d like to thank the Vertica team for throwing the challenge out there and being kind enough to provide the data, code and their elapsed times.  I always enjoy a challenge &#8212; especially one that I can improve upon.  Now, I&#8217;m not going to throw any product marketing nonsense out there as that is certainly not my style (and there certainly is more than enough of that already), but rather I&#8217;ll just let the numbers do the talking.  I&#8217;d also like to point out that this experiment was done without any structure other than the table.  And in full disclosure, all of my SQL commands are available as well.</p>
<p>The other comment that I would make is that the new and improved execution times really make a mockery of the exercise when comparing to Hadoop MapReduce or Pig, but I would also mention that this test case is extremely favorable for parallel pipelined databases that can perform all in-memory operations and given the data set is so small, this is the obviously the case.  Overall, in my opinion, a poor problem choice to compare the three technologies as it obviously (over) highlights the right tool for the job cliche.</p>
<p>Experiments performed on Oracle Database 11.2.0.2.</p>
<p><script src="https://gist.github.com/1289188.js"> </script></p>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/fzhZryk7yLg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/10/17/counting-triangles-faster/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Exadata Smart Flash Logging Explained</title>
		<link>http://structureddata.org/2011/10/12/exadata-smart-flash-logging-explained/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=exadata-smart-flash-logging-explained</link>
		<comments>http://structureddata.org/2011/10/12/exadata-smart-flash-logging-explained/#comments</comments>
		<pubDate>Wed, 12 Oct 2011 22:05:36 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[11gR2]]></category>
		<category><![CDATA[Exadata]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Exadata Smart Flash Logging]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1550</guid>
		<description><![CDATA[I&#8217;ve seen some posts on the blogosphere where people attempt to explain (or should I say guess) how Exadata Smart Flash Logging works and most of them are wrong. Hopefully this post will help clear up some the misconceptions out there. The following is an excerpt from the paper entitled &#8220;Exadata Smart Flash Cache Features and the Oracle Exadata Database Machine&#8221; that goes into technical detail on the Exadata Smart Flash Logging feature. Smart Flash Logging works as follows. When receiving a redo log write request, Exadata will do parallel writes to the on-disk redo logs as well as a small amount of space reserved in the flash hardware. When either of these writes has successfully completed the database will be immediately notified of completion. If the disk drives hosting the logs experience slow response times, then the Exadata Smart Flash Cache will provide a faster log write response time. Conversely, if the Exadata Smart Flash Cache is temporarily experiencing slow response times (e.g., due to wear leveling algorithms), then the disk drive will provide a faster response time. Given the speed advantage the Exadata flash hardware has over disk drives, log writes should be written to Exadata Smart Flash [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve seen some posts on the blogosphere where people attempt to explain (or should I say guess) how Exadata Smart Flash Logging works and most of them are wrong.  Hopefully this post will help clear up some the misconceptions out there.</p>
<p>The following is an excerpt from the paper entitled &#8220;<a href="http://www.oracle.com/technetwork/database/exadata/exadata-smart-flash-cache-366203.pdf" target="_blank">Exadata Smart Flash Cache Features and the Oracle Exadata Database Machine</a>&#8221; that goes into technical detail on the Exadata Smart Flash Logging feature.</p>
<blockquote><p>
Smart Flash Logging works as follows. When receiving a redo log write request, Exadata will do<br />
parallel writes to the on-disk redo logs as well as a small amount of space reserved in the flash<br />
hardware. When either of these writes has successfully completed the database will be<br />
immediately notified of completion. If the disk drives hosting the logs experience slow response<br />
times, then the Exadata Smart Flash Cache will provide a faster log write response time.<br />
Conversely, if the Exadata Smart Flash Cache is temporarily experiencing slow response times<br />
(e.g., due to wear leveling algorithms), then the disk drive will provide a faster response time.<br />
Given the speed advantage the Exadata flash hardware has over disk drives, log writes should be<br />
written to Exadata Smart Flash Cache, almost all of the time, resulting in very fast redo write<br />
performance. This algorithm will significantly smooth out redo write response times and provide<br />
overall better database performance. </p>
<p>The Exadata Smart Flash Cache is not used as a permanent store for redo data – it is just a<br />
temporary store for the purpose of providing fast redo write response time. The Exadata Smart<br />
Flash Cache is a cache for storing redo data until this data is safely written to disk. The Exadata<br />
Storage Server comes with a substantial amount of flash storage. A small amount is allocated for<br />
database logging and the remainder will be used for caching user data. The best practices and<br />
configuration of redo log sizing, duplexing and mirroring do not change when using Exadata<br />
Smart Flash Logging. Smart Flash Logging handles all crash and recovery scenarios without<br />
requiring any additional or special administrator intervention beyond what would normally be<br />
needed for recovery of the database from redo logs. From an end user perspective, the system<br />
behaves in a completely transparent manner and the user need not be aware that flash is being<br />
used as a temporary store for redo. The only behavioral difference will be consistently low<br />
latencies for redo log writes. </p>
<p>By default, 512 MB of the Exadata flash is allocated to Smart Flash Logging. Relative to the 384<br />
GB of flash in each Exadata cell this is an insignificant investment for a huge performance<br />
benefit. This default allocation will be sufficient for most situations. Statistics are maintained to<br />
indicate the number and frequency of redo writes serviced by flash and those that could not be<br />
serviced, due to, for example, insufficient flash space being allocated for Smart Flash Logging.<br />
For a database with a high redo generation rate, or when many databases are consolidated on to<br />
one Exadata Database Machine, the size of the flash allocated to Smart Flash Logging may need<br />
to be enlarged. In addition, for consolidated deployments, the Exadata I/O Resource Manager<br />
(IORM) has been enhanced to enable or disable Smart Flash Logging for the different databases<br />
running on the Database Machine, reserving flash for the most performance critical databases.
</p></blockquote>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/Y5YYfhp6xr0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/10/12/exadata-smart-flash-logging-explained/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Critical Skills for Performance Work</title>
		<link>http://structureddata.org/2011/09/25/critical-skills-for-performance-work/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=critical-skills-for-performance-work</link>
		<comments>http://structureddata.org/2011/09/25/critical-skills-for-performance-work/#comments</comments>
		<pubDate>Mon, 26 Sep 2011 02:16:41 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Philosophy]]></category>
		<category><![CDATA[data science]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1533</guid>
		<description><![CDATA[I was just watching John Rauser&#8217;s keynote &#8220;What is a Career in Big Data?&#8221; from last weeks Strata Conference New York and I have to say it&#8217;s an amazing talk. I would highly recommended it to anyone who does any type of data analysis, including any type of performance analysis. I found many of the &#8220;critical skill&#8221; points John made to have a strong correlation to performance analysis work as well. Some quotations that really stand out to me: On writing: &#8220;[writing]&#8230;it&#8217;s the first major difference between mediocrity and greatness.&#8221; [10:39] &#8220;If it isn&#8217;t written down, it never happened&#8230;if your writing is so opaque that people can not understand your work, then you may as well never have never done it.&#8221; [10:50] On skepticism: &#8220;If you take a skeptical attitude toward your analysis you&#8217;ll look just as hard for data that refutes your hypothesis as you will for data that confirms it. A skeptic attacks the same question from many different angles and dramatically increases their confidence in the results.&#8221; [11:40] The main reason that I wanted to highlight the critical skills of writing and skepticism is I see (read) way too many blog posts from Oracle users that fail [...]]]></description>
			<content:encoded><![CDATA[<p>I was just watching <a href="http://strataconf.com/stratany2011/public/schedule/speaker/10070">John Rauser&#8217;s</a> keynote &#8220;<em>What is a Career in Big Data?</em>&#8221; from last weeks <a href="http://strataconf.com/stratany2011">Strata Conference New York</a> and I have to say it&#8217;s an amazing talk.  I would highly recommended it to anyone who does any type of data analysis, including any type of performance analysis.</p>
<p>I found many of the &#8220;critical skill&#8221; points John made to have a strong correlation to performance analysis work as well.  Some quotations that really stand out to me:</p>
<p><strong>On writing:</strong></p>
<blockquote><p>&#8220;[writing]&#8230;it&#8217;s the first major difference between mediocrity and greatness.&#8221; [10:39]</p></blockquote>
<blockquote><p>&#8220;If it isn&#8217;t written down, it never happened&#8230;if your writing is so opaque that people can not understand your work, then you may as well never have never done it.&#8221; [10:50]</p></blockquote>
<p><strong>On skepticism:</strong></p>
<blockquote><p>&#8220;If you take a skeptical attitude toward your analysis you&#8217;ll look just as hard for data that refutes your hypothesis as you will for data that confirms it.  A skeptic attacks the same question from many different angles and dramatically increases their confidence in the results.&#8221; [11:40]</p></blockquote>
<p>The main reason that I wanted to highlight the critical skills of writing and skepticism is I see (read) way too many blog posts from Oracle users that fail on both of those skills.  The writing simply fails to clearly communicate the issue at hand, what analysis was done and what data was used to draw the conclusion.  Many blog posts also are written without any level of skepticism&#8211;they simply &#8220;report&#8221; their findings and fail to question their own work for accuracy.  I hope this talk inspires you to raise your bar when it comes to performance work.  Enjoy!</p>
<p><iframe width="767" height="420" src="http://www.youtube.com/embed/0tuEEnL61HM?rel=0&amp;hd=1" frameborder="0" allowfullscreen></iframe></p>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/0thblfpLSx4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/09/25/critical-skills-for-performance-work/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>11.2.0.3 Patch Set For Oracle Database Server</title>
		<link>http://structureddata.org/2011/09/23/11-2-0-3-patch-set-for-oracle-database-server/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=11-2-0-3-patch-set-for-oracle-database-server</link>
		<comments>http://structureddata.org/2011/09/23/11-2-0-3-patch-set-for-oracle-database-server/#comments</comments>
		<pubDate>Fri, 23 Sep 2011 15:00:25 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[11gR2]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[11.2.0.3]]></category>
		<category><![CDATA[patch]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1529</guid>
		<description><![CDATA[Just a quick post that the 11.2.0.3 patch set for Oracle Database Server has been released for x86 and x86-64 platforms (other ports will soon follow). The patchset number is 10404530 and is available for download from My Oracle Support. Also be sure to check out the 11.2.0.3 New Features Guide.]]></description>
			<content:encoded><![CDATA[<p>Just a quick post that the 11.2.0.3 patch set for Oracle Database Server has been released for x86 and x86-64 platforms (other ports will soon follow). The patchset number is <a href="https://updates.oracle.com/Orion/PatchDetails/process_form?patch_num=10404530">10404530</a> and is available for download from <a href="https://support.oracle.com/">My Oracle Support</a>.</p>
<p>Also be sure to check out the <a href="http://download.oracle.com/docs/cd/E11882_01/server.112/e22487/chapter1_11203.htm">11.2.0.3 New Features Guide</a>.</p>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/XzeZndpZhI4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/09/23/11-2-0-3-patch-set-for-oracle-database-server/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Right Practice (via James Morle’s Blog)</title>
		<link>http://structureddata.org/2011/09/16/right-practice-via-james-morles-blog/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=right-practice-via-james-morles-blog</link>
		<comments>http://structureddata.org/2011/09/16/right-practice-via-james-morles-blog/#comments</comments>
		<pubDate>Fri, 16 Sep 2011 13:47:15 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[best practice]]></category>
		<category><![CDATA[right practice]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1526</guid>
		<description><![CDATA[Great post from James Morle. Consider it required reading (and thinking). Wow, it&#8217;s been a while since I wrote a post, sorry about that! I thought that I would take a brief break from the technical postings and espouse some opinion on something that has been bothering me for a while &#8211; &#8216;Best Practices.&#8217; Best Practices have been around a long time, and started with very good intentions. In fact, one could easily claim that they are still produced with good intentions: To communicate methods that the hardware and software … Read More via James Morle&#8217;s Blog]]></description>
			<content:encoded><![CDATA[<p>Great post from James Morle.  Consider it required reading (and thinking).</p>
<blockquote style="overflow: hidden;" cite="http://jamesmorle.wordpress.com/?p=162"><p>Wow, it&#8217;s been a while since I wrote a post, sorry about that! I thought that I would take a brief break from the technical postings and espouse some opinion on something that has been bothering me for a while &#8211; &#8216;Best Practices.&#8217; Best Practices have been around a long time, and started with very good intentions. In fact, one could easily claim that they are still produced with good intentions: To communicate methods that the hardware and software … <a title="James Morle's Blog" href="http://jamesmorle.wordpress.com/?p=162">Read More</a></p></blockquote>
<p><small>via <a title="James Morle's Blog" href="http://jamesmorle.wordpress.com/?p=162">James Morle&#8217;s Blog</a></small></p>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/X2ikd85gWug" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/09/16/right-practice-via-james-morles-blog/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Reading Active SQL Monitor Reports Offline</title>
		<link>http://structureddata.org/2011/08/28/reading-active-sql-monitor-reports-offline/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=reading-active-sql-monitor-reports-offline</link>
		<comments>http://structureddata.org/2011/08/28/reading-active-sql-monitor-reports-offline/#comments</comments>
		<pubDate>Sun, 28 Aug 2011 17:25:44 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[11gR1]]></category>
		<category><![CDATA[11gR2]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[sql monitor]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1514</guid>
		<description><![CDATA[Active SQL Monitor Reports require some files from the Internet to render the report in the browser. That&#8217;s no big deal if you have an Internet connection, but what if you do not? Generally if you load an Active SQL Monitor Report without an Internet connection, you will just see an empty page in your browser. There is a little trick I use to work around this issue &#8212; it&#8217;s to have a copy of the required swf and javascript files locally. Here is how I do that on my Mac assuming a couple of things: You know how to turn on the web server/Web Sharing in System Preferences > Sharing You know how to get wget (or use curl [part of OS X] to mimic the below commands) Now edit /etc/hosts and add Now when you load an Active SQL Monitor Report it will access those files from your local web server. Don&#8217;t forget to undo the /etc/hosts entry once you are back on the Internet. Also, keep in mind that these files may change so re-download them from time to time. Option 2 &#8211; Firefox If Firefox is your browser of choice, then there is another option. Having [...]]]></description>
			<content:encoded><![CDATA[<p>Active SQL Monitor Reports require some files from the Internet to render the report in the browser.  That&#8217;s no big deal if you have an Internet connection, but what if you do not?  Generally if you load an Active SQL Monitor Report without an Internet connection, you will just see an empty page in your browser.  There is a little trick I use to work around this issue &#8212; it&#8217;s to have a copy of the required swf and javascript files locally.  Here is how I do that on my Mac assuming a couple of things:</p>
<ol>
<li>You know how to turn on the web server/Web Sharing in System Preferences > Sharing</li>
<li>You know how to get <a href="http://www.gnu.org/s/wget/">wget</a> (or use <a href="http://curl.haxx.se/">curl</a> [part of OS X] to mimic the below commands)</li>
</ol>
<pre class="brush: plain; title: ; notranslate">
# assuming you already have the web server running and have wget
cd /Library/WebServer/Documents
wget --mirror http://download.oracle.com/otn_software/emviewers/scripts/flashver.js
wget --mirror http://download.oracle.com/otn_software/emviewers/scripts/loadswf.js
wget --mirror http://download.oracle.com/otn_software/emviewers/scripts/document.js
wget --mirror http://download.oracle.com/otn_software/emviewers/sqlmonitor/11/sqlmonitor.swf
ln -s download.oracle.com/otn_software otn_software
</pre>
<p>Now edit /etc/hosts and add</p>
<pre class="brush: plain; title: ; notranslate">
127.0.0.1 download.oracle.com
</pre>
<p>Now when you load an Active SQL Monitor Report it will access those files from your local web server. Don&#8217;t forget to undo the /etc/hosts entry once you are back on the Internet.  Also, keep in mind that these files may change so re-download them from time to time.</p>
<h3>Option 2 &#8211; Firefox</h3>
<p>If <a href="http://www.mozilla.com/firefox/">Firefox</a> is your browser of choice, then there is another option.  Having successfully rendered an Active SQL Monitor Report while online, you can work offline by using File > Work Offline.  This should allow the use of a cached version of the swf and javascript files.</p>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/exuS7C5hcVs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/08/28/reading-active-sql-monitor-reports-offline/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Creating Optimizer Trace Files</title>
		<link>http://structureddata.org/2011/08/18/creating-optimizer-trace-files/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=creating-optimizer-trace-files</link>
		<comments>http://structureddata.org/2011/08/18/creating-optimizer-trace-files/#comments</comments>
		<pubDate>Fri, 19 Aug 2011 04:17:09 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[11gR1]]></category>
		<category><![CDATA[11gR2]]></category>
		<category><![CDATA[Execution Plans]]></category>
		<category><![CDATA[Optimizer]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[10053]]></category>
		<category><![CDATA[trace]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1501</guid>
		<description><![CDATA[Many Oracle DBA&#8217;s are probably familiar with what Optimizer trace files are and likely know how to create them. When I say &#8220;Optimizer trace&#8221; more than likely you think of event 10053, right? SQL code like this probably is familiar then: In 11g, a new diagnostic events infrastructure was implemented and there are various levels of debug output that you can control for sql compilation. ORADEBUG shows us the hierarchy. My personal preference for Optimizer tracing is to stick with the most detailed level, in this case SQL_Compiler vs. just SQL_Optimizer. Given that, we can do the following in 11g: One of the big drawbacks of using the 10053 event or the SQL_Compiler event are that two things need to happen: 1) you have to have the SQL text and 2) a hard parse needs to take place (so there is actually sql compilation). What if you want to get an Optimizer trace file for a statement already executed in your database and is in the cursor cache? Chances are you know how to do #1 &#038; #2 but it&#8217;s kind of a pain, right? Even more of a pain if the query is pages of SQL or you don&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>Many Oracle DBA&#8217;s are probably familiar with what Optimizer trace files are and likely know how to create them.  When I say &#8220;Optimizer trace&#8221; more than likely you think of event 10053, right?  SQL code like this probably is familiar then:</p>
<pre class="brush: sql; title: ; notranslate">
alter session set tracefile_identifier='MY_10053';
alter session set events '10053 trace name context forever';
select /* hard parse comment */ * from emp where ename = 'SCOTT';
alter session set events '10053 trace name context off';
</pre>
<p>In 11g, a new diagnostic events infrastructure was implemented and there are various levels of debug output that you can control for sql compilation. ORADEBUG shows us the hierarchy.</p>
<pre class="brush: sql; title: ; notranslate">
SQL&gt; oradebug doc component SQL_Compiler

  SQL_Compiler                    SQL Compiler
    SQL_Parser                    SQL Parser (qcs)
    SQL_Semantic                  SQL Semantic Analysis (kkm)
    SQL_Optimizer                 SQL Optimizer
      SQL_Transform               SQL Transformation (kkq, vop, nso)
        SQL_MVRW                  SQL Materialized View Rewrite
        SQL_VMerge                SQL View Merging (kkqvm)
        SQL_Virtual               SQL Virtual Column (qksvc, kkfi)
      SQL_APA                     SQL Access Path Analysis (apa)
      SQL_Costing                 SQL Cost-based Analysis (kko, kke)
        SQL_Parallel_Optimization SQL Parallel Optimization (kkopq)
    SQL_Code_Generator            SQL Code Generator (qka, qkn, qke, kkfd, qkx)
      SQL_Parallel_Compilation    SQL Parallel Compilation (kkfd)
      SQL_Expression_Analysis     SQL Expression Analysis (qke)
      SQL_Plan_Management         SQL Plan Managment (kkopm)
    MPGE                          MPGE (qksctx)
</pre>
<p>My personal preference for Optimizer tracing is to stick with the most detailed level, in this case SQL_Compiler vs. just SQL_Optimizer.</p>
<p>Given that, we can do the following in 11g:</p>
<pre class="brush: sql; title: ; notranslate">
alter session set tracefile_identifier='MY_SQL_Compiler_TRACE';
alter session set events 'trace [SQL_Compiler.*]';
select /* hard parse comment */ * from emp where ename = 'SCOTT';
alter session set events 'trace [SQL_Compiler.*] off';
</pre>
<p>One of the big drawbacks of using the 10053 event or the SQL_Compiler event are that two things need to happen: 1) you have to have the SQL text and 2) a hard parse needs to take place (so there is actually sql compilation).  What if you want to get an Optimizer trace file for a statement already executed in your database and is in the cursor cache?  Chances are you know how to do #1 &#038; #2 but it&#8217;s kind of a pain, right?  Even more of a pain if the query is pages of SQL or you don&#8217;t have the application schema password, etc.</p>
<p>In 11gR2 (11.2) there was a procedure added to DBMS_SQLDIAG called DUMP_TRACE.  The DUMP_TRACE procedure didn&#8217;t make the <a href="http://download.oracle.com/docs/cd/E11882_01/appdev.112/e16760/d_sqldiag.htm" target="_blank">DBMS_SQLDIAG documentation</a> but here is the declaration:</p>
<pre class="brush: sql; title: ; notranslate">
-- $ORACLE_HOME/rdbms/admin/dbmsdiag.sql
-------------------------------- dump_trace ---------------------------------
-- NAME:
--     dump_trace - Dump Optimizer Trace
--
-- DESCRIPTION:
--     This procedure dumps the optimizer or compiler trace for a give SQL
--     statement identified by a SQL ID and an optional child number.
--
-- PARAMETERS:
--     p_sql_id          (IN)  -  identifier of the statement in the cursor
--                                cache
--     p_child_number    (IN)  -  child number
--     p_component       (IN)  -  component name
--                                Valid values are Optimizer and Compiler
--                                The default is Optimizer
--     p_file_id         (IN)  -  file identifier
------------------------------------------------------------------------------
PROCEDURE dump_trace(
              p_sql_id         IN varchar2,
              p_child_number   IN number   DEFAULT 0,
              p_component      IN varchar2 DEFAULT 'Optimizer',
              p_file_id        IN varchar2 DEFAULT null);
</pre>
<p>As you can see, you can specify either Optimizer or Compiler as the component name which is the equivalent of the SQL_Compiler or SQL_Optimizer events.  Conveniently you can use P_FILE_ID to add a trace file identifier to your trace file.  The four commands used above can be simplified to just a single call.  For example:</p>
<pre class="brush: sql; title: ; notranslate">
SQL&gt; begin
  2    dbms_sqldiag.dump_trace(p_sql_id=&gt;'6yf5xywktqsa7',
  3                            p_child_number=&gt;0,
  4                            p_component=&gt;'Compiler',
  5                            p_file_id=&gt;'MY_TRACE_DUMP');
  6  end;
  7  /

PL/SQL procedure successfully completed.
</pre>
<p>If we look at the trace file we can see that DBMS_SQLDIAG.DUMP_TRACE added in a comment /* SQL Analyze(1443,0) */ and did the hard parse for us (Thanks!).  </p>
<pre class="brush: plain; title: ; notranslate">
Enabling tracing for cur#=9 sqlid=as9bkjstppk0a recursive
Parsing cur#=9 sqlid=as9bkjstppk0a len=91
sql=/* SQL Analyze(1443,0) */ select /* hard parse comment */ * from emp where ename = 'SCOTT'
End parsing of cur#=9 sqlid=as9bkjstppk0a
Semantic Analysis cur#=9 sqlid=as9bkjstppk0a
OPTIMIZER INFORMATION

******************************************
----- Current SQL Statement for this session (sql_id=as9bkjstppk0a) -----
/* SQL Analyze(1443,0) */ select /* hard parse comment */ * from emp where ename = 'SCOTT'
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x16fd3a368       145  package body SYS.DBMS_SQLTUNE_INTERNAL
0x16fd3a368     12085  package body SYS.DBMS_SQLTUNE_INTERNAL
0x18e7fead8      1229  package body SYS.DBMS_SQLDIAG
0x16fdbddd0         1  anonymous block
*******************************************
</pre>
<p>Hopefully you don&#8217;t find yourself having to get too many Optimizer Trace Dumps, but if you do and you&#8217;re on 11.2, the hard work has been done for you.</p>
<p><strong>Footnote:</strong><br />
Due to a bug in DBMS_ASSERT, you will need to specify a value for P_COMPONENT.  If you leave it NULL, it will error like such:</p>
<pre class="brush: sql; title: ; notranslate">
SQL&gt; begin
  2    dbms_sqldiag.dump_trace(p_sql_id=&gt;'6yf5xywktqsa7',
  3                            p_child_number=&gt;0,
  4                            p_component=&gt;NULL,
  5                            p_file_id=&gt;'MY_TRACE_DUMP');
  6  end;
  7  /
begin
*
ERROR at line 1:
ORA-44003: invalid SQL name
ORA-06512: at 'SYS.DBMS_ASSERT', line 160
ORA-06512: at 'SYS.DBMS_SQLDIAG', line 1182
ORA-06512: at line 2
</pre>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/3vU0-xepJA8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/08/18/creating-optimizer-trace-files/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Free Online Classes from Stanford University</title>
		<link>http://structureddata.org/2011/08/17/free-online-classes-from-stanford-university/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=free-online-classes-from-stanford-university</link>
		<comments>http://structureddata.org/2011/08/17/free-online-classes-from-stanford-university/#comments</comments>
		<pubDate>Wed, 17 Aug 2011 17:49:24 +0000</pubDate>
		<dc:creator>Greg Rahn</dc:creator>
				<category><![CDATA[Oracle]]></category>

		<guid isPermaLink="false">http://structureddata.org/?p=1498</guid>
		<description><![CDATA[If you are looking to know more about the insides of databases (relational algebra, relational design theory, etc.) or machine learning or AI you may want to check out these free online classes from Stanford University. Class begins October 10! Introduction to Databases Introduction to Artificial Intelligence Introduction to Machine Learning]]></description>
			<content:encoded><![CDATA[<p>If you are looking to know more about the insides of databases (relational algebra, relational design theory, etc.) or machine learning or AI you may want to check out these free online classes from Stanford University.  Class begins October 10!  </p>
<ul>
<li><a href="http://www.db-class.org/">Introduction to Databases</a></li>
<li><a href="http://www.ai-class.com/">Introduction to Artificial Intelligence</a></li>
<li><a href="http://www.ml-class.com/">Introduction to Machine Learning</a></li>
</ul>
<img src="http://feeds.feedburner.com/~r/StructuredData/~4/rVLO_ibc8Xo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://structureddata.org/2011/08/17/free-online-classes-from-stanford-university/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss><!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced
Database Caching 21/73 queries in 1.236 seconds using disk: basic
Object Caching 988/1145 objects using disk: basic

Served from: structureddata.org @ 2012-04-30 00:41:59 -->

