ubTools Support http://jira.ubtools.com/jira/secure/IssueNavigator.jspa?reset=true&type=11&pid=10042&sorter/field=issuekey&sorter/order=DESC An XML representation of a search request en-us RE: [QA-57] ORA-04030 returned by "__libc_sbrk(0x0000000001010020) Err#12 ENOMEM" http://jira.ubtools.com/jira/browse/QA-57?focusedCommentId=31971#action_31971 Tue, 28 Feb 2017 09:01:46 +0000 ubTools Support There was no response from the system admin. But, the problem was a resource limit problem that Oracle user could not allocate memory. <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-57">QA-57</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=31971#action_31971 RE: [QA-60] "PRVF-5507 : NTP daemon or service is not running on any node ..." even if NTP is running. http://jira.ubtools.com/jira/browse/QA-60?focusedCommentId=31194#action_31194 Sat, 5 Mar 2016 15:22:48 +0000 ubTools Support <b>Solution</b> <p>There was no "/var/run/ntpd.pid" file defined in "/etc/sysconfig/ntpd". The problem has been solved after setting as below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>#OPTIONS="-g" OPTIONS="-x -g -p /var/run/ntpd.pid" </pre> </div></div> <p>Additional note:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>NTP has been replaced by Chrony(new feature) in Oracle Linux 7. </pre> </div></div> <p><em>Ref: Oracle Note: Unable to Configure NTP after Oracle Linux 7 Installation (Doc ID 1995703.1)</em></p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-60">QA-60</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=31194#action_31194 RE: [QA-60] "PRVF-5507 : NTP daemon or service is not running on any node ..." even if NTP is running. http://jira.ubtools.com/jira/browse/QA-60?focusedCommentId=31193#action_31193 Sat, 5 Mar 2016 15:17:19 +0000 ubTools Support <b>CVU Trace:</b> <p><ins>Generating Trace:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>$ export CV_TRACELOC=/tmp $ export SRVM_TRACE=true $ ./runcluvfy.sh stage -pre crsinst -n sygnx01,sygnx02 -verbose ..... </pre> </div></div> <p><ins>Excerpt from the trace:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>[20421@***.***.com] [Worker 0] [ 2016-03-05 16:20:46.657 EET ] [RuntimeExec.runCommand:77] /tmp/CVU_11.2.0.4.0_grid/exectask.sh -chkfile /var/run/ntpd.pid [20421@***.***.com] [Worker 0] [ 2016-03-05 16:20:46.659 EET ] [RuntimeExec.runCommand:142] runCommand: Waiting for the process [20421@***.***.com] [Thread-216] [ 2016-03-05 16:20:46.659 EET ] [StreamReader.run:61] In StreamReader.run [20421@***.***.com] [Thread-217] [ 2016-03-05 16:20:46.659 EET ] [StreamReader.run:61] In StreamReader.run [20421@***.***.com] [Thread-216] [ 2016-03-05 16:20:46.668 EET ] [StreamReader.run:65] OUTPUT&gt;&lt;CV_VRES&gt;1&lt;/CV_VRES&gt;&lt;CV_LOG&gt;Exectask: file check failed&lt;/CV_LOG&gt;&lt;CV_ERES&gt;0&lt;/CV_ERES&gt; ..... [20421@sygnx01.sankomenkul.com] [main] [ 2016-03-05 16:20:46.669 EET ] [TaskDaemonLiveliness.displayDaemonLivelinessOutput:283] Daemon 'ntpd' is not running on node: 'sygnx01' </pre> </div></div> <p>"/var/run/ntpd.pid" doesn't exist.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-60">QA-60</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=31193#action_31193 RE: [QA-60] "PRVF-5507 : NTP daemon or service is not running on any node ..." even if NTP is running. http://jira.ubtools.com/jira/browse/QA-60?focusedCommentId=31192#action_31192 Sat, 5 Mar 2016 15:12:26 +0000 ubTools Support <b>NTP status:</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>[root@sygnx01 ~]# systemctl status ntpd ntpd.service - Network Time Service Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2016-03-05 14:32:46 EET; 1h 29min ago Process: 1074 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS) Main PID: 1081 (ntpd) CGroup: /system.slice/ntpd.service 1081 /usr/sbin/ntpd -u ntp:ntp -x -g </pre> </div></div> <p>NTP is running.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-60">QA-60</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=31192#action_31192 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30385#action_30385 Fri, 2 Oct 2015 13:45:05 +0000 ubTools Support The focus here is to show how CPU scaling governor affects Oracle service and wait times; not to show how to tune Oracle events such as "row cache lock" above. <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30385#action_30385 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30367#action_30367 Wed, 30 Sep 2015 09:24:23 +0000 ubTools Support <b>CPU TIME and LOGICAL READS:</b> <p><b>Data:</b> </p> <table class='confluenceTable'><tbody> <tr> <th class='confluenceTh'>&nbsp;</th> <th class='confluenceTh'>ondemand</th> <th class='confluenceTh'>performance</th> <th class='confluenceTh'>Difference(%)</th> </tr> <tr> <td class='confluenceTd'>CPU time per second</td> <td class='confluenceTd'>7.8s</td> <td class='confluenceTd'>8.3s</td> <td class='confluenceTd'>6.4</td> </tr> <tr> <td class='confluenceTd'>Logical reads per second</td> <td class='confluenceTd'>535,236.2</td> <td class='confluenceTd'>622,625.2</td> <td class='confluenceTd'>16.3</td> </tr> <tr> <td class='confluenceTd'>CPU time per Logical reads</td> <td class='confluenceTd'>14,6us</td> <td class='confluenceTd'>13,3us</td> <td class='confluenceTd'>8.9</td> </tr> </tbody></table> <p><b>Analysis:</b> </p> <p>8.9% improvements in CPU time caused 31.4% improvement DB time. </p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30367#action_30367 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30365#action_30365 Tue, 29 Sep 2015 15:34:34 +0000 ubTools Support <b>SUMMARY:</b> <p><ins>Analysis:</ins></p> <ul class="alternate" type="square"> <li>Changing CPU scaling governor from "ondemand" to "performance" increased the performance.</li> <li>Performance improvement is noticable when: <ul class="alternate" type="square"> <li>The difference between the minumum and maximum CPU frequencies is high.</li> <li>CPU usage is not heavy(up_threshold:95%).</li> <li>There are sessions waiting for other sessions on CPU.</li> </ul> </li> </ul> <p><ins>Recommendations:</ins></p> <ul class="alternate" type="square"> <li>If performance is important than heating, set CPU scaling governor to "performance".</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30365#action_30365 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30364#action_30364 Tue, 29 Sep 2015 15:19:13 +0000 ubTools Support <b>COMPARISION:</b> <p>30 minutes load test results...</p> <p>1st: When CPU scaling governor is ondemand.<br/> 2nd: When CPU scaling governor is performance.</p> <p><b>Data:</b></p> <p><ins>Top Activity:</ins></p> <p><img src="http://www.ubTools.com/jira/secure/attachment/13743/13743_EMTopActivity.png" align="absmiddle" border="0" /></p> <p><ins>AWR:</ins></p> <p><img src="http://www.ubTools.com/jira/secure/attachment/13744/13744_AWR.png" align="absmiddle" border="0" /></p> <p><b>Analysis:</b></p> <ul class="alternate" type="square"> <li>"row cache lock" wait time decreased since the holders did their jobs faster, as a result held the resources shorter.</li> <li>DB time decreased 31.4%, mostly from decrease in "row cache lock".</li> <li>Logical reads increased 16.3% since more buffer gets could be done on the faster CPU frequency.</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30364#action_30364 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30363#action_30363 Tue, 29 Sep 2015 13:22:53 +0000 ubTools Support <b>TEST2:</b> <p>CPU scaling governor is performance.</p> <p><b>An atop snapshot:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ATOP - avsprddbflx05 2015/09/29 16:16:27 --------- 10s elapsed PRC | sys 7.06s | user 86.40s | #proc 1313 | #tslpi 1756 | #tslpu 0 | #zombie 0 | no procacct | CPU | sys 57% | user 864% | irq 14% | idle 623% | wait 43% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 80% | irq 7% | idle 10% | cpu004 w 1% | avgf 2.00GHz | avgscal 100% | cpu | sys 5% | user 81% | irq 2% | idle 11% | cpu000 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 82% | irq 1% | idle 14% | cpu001 w 1% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 74% | irq 0% | idle 21% | cpu002 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 73% | irq 0% | idle 22% | cpu003 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 62% | irq 0% | idle 32% | cpu005 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 6% | user 58% | irq 1% | idle 26% | cpu008 w 10% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 51% | irq 0% | idle 43% | cpu006 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 47% | irq 0% | idle 47% | cpu010 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 46% | irq 0% | idle 49% | cpu013 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 2% | user 44% | irq 1% | idle 50% | cpu007 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 6% | user 40% | irq 1% | idle 48% | cpu009 w 6% | avgf 2.00GHz | avgscal 100% | cpu | sys 6% | user 33% | irq 1% | idle 57% | cpu011 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 2% | user 34% | irq 0% | idle 60% | cpu012 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 4% | user 28% | irq 0% | idle 66% | cpu014 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 2% | user 28% | irq 0% | idle 69% | cpu015 w 1% | avgf 2.00GHz | avgscal 100% | CPL | avg1 5.98 | avg5 6.41 | avg15 4.75 | csw 382133 | intr 340254 | | numcpu 16 | MEM | tot 126.1G | free 36.6G | cache 5.3G | dirty 28.9M | buff 193.0M | slab 836.2M | | SWP | tot 17.1G | free 17.1G | | | | vmcom 13.0G | vmlim 42.6G | NET | transport | tcpi 10272 | tcpo 10302 | udpi 75222 | udpo 75458 | tcpao 30 | tcppo 2 | NET | network | ipi 111030 | ipo 85760 | ipfrw 0 | deliv 85494 | icmpi 0 | icmpo 0 | PID TID SYSCPU USRCPU VGROW RGROW RUID EUID THR ST EXC S CPU CMD 1/62 15847 - 0.17s 4.59s 0K 896K grid oracle 1 -- - S 48% oracle 14867 - 0.14s 3.79s 0K 2220K grid oracle 1 -- - R 40% oracle 15835 - 0.15s 3.76s 8192K 384K grid oracle 1 -- - R 39% oracle 14871 - 0.24s 3.59s 0K 0K grid oracle 1 -- - R 39% oracle 15849 - 0.14s 3.55s 0K -1216K grid oracle 1 -- - R 37% oracle </pre> </div></div> <p><b>Analysis:</b></p> <ul class="alternate" type="square"> <li>The maximum CPU frequency is 2.0Ghz and all CPUs could use 100% of full CPU speed.</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30363#action_30363 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30362#action_30362 Tue, 29 Sep 2015 12:57:02 +0000 ubTools Support <b>TEST1:</b> <p>CPU scaling governor is ondemand.</p> <p><b>An atop snapshot:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ATOP - avsprddbflx05 2015/09/29 15:40:21 --------- 10s elapsed PRC | sys 7.86s | user 80.03s | #proc 1347 | #tslpi 1791 | #tslpu 0 | #zombie 0 | no procacct | CPU | sys 66% | user 800% | irq 13% | idle 626% | wait 95% | avgf 1.63GHz | avgscal 81% | cpu | sys 4% | user 86% | irq 1% | idle 6% | cpu000 w 3% | avgf 1.94GHz | avgscal 96% | cpu | sys 4% | user 77% | irq 5% | idle 11% | cpu004 w 3% | avgf 1.90GHz | avgscal 94% | cpu | sys 4% | user 72% | irq 0% | idle 17% | cpu001 w 7% | avgf 1.83GHz | avgscal 91% | cpu | sys 3% | user 67% | irq 0% | idle 20% | cpu002 w 9% | avgf 1.80GHz | avgscal 90% | cpu | sys 3% | user 61% | irq 0% | idle 28% | cpu003 w 8% | avgf 1.73GHz | avgscal 86% | cpu | sys 6% | user 54% | irq 1% | idle 34% | cpu009 w 6% | avgf 1.62GHz | avgscal 80% | cpu | sys 3% | user 52% | irq 1% | idle 36% | cpu005 w 8% | avgf 1.68GHz | avgscal 83% | cpu | sys 7% | user 47% | irq 1% | idle 29% | cpu008 w 16% | avgf 1.67GHz | avgscal 83% | cpu | sys 6% | user 45% | irq 0% | idle 48% | cpu013 w 1% | avgf 1.52GHz | avgscal 76% | cpu | sys 7% | user 39% | irq 1% | idle 53% | cpu015 w 1% | avgf 1.49GHz | avgscal 74% | cpu | sys 3% | user 41% | irq 0% | idle 49% | cpu006 w 7% | avgf 1.57GHz | avgscal 78% | cpu | sys 5% | user 34% | irq 0% | idle 52% | cpu010 w 9% | avgf 1.51GHz | avgscal 75% | cpu | sys 2% | user 35% | irq 1% | idle 55% | cpu007 w 7% | avgf 1.55GHz | avgscal 77% | cpu | sys 4% | user 32% | irq 0% | idle 63% | cpu014 w 1% | avgf 1.43GHz | avgscal 71% | cpu | sys 4% | user 31% | irq 0% | idle 59% | cpu011 w 6% | avgf 1.46GHz | avgscal 72% | cpu | sys 2% | user 26% | irq 0% | idle 68% | cpu012 w 3% | avgf 1.43GHz | avgscal 71% | CPL | avg1 5.33 | avg5 5.46 | avg15 4.57 | csw 377039 | intr 323539 | | numcpu 16 | MEM | tot 126.1G | free 38.8G | cache 3.6G | dirty 4.0M | buff 146.3M | slab 577.8M | | SWP | tot 17.1G | free 17.1G | | | | vmcom 12.9G | vmlim 42.6G | NET | transport | tcpi 20869 | tcpo 21016 | udpi 70875 | udpo 71067 | tcpao 33 | tcppo 1 | NET | network | ipi 128214 | ipo 92084 | ipfrw 0 | deliv 91742 | icmpi 0 | icmpo 0 | PID TID SYSCPU USRCPU VGROW RGROW RUID EUID THR ST EXC S CPU CMD 1/64 13661 - 0.15s 4.50s -24.0M -14.3M grid oracle 1 -- - R 47% oracle 15747 - 0.16s 3.80s 0K 684K grid oracle 1 -- - S 40% oracle 13733 - 0.32s 3.57s 32768K 31360K grid oracle 1 -- - R 39% oracle 27274 - 0.61s 3.21s 24576K 11976K grid oracle 1 -- - R 39% oracle 14869 - 0.17s 3.29s 0K -1880K grid oracle 1 -- - S 35% oracle </pre> </div></div> <p>The "CPU" shows overall statistics for all CPUs.<br/> The "cpu" shows statistics for single CPU.</p> <p><b>Analysis:</b></p> <ul class="alternate" type="square"> <li>Although maximum CPU frequency is 2.0Ghz, the server could not use its full speed. It used average 1.63Ghz, which is 81% of full CPU speed.</li> <li>When CPU usage is 91%(sys:4+user:86+irq:1) at cpu000, it used average 1.94Ghz, which is 96% of full CPU speed.</li> <li>When CPU usage is 51%(sys:6+user:45+irq:0) at cpu013, it used average 1.52Ghz, which is 76% of full CPU speed.</li> <li>When CPU usage is 28%(sys:2+user:26+irq:0) at cpu012, it used average 1.43Ghz, which is 71% of full CPU speed.</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30362#action_30362 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30361#action_30361 Tue, 29 Sep 2015 12:29:30 +0000 ubTools Support <b>METHOD:</b> <ul class="alternate" type="square"> <li>The tests will be done when CPU scaling governors are ondemand and then performance.</li> <li>The same work load will be generated by HP's LOAD RUNNER tool.</li> <li>The results will be compared.</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30361#action_30361 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30360#action_30360 Tue, 29 Sep 2015 12:24:10 +0000 ubTools Support atop(<span class="nobr"><a href="http://www.atoptool.nl/">http://www.atoptool.nl/<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span>) tool wil be used to monitor CPU frequencies. <p>From the man page of atop:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>In case that the kernel module 'cpufreq_stats' is active (after issueing 'modprobe cpufreq_stats'), the average frequency ('avgf') and the average scaling percentage ('avgscal') is shown. Otherwise the current frequency ('curf') and the current scaling percentage ('curscal') is shown at the moment that the sample is taken. </pre> </div></div> <p>In order to compare the CPU usages to the frequencies, CPU "cpufreq_stats" should be enabled. Otherwise, atop will show the current frequencies, not the average during monitoring samples.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30360#action_30360 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30359#action_30359 Tue, 29 Sep 2015 12:21:55 +0000 ubTools Support <b>ENVIRONMENT:</b> <p><b>Data:</b><br/> <em>for the CPU0(similar for the others):</em></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>[perftest1]/sys/devices/system/cpu/cpu0/cpufreq $ more * :::::::::::::: affected_cpus :::::::::::::: 0 cpuinfo_cur_freq: Permission denied :::::::::::::: cpuinfo_max_freq :::::::::::::: 2000000 :::::::::::::: cpuinfo_min_freq :::::::::::::: 1200000 :::::::::::::: cpuinfo_transition_latency :::::::::::::: 10000 *** ondemand: directory *** :::::::::::::: related_cpus :::::::::::::: 0 :::::::::::::: scaling_available_frequencies :::::::::::::: 2000000 1900000 1800000 1700000 1600000 1500000 1400000 1300000 1200000 :::::::::::::: scaling_available_governors :::::::::::::: ondemand userspace performance :::::::::::::: scaling_cur_freq :::::::::::::: 2000000 :::::::::::::: scaling_driver :::::::::::::: acpi-cpufreq :::::::::::::: scaling_governor :::::::::::::: ondemand :::::::::::::: scaling_max_freq :::::::::::::: 2000000 :::::::::::::: scaling_min_freq :::::::::::::: 1200000 :::::::::::::: scaling_setspeed :::::::::::::: &lt;unsupported&gt; *** stats: directory *** [perftest1]/sys/devices/system/cpu/cpu0/cpufreq $ cd ondemand [perftest1]/sys/devices/system/cpu/cpu0/cpufreq/ondemand $ ls -ltr total 0 -r--r--r-- 1 root root 4096 Sep 29 15:17 sampling_rate_min -r--r--r-- 1 root root 4096 Sep 29 15:17 sampling_rate_max -rw-r--r-- 1 root root 4096 Sep 29 15:17 up_threshold -rw-r--r-- 1 root root 4096 Sep 29 15:17 sampling_rate -rw-r--r-- 1 root root 4096 Sep 29 15:17 powersave_bias -rw-r--r-- 1 root root 4096 Sep 29 15:17 ignore_nice_load [perftest1]/sys/devices/system/cpu/cpu0/cpufreq/ondemand $ more * :::::::::::::: ignore_nice_load :::::::::::::: 0 :::::::::::::: powersave_bias :::::::::::::: 0 :::::::::::::: sampling_rate :::::::::::::: 10000 :::::::::::::: sampling_rate_max :::::::::::::: 4294967295 :::::::::::::: sampling_rate_min :::::::::::::: 10000 :::::::::::::: up_threshold :::::::::::::: 95 [perftest1]/sys/devices/system/cpu/cpu0/cpufreq/ondemand $ </pre> </div></div> <p><b>View:</b></p> <ul class="alternate" type="square"> <li>scaling_governor: CPU scaling governor is ondemand.</li> <li>cpuinfo_min_freq: Minimum CPU frequency is 1200000Khz(1.2Ghz)</li> <li>cpuinfo_max_freq: Maximum CPU frequency is 2001000Khz(2.0Ghz)</li> <li>sampling_rate: The kernel looks at the CPU usage per 10000us(10ms) to make decisions about CPU frequency.</li> <li>up_threshold: The kernel will increase the CPU frequency if average CPU usage between each sampling_rate(10ms) is higher than 95%.</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30359#action_30359 RE: [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59?focusedCommentId=30358#action_30358 Tue, 29 Sep 2015 12:16:05 +0000 ubTools Support See the following notes for the basic definitions of CPUfreq Governors: <ul class="alternate" type="square"> <li><span class="nobr"><a href="https://www.kernel.org/doc/Documentation/cpu-freq/user-guide.txt">https://www.kernel.org/doc/Documentation/cpu-freq/user-guide.txt<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></li> <li><span class="nobr"><a href="https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt">https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-59">QA-59</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30358#action_30358 RE: [QA-57] ORA-04030 returned by "__libc_sbrk(0x0000000001010020) Err#12 ENOMEM" http://jira.ubtools.com/jira/browse/QA-57?focusedCommentId=23857#action_23857 Mon, 2 Dec 2013 15:10:15 +0000 ubTools Support The system admin will work on this problem. The solution will be added here. <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-57">QA-57</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=23857#action_23857 RE: [QA-57] ORA-04030 returned by "__libc_sbrk(0x0000000001010020) Err#12 ENOMEM" http://jira.ubtools.com/jira/browse/QA-57?focusedCommentId=23856#action_23856 Mon, 2 Dec 2013 15:02:29 +0000 ubTools Support <b>ANALYSIS 2:</b> <p><ins>System Calls:</ins></p> <p><em>truss -fae -o &lt;outputFile&gt; -p &lt;V$PROCESS.SPID&gt;</em> excerpt:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>14483680: 43122723: __libc_sbrk(0x0000000001010020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FE0020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001004020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FE0020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001001020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FE0020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001000420) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FDF420) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001000120) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FDF420) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001000060) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FDF420) Err#12 ENOMEM 14483680: 43122723: statx("/oracle/admin/ATSD/udump", 0x0FFFFFFFFFFF41A8, 176, 0) = 0 14483680: 43122723: close(5) = 0 14483680: 43122723: statx("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc", 0x0FFFFFFFFFFF44C0, 176, 01) Err#2 ENOENT 14483680: 43122723: statx("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc", 0x0FFFFFFFFFFF44C0, 176, 0) Err#2 ENOENT 14483680: 43122723: kopen("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) = 5 14483680: 43122723: kwrite(5, 0x0000000104A1C468, 0) = 0 14483680: 43122723: kwrite(5, " / o r a c l e / a d m i".., 47) = 47 </pre> </div></div> <p>When ORA-4030 error occured, trace file <em>("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc</em> was created. So, the problem occured before its generation at _<em>libc_sbrk with return code of _ENOMEM</em>. The system could not return memory to Oracle process.</p> <p><ins>User resource limits:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>oracle@atlasdb2:/home/oracle/dunal &gt;ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) unlimited memory(kbytes) unlimited coredump(blocks) unlimited nofiles(descriptors) unlimited threads(per process) unlimited processes(per user) unlimited oracle@atlasdb2:/home/oracle/dunal &gt; </pre> </div></div> <p>No limit was found for oracle user.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-57">QA-57</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=23856#action_23856 RE: [QA-57] ORA-04030 returned by "__libc_sbrk(0x0000000001010020) Err#12 ENOMEM" http://jira.ubtools.com/jira/browse/QA-57?focusedCommentId=23855#action_23855 Mon, 2 Dec 2013 14:52:42 +0000 ubTools Support <b>ANALYIS 1:</b> <p><ins>PGASTAT:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select * from v$pgastat order by value; NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ maximum PGA used for manual workareas 0 bytes over allocation count 0 total PGA used for manual workareas 0 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ cache hit percentage 98.53 percent process count 126 max processes count 135 NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ recompute count (total) 132370 total PGA used for auto workareas 4399104 bytes total freeable PGA memory 106823680 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ maximum PGA used for auto workareas 153909248 bytes global memory bound 214743040 bytes total PGA inuse 747691008 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ total PGA allocated 1180690432 bytes aggregate PGA auto target 1265577984 bytes maximum PGA allocated 1299183616 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ aggregate PGA target parameter 2147483648 bytes extra bytes read/written 1.2622E+10 bytes PGA memory freed back to OS 6.0510E+10 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ bytes processed 8.5171E+11 bytes 19 rows selected. SQL&gt; </pre> </div></div> <p><em>pga_aggregate_target</em> parmeter is not exceeded.</p> <p><ins>HEAPDUMP:</ins></p> <p><ins>Set Up:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>To setup tracing to trap the ORA-4030, on the server use the following in SQL*Plus: SQL&gt; ALTER SYSTEM SET EVENTS '4030 trace name heapdump level 536870917;name errorstack level 3'; Once the error reoccurs with the event set, you can turn off tracing using the following command in SQL*Plus: ALTER SYSTEM SET EVENTS '4030 trace name context off; name context off'; </pre> </div></div> <p><em>Ref: Oracle note: Master Note for Diagnosing OS Memory Problems and ORA-4030 (Doc ID 1088267.1)</em></p> <p><ins>TRACE:</ins></p> <p>Heap:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>HEAP DUMP heap name="session heap" desc=11044a830 extent sz=0xff80 alt=32767 het=32767 rec=0 flg=2 opc=2 parent=1101981f0 owner=70000033f6789e8 nex=0 xsz=0x0 ..... Total heap size =108241256 </pre> </div></div> <p>Internal Parameters:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> _pga_max_size = 419420 KB ..... _smm_max_size = 209710 KB _smm_px_max_size = 1048576 KB </pre> </div></div> <p>No PGA limits are exceeded.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-57">QA-57</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=23855#action_23855 RE: [QA-54] Unable to close database by srvctl and racgimon takes 100% of CPU. http://jira.ubtools.com/jira/browse/QA-54?focusedCommentId=20127#action_20127 Tue, 26 Mar 2013 15:21:48 +0000 ubTools Support <b>WORKAROUND:</b> <p>Set privileged option to a value as an example below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># projmod -s -K "process.max-file-descriptor=(basic,4096,deny),(privileged,65536,deny)" 'user.oracle' </pre> </div></div> <p>After setting, check as below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>$ prctl -n process.max-file-descriptor -i process $$ process: 708: -sh NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-file-descriptor basic 4.10K - deny 708 privileged 65.5K - deny - system 2.15G max deny - $ </pre> </div></div> <p>See similar problem for lower Oracle versions in Oracle note <em>srvctl Slow or Fails to Start/Stop Database Instance and crsd.bin/racgmain/racgimon High CPU Usage <span class="error">&#91;ID 1457387.1&#93;</span></em>.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-54">QA-54</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-54?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=20127#action_20127 RE: [QA-54] Unable to close database by srvctl and racgimon takes 100% of CPU. http://jira.ubtools.com/jira/browse/QA-54?focusedCommentId=20126#action_20126 Tue, 26 Mar 2013 15:09:09 +0000 ubTools Support <b>ANALYSIS 2:</b> <p><ins><em>prctl</em> outpur of <em>racgimon</em>:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># prctl 8286 process: 8286: /u01/app/oracle/product/10.2/bin/racgimon startd ESIBASE NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-port-events privileged 65.5K - deny - system 2.15G max deny - process.max-msg-messages privileged 8.19K - deny - system 4.29G max deny - process.max-msg-qbytes privileged 64.0KB - deny - system 16.0EB max deny - process.max-sem-ops privileged 512 - deny - system 2.15G max deny - process.max-sem-nsems privileged 512 - deny - system 32.8K max deny - process.max-address-space privileged 16.0EB max deny - system 16.0EB max deny - process.max-file-descriptor privileged 2.15G max deny - system 2.15G max deny - process.max-core-size basic 0B - deny 8286 system 8.00EB max deny - process.max-stack-size basic 10.0MB - deny 8286 privileged 125TB - deny - system 125TB max deny - process.max-data-size privileged 16.0EB max deny - system 16.0EB max deny - process.max-file-size privileged 8.00EB max deny,signal=XFSZ - system 8.00EB max deny - process.max-cpu-time privileged 18.4Es inf signal=XCPU - system 18.4Es inf none - task.max-cpu-time system 18.4Es inf none - task.max-lwps system 2.15G max deny - project.max-contracts privileged 10.0K - deny - system 2.15G max deny - project.max-device-locked-memory privileged 2.19GB - deny - system 16.0EB max deny - project.max-locked-memory system 16.0EB max deny - project.max-port-ids privileged 8.19K - deny - system 65.5K max deny - project.max-shm-memory privileged 24.0GB - deny - system 16.0EB max deny - project.max-shm-ids privileged 128 - deny - system 16.8M max deny - project.max-msg-ids privileged 128 - deny - system 16.8M max deny - project.max-sem-ids privileged 128 - deny - system 16.8M max deny - project.max-crypto-memory privileged 8.77GB - deny - system 16.0EB max deny - project.max-tasks system 2.15G max deny - project.max-lwps system 2.15G max deny - project.cpu-cap system 4.29G inf deny - project.cpu-shares privileged 1 - none - system 65.5K max none - zone.max-swap system 16.0EB max deny - zone.max-locked-memory system 16.0EB max deny - zone.max-shm-memory system 16.0EB max deny - zone.max-shm-ids system 16.8M max deny - zone.max-sem-ids system 16.8M max deny - zone.max-msg-ids system 16.8M max deny - zone.max-lwps system 2.15G max deny - zone.cpu-cap system 4.29G inf deny - zone.cpu-shares privileged 1 - none - system 65.5K max none - $ prctl -n process.max-file-descriptor -i process $$ process: 7615: -sh NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-file-descriptor basic 4.10K - deny 7615 system 2.15G max deny - $ </pre> </div></div> <p><ins>Comment:</ins></p> <p>privileged option of <em>process.max-file-descriptor</em> had reached to 2.15G descriptors. But, no privileged option had been set to it.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-54">QA-54</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-54?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=20126#action_20126 RE: [QA-54] Unable to close database by srvctl and racgimon takes 100% of CPU. http://jira.ubtools.com/jira/browse/QA-54?focusedCommentId=20125#action_20125 Tue, 26 Mar 2013 15:01:19 +0000 ubTools Support <b>ANALYSIS 1:</b> <p><ins><em>truss output of one of _racgimon</em>:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># truss -fae -p 8286 8286: close(346745079) Err#9 EBADF 8286: close(346745080) Err#9 EBADF 8286: close(346745081) Err#9 EBADF 8286: close(346745082) Err#9 EBADF 8286: close(346745083) Err#9 EBADF 8286: close(346745084) Err#9 EBADF 8286: close(346745085) Err#9 EBADF 8286: close(346745086) Err#9 EBADF # truss -faec -p 8286 psargs: /u01/app/oracle/product/10.2/bin/racgimon startd ESIBASE ^C syscall seconds calls errors close 2.374 1857265 1857265 -------- ------ ---- sys totals: 2.374 1857265 1857265 usr time: 1.079 elapsed: 23.090 # </pre> </div></div> <p><ins>Comment:</ins></p> <p><em>racgimon</em> could not close file descriptors. It repeats to close different file descriptors which are incremented 1 in each subsequent <em>close()</em> system call.</p> <p><em>close()</em> system calls return <em>EBADF</em>, which is <em>The fildes argument is not a valid file descriptor.</em><br/> Ref: <span class="nobr"><a href="http://docs.oracle.com/cd/E23823_01/html/816-5167/close-2.html#REFMAN2close-2">http://docs.oracle.com/cd/E23823_01/html/816-5167/close-2.html#REFMAN2close-2<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-54">QA-54</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-54?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=20125#action_20125 RE: [QA-50] PRVF-5410 : Check of common NTP Time Server failed, PRVF-5416 : Query of NTP daemon failed on all nodes http://jira.ubtools.com/jira/browse/QA-50?focusedCommentId=14519#action_14519 Fri, 13 May 2011 18:07:15 +0000 ubTools Support <b>Solution:</b> <p>The Network Administrator set an IP to <em>refid</em> for NTP.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-50">QA-50</a>)</td> </tr> <tr> <td>Edited by:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a></td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519#action_14519 RE: [QA-50] PRVF-5410 : Check of common NTP Time Server failed, PRVF-5416 : Query of NTP daemon failed on all nodes http://jira.ubtools.com/jira/browse/QA-50?focusedCommentId=14518#action_14518 Fri, 13 May 2011 18:01:25 +0000 ubTools Support <ins>Action:</ins><br/> The Network Administrator set an IP to <em>refid</em> for NTP. <p><ins>NTP:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># ntpq -p remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; # </pre> </div></div> <p><ins>CVU Log:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... [967@detrac1] [main] [ 2011-05-13 19:21:19.426 EEST ] [TaskNTP.getTimeServerInfo:838] Output from NTP query command on node detrac1 is = remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.433 EEST ] [TimeServerNode.addDataToNode:66] TimeServerNode:addDataToNode():Parsing line: *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.434 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[0]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.434 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[1]=72.14.188.52 [967@detrac1] [main] [ 2011-05-13 19:21:19.434 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[2]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.435 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[3]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.435 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[4]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.436 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[5]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.436 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[6]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.437 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[7]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.437 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[8]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.438 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[9]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.438 EEST ] [TaskNTP.getTimeServerInfo:838] Output from NTP query command on node detrac2 is = remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.439 EEST ] [TimeServerNode.addDataToNode:66] TimeServerNode:addDataToNode():Parsing line: *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.440 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[0]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.440 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[1]=72.14.188.52 [967@detrac1] [main] [ 2011-05-13 19:21:19.441 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[2]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.441 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[3]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.441 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[4]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.442 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[5]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.442 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[6]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.443 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[7]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.443 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[8]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.444 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[9]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.444 EEST ] [TaskNTP.doTimeServerCheck:736] tsId=72.14.188.52; tServer ..... </pre> </div></div> <p>CVU could parse <em>ntpq</em> output.</p> <p><ins>CVU Output:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... NTP common Time Server Check started... NTP Time Server "72.14.188.52" is common to all nodes on which the NTP daemon is running Check of common NTP Time Server passed Clock time offset check from NTP Time Server started... Checking on nodes "[detrac1, detrac2]"... Check: Clock time offset from NTP Time Server Time Server: 72.14.188.52 Time Offset Limit: 1000.0 msecs Node Name Time Offset Status ------------ ------------------------ ------------------------ detrac1 -2.332 passed detrac2 -2.842 passed Time Server "72.14.188.52" has time offsets that are within permissible limits for nodes "[detrac1, detrac2]". Clock time offset check passed Result: Clock synchronization check using Network Time Protocol(NTP) passed ..... </pre> </div></div> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-50">QA-50</a>)</td> </tr> <tr> <td>Edited by:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a></td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518#action_14518 RE: [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47?focusedCommentId=11208#action_11208 Fri, 10 Apr 2009 13:13:02 +0000 ubTools Support Operating System reinstalled by the vendor. Then problem has not occured. <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-47">QA-47</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11208#action_11208 RE: [QA-48] Unable to start VIP because of invalid RX packets numbers. http://jira.ubtools.com/jira/browse/QA-48?focusedCommentId=11182#action_11182 Thu, 19 Mar 2009 13:45:12 +0000 ubTools Support Looks like an inconsistency of Oracle on AIX 6.1. <p><b><ins>Workaround:</ins></b></p> <p>Capturing column number of netstat must be changed from 5 to 6.</p> <p><b>Original lines for _O1:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... tmpIP=`$LSATTR -El ${_IF} -a netaddr | $AWK '{print $2}'` # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` x=$CHECK_TIMES while [ $x -gt 0 ] ... </pre> </div></div> <p><b>Modified line for _O1:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... tmpIP=`$LSATTR -El ${_IF} -a netaddr | $AWK '{print $2}'` # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$6; exit}}"` x=$CHECK_TIMES while [ $x -gt 0 ] ... </pre> </div></div> <p><b>Original lines for _O2:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` if [ "$_O1" != "$_O2" ] then # RX packets numbers changed ... </pre> </div></div> <p><b>Modified line for _O2:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$6; exit}}"` if [ "$_O1" != "$_O2" ] then # RX packets numbers changed ... </pre> </div></div> <p>Then, VIP could be started on the correct nodes:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....ap1.gsd application ONLINE ONLINE akyorap1 ora....ap1.ons application ONLINE ONLINE akyorap1 ora....ap1.vip application ONLINE ONLINE akyorap1 ora....ap2.gsd application ONLINE ONLINE akyorap2 ora....ap2.ons application ONLINE ONLINE akyorap2 ora....ap2.vip application ONLINE ONLINE akyorap2 </pre> </div></div> <p><em>Note: Don't edit Oracle scripts unless you know what you're doing.</em></p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-48">QA-48</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11182#action_11182 RE: [QA-48] Unable to start VIP because of invalid RX packets numbers. http://jira.ubtools.com/jira/browse/QA-48?focusedCommentId=11181#action_11181 Thu, 19 Mar 2009 13:12:19 +0000 ubTools Support No solution found from Metalink. <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-48">QA-48</a>)</td> </tr> <tr> <td>Edited by:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a></td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11181#action_11181 RE: [QA-48] Unable to start VIP because of invalid RX packets numbers. http://jira.ubtools.com/jira/browse/QA-48?focusedCommentId=11180#action_11180 Thu, 19 Mar 2009 12:54:53 +0000 ubTools Support The Network Adminisitrator said it was an AIX Bug: <ul class="alternate" type="square"> <li><span class="nobr"><a href="http://www-01.ibm.com/support/docview.wss?uid=isg1IZ41358">IZ41358: ZONEID NEEDS TO PRINT "-" RATHER THAN A BLANK FOR NO VALUE. APPLIES TO AIX 6100-02<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></li> </ul> <p>But, this fix changes ZoneID from blank value to '-'. After this fix, no VIP could be started.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-48">QA-48</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11180#action_11180 RE: [QA-48] Unable to start VIP because of invalid RX packets numbers. http://jira.ubtools.com/jira/browse/QA-48?focusedCommentId=11179#action_11179 Wed, 18 Mar 2009 20:44:22 +0000 ubTools Support <p><b>netstat Output on Failed Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>/usr/bin/netstat -f inet -n -I en1 | /usr/bin/awk "{ if (/^en1/) {print $5; exit}}" en1 1500 link#3 0.21.5e.34.55.bc - 34601 0 16269 3 0 </pre> </div></div> <p>The column#5 is '-'. This is wrong and caused the problem.</p> <p><b>netstat Output on Successful Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>en1 1500 link#3 0.21.5e.34.57.fe 29223 0 10609 3 0 </pre> </div></div> <p>The column#5 is <em>29223</em>. This is expected number.</p> <p><b>Headers of netstat on Failed Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>#/usr/bin/netstat -f inet -n -I en1 Name Mtu Network Address ZoneID Ipkts Ierrs Opkts Oerrs Coll en1 1500 link#3 0.21.5e.34.55.bc - 35645 0 16801 3 0 en1 1500 10.46.180 10.46.180.52 - 35645 0 16801 3 0 </pre> </div></div> <p><b>Headers of netstat on Successful Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>#/usr/bin/netstat -f inet -n -I en1 Name Mtu Network Address ZoneID Ipkts Ierrs Opkts Oerrs Coll en1 1500 link#3 0.21.5e.34.57.fe 29743 0 10762 3 0 en1 1500 10.46.180 10.46.180.51 29743 0 10762 3 0 en1 1500 10.46.180 10.46.180.53 29743 0 10762 3 0 en1 1500 10.46.180 10.46.180.54 29743 0 10762 3 0 </pre> </div></div> <p><font color="red">The difference is the <em>ZoneID</em> column.</font> </p> <p>Looks like a network configuration problem. This issue will be open for an update from Network Administrators.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-48">QA-48</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11179#action_11179 RE: [QA-48] Unable to start VIP because of invalid RX packets numbers. http://jira.ubtools.com/jira/browse/QA-48?focusedCommentId=11178#action_11178 Wed, 18 Mar 2009 20:28:31 +0000 ubTools Support racgvip was modified as below to dump the values of _<em>O1</em> and _<em>O2</em>: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` logx "--------------&gt; by dunal: _O1: $_O1" x=$CHECK_TIMES while [ $x -gt 0 ] do if [ -n "$tmpIP" ] then logx "About to execute command: $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW " $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 else logx "About to execute command: $PING $PING_TIMEOUT $DEFAULTGW" $PING $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` logx "--------------&gt; by dunal: _O2: $_O2" ... </pre> </div></div> <p>As seen above, <em>logx "--------------&gt; by dunal: ..."</em> lines are added to the script. <font color="red"> Don't do that if you're not sure about what you do.</font> </p> <p>After restarting the VIP, the values of _<em>O1</em> and _<em>O2</em> are dumped in the logs.</p> <p><b>Failed Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] --------------&gt; by dunal: _O1: - 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] About to execute command: /usr/sbin/ping -S 10.46.180.52 -c 1 -w 1 10.46.180.1 Wed Mar 18 20:58:50 GMT+02:00 2009 [ 413770 ] --------------&gt; by dunal: _O2: - 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:51 GMT+02:00 2009 [ 413770 ] About to execute command: /usr/sbin/ping -S 10.46.180.52 -c 1 -w 1 10.46.180.1 Wed Mar 18 20:58:51 GMT+02:00 2009 [ 413770 ] --------------&gt; by dunal: _O2: - 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] IsIfAlive: RX packets checked if=en1 failed Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] Interface en1 checked failed (host =akyorap2) ... </pre> </div></div> <p>As seen above, the values are '-'. It's wrong. But, they are same. So, RX packet number not changed.</p> <p><b>Successful Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] --------------&gt; by dunal: _O1: 17297 2009-03-18 20:58:55.793: [ RACG][1] [397546][1][ora.akyorap2.vip]: Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] About to execute command: /usr/sbin/ping -S 10.46.180.51 -c 1 -w 1 10.46.180.1 Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] --------------&gt; by dunal: _O2: 17298 2009-03-18 20:58:55.793: [ RACG][1] [397546][1][ora.akyorap2.vip]: Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] IsIfAlive: RX packets checked if=en1 OK </pre> </div></div> <p>_<em>O1</em> and _<em>O2</em> are different. That means RX packet number changed and the interface is up.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-48">QA-48</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11178#action_11178 RE: [QA-48] Unable to start VIP because of invalid RX packets numbers. http://jira.ubtools.com/jira/browse/QA-48?focusedCommentId=11177#action_11177 Wed, 18 Mar 2009 20:10:55 +0000 ubTools Support The problem raised from <em>IsIfAlive()</em> of $ORA_CRS_HOME/racgvip. <p>Here are the related excerpt from racgvip:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> # Check the status of the interface thro' pinging gateway if [ -n "$DEFAULTGW" ] then _RET=1 # get base IP address of the interface tmpIP=`$LSATTR -El ${_IF} -a netaddr | $AWK '{print $2}'` # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` x=$CHECK_TIMES while [ $x -gt 0 ] do if [ -n "$tmpIP" ] then logx "About to execute command: $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW " $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 else logx "About to execute command: $PING $PING_TIMEOUT $DEFAULTGW" $PING $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` if [ "$_O1" != "$_O2" ] then # RX packets numbers changed _RET=0 break fi $SLEEP 1 x=`$EXPR $x - 1` done if [ $_RET -ne 0 ] then logx "IsIfAlive: RX packets checked if=$_IF failed" else logx "IsIfAlive: RX packets checked if=$_IF OK" fi .... </pre> </div></div> <p>According to the the code above, it does the followings:</p> <ul class="alternate" type="square"> <li>Assigns the current RX packet number to _O1 variable as the first RX packet number.</li> <li>Loops $CHECK_TIMES times: <ul class="alternate" type="square"> <li>Pings default gateway.</li> <li>Assigns the current RX packet number to _O2 variable as the next RX packet number.</li> <li>If RX packet number changed(_O1!=_O2), break the loop.</li> <li>Sleep 1 second.</li> </ul> </li> <li>If RX packet number is NOT changed(_O1==_O2) raise the error; else it's OK.</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-48">QA-48</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11177#action_11177 RE: [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47?focusedCommentId=11176#action_11176 Tue, 10 Mar 2009 10:36:07 +0000 ubTools Support Looks like a configuration issue or a bug in OS/STORAGE side. <p>This issue handles redo corruption only. But, the database encounters the corruptions on UNDO,INDEX,TABLE, CONTROL FILES, too. But, the root cause is same:<br/> <font color="red">The On-Disk image of the block and its On-Memory image are not same.</font></p> <p>Similar to <a href="http://jira.ubtools.com/jira/browse/QA-37" title="&quot;ORA-01187: cannot read from file&quot; in one of the RAC Node."><del>QA-37</del></a>.</p> <p>This issue will be updated when a comment is sent by the OS vendor.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-47">QA-47</a>)</td> </tr> <tr> <td>Edited by:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a></td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11176#action_11176 RE: [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47?focusedCommentId=11175#action_11175 Tue, 10 Mar 2009 10:19:59 +0000 ubTools Support <b>Checking missing IO of LGWR from truss Output :</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ grep Err lgwr.truss.log|grep pwrite bash-3.00$ grep Err lgwr.truss.log|grep pread bash-3.00$ </pre> </div></div> <p>No missing IO.</p> <p><b>Checking IO buffers of LGWR</b>:</p> <p>fd#260 is /u01/oradata/oravol2 for LGWR.<br/> Offset: 0xEDD420000.</p> <p>The Last write to block:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>25925: pwrite(260, 0x380D78400, 76288, 0xEDD420000) = 76288 25925: 01 "\0\0\0C8\0\01B\0\0\0 \80 H -\00505 4 1 4 5 0\v 6 6 6 6 6 6 4 &lt;blockNo&gt; 25925: 1 4 5 00F 2 1 2 . 1 5 6 . 2 3 0 . 2 1 807 x l\n07\f %1F01 0 ,\0 25925: 0505 3 5 6 0 705 3 8 0 3 50E 8 8 . 2 4 1 . 1 3 6 . 2 2 007 x l\n </pre> </div></div> <p>As seen above, the contents of redo buffer is corrupted. The block number is 0xC800.</p> <p>But, this LGWR had generated correct archivelog:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/app/oracle/product/10.2.0/dbs/arch/1_25_681074311.dbf bs=512 skip=256 count=1|od -x 1+0 records in 1+0 records out 0000000 2201 0000 0100 0000 0019 0000 8000 d162 &lt;blockNo&gt; 0000020 3534 332e 2e33 3032 0733 6b78 0904 3c0c 0000040 0114 2c30 0500 3205 3031 3631 6905 6e69 </pre> </div></div> <p>0x0100 = 256, which is the correct block number.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-47">QA-47</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11175#action_11175 RE: [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47?focusedCommentId=11174#action_11174 Tue, 10 Mar 2009 07:45:42 +0000 ubTools Support <b><ins>Finding the Other Corrupted Block</ins></b>: <p><b>dd Outputs on pread() of ARCH</b>:</p> <ul class="alternate" type="square"> <li>26085: pread(261, 0xFFFFFD7FFBEADE00, 512, 0xEDD400000) = 512 <ul class="alternate" type="square"> <li>Offset: 0xEDD400000 = 63841501184</li> <li>Offset in 512 byte block: 63841501184/512=124690432 <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690432 count=1|od -x 0000000 2201 0000 c000 0000 001b 0000 8000 621d &lt;blockNo&gt; ... </pre> </div></div></li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 <ul class="alternate" type="square"> <li>First Block Offset: 0xEDD400200 = 63841501696</li> <li>First Block Offset in 512 byte block: 63841501696/512=124690433 (next block of previous block) <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690433 count=1|od -x 0000000 2201 0000 c001 0000 001b 0000 8124 5172 &lt;blockNo&gt; .. </pre> </div></div></li> <li>Last Block Offset: 0xEDD400200 + 130560-512= 63841631744</li> <li>First Block Offset in 512 byte block: 63841631744/512=124690687 <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690687 count=1|od -x 0000000 2201 0000 c0ff 0000 001b 0000 8018 4635 &lt;blockNo&gt; .. </pre> </div></div></li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 <ul class="alternate" type="square"> <li>Offset: 0xEDD420000 = 63841632256</li> <li>Offset in 512 byte block: 63841632256/512 = 124690688 <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690688 count=1|od -x 0000000 2201 0000 c800 0000 001b 0000 805c 2d48 &lt;blockNo&gt; .. </pre> </div></div></li> </ul> </li> </ul> <p>As seen above, the block numbers increase from 0xC000 to 0xC0FF. But, in the last call, it jumped to 0xC800.</p> <p><b>truss Output of ARCH for block# 0xC800</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 26085: 01 "\0\0\0C8\0\01B\0\0\0 \80 H -\00505 4 1 4 5 0\v 6 6 6 6 6 6 4 &lt;blockNo&gt; 26085: 1 4 5 00F 2 1 2 . 1 5 6 . 2 3 0 . 2 1 807 x l\n07\f %1F01 0 ,\0 26085: 0505 3 5 6 0 705 3 8 0 3 50E 8 8 . 2 4 1 . 1 3 6 . 2 2 007 x l\n 26085: 07\f %1F01 0 ,\00505 6 2 0 5 1\b a d a m k a c i0E 1 9 5 . 2 4 4 26085: . 6 2 . 1 4 507 x l\n07\f % "01 0 ,\00505 6 2 0 5 1\b a d a m k 26085: a c i\f 7 8 . 1 9 0 . 6 8 . 1 707 x l\n07\f % #01 0 ,\00502 - 1 26085: 05 K A Y A 20E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n07\f % .02 - 2 26085: ,\00502 - 105 K A Y A 20E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n07 26085: \f &amp;0102 - 2 ,\00505 6 1 1 4 105 1 9 5 5 60E 1 9 5 . 2 4 4 . 6 2 26085: . 1 4 707 x l\n07\f &amp;\r01 0 ,\00505 6 1 1 4 105 1 9 5 5 6\f 8 8 26085: . 2 3 4 . 5 . 2 3 107 x l\n07\f &amp;0F01 0 ,\00502 - 105 K A Y A 2 26085: 0E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n07\f &amp;1002 - 2 ,\00506 1 1 26085: 1 0 1 605 O K A Y A\f 8 5 . 1 0 8 . 8 7 . 5 007 x l\n07\f &amp; !01 26085: 0 ,\00502 - 105 K A Y A 20E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n 26085: 07\f &amp; "02 - 2 ,\00505 4 1 9 3 806 6 4 3 2 5 5\r 8 8 . 2 2 5 . 1 26085: 2 0 . 5 307 x l\n07\f &amp; +01 0 ,\00505 5 3 0 5 506 0 9 1 2 1 90E </pre> </div></div> <p>Then, the following messages were written to the trace file:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>26085: write(2, " * * * 2 0 0 9 - 0 3 -".., 27) = 27 26085: write(2, "\n", 1) = 1 26085: write(2, " ", 1) = 1 26085: write(2, "\n", 1) = 1 26085: write(2, " C o r r u p t r e d o".., 51) = 51 26085: write(2, "\n", 1) = 1 26085: write(2, " F l a g : 0 x 3 0 F".., 80) = 80 26085: write(2, "\n", 1) = 1 26085: write(2, " - - - - - D u m p o".., 39) = 39 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 4 6 3 8 3 0 2 0 3 0".., 64) = 64 &lt;blockNoPiece0&gt; 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 0 4 3 3 c 5 c 3 0".., 64) = 64 &lt;blockNoPiece1&gt; 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 5 0 3 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 3 0 2 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 2 3 9 3 5 3 2 2 0 0 9".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 1 3 0 3 0 2 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 1 3 0 3 0 5 c 3 2".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 4 3 0 3 0 5 c 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 2 0 3 0 5 c 3 2 3 0 3 9".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 3 0 2 0 3 6".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 0 4 3 3 c 2 0 3 7".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 a 3 5 3 2 3 9 3 0 2 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 3 0 3 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 4 2 5 4 2 0 4 4 0 a 4 9".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 2 0 3 8 2 0 3 0 2 0 3 5".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 5 3 0 3 6 5 c 3 8".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " R e r e a d i n g l o".., 78) = 78 26085: write(2, "\n", 1) = 1 </pre> </div></div> <p>Rereading the block fails like this.</p> <p>There are 2 problems:</p> <ul class="alternate" type="square"> <li>Redo block# jumped to 0xC800 from 0xC0FF. So, On-Disk image is corrupted.</li> <li>On-Memory image of block is different than On-Disk image.</li> </ul> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-47">QA-47</a>)</td> </tr> <tr> <td>Edited by:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a></td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11174#action_11174 RE: [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47?focusedCommentId=11173#action_11173 Tue, 10 Mar 2009 06:40:25 +0000 ubTools Support <b><ins>Interpreting the truss Output of ARCH:</ins></b> <p>fd#261 is /u01/oradata/oravol2 for ARCH.</p> <p><b>Reading Offsets by ARCH:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ grep "pread(261" arc0.truss.log 26085: pread(261, 0xFFFFFD7FFC32DE00, 131072, 0xEDE600000) = 131072 26085: pread(261, 0xFFFFFD7FFC21CE00, 131072, 0xEDE620000) = 131072 26085: pread(261, 0xFFFFFD7FFC10BE00, 131072, 0xEDE640000) = 131072 26085: pread(261, 0xFFFFFD7FFBE2DE00, 131072, 0xEDE660000) = 131072 26085: pread(261, 0xFFFFFD7FFBA2DE00, 131072, 0xEDE680000) = 131072 26085: pread(261, 0xFFFFFD7FFB42DE00, 131072, 0xEDE6A0000) = 131072 26085: pread(261, 0xFFFFFD7FFB53DE00, 131072, 0xEDE6C0000) = 131072 26085: pread(261, 0xFFFFFD7FFB64DE00, 131072, 0xEDE6E0000) = 131072 26085: pread(261, 0xFFFFFD7FFADCDE00, 131072, 0xEDE700000) = 131072 26085: pread(261, 0xFFFFFD7FFAE6DE00, 131072, 0xEDE800000) = 131072 26085: pread(261, 0xFFFFFD7FFAEDDE00, 131072, 0xEDE720000) = 131072 26085: pread(261, 0xFFFFFD7FFAF7DE00, 131072, 0xEDE820000) = 131072 26085: pread(261, 0xFFFFFD7FFC2CDE00, 131072, 0xEDE740000) = 131072 26085: pread(261, 0xFFFFFD7FFC36DE00, 131072, 0xEDE840000) = 131072 26085: pread(261, 0xFFFFFD7FFC1BCE00, 131072, 0xEDE760000) = 131072 26085: pread(261, 0xFFFFFD7FFC25CE00, 131072, 0xEDE860000) = 131072 26085: pread(261, 0xFFFFFD7FFC0ABE00, 131072, 0xEDE780000) = 131072 26085: pread(261, 0xFFFFFD7FFC14BE00, 131072, 0xEDE880000) = 131072 26085: pread(261, 0xFFFFFD7FFBDCDE00, 131072, 0xEDE7A0000) = 131072 26085: pread(261, 0xFFFFFD7FFBE6DE00, 131072, 0xEDE8A0000) = 131072 26085: pread(261, 0xFFFFFD7FFB9CDE00, 131072, 0xEDE7C0000) = 131072 26085: pread(261, 0xFFFFFD7FFBA6DE00, 131072, 0xEDE8C0000) = 131072 26085: pread(261, 0xFFFFFD7FFB3CDE00, 131072, 0xEDE7E0000) = 131072 26085: pread(261, 0xFFFFFD7FFB46DE00, 131072, 0xEDE8E0000) = 131072 26085: pread(261, 0xFFFFFD7FFB51DE00, 131072, 0xEDE900000) = 131072 26085: pread(261, 0xFFFFFD7FFB62DE00, 131072, 0xEDE920000) = 131072 26085: pread(261, 0xFFFFFD7FFAE0DE00, 131072, 0xEDE940000) = 131072 26085: pread(261, 0xFFFFFD7FFAF1DE00, 131072, 0xEDE960000) = 131072 26085: pread(261, 0xFFFFFD7FFC30DE00, 131072, 0xEDE980000) = 131072 26085: pread(261, 0xFFFFFD7FFC1FCE00, 131072, 0xEDE9A0000) = 131072 26085: pread(261, 0xFFFFFD7FFC0EBE00, 131072, 0xEDE9C0000) = 131072 26085: pread(261, 0xFFFFFD7FFBE0DE00, 131072, 0xEDE9E0000) = 131072 26085: pread(261, 0xFFFFFD7FFBEADE00, 512, 0xEDD400000) = 512 26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 26085: pread(261, 0xFFFFFD7FFBA4DE00, 131072, 0xEDD500000) = 131072 26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 26085: pread(261, 0xFFFFFD7FFBA4DE00, 131072, 0xEDD500000) = 131072 26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 26085: pread(261, 0xFFFFFD7FFC53BE00, 16384, 0xEDDED4000) = 16384 bash-3.00$ </pre> </div></div> <p>As seen above, offsets starting with 0xEDE and 0xEDD5 are greater than our corrupted offset of 0xEDD4DFA00. So, They are out of the scope.</p> <p>The followings should be examined:</p> <ul class="alternate" type="square"> <li>26085: pread(261, 0xFFFFFD7FFBEADE00, 512, 0xEDD400000) = 512 <ul class="alternate" type="square"> <li>This is the ASM Extent Offset. In other words, it's the base offset. (0xEDD400000+512)&lt;0xEDD4DFA00. So, it doesn't read the corrupted block.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 <ul class="alternate" type="square"> <li>(0xEDD400200+130560)&lt;0xEDD4DFA00. It doesn't read the corrupted block.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 <ul class="alternate" type="square"> <li>(0xEDD420000+512)&lt;0xEDD4DFA00. It doesn't read the corrupted block.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 <ul class="alternate" type="square"> <li>Same as before.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 <ul class="alternate" type="square"> <li>Same as before.</li> </ul> </li> </ul> <p>ARCH did not read the corrupted block#50941. But, it reported an error.</p> <p><b>dd Output of the Corrupted Block:</b></p> <p>ASM Corrupted Block Offset in 512 byte block: 63842417152/512=124692221</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124692221 count=1|od -x 0000000 2201 0000 f0fd 0000 001b 0000 80d8 2304 &lt;blockNo&gt; 0000020 3838 322e 3731 312e 3431 7807 0a6c 111e 0000040 2230 3001 002c 0605 3131 3730 3130 3306 </pre> </div></div> <p>0x0000f0fd is not 50941. So, it's corrupted.</p> <p>The reason why ARCH did not read this block is hidden in the error messages:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00353: log corruption near block 50941 change 9160702125 time 03/09/2009 1 </pre> </div></div> <p>It says <em>near</em>.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-47">QA-47</a>)</td> </tr> <tr> <td>Edited by:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a></td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11173#action_11173 RE: [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47?focusedCommentId=11172#action_11172 Tue, 10 Mar 2009 03:59:25 +0000 ubTools Support <b>Computing the Offset of Corrupted ASM Block:</b> <p>SQL&gt; select GROUP_NUMBER,NAME,ALLOCATION_UNIT_SIZE from v$asm_diskgroup;</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>GROUP_NUMBER NAME ALLOCATION_UNIT_SIZE ------------ ------------------------- -------------------- 1 DATA 1048576 SQL&gt; select GROUP_NUMBER, DISK_NUMBER, name, path from v$asm_disk; GROUP_NUMBER DISK_NUMBER NAME PATH ------------ ----------- ------------------------- -------------------- 1 0 DATA_0000 /u01/oradata/oravol1 1 1 DATA_0001 /u01/oradata/oravol2 1 2 DATA_0002 /u01/oradata/oravol3 1 3 DATA_0003 /u01/oradata/oravol4 1 4 DATA_0004 /u01/oradata/oravol5 </pre> </div></div> <ul class="alternate" type="square"> <li>ASM File Name: +DATA/orcl/onlinelog/group_1.516.680795507</li> <li>ASM File#.........: 516</li> <li>Corrupted Block#...: 50941</li> <li>File Block Size:</li> </ul> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select BLOCK_SIZE from v$asm_file where FILE_NUMBER=516; BLOCK_SIZE ---------- 512 </pre> </div></div> <ul class="alternate" type="square"> <li>Blocks per ASM Extent: 1048576/512=2048</li> <li>ASM Extent#......: 50941/2048 = 24 (rounded down)</li> <li>Block# in ASM Extent...: 50941 - 24*2048 = 1789</li> <li>Disk# and ASM Extent Offset:</li> </ul> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select DISK_KFFXP, AU_KFFXP from x$kffxp where XNUM_KFFXP=24 and group_kffxp=1 and NUMBER_KFFXP=516; DISK_KFFXP AU_KFFXP ---------- ---------- 1 60884 </pre> </div></div> <p>Disk#1 : /u01/oradata/oravol2<br/> ASM Extent Offset...: 60884*1048576 = 63841501184 --&gt; 0xEDD400000<br/> ASM Corrupted Block Offset.....: 63841501184+1789*512 = 63842417152 --&gt; 0xEDD4DFA00</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-47">QA-47</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11172#action_11172 RE: [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47?focusedCommentId=11171#action_11171 Tue, 10 Mar 2009 03:49:36 +0000 ubTools Support <b>Last Successful Log Switch:</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Beginning log switch checkpoint up to RBA [0x19.2.10], SCN: 9160700232 Mon Mar 9 19:38:21 2009 Thread 1 advanced to log sequence 25 (LGWR switch) Current log# 1 seq# 25 mem# 0: +DATA/orcl/onlinelog/group_1.516.680795507 Thread 1 cannot allocate new log, sequence 26 Checkpoint not complete Current log# 1 seq# 25 mem# 0: +DATA/orcl/onlinelog/group_1.516.680795507 Mon Mar 9 19:38:28 2009 Completed checkpoint up to RBA [0x19.2.10], SCN: 9160700232 </pre> </div></div> <p>As seen above, the last successful sequence before the corruption is 25.</p> <p><b>Header of Archive Log</b>:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(root@gdksun1:bin)$ dd if=/u01/app/oracle/product/10.2.0/dbs/arch/1_25_681074311.dbf bs=512 skip=50941 count=1|od -x 0000000 2201 0000 c6fd 0000 0019 0000 81d8 54c6 &lt;blockNo&gt; 0000020 2e32 3134 362e 0736 6b78 1207 2f0f 0212 0000040 332d 002c 0505 3831 3834 0532 7567 6469 0000060 0c65 3838 322e 3433 382e 2e38 3138 7807 0000100 076b 0f12 172f 3001 002c 0505 3032 3834 0000120 0739 7362 7361 6369 0e69 3538 312e 3530 0000140 312e 3535 322e 3233 7807 076b 0f12 172f </pre> </div></div> <p>The block number is 0x0000c6fd (bytes swapped since the platform is little endian). Since 50941=0x0000c6fd, block number in archive log is correct. That means, LGWR had successfuly written the correct redo before the log switch.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-47">QA-47</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11171#action_11171 RE: [QA-45] 'direct path read temp' hangs on read() system call when ASMLIB in use. http://jira.ubtools.com/jira/browse/QA-45?focusedCommentId=11161#action_11161 Tue, 3 Feb 2009 00:03:05 +0000 ubTools Support The problem caused by __read_nocancel () from /lib64/libpthread.so.0. <p>OS Vendor driver looks incompatible with Oracle ASMLIB.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-45">QA-45</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11161#action_11161 RE: [QA-12] Do read()/write() system calls block users in physical IO ? http://jira.ubtools.com/jira/browse/QA-12?focusedCommentId=10074#action_10074 Sun, 15 Jul 2007 13:35:36 +0000 ubTools Support There is a common misconseption that read()/write() system calls block users until physical IO to disk is completed. <p>read()/write() system calls do not block users during pyhsical IO unless file is opened with O_DIRECT or O_SYNC flags. Users are blocked just during copying buffers from/to user address space to/from kernel address space. So, although read()/write() calls look synchronous in user perspective, they don't do physical IO as synchronously.</p> <p>In Asynchronous IO calls(i.e aio_read()/aio_write()), users are just blocked during enqueuing IO requests, not during copying buffers from/to user address space to/from kernel address space and not during physical IO.</p> <br/> <br/> <table> <tr> <td>Author:</td> <td><a href="http://jira.ubtools.com/jira/secure/ViewProfile.jspa?name=support">ubTools Support</a> (<a href="http://jira.ubtools.com/jira/browse/QA-12">QA-12</a>)</td> </tr> </table> http://jira.ubtools.com/jira/browse/QA-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10074#action_10074