ubTools Support http://jira.ubtools.com/jira/secure/IssueNavigator.jspa?reset=true&pid=10042&status=6&sorter/field=issuekey&sorter/order=DESC An XML representation of a search request en-us [QA-65] ubGuard 4.0.0-4.0.1 Prerelease Commands http://jira.ubtools.com/jira/browse/QA-65 <br/> This document explains ubGuard 4.0.0-4.0.1 prerelease commands. QA-65 ubGuard 4.0.0-4.0.1 Prerelease Commands ubTools - ubGuard Major Closed Answered ubTools Support ubTools Support Wed, 19 Jan 2022 20:11:02 +0000 (UTC) Wed, 19 Jan 2022 22:09:32 +0000 (UTC) 0 <b>COMMANDS:</b> <p>ubGuard executable is &lt;UBGUARD_HOME&gt;/bin/ubguard.sh (ubguard.bat for Windows).</p> <p><ins>Prerequisite for all commands:</ins></p> <ul class="alternate" type="square"> <li>Oracle listeners must be running on primary and standy servers.</li> </ul> <p><b>Setup:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>&lt;UBGUARD_HOME&gt;/conf/setup.properties must be filled.</li> <li>Primary databases must be in OPEN state.</li> <li>Standby databases must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh setup </pre> </div></div> <p><b>Start:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Primary database must be in OPEN state.</li> <li>Standby database must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh start guard -d &lt;ubguard_database_alias&gt; </pre> </div></div> <p><b>Stop:</b></p> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -d &lt;ubguard_database_alias&gt; </pre> </div></div> <p><b>Status:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Primary database must be in OPEN state.</li> <li>Standby database must be in MOUNT or OPEN state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh status guard -d &lt;ubguard_database_alias&gt; </pre> </div></div> <p><b>Failover:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Standby database must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh failover to &lt;ubguard_database_alias&gt; [-f] </pre> </div></div> <p><ins>Default Failover:</ins></p> <p>It is a failover without "-f" option. It applies archivelogs to standby database, activates standby database as primary database. It causes less data loss, but longer failover time.</p> <p>The alternative manual method by RMAN on standby:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -d &lt;ubguard_database_alias&gt; CMD&gt; SET ORACLE_SID=&lt;SID&gt; CMD&gt; rman target / RMAN&gt; RECOVER DATABASE; RMAN&gt; SQL 'ALTER DATABASE ACTIVATE STANDBY DATABASE'; RMAN&gt; ALTER DATABASE OPEN; RMAN&gt; EXIT; </pre> </div></div> <p>If the manual method is used, ubGuard setup must be run again to update ubGuard's catalog.</p> <p><ins>Forced Failover:</ins></p> <p>It is a failover with "-f" option. It doesn't apply archivelogs to standby database. It activates standby database as primary database. It causes more data loss, but less failover time.</p> <p>The alternative manual method by RMAN on standby:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -d &lt;ubguard_database_alias&gt; CMD&gt; SET ORACLE_SID=&lt;SID&gt; CMD&gt; rman target / RMAN&gt; SQL 'ALTER DATABASE ACTIVATE STANDBY DATABASE'; RMAN&gt; ALTER DATABASE OPEN; RMAN&gt; EXIT; </pre> </div></div> <p>If the manual method is used, ubGuard setup must be run again to update ubGuard's catalog.</p> <p><b>Switchover:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Primary database must be in MOUNT or OPEN state.</li> <li>Standby database must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh switchover to &lt;ubguard_database_alias&gt; </pre> </div></div> [QA-64] ubGuard 4.0.0-1.0.0 Prerelease Commands http://jira.ubtools.com/jira/browse/QA-64 This document explains ubGuard 4.0.0-1.0.0 prerelease commands. QA-64 ubGuard 4.0.0-1.0.0 Prerelease Commands ubTools - ubGuard Major Closed Answered ubTools Support ubTools Support Tue, 19 Mar 2019 13:42:33 +0000 (UTC) Wed, 19 Jan 2022 22:09:54 +0000 (UTC) 0 <b>COMMANDS:</b> <p>ubGuard executable is &lt;UBGUARD_HOME&gt;/bin/ubguard.sh (ubguard.bat for Windows).</p> <p><ins>Prerequisite for all commands:</ins></p> <ul class="alternate" type="square"> <li>Oracle listeners must be running on primary and standy servers.</li> </ul> <p><b>Setup:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>&lt;UBGUARD_HOME&gt;conf/setup.properties must be filled.</li> <li>Primary databases must be opened.</li> <li>Standby databases must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh setup </pre> </div></div> <p><b>Start:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Primary database must be opened.</li> <li>Standby database must be in MOUNT state.</li> </ul> <p>Usage:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh start guard -d &lt;ubguard_database_alias&gt; </pre> </div></div> <p><b>Stop:</b></p> <p>Usage:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -d &lt;ubguard_database_alias&gt; </pre> </div></div> <p><b>Status:</b></p> <p>Usage:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh status guard -d &lt;ubguard_database_alias&gt; </pre> </div></div> <p><b>Failover:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Standby database must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh failover to &lt;ubguard_database_alias&gt; [-f] </pre> </div></div> <p><ins>Default Failover:</ins></p> <p>It is a failover without "-f" option. It gets all missing archivelogs from primary server, applies archivelogs to standby database, activates standby database as primary database. It causes less data loss, but longer failover time.</p> <p>The alternative manual method by RMAN on standby:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -d &lt;ubguard_database_alias&gt; --&gt; Copy all missing archivelogs from primary server to standby server CMD&gt; SET ORACLE_SID=&lt;SID&gt; CMD&gt; rman target / RMAN&gt; SHUTDOWN IMMEDIATE; RMAN&gt; STARTUP MOUNT; RMAN&gt; RECOVER DATABASE; RMAN&gt; SQL 'ALTER DATABASE ACTIVATE STANDBY DATABASE'; RMAN&gt; ALTER DATABASE OPEN; RMAN&gt; EXIT; </pre> </div></div> <p>If the manual method is used, ubGuard setup must be run again to update ubGuard's catalog.</p> <p><ins>Forced Failover:</ins></p> <p>It is a failover with "-f" option. It doesn't get archivelogs from primary server and it doesn't apply archivelogs to standby database. It activates standby database as primary database. It causes more data loss, but less failover time.</p> <p>The alternative manual method by RMAN on standby:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -d &lt;ubguard_database_alias&gt; CMD&gt; SET ORACLE_SID=&lt;SID&gt; CMD&gt; rman target / RMAN&gt; SHUTDOWN IMMEDIATE; RMAN&gt; STARTUP MOUNT; RMAN&gt; SQL 'ALTER DATABASE ACTIVATE STANDBY DATABASE'; RMAN&gt; ALTER DATABASE OPEN; RMAN&gt; EXIT; </pre> </div></div> <p>If the manual method is used, ubGuard setup must be run again to update ubGuard's catalog.</p> [QA-63] ORA-600 [3020] on the standby after adding a datafile on primary http://jira.ubtools.com/jira/browse/QA-63 <b>Problem:</b> <p>The customer has added a datafile on the primary database. After the datafile was created on the standby database, ORA-600 <span class="error">&#91;3020&#93;</span> was encountered while applying an archivelog to this datafile on standby.</p> <p><b>ORA-600 <span class="error">&#91;3020&#93;</span>:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>This is called a 'STUCK RECOVERY'. There is an inconsistency between the information stored in the redo and the information stored in a database block being recovered. </pre> </div></div> <p><em>Ref: Doc ID 30866.1</em></p> QA-63 ORA-600 [3020] on the standby after adding a datafile on primary Oracle - Internals Major Closed Answered ubTools Support ubTools Support Fri, 16 Feb 2018 10:31:38 +0000 (UTC) Fri, 16 Feb 2018 12:47:14 +0000 (UTC) 0 <b>PROBLEM OCCURRENCE:</b> <p><b>Adding datafile on the Primary:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Tue Feb 13 07:44:52 2018 ALTER TABLESPACE MENKUL2018_DATA ADD DATAFILE '/orassd/orcl/datafile/menkul2018_data02.dbf' SIZE 5G AUTOEXTEND ON NEXT 100M MAXSIZE UNLIMITED Completed: ALTER TABLESPACE MENKUL2018_DATA ADD DATAFILE '/orassd/orcl/datafile/menkul2018_data02.dbf' SIZE 5G AUTOEXTEND ON NEXT 100M MAXSIZE UNLIMITED Tue Feb 13 07:45:38 2018 </pre> </div></div> <p><b>Applying Archivelogs on the Standby:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Tue Feb 13 07:46:55 2018 ALTER DATABASE RECOVER AUTOMATIC STANDBY DATABASE UNTIL CHANGE 69358633799 Media Recovery Start started logmerger process Tue Feb 13 07:46:55 2018 Managed Standby Recovery not using Real Time Apply Parallel Media Recovery started with 24 slaves Media Recovery Log /u01/ORCL/archive/52b86b9a_1_91417_922239972.arc Tue Feb 13 07:47:07 2018 Successfully added datafile 232 to media recovery Datafile #232: '/u01/oracle/app/oradata/ORCL/datafile/ORCL_STBY/datafile/o1_mf_menkul20_f84vg0lt_.dbf' Incomplete Recovery applied until change 69358633799 time 02/13/2018 07:45:38 Tue Feb 13 07:47:08 2018 Media Recovery Complete (ORCL) Completed: ALTER DATABASE RECOVER AUTOMATIC STANDBY DATABASE UNTIL CHANGE 69358633799 ..... Tue Feb 13 08:26:55 2018 ALTER DATABASE RECOVER AUTOMATIC STANDBY DATABASE UNTIL CHANGE 69358697423 Media Recovery Start started logmerger process Tue Feb 13 08:26:55 2018 Managed Standby Recovery not using Real Time Apply Parallel Media Recovery started with 24 slaves Media Recovery Log /u01/ORCL/archive/52b86b9a_1_91425_922239972.arc Tue Feb 13 08:26:57 2018 Errors in file /u01/oracle/app/diag/rdbms/orcl_stby/ORCL/trace/ORCL_pr0i_21945.trc (incident=131897): ORA-00600: internal error code, arguments: [3020], [232], [3], [973078531], [], [], [], [], [], [], [], [] ORA-10567: Redo is inconsistent with data block (file# 232, block# 3, file offset is 24576 bytes) ORA-10564: tablespace MENKUL2018_DATA ORA-01110: data file 232: '/u01/oracle/app/oradata/ORCL/datafile/ORCL_STBY/datafile/o1_mf_menkul20_f84vg0lt_.dbf' ORA-10560: block type '0' Incident details in: /u01/oracle/app/diag/rdbms/orcl_stby/ORCL/incident/incdir_131897/ORCL_pr0i_21945_i131897.trc Tue Feb 13 08:26:59 2018 </pre> </div></div> <b>ANALYSIS of the RESULT:</b><br/> <em>Ref: ORCL_pr0i_21945.trc</em> <p><b>Data:</b></p> <p><ins>REDO Dump:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK *** 2018-02-13 08:26:56.826 RECOVERY STUCK AT BLOCK 3 OF FILE 232 Redo record scn: 0x0010.2619b07d CHANGE #1 TYP:0 CLS:12 AFN:232 DBA:0x3a000003 OBJ:4294967295 SCN:0x0010.2618c0fb SEQ:2 OP:22.5 ENC:0 RBL:0 Buffer read during recovery: ..... </pre> </div></div> <p>The stuck recovery happened at file#232 block#3 with KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK information.</p> <p><ins>Block Dump</ins>:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>buffer tsn: 141 rdba: 0x3a000003 (232/3) scn: 0x0000.00000000 seq: 0x01 flg: 0x05 tail: 0x00000001 frmt: 0x02 chkval: 0x9d03 type: 0x00=unknown on-disk scn: 0x0.0 </pre> </div></div> <p>SCN is 0, type is unknown. flg is 0x05:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Where flg: 0x05 contains flag 0x1 (unused,unformatted block). </pre> </div></div> <p><em>Ref: Oracle Doc ID 17896895.8</em></p> <p><b>Comment:</b></p> <p>The change vector was expecting SCN:0x0010.2618c0fb on the block. But, the SCN on the block was 0x0000.00000000.</p> <p>Oracle was trying to apply archivelog to an unformatted block. REDO in archivelog is beyond block in datafile. This inconsistency causes stuck recovery.</p> <b>ANALYSIS of the ROOT CAUSE:</b> <p><b>Data:</b></p> <p> The datafile has been created at sequence#91417 and the problem happened at sequence#91425 at file#232 block#3.</p> <p><ins>REDO Dump Commands:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91417_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91418_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91419_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91420_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91421_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91422_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91423_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91424_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; ALTER SYSTEM dump logfile '/u01/ORCL/archive/52b86b9a_1_91425_922239972.arc' dba min 232 3 dba max 232 3; System altered. SQL&gt; </pre> </div></div> <p><ins>REDO Dumps:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>DUMP OF REDO FROM FILE '/u01/ORCL/archive/52b86b9a_1_91417_922239972.arc' ..... REDO RECORD - Thread:1 RBA: 0x016519.00000765.01e8 LEN: 0x0058 VLD: 0x01 SCN: 0x0010.2618c0fb SUBSCN: 1 02/13/2018 07:44:59 (LWN RBA: 0x016519.00000763.0010 LEN: 0004 NST: 0002 SCN: 0x0010.2618c0f9) CHANGE #1 TYP:1 CLS:12 AFN:232 DBA:0x3a000003 OBJ:4294967295 SCN:0x0010.2618c0fb SEQ:1 OP:22.4 ENC:0 RBL:0 ktfbbfo - File BitMap Block Format: BitMap Control: RelFno: 232, BeginBlock: 128, Flag: 0, First: 0, Free: 63488 REDO RECORD - Thread:1 RBA: 0x016519.00000763.0010 LEN: 0x0244 VLD: 0x05 SCN: 0x0010.2618c0fb SUBSCN: 1 02/13/2018 07:44:59 CHANGE #1 TYP:0 CLS:69 AFN:3 DBA:0x00c018c0 OBJ:4294967295 SCN:0x0010.2618c0ee SEQ:1 OP:5.4 ENC:0 RBL:0 ktucm redo: slt: 0x0007 sqn: 0x000060e7 srt: 0 sta: 9 flg: 0x2 ktucf redo: uba: 0x3000319f.07d5.06 ext: 2 spc: 7334 fbi: 0 CHANGE #2 MEDIA RECOVERY MARKER SCN:0x0000.00000000 SEQ:0 OP:17.30 ENC:0 Add datafiles to tablespace #141 file #232 relative file #232. '/orassd/orcl/datafile/menkul2018_data02.dbf' flags(reuse): 0x0 Checkpointed at scn: 0x0010.2618c0f0 02/13/2018 07:44:56 ..... DUMP OF REDO FROM FILE '/u01/ORCL/archive/52b86b9a_1_91425_922239972.arc' ..... REDO RECORD - Thread:1 RBA: 0x016521.0001a022.0034 LEN: 0x0040 VLD: 0x01 SCN: 0x0010.2619b07d SUBSCN: 14 02/13/2018 08:23:51 (LWN RBA: 0x016521.00019dca.0010 LEN: 1012 NST: 0002 SCN: 0x0010.2619b069) CHANGE #1 TYP:0 CLS:12 AFN:232 DBA:0x3a000003 OBJ:4294967295 SCN:0x0010.2618c0fb SEQ:2 OP:22.5 ENC:0 RBL:0 ktfbbredo - File BitMap Block Redo: Use Bits: </pre> </div></div> <p><b>Comment:</b></p> <p>The datafile has been created at sequence#91417 by REDO OP code 17.30, which means:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>the OP:17.30 redo which adds the &lt;file#&gt; datafile </pre> </div></div> <p><em>Ref: Oracle Doc ID 27229389.8</em></p> <p>There are some other OP codes 22.4 and 5.4 before adding the datafile.</p> <p><ins>Change Vector for OP Code 22:4:</ins></p> <p>It tries to change absolute file#232 <em>(AFN:232 DBA:0x3a000003)</em>. This is the problem that Oracle tries to apply a change vector to a file which was not created yet.</p> <p><ins>Change Vector for OP Code 5:4:</ins></p> <p>It tries to change absolute file#3 <em>(AFN:3 DBA:0x00c018c0)</em>. This is a different file. So, it's out of the scope.</p> <b>SOLUTION:</b> <p><b>Problem:</b></p> <p>Oracle tries to apply archivelog to a file which was not created on standby yet.</p> <p><b>Fix:</b></p> <p>This is Oracle bug 27229389.</p> <p><b>Workaround:</b></p> <p>Copy datafile from primary to standby that doesn't require corrupted archivelogs.</p> Operating System Product Version 11.2.0.4 Database Name . Host Name . [QA-62] ubGuard 3.0.0 Commands http://jira.ubtools.com/jira/browse/QA-62 This document explains ubGuard 3.0.0 commands. QA-62 ubGuard 3.0.0 Commands ubTools - ubGuard Major Closed Answered ubTools Support ubTools Support Tue, 26 Dec 2017 13:52:47 +0000 (UTC) Tue, 26 Dec 2017 14:31:00 +0000 (UTC) 0 <b>COMMANDS:</b> <p>ubGuard executable is &lt;UBGUARD_HOME&gt;/bin/ubguard.sh (ubguard.bat for Windows).</p> <p><ins>Prerequisite for all commands:</ins></p> <ul class="alternate" type="square"> <li>Regarding Oracle listeners must be running.</li> </ul> <p><b>Setup:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>&lt;UBGUARD_HOME&gt;conf/setup.properties must be filled.</li> <li>Primary databases must be opened.</li> <li>Standby databases must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh setup </pre> </div></div> <p><b>Start:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Primary database must be opened.</li> <li>Standby database must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh start guard -i &lt;ubguard_instance_alias&gt; </pre> </div></div> <p><b>Stop:</b></p> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -i &lt;ubguard_instance_alias&gt; </pre> </div></div> <p><b>Status:</b></p> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh status guard -i &lt;ubguard_instance_alias&gt; </pre> </div></div> <p><b>Failover:</b></p> <p><ins>Prerequisites:</ins></p> <ul class="alternate" type="square"> <li>Standby database must be in MOUNT state.</li> </ul> <p><ins>Usage:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh failover to &lt;ubguard_instance_alias&gt; [-f] </pre> </div></div> <p><ins>Default Failover:</ins></p> <p>It is a failover without "-f" option. It gets all missing archivelogs from primary server, applies archivelogs to standby database, activates standby database as primary database. It causes less data loss, but longer failover time.</p> <p>The alternative manual method by RMAN on standby:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -i &lt;ubguard_instance_alias&gt; --&gt; Copy all missing archivelogs from primary server to standby server CMD&gt; SET ORACLE_SID=&lt;SID&gt; CMD&gt; rman target / RMAN&gt; SHUTDOWN IMMEDIATE; RMAN&gt; STARTUP MOUNT; RMAN&gt; RECOVER DATABASE; RMAN&gt; SQL 'ALTER DATABASE ACTIVATE STANDBY DATABASE'; RMAN&gt; ALTER DATABASE OPEN; RMAN&gt; EXIT; </pre> </div></div> <p>If the manual method is used, ubGuard setup must be run again to update ubGuard's catalog.</p> <p><ins>Forced Failover:</ins></p> <p>It is a failover with "-f" option. It doesn't get archivelogs from primary server and it doesn't apply archivelogs to standby database. It activates standby database as primary database. It causes more data loss, but less failover time.</p> <p>The alternative manual method by RMAN on standby:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CMD&gt; ubguard.sh stop guard -i &lt;ubguard_instance_alias&gt; CMD&gt; SET ORACLE_SID=&lt;SID&gt; CMD&gt; rman target / RMAN&gt; SHUTDOWN IMMEDIATE; RMAN&gt; STARTUP MOUNT; RMAN&gt; SQL 'ALTER DATABASE ACTIVATE STANDBY DATABASE'; RMAN&gt; ALTER DATABASE OPEN; RMAN&gt; EXIT; </pre> </div></div> <p>If the manual method is used, ubGuard setup must be run again to update ubGuard's catalog.</p> [QA-60] "PRVF-5507 : NTP daemon or service is not running on any node ..." even if NTP is running. http://jira.ubtools.com/jira/browse/QA-60 CVU gives the following error: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>$./runcluvfy.sh stage -pre crsinst -n sygnx01,sygnx02 -verbose ..... No NTP Daemons or Services were found to be running PRVF-5507 : NTP daemon or service is not running on any node but NTP configuration file exists on the following node(s): sygnx02,sygnx01 Result: Clock synchronization check using Network Time Protocol(NTP) failed </pre> </div></div> QA-60 "PRVF-5507 : NTP daemon or service is not running on any node ..." even if NTP is running. Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Sat, 5 Mar 2016 15:12:03 +0000 (UTC) Sat, 5 Mar 2016 15:27:19 +0000 (UTC) 0 <b>NTP status:</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>[root@sygnx01 ~]# systemctl status ntpd ntpd.service - Network Time Service Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2016-03-05 14:32:46 EET; 1h 29min ago Process: 1074 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS) Main PID: 1081 (ntpd) CGroup: /system.slice/ntpd.service 1081 /usr/sbin/ntpd -u ntp:ntp -x -g </pre> </div></div> <p>NTP is running.</p> <b>CVU Trace:</b> <p><ins>Generating Trace:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>$ export CV_TRACELOC=/tmp $ export SRVM_TRACE=true $ ./runcluvfy.sh stage -pre crsinst -n sygnx01,sygnx02 -verbose ..... </pre> </div></div> <p><ins>Excerpt from the trace:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>[20421@***.***.com] [Worker 0] [ 2016-03-05 16:20:46.657 EET ] [RuntimeExec.runCommand:77] /tmp/CVU_11.2.0.4.0_grid/exectask.sh -chkfile /var/run/ntpd.pid [20421@***.***.com] [Worker 0] [ 2016-03-05 16:20:46.659 EET ] [RuntimeExec.runCommand:142] runCommand: Waiting for the process [20421@***.***.com] [Thread-216] [ 2016-03-05 16:20:46.659 EET ] [StreamReader.run:61] In StreamReader.run [20421@***.***.com] [Thread-217] [ 2016-03-05 16:20:46.659 EET ] [StreamReader.run:61] In StreamReader.run [20421@***.***.com] [Thread-216] [ 2016-03-05 16:20:46.668 EET ] [StreamReader.run:65] OUTPUT&gt;&lt;CV_VRES&gt;1&lt;/CV_VRES&gt;&lt;CV_LOG&gt;Exectask: file check failed&lt;/CV_LOG&gt;&lt;CV_ERES&gt;0&lt;/CV_ERES&gt; ..... [20421@sygnx01.sankomenkul.com] [main] [ 2016-03-05 16:20:46.669 EET ] [TaskDaemonLiveliness.displayDaemonLivelinessOutput:283] Daemon 'ntpd' is not running on node: 'sygnx01' </pre> </div></div> <p>"/var/run/ntpd.pid" doesn't exist.</p> <b>Solution</b> <p>There was no "/var/run/ntpd.pid" file defined in "/etc/sysconfig/ntpd". The problem has been solved after setting as below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>#OPTIONS="-g" OPTIONS="-x -g -p /var/run/ntpd.pid" </pre> </div></div> <p>Additional note:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>NTP has been replaced by Chrony(new feature) in Oracle Linux 7. </pre> </div></div> <p><em>Ref: Oracle Note: Unable to Configure NTP after Oracle Linux 7 Installation (Doc ID 1995703.1)</em></p> Operating System Operating System Version Oracle Linux 7.2 Product Version 11.2.0.4 RAC Database Name . Host Name . [QA-59] Unable to use the full CPU speed when CPUfreq Governor is ondemand. http://jira.ubtools.com/jira/browse/QA-59 The customer is unable use the full CPU speed. The CPUfreq Governor is OnDemand. QA-59 Unable to use the full CPU speed when CPUfreq Governor is ondemand. Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Tue, 29 Sep 2015 12:15:07 +0000 (UTC) Fri, 2 Oct 2015 13:45:05 +0000 (UTC) 0 See the following notes for the basic definitions of CPUfreq Governors: <ul class="alternate" type="square"> <li><span class="nobr"><a href="https://www.kernel.org/doc/Documentation/cpu-freq/user-guide.txt">https://www.kernel.org/doc/Documentation/cpu-freq/user-guide.txt<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></li> <li><span class="nobr"><a href="https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt">https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></li> </ul> <b>ENVIRONMENT:</b> <p><b>Data:</b><br/> <em>for the CPU0(similar for the others):</em></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>[perftest1]/sys/devices/system/cpu/cpu0/cpufreq $ more * :::::::::::::: affected_cpus :::::::::::::: 0 cpuinfo_cur_freq: Permission denied :::::::::::::: cpuinfo_max_freq :::::::::::::: 2000000 :::::::::::::: cpuinfo_min_freq :::::::::::::: 1200000 :::::::::::::: cpuinfo_transition_latency :::::::::::::: 10000 *** ondemand: directory *** :::::::::::::: related_cpus :::::::::::::: 0 :::::::::::::: scaling_available_frequencies :::::::::::::: 2000000 1900000 1800000 1700000 1600000 1500000 1400000 1300000 1200000 :::::::::::::: scaling_available_governors :::::::::::::: ondemand userspace performance :::::::::::::: scaling_cur_freq :::::::::::::: 2000000 :::::::::::::: scaling_driver :::::::::::::: acpi-cpufreq :::::::::::::: scaling_governor :::::::::::::: ondemand :::::::::::::: scaling_max_freq :::::::::::::: 2000000 :::::::::::::: scaling_min_freq :::::::::::::: 1200000 :::::::::::::: scaling_setspeed :::::::::::::: &lt;unsupported&gt; *** stats: directory *** [perftest1]/sys/devices/system/cpu/cpu0/cpufreq $ cd ondemand [perftest1]/sys/devices/system/cpu/cpu0/cpufreq/ondemand $ ls -ltr total 0 -r--r--r-- 1 root root 4096 Sep 29 15:17 sampling_rate_min -r--r--r-- 1 root root 4096 Sep 29 15:17 sampling_rate_max -rw-r--r-- 1 root root 4096 Sep 29 15:17 up_threshold -rw-r--r-- 1 root root 4096 Sep 29 15:17 sampling_rate -rw-r--r-- 1 root root 4096 Sep 29 15:17 powersave_bias -rw-r--r-- 1 root root 4096 Sep 29 15:17 ignore_nice_load [perftest1]/sys/devices/system/cpu/cpu0/cpufreq/ondemand $ more * :::::::::::::: ignore_nice_load :::::::::::::: 0 :::::::::::::: powersave_bias :::::::::::::: 0 :::::::::::::: sampling_rate :::::::::::::: 10000 :::::::::::::: sampling_rate_max :::::::::::::: 4294967295 :::::::::::::: sampling_rate_min :::::::::::::: 10000 :::::::::::::: up_threshold :::::::::::::: 95 [perftest1]/sys/devices/system/cpu/cpu0/cpufreq/ondemand $ </pre> </div></div> <p><b>View:</b></p> <ul class="alternate" type="square"> <li>scaling_governor: CPU scaling governor is ondemand.</li> <li>cpuinfo_min_freq: Minimum CPU frequency is 1200000Khz(1.2Ghz)</li> <li>cpuinfo_max_freq: Maximum CPU frequency is 2001000Khz(2.0Ghz)</li> <li>sampling_rate: The kernel looks at the CPU usage per 10000us(10ms) to make decisions about CPU frequency.</li> <li>up_threshold: The kernel will increase the CPU frequency if average CPU usage between each sampling_rate(10ms) is higher than 95%.</li> </ul> atop(<span class="nobr"><a href="http://www.atoptool.nl/">http://www.atoptool.nl/<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span>) tool wil be used to monitor CPU frequencies. <p>From the man page of atop:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>In case that the kernel module 'cpufreq_stats' is active (after issueing 'modprobe cpufreq_stats'), the average frequency ('avgf') and the average scaling percentage ('avgscal') is shown. Otherwise the current frequency ('curf') and the current scaling percentage ('curscal') is shown at the moment that the sample is taken. </pre> </div></div> <p>In order to compare the CPU usages to the frequencies, CPU "cpufreq_stats" should be enabled. Otherwise, atop will show the current frequencies, not the average during monitoring samples.</p> <b>METHOD:</b> <ul class="alternate" type="square"> <li>The tests will be done when CPU scaling governors are ondemand and then performance.</li> <li>The same work load will be generated by HP's LOAD RUNNER tool.</li> <li>The results will be compared.</li> </ul> <b>TEST1:</b> <p>CPU scaling governor is ondemand.</p> <p><b>An atop snapshot:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ATOP - avsprddbflx05 2015/09/29 15:40:21 --------- 10s elapsed PRC | sys 7.86s | user 80.03s | #proc 1347 | #tslpi 1791 | #tslpu 0 | #zombie 0 | no procacct | CPU | sys 66% | user 800% | irq 13% | idle 626% | wait 95% | avgf 1.63GHz | avgscal 81% | cpu | sys 4% | user 86% | irq 1% | idle 6% | cpu000 w 3% | avgf 1.94GHz | avgscal 96% | cpu | sys 4% | user 77% | irq 5% | idle 11% | cpu004 w 3% | avgf 1.90GHz | avgscal 94% | cpu | sys 4% | user 72% | irq 0% | idle 17% | cpu001 w 7% | avgf 1.83GHz | avgscal 91% | cpu | sys 3% | user 67% | irq 0% | idle 20% | cpu002 w 9% | avgf 1.80GHz | avgscal 90% | cpu | sys 3% | user 61% | irq 0% | idle 28% | cpu003 w 8% | avgf 1.73GHz | avgscal 86% | cpu | sys 6% | user 54% | irq 1% | idle 34% | cpu009 w 6% | avgf 1.62GHz | avgscal 80% | cpu | sys 3% | user 52% | irq 1% | idle 36% | cpu005 w 8% | avgf 1.68GHz | avgscal 83% | cpu | sys 7% | user 47% | irq 1% | idle 29% | cpu008 w 16% | avgf 1.67GHz | avgscal 83% | cpu | sys 6% | user 45% | irq 0% | idle 48% | cpu013 w 1% | avgf 1.52GHz | avgscal 76% | cpu | sys 7% | user 39% | irq 1% | idle 53% | cpu015 w 1% | avgf 1.49GHz | avgscal 74% | cpu | sys 3% | user 41% | irq 0% | idle 49% | cpu006 w 7% | avgf 1.57GHz | avgscal 78% | cpu | sys 5% | user 34% | irq 0% | idle 52% | cpu010 w 9% | avgf 1.51GHz | avgscal 75% | cpu | sys 2% | user 35% | irq 1% | idle 55% | cpu007 w 7% | avgf 1.55GHz | avgscal 77% | cpu | sys 4% | user 32% | irq 0% | idle 63% | cpu014 w 1% | avgf 1.43GHz | avgscal 71% | cpu | sys 4% | user 31% | irq 0% | idle 59% | cpu011 w 6% | avgf 1.46GHz | avgscal 72% | cpu | sys 2% | user 26% | irq 0% | idle 68% | cpu012 w 3% | avgf 1.43GHz | avgscal 71% | CPL | avg1 5.33 | avg5 5.46 | avg15 4.57 | csw 377039 | intr 323539 | | numcpu 16 | MEM | tot 126.1G | free 38.8G | cache 3.6G | dirty 4.0M | buff 146.3M | slab 577.8M | | SWP | tot 17.1G | free 17.1G | | | | vmcom 12.9G | vmlim 42.6G | NET | transport | tcpi 20869 | tcpo 21016 | udpi 70875 | udpo 71067 | tcpao 33 | tcppo 1 | NET | network | ipi 128214 | ipo 92084 | ipfrw 0 | deliv 91742 | icmpi 0 | icmpo 0 | PID TID SYSCPU USRCPU VGROW RGROW RUID EUID THR ST EXC S CPU CMD 1/64 13661 - 0.15s 4.50s -24.0M -14.3M grid oracle 1 -- - R 47% oracle 15747 - 0.16s 3.80s 0K 684K grid oracle 1 -- - S 40% oracle 13733 - 0.32s 3.57s 32768K 31360K grid oracle 1 -- - R 39% oracle 27274 - 0.61s 3.21s 24576K 11976K grid oracle 1 -- - R 39% oracle 14869 - 0.17s 3.29s 0K -1880K grid oracle 1 -- - S 35% oracle </pre> </div></div> <p>The "CPU" shows overall statistics for all CPUs.<br/> The "cpu" shows statistics for single CPU.</p> <p><b>Analysis:</b></p> <ul class="alternate" type="square"> <li>Although maximum CPU frequency is 2.0Ghz, the server could not use its full speed. It used average 1.63Ghz, which is 81% of full CPU speed.</li> <li>When CPU usage is 91%(sys:4+user:86+irq:1) at cpu000, it used average 1.94Ghz, which is 96% of full CPU speed.</li> <li>When CPU usage is 51%(sys:6+user:45+irq:0) at cpu013, it used average 1.52Ghz, which is 76% of full CPU speed.</li> <li>When CPU usage is 28%(sys:2+user:26+irq:0) at cpu012, it used average 1.43Ghz, which is 71% of full CPU speed.</li> </ul> <b>TEST2:</b> <p>CPU scaling governor is performance.</p> <p><b>An atop snapshot:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ATOP - avsprddbflx05 2015/09/29 16:16:27 --------- 10s elapsed PRC | sys 7.06s | user 86.40s | #proc 1313 | #tslpi 1756 | #tslpu 0 | #zombie 0 | no procacct | CPU | sys 57% | user 864% | irq 14% | idle 623% | wait 43% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 80% | irq 7% | idle 10% | cpu004 w 1% | avgf 2.00GHz | avgscal 100% | cpu | sys 5% | user 81% | irq 2% | idle 11% | cpu000 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 82% | irq 1% | idle 14% | cpu001 w 1% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 74% | irq 0% | idle 21% | cpu002 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 73% | irq 0% | idle 22% | cpu003 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 62% | irq 0% | idle 32% | cpu005 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 6% | user 58% | irq 1% | idle 26% | cpu008 w 10% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 51% | irq 0% | idle 43% | cpu006 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 47% | irq 0% | idle 47% | cpu010 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 3% | user 46% | irq 0% | idle 49% | cpu013 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 2% | user 44% | irq 1% | idle 50% | cpu007 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 6% | user 40% | irq 1% | idle 48% | cpu009 w 6% | avgf 2.00GHz | avgscal 100% | cpu | sys 6% | user 33% | irq 1% | idle 57% | cpu011 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 2% | user 34% | irq 0% | idle 60% | cpu012 w 3% | avgf 2.00GHz | avgscal 100% | cpu | sys 4% | user 28% | irq 0% | idle 66% | cpu014 w 2% | avgf 2.00GHz | avgscal 100% | cpu | sys 2% | user 28% | irq 0% | idle 69% | cpu015 w 1% | avgf 2.00GHz | avgscal 100% | CPL | avg1 5.98 | avg5 6.41 | avg15 4.75 | csw 382133 | intr 340254 | | numcpu 16 | MEM | tot 126.1G | free 36.6G | cache 5.3G | dirty 28.9M | buff 193.0M | slab 836.2M | | SWP | tot 17.1G | free 17.1G | | | | vmcom 13.0G | vmlim 42.6G | NET | transport | tcpi 10272 | tcpo 10302 | udpi 75222 | udpo 75458 | tcpao 30 | tcppo 2 | NET | network | ipi 111030 | ipo 85760 | ipfrw 0 | deliv 85494 | icmpi 0 | icmpo 0 | PID TID SYSCPU USRCPU VGROW RGROW RUID EUID THR ST EXC S CPU CMD 1/62 15847 - 0.17s 4.59s 0K 896K grid oracle 1 -- - S 48% oracle 14867 - 0.14s 3.79s 0K 2220K grid oracle 1 -- - R 40% oracle 15835 - 0.15s 3.76s 8192K 384K grid oracle 1 -- - R 39% oracle 14871 - 0.24s 3.59s 0K 0K grid oracle 1 -- - R 39% oracle 15849 - 0.14s 3.55s 0K -1216K grid oracle 1 -- - R 37% oracle </pre> </div></div> <p><b>Analysis:</b></p> <ul class="alternate" type="square"> <li>The maximum CPU frequency is 2.0Ghz and all CPUs could use 100% of full CPU speed.</li> </ul> <b>COMPARISION:</b> <p>30 minutes load test results...</p> <p>1st: When CPU scaling governor is ondemand.<br/> 2nd: When CPU scaling governor is performance.</p> <p><b>Data:</b></p> <p><ins>Top Activity:</ins></p> <p><img src="http://www.ubTools.com/jira/secure/attachment/13743/13743_EMTopActivity.png" align="absmiddle" border="0" /></p> <p><ins>AWR:</ins></p> <p><img src="http://www.ubTools.com/jira/secure/attachment/13744/13744_AWR.png" align="absmiddle" border="0" /></p> <p><b>Analysis:</b></p> <ul class="alternate" type="square"> <li>"row cache lock" wait time decreased since the holders did their jobs faster, as a result held the resources shorter.</li> <li>DB time decreased 31.4%, mostly from decrease in "row cache lock".</li> <li>Logical reads increased 16.3% since more buffer gets could be done on the faster CPU frequency.</li> </ul> <b>SUMMARY:</b> <p><ins>Analysis:</ins></p> <ul class="alternate" type="square"> <li>Changing CPU scaling governor from "ondemand" to "performance" increased the performance.</li> <li>Performance improvement is noticable when: <ul class="alternate" type="square"> <li>The difference between the minumum and maximum CPU frequencies is high.</li> <li>CPU usage is not heavy(up_threshold:95%).</li> <li>There are sessions waiting for other sessions on CPU.</li> </ul> </li> </ul> <p><ins>Recommendations:</ins></p> <ul class="alternate" type="square"> <li>If performance is important than heating, set CPU scaling governor to "performance".</li> </ul> <b>CPU TIME and LOGICAL READS:</b> <p><b>Data:</b> </p> <table class='confluenceTable'><tbody> <tr> <th class='confluenceTh'>&nbsp;</th> <th class='confluenceTh'>ondemand</th> <th class='confluenceTh'>performance</th> <th class='confluenceTh'>Difference(%)</th> </tr> <tr> <td class='confluenceTd'>CPU time per second</td> <td class='confluenceTd'>7.8s</td> <td class='confluenceTd'>8.3s</td> <td class='confluenceTd'>6.4</td> </tr> <tr> <td class='confluenceTd'>Logical reads per second</td> <td class='confluenceTd'>535,236.2</td> <td class='confluenceTd'>622,625.2</td> <td class='confluenceTd'>16.3</td> </tr> <tr> <td class='confluenceTd'>CPU time per Logical reads</td> <td class='confluenceTd'>14,6us</td> <td class='confluenceTd'>13,3us</td> <td class='confluenceTd'>8.9</td> </tr> </tbody></table> <p><b>Analysis:</b> </p> <p>8.9% improvements in CPU time caused 31.4% improvement DB time. </p> The focus here is to show how CPU scaling governor affects Oracle service and wait times; not to show how to tune Oracle events such as "row cache lock" above. Operating System Operating System Version 2.6.32-504.23.4.el6.x86_64 Product Version 11.2.0.3 Database Name . Host Name . [QA-57] ORA-04030 returned by "__libc_sbrk(0x0000000001010020) Err#12 ENOMEM" http://jira.ubtools.com/jira/browse/QA-57 The customer encountered the following problem: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-04030: out of process memory when trying to allocate 2093096 bytes (QERHJ hash-joi,QERHJ list array) </pre> </div></div> QA-57 ORA-04030 returned by "__libc_sbrk(0x0000000001010020) Err#12 ENOMEM" Oracle - Operating System Major Closed Third-party Problem ubTools Support ubTools Support Mon, 2 Dec 2013 14:21:20 +0000 (UTC) Tue, 28 Feb 2017 09:01:55 +0000 (UTC) 0 <b>ANALYIS 1:</b> <p><ins>PGASTAT:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select * from v$pgastat order by value; NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ maximum PGA used for manual workareas 0 bytes over allocation count 0 total PGA used for manual workareas 0 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ cache hit percentage 98.53 percent process count 126 max processes count 135 NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ recompute count (total) 132370 total PGA used for auto workareas 4399104 bytes total freeable PGA memory 106823680 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ maximum PGA used for auto workareas 153909248 bytes global memory bound 214743040 bytes total PGA inuse 747691008 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ total PGA allocated 1180690432 bytes aggregate PGA auto target 1265577984 bytes maximum PGA allocated 1299183616 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ aggregate PGA target parameter 2147483648 bytes extra bytes read/written 1.2622E+10 bytes PGA memory freed back to OS 6.0510E+10 bytes NAME VALUE ---------------------------------------------------------------- ---------- UNIT ------------ bytes processed 8.5171E+11 bytes 19 rows selected. SQL&gt; </pre> </div></div> <p><em>pga_aggregate_target</em> parmeter is not exceeded.</p> <p><ins>HEAPDUMP:</ins></p> <p><ins>Set Up:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>To setup tracing to trap the ORA-4030, on the server use the following in SQL*Plus: SQL&gt; ALTER SYSTEM SET EVENTS '4030 trace name heapdump level 536870917;name errorstack level 3'; Once the error reoccurs with the event set, you can turn off tracing using the following command in SQL*Plus: ALTER SYSTEM SET EVENTS '4030 trace name context off; name context off'; </pre> </div></div> <p><em>Ref: Oracle note: Master Note for Diagnosing OS Memory Problems and ORA-4030 (Doc ID 1088267.1)</em></p> <p><ins>TRACE:</ins></p> <p>Heap:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>HEAP DUMP heap name="session heap" desc=11044a830 extent sz=0xff80 alt=32767 het=32767 rec=0 flg=2 opc=2 parent=1101981f0 owner=70000033f6789e8 nex=0 xsz=0x0 ..... Total heap size =108241256 </pre> </div></div> <p>Internal Parameters:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> _pga_max_size = 419420 KB ..... _smm_max_size = 209710 KB _smm_px_max_size = 1048576 KB </pre> </div></div> <p>No PGA limits are exceeded.</p> <b>ANALYSIS 2:</b> <p><ins>System Calls:</ins></p> <p><em>truss -fae -o &lt;outputFile&gt; -p &lt;V$PROCESS.SPID&gt;</em> excerpt:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>14483680: 43122723: __libc_sbrk(0x0000000001010020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FE0020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001004020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FE0020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001001020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FE0020) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001000420) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FDF420) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001000120) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FDF420) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000001000060) Err#12 ENOMEM 14483680: 43122723: __libc_sbrk(0x0000000000FDF420) Err#12 ENOMEM 14483680: 43122723: statx("/oracle/admin/ATSD/udump", 0x0FFFFFFFFFFF41A8, 176, 0) = 0 14483680: 43122723: close(5) = 0 14483680: 43122723: statx("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc", 0x0FFFFFFFFFFF44C0, 176, 01) Err#2 ENOENT 14483680: 43122723: statx("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc", 0x0FFFFFFFFFFF44C0, 176, 0) Err#2 ENOENT 14483680: 43122723: kopen("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP) = 5 14483680: 43122723: kwrite(5, 0x0000000104A1C468, 0) = 0 14483680: 43122723: kwrite(5, " / o r a c l e / a d m i".., 47) = 47 </pre> </div></div> <p>When ORA-4030 error occured, trace file <em>("/oracle/admin/ATSD/udump/atsd2_ora_14483680.trc</em> was created. So, the problem occured before its generation at _<em>libc_sbrk with return code of _ENOMEM</em>. The system could not return memory to Oracle process.</p> <p><ins>User resource limits:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>oracle@atlasdb2:/home/oracle/dunal &gt;ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) unlimited memory(kbytes) unlimited coredump(blocks) unlimited nofiles(descriptors) unlimited threads(per process) unlimited processes(per user) unlimited oracle@atlasdb2:/home/oracle/dunal &gt; </pre> </div></div> <p>No limit was found for oracle user.</p> The system admin will work on this problem. The solution will be added here. There was no response from the system admin. But, the problem was a resource limit problem that Oracle user could not allocate memory. Operating System Operating System Version 6.1 Product Version 10.2.0.4 Database Name . Host Name . [QA-55] deinstall tool drops database http://jira.ubtools.com/jira/browse/QA-55 <em>Oracle® Database Upgrade Guide 11g Release 2 (11.2) Part Number E23633-07</em> writes: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Known Issue with the Deinstallation Tool for This Release Cause: After upgrading from 11.2.0.1 or 11.2.0.2 to 11.2.0.3, deinstallation of the Oracle home in the earlier release of Oracle Database may result in the deletion of the old Oracle base that was associated with it. This may also result in the deletion of data files, audit files, etc., which are stored under the old Oracle base. Action: Before deinstalling the Oracle home in the earlier release, edit the orabase_cleanup.lst file found in the $Oracle_Home/utl directory and remove the "oradata" and "admin" entries. Then, deinstall the Oracle home using the 11.2.0.3 deinstallation tool. </pre> </div></div> <p>_Ref: <span class="nobr"><a href="http://docs.oracle.com/cd/E11882_01/server.112/e23633/intro.htm#BHCEECDJ">http://docs.oracle.com/cd/E11882_01/server.112/e23633/intro.htm#BHCEECDJ<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></p> <p>In our case:</p> <ul class="alternate" type="square"> <li>There were already no oradata and admin entries in $ORACLE_HOME/utl/orabase_cleanup.lst.</li> <li>There was already no database file in $ORACLE_BASE/oradata. There was just a soft link to ASM disk, which includes the database.</li> </ul> <p>But, <em>deinstall</em> tool dropped the database.</p> QA-55 deinstall tool drops database Oracle - Administration Major Closed Answered ubTools Support ubTools Support Tue, 26 Mar 2013 15:49:51 +0000 (UTC) Tue, 26 Mar 2013 15:52:54 +0000 (UTC) 0 Be careful while using deinstall. if you want to keep your database, don't use it until this problem is fixed. Operating System Product Version 11.2.0.3 Database Name . Host Name . [QA-54] Unable to close database by srvctl and racgimon takes 100% of CPU. http://jira.ubtools.com/jira/browse/QA-54 Unable to close the database: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>$ srvctl stop database -d ESIBASE PRKP-1002 : Error stopping instance ESIBASE1 on node ersteracsrv1 CRS-0216: Could not stop resource 'ora.ESIBASE.ESIBASE1.inst'. $ </pre> </div></div> <p>2 <em>racgimon</em> processes take 100% of CPU in <em>prstat</em> output:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP 8286 oracle 64 36 0.0 0.0 0.0 0.0 0.0 0.0 0 37 .15 0 racgimon/1 7903 oracle 65 35 0.0 0.0 0.0 0.0 0.0 0.0 0 35 .15 0 racgimon/1 10015 root 0.0 0.8 0.0 0.0 0.0 0.0 99 0.0 21 1 398 0 prstat/1 7818 oracle 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 62 1 7K 0 oracle/2 10055 root 0.0 0.1 0.0 0.0 0.0 0.0 100 0.0 7 0 318 0 sleep/1 816 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 30 1 275 0 init.cssd/1 1916 oracle 0.1 0.0 0.0 0.0 0.0 0.0 100 0.0 60 0 719 59 oracle/1 1878 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 170 0 694 1 oracle/1 1874 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 170 0 691 1 oracle/1 1872 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 81 0 401 31 oracle/1 1621 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 53 0 343 1 oracle/1 1894 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 21 0 54 0 oracle/2 1625 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 64 0 201 1 oracle/1 1623 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 64 0 201 1 oracle/1 1870 oracle 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 59 0 347 1 oracle/2 NPROC USERNAME SWAP RSS MEMORY TIME CPU 72 oracle 21G 21G 57% 0:07:37 26% 54 root 117M 180M 0.5% 0:00:06 0.2% 1 noaccess 136M 207M 0.6% 0:00:12 0.0% 6 daemon 6408K 7496K 0.0% 0:00:00 0.0% 1 smmsp 1136K 7244K 0.0% 0:00:00 0.0% Total: 134 processes, 439 lwps, load averages: 2.26, 1.82, 1.13 # </pre> </div></div> QA-54 Unable to close database by srvctl and racgimon takes 100% of CPU. Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Tue, 26 Mar 2013 14:47:28 +0000 (UTC) Tue, 26 Mar 2013 15:22:11 +0000 (UTC) 0 <b>ANALYSIS 1:</b> <p><ins><em>truss output of one of _racgimon</em>:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># truss -fae -p 8286 8286: close(346745079) Err#9 EBADF 8286: close(346745080) Err#9 EBADF 8286: close(346745081) Err#9 EBADF 8286: close(346745082) Err#9 EBADF 8286: close(346745083) Err#9 EBADF 8286: close(346745084) Err#9 EBADF 8286: close(346745085) Err#9 EBADF 8286: close(346745086) Err#9 EBADF # truss -faec -p 8286 psargs: /u01/app/oracle/product/10.2/bin/racgimon startd ESIBASE ^C syscall seconds calls errors close 2.374 1857265 1857265 -------- ------ ---- sys totals: 2.374 1857265 1857265 usr time: 1.079 elapsed: 23.090 # </pre> </div></div> <p><ins>Comment:</ins></p> <p><em>racgimon</em> could not close file descriptors. It repeats to close different file descriptors which are incremented 1 in each subsequent <em>close()</em> system call.</p> <p><em>close()</em> system calls return <em>EBADF</em>, which is <em>The fildes argument is not a valid file descriptor.</em><br/> Ref: <span class="nobr"><a href="http://docs.oracle.com/cd/E23823_01/html/816-5167/close-2.html#REFMAN2close-2">http://docs.oracle.com/cd/E23823_01/html/816-5167/close-2.html#REFMAN2close-2<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></p> <b>ANALYSIS 2:</b> <p><ins><em>prctl</em> outpur of <em>racgimon</em>:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># prctl 8286 process: 8286: /u01/app/oracle/product/10.2/bin/racgimon startd ESIBASE NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-port-events privileged 65.5K - deny - system 2.15G max deny - process.max-msg-messages privileged 8.19K - deny - system 4.29G max deny - process.max-msg-qbytes privileged 64.0KB - deny - system 16.0EB max deny - process.max-sem-ops privileged 512 - deny - system 2.15G max deny - process.max-sem-nsems privileged 512 - deny - system 32.8K max deny - process.max-address-space privileged 16.0EB max deny - system 16.0EB max deny - process.max-file-descriptor privileged 2.15G max deny - system 2.15G max deny - process.max-core-size basic 0B - deny 8286 system 8.00EB max deny - process.max-stack-size basic 10.0MB - deny 8286 privileged 125TB - deny - system 125TB max deny - process.max-data-size privileged 16.0EB max deny - system 16.0EB max deny - process.max-file-size privileged 8.00EB max deny,signal=XFSZ - system 8.00EB max deny - process.max-cpu-time privileged 18.4Es inf signal=XCPU - system 18.4Es inf none - task.max-cpu-time system 18.4Es inf none - task.max-lwps system 2.15G max deny - project.max-contracts privileged 10.0K - deny - system 2.15G max deny - project.max-device-locked-memory privileged 2.19GB - deny - system 16.0EB max deny - project.max-locked-memory system 16.0EB max deny - project.max-port-ids privileged 8.19K - deny - system 65.5K max deny - project.max-shm-memory privileged 24.0GB - deny - system 16.0EB max deny - project.max-shm-ids privileged 128 - deny - system 16.8M max deny - project.max-msg-ids privileged 128 - deny - system 16.8M max deny - project.max-sem-ids privileged 128 - deny - system 16.8M max deny - project.max-crypto-memory privileged 8.77GB - deny - system 16.0EB max deny - project.max-tasks system 2.15G max deny - project.max-lwps system 2.15G max deny - project.cpu-cap system 4.29G inf deny - project.cpu-shares privileged 1 - none - system 65.5K max none - zone.max-swap system 16.0EB max deny - zone.max-locked-memory system 16.0EB max deny - zone.max-shm-memory system 16.0EB max deny - zone.max-shm-ids system 16.8M max deny - zone.max-sem-ids system 16.8M max deny - zone.max-msg-ids system 16.8M max deny - zone.max-lwps system 2.15G max deny - zone.cpu-cap system 4.29G inf deny - zone.cpu-shares privileged 1 - none - system 65.5K max none - $ prctl -n process.max-file-descriptor -i process $$ process: 7615: -sh NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-file-descriptor basic 4.10K - deny 7615 system 2.15G max deny - $ </pre> </div></div> <p><ins>Comment:</ins></p> <p>privileged option of <em>process.max-file-descriptor</em> had reached to 2.15G descriptors. But, no privileged option had been set to it.</p> <b>WORKAROUND:</b> <p>Set privileged option to a value as an example below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># projmod -s -K "process.max-file-descriptor=(basic,4096,deny),(privileged,65536,deny)" 'user.oracle' </pre> </div></div> <p>After setting, check as below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>$ prctl -n process.max-file-descriptor -i process $$ process: 708: -sh NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-file-descriptor basic 4.10K - deny 708 privileged 65.5K - deny - system 2.15G max deny - $ </pre> </div></div> <p>See similar problem for lower Oracle versions in Oracle note <em>srvctl Slow or Fails to Start/Stop Database Instance and crsd.bin/racgmain/racgimon High CPU Usage <span class="error">&#91;ID 1457387.1&#93;</span></em>.</p> Operating System Product Version 11.2.0.3 Database Name . Host Name . [QA-53] Starting Listener Hangs with "TNS-12531: TNS:cannot allocate memory" in Listener Log http://jira.ubtools.com/jira/browse/QA-53 Starting the LISTENER hangs. The following errors appear as an infinite loop in the listener.log: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>02-JUL-2012 15:54:06 * 12531 TNS-12531: TNS:cannot allocate memory 02-JUL-2012 15:54:06 * 12531 TNS-12531: TNS:cannot allocate memory 02-JUL-2012 15:54:06 * 12531 TNS-12531: TNS:cannot allocate memory 02-JUL-2012 15:54:06 * 12531 TNS-12531: TNS:cannot allocate memory 02-JUL-2012 15:54:06 * 12531 TNS-12531: TNS:cannot allocate memory </pre> </div></div> QA-53 Starting Listener Hangs with "TNS-12531: TNS:cannot allocate memory" in Listener Log Oracle - SQL*Net Major Closed Answered ubTools Support ubTools Support Mon, 2 Jul 2012 13:52:09 +0000 (UTC) Thu, 12 Jul 2012 16:01:10 +0000 (UTC) 0 LISTENER trace enabled in listener.ora as below: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>TRACE_LEVEL_LISTENER = 16 TRACE_FILE_LISTENER = listener.trc TRACE_UNIQUE_LISTENER = TRUE TRACE_TIMESTAMP_LISTENER = TRUE </pre> </div></div> <p><em>listener.trc</em> was generated in $ORACLE_BASE/diag/tnslsnr/linux1/listener/trace/ as below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>2012-07-02 15:55:03.847203 : snlinGetAddrInfo:entry 2012-07-02 15:55:03.847276 : snlinGetAddrInfo:getaddrinfo() failed with error -3 2012-07-02 15:55:03.847295 : snlinGetAddrInfo:exit 2012-07-02 15:55:03.847307 : nserror:entry 2012-07-02 15:55:03.847319 : nserror:nsres: id=0, op=65, ns=12531, ns2=0; nt[0]=0, nt[1]=0, nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0 2012-07-02 15:55:03.847331 : nsmfr:entry 2012-07-02 15:55:03.847342 : nsmfr:1528 bytes at 0xa193a0 2012-07-02 15:55:03.847352 : nsmfr:normal exit 2012-07-02 15:55:03.847363 : nsopenmplx:error exit 2012-07-02 15:55:03.847373 : nsopen:unable to allocate context area 2012-07-02 15:55:03.847384 : nsopen:error exit 2012-07-02 15:55:03.847395 : nsanswer:error exit 2012-07-02 15:55:03.847411 : nsglhc:nsanswer error 12531 </pre> </div></div> <p>The problem appeared in getaddrinfo() system call.</p> The IPv4 for hostname was defined in the <em>/etc/hosts</em>; but there was no IPv6 definition. <p>Even though, only IPv4 address was used in the listener.ora, the problem occured again.</p> <p>The problem has been disappeared after adding the same hostname as IPv6 to the <em>/etc/hosts</em>.</p> Operating System Product Version 11.2.0.3 Database Name . Host Name . [QA-52] "Transaction recovery: lock conflict caught and ignored" messages in ALERT LOG. http://jira.ubtools.com/jira/browse/QA-52 The customer encounters the following messages: <p>ALERT LOG:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored Transaction recovery: lock conflict caught and ignored ..... </pre> </div></div> <p>SMON TRACE:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... *** 2011-12-26 14:42:46.401 Serial Transaction recovery caught exception 30319 Serial Transaction recovery caught exception 601 *** 2011-12-26 14:46:25.455 Serial Transaction recovery caught exception 601 Serial Transaction recovery caught exception 601 Serial Transaction recovery caught exception 601 Serial Transaction recovery caught exception 601 ..... </pre> </div></div> <p>The customer said the error started after SUPPLEMENTAL LOGGING enabled. But, the messages have not disappeared after disabling it.</p> QA-52 "Transaction recovery: lock conflict caught and ignored" messages in ALERT LOG. Oracle - Administration Major Closed Answered ubTools Support ubTools Support Fri, 30 Dec 2011 12:56:45 +0000 (UTC) Mon, 16 Jan 2012 15:28:49 +0000 (UTC) 0 <b>DEAD TRANSACTIONS:</b> <p><ins>SQL:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>select b.name useg, b.inst# instid, b.status$ status, a.ktuxeusn xid_usn, a.ktuxeslt xid_slot, a.ktuxesqn xid_seq, a.ktuxesiz undoblocks, a.ktuxesta txstatus from x$ktuxe a, undo$ b where a.ktuxecfl like '%DEAD%' and a.ktuxeusn = b.us#; </pre> </div></div> <p><ins>Data:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>USEG INSTID STATUS XID_USN XID_SLOT XID_SEQ UNDOBLOCKS TXSTATUS _SYSSMU1209_1270276489$ 1 3 1209 3 1100382 3033 ACTIVE _SYSSMU1482_3325964579$ 2 2 1482 16 496322 0 INACTIVE _SYSSMU1681_4095893383$ 2 2 1681 5 472365 0 INACTIVE _SYSSMU2072_3213080551$ 2 2 2072 2 120912 0 INACTIVE </pre> </div></div> <p><ins>Definition:</ins></p> <ul class="alternate" type="square"> <li>Transaction id: <em>XID_USN.XID_SLOT.XID_SEQ</em></li> </ul> <p><ins>Comment:</ins></p> <ul class="alternate" type="square"> <li>There is an active dead transaction in _SYSSMU1209_1270276489$ undo segment.</li> <li>The dead transaction id is 1209.3.1100382 which is 0x04B9.003.0010CA5E in hexadecimal.</li> </ul> <b>UNDO HEADER:</b> <p><b>Reading Transaction Table in the UNDO header:</b></p> <p><ins>SQL:</ins></p> <ul class="alternate" type="square"> <li>SQL&gt; ALTER SYSTEM DUMP UNDO HEADER '_SYSSMU1209_1270276489$';</li> </ul> <p><ins>Data:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... TRN TBL:: index state cflags wrap# uel scn dba parent-xid nub stmt_num cmt ------------------------------------------------------------------------------------------------ 0x00 9 0x00 0x10ca61 0x001a 0x001d.ac01b6ea 0x04c30103 0x0000.000.00000000 0x00000001 0x00000000 1324239683 0x01 9 0x00 0x10c8f0 0x000e 0x001d.abff4525 0x04c30043 0x0000.000.00000000 0x00000001 0x00000000 1324239603 0x02 9 0x00 0x10ca1f 0x0005 0x001d.abfad814 0x00c2e328 0x0000.000.00000000 0x00000003 0x00000000 1324239446 0x03 10 0x90 0x10ca5e 0x0002 0x001d.ab7efd87 0x00c0f5ea 0x0000.000.00000000 0x00000bd9 0x04c1f938 0 0x04 9 0x00 0x10c46d 0x000a 0x001d.abda8e80 0x04c28e5f 0x0000.000.00000000 0x00000001 0x00000000 1324238461 0x05 9 0x00 0x10c91c 0x0015 0x001d.abfadb90 0x00c2e32d 0x0000.000.00000000 0x00000001 0x00000000 1324239447 0x06 9 0x00 0x10cdbb 0x001d 0x001d.abd50f70 0x04c28e80 0x0000.000.00000000 0x00000001 0x00000000 1324238283 0x07 9 0x00 0x10c77a 0x0004 0x001d.abd90c5b 0x00c29cd8 0x0000.000.00000000 0x00000001 0x00000000 1324238409 0x08 9 0x00 0x10c229 0x0020 0x001d.abe1de1e 0x00c2a8d2 0x0000.000.00000000 0x00000001 0x00000000 1324238704 0x09 9 0x00 0x10ca28 0x0006 0x001d.abd4dfb6 0x04c28e5f 0x0000.000.00000000 0x00000001 0x00000000 1324238278 0x0a 9 0x00 0x10c6b7 0x0008 0x001d.abe1c7f3 0x00c2a8c2 0x0000.000.00000000 0x00000001 0x00000000 1324238701 0x0b 9 0x00 0x10c9e6 0x0017 0x001d.abfdbd74 0x04c30007 0x0000.000.00000000 0x00000001 0x00000000 1324239554 0x0c 9 0x00 0x10cb45 0x0011 0x001d.abfc5eea 0x04c2ff9d 0x0000.000.00000000 0x00000001 0x00000000 1324239502 0x0d 9 0x00 0x10c444 0x001c 0x001d.abca9d1f 0x00c22bc1 0x0000.000.00000000 0x00000001 0x00000000 1324237948 0x0e 9 0x00 0x10c7e3 0x0000 0x001d.abffbb9a 0x04c3005f 0x0000.000.00000000 0x00000001 0x00000000 1324239618 0x0f 9 0x00 0x10ca72 0x0007 0x001d.abd82320 0x00c29c21 0x0000.000.00000000 0x00000001 0x00000000 1324238375 0x10 9 0x00 0x10c501 0x001f 0x001d.abf33edd 0x00c2e03f 0x0000.000.00000000 0x00000001 0x00000000 1324239208 0x11 9 0x00 0x10ca90 0x000b 0x001d.abfdbc34 0x04c30004 0x0000.000.00000000 0x00000001 0x00000000 1324239554 0x12 9 0x00 0x10c2ef 0x0018 0x001d.ac09d85c 0x04c3036f 0x0000.000.00000000 0x00000001 0x00000000 1324239959 0x13 9 0x00 0x10c8ae 0x0010 0x001d.abe83852 0x04c2ea99 0x0000.000.00000000 0x00000001 0x00000000 1324238911 0x14 9 0x00 0x10c5ad 0x0016 0x001d.abd3e99a 0x04c28de0 0x0000.000.00000000 0x00000001 0x00000000 1324238242 0x15 9 0x00 0x10c62c 0x000c 0x001d.abfb4d1a 0x04c2ff8d 0x0000.000.00000000 0x00000001 0x00000000 1324239464 0x16 9 0x00 0x10c72b 0x001b 0x001d.abd4d238 0x04c28e4e 0x0000.000.00000000 0x00000001 0x00000000 1324238274 0x17 9 0x00 0x10c2da 0x0001 0x001d.abff0f85 0x04c30029 0x0000.000.00000000 0x00000001 0x00000000 1324239598 0x18 9 0x00 0x10c589 0xffff 0x001d.ad480910 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1324254620 0x19 9 0x00 0x10c628 0x000d 0x001d.abca6bcb 0x00c22ba5 0x0000.000.00000000 0x00000001 0x00000000 1324237944 0x1a 9 0x00 0x10c4a7 0x0012 0x001d.ac04e46b 0x04c30232 0x0000.000.00000000 0x00000001 0x00000000 1324239791 0x1b 9 0x00 0x10c2e6 0x0009 0x001d.abd4df89 0x04c28e5c 0x0000.000.00000000 0x00000001 0x00000000 1324238277 0x1c 9 0x00 0x10c755 0x0014 0x001d.abcb5957 0x04c28b14 0x0000.000.00000000 0x00000001 0x00000000 1324237971 0x1d 9 0x00 0x10cd54 0x0021 0x001d.abd6f01b 0x00c29b7b 0x0000.000.00000000 0x00000001 0x00000000 1324238343 0x1e 9 0x00 0x10c5e3 0x0019 0x001d.abca546a 0x00c22b90 0x0000.000.00000000 0x00000001 0x00000000 1324237940 0x1f 9 0x00 0x10c232 0x0002 0x001d.abf7fc92 0x00c2e1be 0x0000.000.00000000 0x00000001 0x00000000 1324239355 0x20 9 0x00 0x10c391 0x0013 0x001d.abe5ff89 0x04c2e999 0x0000.000.00000000 0x00000001 0x00000000 1324238832 0x21 9 0x00 0x10cc70 0x000f 0x001d.abd77e3c 0x00c29bd6 0x0000.000.00000000 0x00000001 0x00000000 1324238361 EXT TRN CTL:: usn: 1209 ..... </pre> </div></div> <p><ins>Definitions:</ins></p> <ul class="alternate" type="square"> <li><em>State#10</em> means active transaction.</li> <li><em>dba</em> points to starting UNDO block address.</li> <li><em>usn</em>: Undo segment number</li> <li><em>usn.index.wrap#</em> gives transaction id.</li> </ul> <p><ins>Comment:</ins></p> <p>An active transaction of 0x04b9.003.0010ca5e is available in the slot of 0x03, which has a dba of 0x00c0f5ea, which is 12645866 in decimal.</p> <b>UNDO BLOCK:</b> <p><b>Reading UNDO Block:</b></p> <p><ins>SQL:</ins></p> <ul class="alternate" type="square"> <li>fileID: select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(12645866) from x$dual;</li> <li>blockID:select DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(12645866) from x$dual;</li> <li>alter system dump datafile &lt;fileID&gt; block &lt;blockID&gt;;</li> </ul> <p><ins>Data:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... UNDO BLK: xid: 0x04b9.003.0010ca5e seq: 0x1447 cnt: 0x2e irb: 0x2c icl: 0x0 flg: 0x0000 Rec Offset Rec Offset Rec Offset Rec Offset Rec Offset --------------------------------------------------------------------------- 0x01 0x1f8c 0x02 0x1dac 0x03 0x1d3c 0x04 0x1ccc 0x05 0x1c64 0x06 0x1c0c 0x07 0x1b7c 0x08 0x1b0c 0x09 0x1a9c 0x0a 0x1a24 0x0b 0x19cc 0x0c 0x183c 0x0d 0x17cc 0x0e 0x175c 0x0f 0x16e4 0x10 0x168c 0x11 0x15fc 0x12 0x158c 0x13 0x151c 0x14 0x14b4 0x15 0x145c 0x16 0x12f4 0x17 0x1284 0x18 0x1214 0x19 0x11ac 0x1a 0x1154 0x1b 0x0f9c 0x1c 0x0f2c 0x1d 0x0ebc 0x1e 0x0e44 0x1f 0x0dec 0x20 0x0c3c 0x21 0x0bcc 0x22 0x0b5c 0x23 0x0af4 0x24 0x0a9c 0x25 0x08c4 0x26 0x0854 0x27 0x07e4 0x28 0x076c 0x29 0x0714 0x2a 0x0604 0x2b 0x022c 0x2c 0x01c4 0x2d 0x0154 0x2e 0x00e4 ..... </pre> </div></div> <p><ins>Definitions</ins></p> <ul class="alternate" type="square"> <li><em>irb</em> points to last UNDO RECORD in UNDO block.</li> <li><em>rci</em> points to previous UNDO RECORD. if rci=0, it's the first UNDO RECORD.</li> <li>Recovery operation starts from <em>irb</em> and chain is followed by <em>rci</em> until <em>rci</em> is zero.</li> </ul> <p><ins>Comment:</ins></p> <ul class="alternate" type="square"> <li>The transaction of 0x04b9.003.0010ca5e starts recovery from UNDO RECORD of 0x2c.</li> </ul> <b>UNDO RECORDS:</b> <p><b>Reading UNDO Records:</b></p> <p><ins>Data:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... *----------------------------- * Rec #0x2c slt: 0x03 objn: 939468(0x000e55cc) objd: 941274 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x2b ..... *----------------------------- * Rec #0x2b slt: 0x03 objn: 939468(0x000e55cc) objd: 941274 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x2a ..... *----------------------------- * Rec #0x2a slt: 0x03 objn: 939468(0x000e55cc) objd: 941274 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x29 ..... *----------------------------- * Rec #0x29 slt: 0x03 objn: 1126679(0x00113117) objd: 1126679 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x28 ..... *----------------------------- * Rec #0x28 slt: 0x03 objn: 1123018(0x001122ca) objd: 1123018 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x27 ..... *----------------------------- * Rec #0x27 slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x26 ..... *----------------------------- * Rec #0x26 slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x25 ..... *----------------------------- * Rec #0x25 slt: 0x03 objn: 939450(0x000e55ba) objd: 939450 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x24 ..... *----------------------------- * Rec #0x24 slt: 0x03 objn: 1126696(0x00113128) objd: 1126696 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x23 ..... *----------------------------- * Rec #0x23 slt: 0x03 objn: 1123035(0x001122db) objd: 1123035 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x22 ..... *----------------------------- * Rec #0x22 slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x21 ..... *----------------------------- * Rec #0x21 slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x20 ..... *----------------------------- * Rec #0x20 slt: 0x03 objn: 939408(0x000e5590) objd: 941229 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x1f ..... *----------------------------- * Rec #0x1f slt: 0x03 objn: 1126655(0x001130ff) objd: 1126655 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x1e ..... *----------------------------- * Rec #0x1e slt: 0x03 objn: 1122994(0x001122b2) objd: 1122994 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x1d ..... *----------------------------- * Rec #0x1d slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x1c ..... *----------------------------- * Rec #0x1c slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x1b ..... *----------------------------- * Rec #0x1b slt: 0x03 objn: 939429(0x000e55a5) objd: 941242 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x1a ..... *----------------------------- * Rec #0x1a slt: 0x03 objn: 1126678(0x00113116) objd: 1126678 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x19 ..... *----------------------------- * Rec #0x19 slt: 0x03 objn: 1123017(0x001122c9) objd: 1123017 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x18 ..... *----------------------------- * Rec #0x18 slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x17 ..... *----------------------------- * Rec #0x17 slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x16 ..... *----------------------------- * Rec #0x16 slt: 0x03 objn: 939466(0x000e55ca) objd: 941272 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x15 ..... *----------------------------- * Rec #0x15 slt: 0x03 objn: 1126681(0x00113119) objd: 1126681 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x14 ..... *----------------------------- * Rec #0x14 slt: 0x03 objn: 1123020(0x001122cc) objd: 1123020 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x13 ..... *----------------------------- * Rec #0x13 slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x12 ..... *----------------------------- * Rec #0x12 slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x11 ..... *----------------------------- * Rec #0x11 slt: 0x03 objn: 939420(0x000e559c) objd: 941236 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x10 ..... *----------------------------- * Rec #0x10 slt: 0x03 objn: 1126647(0x001130f7) objd: 1126647 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x0f ..... *----------------------------- * Rec #0xf slt: 0x03 objn: 1122986(0x001122aa) objd: 1122986 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x0e ..... *----------------------------- * Rec #0xe slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x0d ..... *----------------------------- * Rec #0xd slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x0c ..... *----------------------------- * Rec #0xc slt: 0x03 objn: 939418(0x000e559a) objd: 941235 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x0b ..... *----------------------------- * Rec #0xb slt: 0x03 objn: 1126653(0x001130fd) objd: 1126653 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x0a ..... *----------------------------- * Rec #0xa slt: 0x03 objn: 1122992(0x001122b0) objd: 1122992 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x09 ..... *----------------------------- * Rec #0x9 slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x08 ..... *----------------------------- * Rec #0x8 slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x07 ..... *----------------------------- * Rec #0x7 slt: 0x03 objn: 939438(0x000e55ae) objd: 941251 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x06 ..... *----------------------------- * Rec #0x6 slt: 0x03 objn: 1126696(0x00113128) objd: 1126696 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x05 ..... *----------------------------- * Rec #0x5 slt: 0x03 objn: 1123035(0x001122db) objd: 1123035 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x04 ..... *----------------------------- * Rec #0x4 slt: 0x03 objn: 1162285(0x0011bc2d) objd: 1162285 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x03 ..... *----------------------------- * Rec #0x3 slt: 0x03 objn: 1162273(0x0011bc21) objd: 1162273 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x02 ..... *----------------------------- * Rec #0x2 slt: 0x03 objn: 939448(0x000e55b8) objd: 939448 tblspc: 9(0x00000009) * Layer: 11 (Row) opc: 1 rci 0x01 ..... *----------------------------- * Rec #0x1 slt: 0x03 objn: 1126675(0x00113113) objd: 1126675 tblspc: 9(0x00000009) * Layer: 10 (Index) opc: 22 rci 0x00 ..... KDO Op code: LMN row dependencies Disabled ..... </pre> </div></div> <p><ins>Definitions:</ins></p> <ul class="alternate" type="square"> <li><em>objn</em> means object id.</li> </ul> <p><ins>Comment:</ins></p> <ul class="alternate" type="square"> <li>The objects need recovery: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>select * from dba_objects where object_id in (939468,1126679,1123018,1162285,1162273,939450,1126696,1123035,939408, 1126655,1122994,939429,1126678,1123017,939466,1126681, 1123020,939420,1126647,1122986,939418,1126653,1122992,939438,939448,1126675); </pre> </div></div></li> <li>The first UNDO record includes <em>LMN</em>.<br/> --<div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>When running RAC and compatible 11.1 or higher, SMON could fail to recover transactions which had undo records for supplemental logging. (1) SMON is spinning (2) Must be RAC and compatible 11.1 or higher (3) Supplemental logging must have been enabled. If so, dump the undo for the transaction mentioned. If the records show LMN entries, it is this bug. </pre> </div></div><br/> <em>Ref: Bug 9489626 ORA-600 <span class="error">&#91;4464&#93;</span> in RAC and SMON spins on cpu for a table with supplemental logging</em></li> </ul> <b>ACTIONS:</b> <p><ins>Bug:</ins></p> <p>This problem is Oracle Bug:9857702:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... Affects: Product (Component) Oracle Server (Rdbms) Range of versions believed to be affected Versions &gt;= 11.1 but BELOW 12.1 Versions confirmed as being affected •11.2.0.1 •11.1.0.7 Platforms affected Generic (all / most platforms affected) Fixed: This issue is fixed in •12.1 (Future Release) •11.2.0.2 (Server Patch Set) •11.1.0.7.8 Patch Set Update •11.1.0.7 Patch 40 on Windows Platforms ..... </pre> </div></div> <p><em>Ref: Bug 9857702 ORA-600 <span class="error">&#91;4464&#93;</span> / ORA-600 <span class="error">&#91;4139&#93;</span> by ROLLBACK for a table with supplemental logging enabled</em></p> <p><ins>Workaround:</ins></p> <ul class="alternate" type="square"> <li>Recreate objects that need recovery.</li> </ul> Waiting for the customer action. The customer dropped the identified objects, and the problem disappeared. Operating System Operating System Version B.11.31 Product Version 11.2.0.1.0 (RAC) Database Name . Host Name . [QA-50] PRVF-5410 : Check of common NTP Time Server failed, PRVF-5416 : Query of NTP daemon failed on all nodes http://jira.ubtools.com/jira/browse/QA-50 <ins>Errors:</ins> <p>The customer encountered the following errors in CVU:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>./cluvfy stage -pre crsinst -n detrac1,detrac2 -verbose ..... NTP common Time Server Check started... PRVF-5410 : Check of common NTP Time Server failed PRVF-5416 : Query of NTP daemon failed on all nodes Result: Clock synchronization check using Network Time Protocol(NTP) failed ..... </pre> </div></div> <p><ins>NTP:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># ntpq -p remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; LOCAL(0) &lt;REMOVED&gt; # </pre> </div></div> <p>Same on both nodes.</p> <p><ins>CVU log:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... [978@detrac1] [main] [ 2011-05-13 17:09:51.490 EEST ] [TaskNTP.getTimeServerInfo:838] Output from NTP query command on node detrac1 is = remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; LOCAL(0) &lt;REMOVED&gt; [978@detrac1] [main] [ 2011-05-13 17:09:51.492 EEST ] [TaskNTP.getTimeServerInfo:864] Parsing of NTP query output line FAILED. Line= *&lt;REMOVED&gt; LOCAL(0) &lt;REMOVED&gt; [978@detrac1] [main] [ 2011-05-13 17:09:51.492 EEST ] [TaskNTP.getTimeServerInfo:880] NTP query on node detrac1 did NOT produce valid output. [978@detrac1] [main] [ 2011-05-13 17:09:51.492 EEST ] [TaskNTP.getTimeServerInfo:838] Output from NTP query command on node detrac2 is = remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; LOCAL(0) &lt;REMOVED&gt; [978@detrac1] [main] [ 2011-05-13 17:09:51.493 EEST ] [TaskNTP.getTimeServerInfo:864] Parsing of NTP query output line FAILED. Line= *&lt;REMOVED&gt; LOCAL(0) &lt;REMOVED&gt; [978@detrac1] [main] [ 2011-05-13 17:09:51.494 EEST ] [TaskNTP.getTimeServerInfo:880] NTP query on node detrac2 did NOT produce valid output. ..... </pre> </div></div> <p>Ref: $CVU_HOME/cv/log/cvutrace.log.0</p> QA-50 PRVF-5410 : Check of common NTP Time Server failed, PRVF-5416 : Query of NTP daemon failed on all nodes Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Fri, 13 May 2011 17:27:21 +0000 (UTC) Fri, 13 May 2011 18:07:23 +0000 (UTC) 0 <ins>Action:</ins><br/> The Network Administrator set an IP to <em>refid</em> for NTP. <p><ins>NTP:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># ntpq -p remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; # </pre> </div></div> <p><ins>CVU Log:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... [967@detrac1] [main] [ 2011-05-13 19:21:19.426 EEST ] [TaskNTP.getTimeServerInfo:838] Output from NTP query command on node detrac1 is = remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.433 EEST ] [TimeServerNode.addDataToNode:66] TimeServerNode:addDataToNode():Parsing line: *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.434 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[0]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.434 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[1]=72.14.188.52 [967@detrac1] [main] [ 2011-05-13 19:21:19.434 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[2]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.435 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[3]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.435 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[4]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.436 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[5]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.436 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[6]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.437 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[7]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.437 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[8]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.438 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[9]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.438 EEST ] [TaskNTP.getTimeServerInfo:838] Output from NTP query command on node detrac2 is = remote refid st t when poll reach delay offset disp ============================================================================== *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.439 EEST ] [TimeServerNode.addDataToNode:66] TimeServerNode:addDataToNode():Parsing line: *&lt;REMOVED&gt; 72.14.188.52 &lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.440 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[0]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.440 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[1]=72.14.188.52 [967@detrac1] [main] [ 2011-05-13 19:21:19.441 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[2]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.441 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[3]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.441 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[4]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.442 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[5]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.442 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[6]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.443 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[7]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.443 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[8]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.444 EEST ] [TimeServerNode.addDataToNode:79] Parsed Value[9]=&lt;REMOVED&gt; [967@detrac1] [main] [ 2011-05-13 19:21:19.444 EEST ] [TaskNTP.doTimeServerCheck:736] tsId=72.14.188.52; tServer ..... </pre> </div></div> <p>CVU could parse <em>ntpq</em> output.</p> <p><ins>CVU Output:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... NTP common Time Server Check started... NTP Time Server "72.14.188.52" is common to all nodes on which the NTP daemon is running Check of common NTP Time Server passed Clock time offset check from NTP Time Server started... Checking on nodes "[detrac1, detrac2]"... Check: Clock time offset from NTP Time Server Time Server: 72.14.188.52 Time Offset Limit: 1000.0 msecs Node Name Time Offset Status ------------ ------------------------ ------------------------ detrac1 -2.332 passed detrac2 -2.842 passed Time Server "72.14.188.52" has time offsets that are within permissible limits for nodes "[detrac1, detrac2]". Clock time offset check passed Result: Clock synchronization check using Network Time Protocol(NTP) passed ..... </pre> </div></div> <b>Solution:</b> <p>The Network Administrator set an IP to <em>refid</em> for NTP.</p> Operating System Operating System Version 10 Product Version CVU 11g Database Name . Host Name . [QA-49] ORA-4031: High Allocation for "Oracle Text Commit new id" in Shared Pool. http://jira.ubtools.com/jira/browse/QA-49 The customer encountered ORA-4031 and trace file generated. SGA is an ASMM SGA. The application uses Oracle Text. QA-49 ORA-4031: High Allocation for "Oracle Text Commit new id" in Shared Pool. Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Fri, 5 Nov 2010 22:01:28 +0000 (UTC) Fri, 5 Nov 2010 22:47:51 +0000 (UTC) 0 <b>Analysis of the Trace:</b> <p><ins>The Requested SUBPOOL:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... ================================= Begin 4031 Diagnostic Information ================================= ..... HEAP DUMP heap name="sga heap(3,0)" desc=380043660 extent sz=0xfe0 alt=216 het=32767 rec=9 flg=-126 opc=0 parent=0 owner=0 nex=0 xsz=0x1000000 latch set 3 of 4 durations enabled for this heap reserved granules for root 0 (granule size 16777216) ..... </pre> </div></div> <p>The allocation was requested from <em>sga heap(3,0)</em>, which is <em>(SUBPOOL:3,DURATION:0)</em>.</p> <p><ins>All SUBPOOLS and Their DURATION Memories:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... HEAP DUMP heap name="sga heap(1,0)" desc=380030610 Total heap size =218102664 Total free space = 1066928 Total reserved free space = 8439520 Unpinned space = 38812528 rcr=11971 trn=17906 Permanent space =208595160 HEAP DUMP heap name="sga heap(1,1)" desc=380031e68 Total heap size = 67108512 Total free space = 2912528 Total reserved free space = 1382816 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(1,2)" desc=3800336c0 Total heap size =167771280 Total free space = 92743480 Total reserved free space = 3852856 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(1,3)" desc=380034f18 Total heap size =268434048 Total free space = 74547592 Total reserved free space = 13497472 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(2,0)" desc=380039e38 Total heap size =201325536 Total free space = 17200 Total reserved free space = 8435920 Unpinned space = 26474112 rcr=7934 trn=8094 Permanent space =192871456 HEAP DUMP heap name="sga heap(2,1)" desc=38003b690 Total heap size = 83885640 Total free space = 48723768 Total reserved free space = 1035792 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(2,2)" desc=38003cee8 Total heap size =369096816 Total free space =258674312 Total reserved free space = 16982464 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(2,3)" desc=38003e740 Total heap size =218102664 Total free space = 17202608 Total reserved free space = 10966696 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(3,0)" desc=380043660 Total heap size =184548408 Total free space = 13008 Total reserved free space = 5061928 Unpinned space = 26943408 rcr=4930 trn=9425 Permanent space =179472608 HEAP DUMP heap name="sga heap(3,1)" desc=380044eb8 Total heap size = 67108512 Total free space = 27568352 Total reserved free space = 4744 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(3,2)" desc=380046710 Total heap size =352319688 Total free space =233302736 Total reserved free space = 15981216 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(3,3)" desc=380047f68 Total heap size =385873944 Total free space =143746536 Total reserved free space = 19402616 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(4,0)" desc=38004ce88 Total heap size =184548408 Total free space = 8616 Total reserved free space = 7592328 Unpinned space = 28725496 rcr=8459 trn=9864 Permanent space =176946600 HEAP DUMP heap name="sga heap(4,1)" desc=38004e6e0 Total heap size = 83885640 Total free space = 33356784 Total reserved free space = 1189120 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(4,2)" desc=38004ff38 Total heap size =335542560 Total free space =238988592 Total reserved free space = 16293768 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 HEAP DUMP heap name="sga heap(4,3)" desc=380051790 Total heap size =721416504 Total free space =445595432 Total reserved free space = 33743680 Unpinned space = 0 rcr=0 trn=0 Permanent space = 0 ..... </pre> </div></div> <p>All PERMANENT SPACES were allocated in DURATION 0. Although there are enough free spaces in the other DURATIONS of <em>(3,1),(3,2),(3,3)</em>; free space can not be allocated from them.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... duration memory (duration 0) cannot take free memory from other durations within the same subpool. It can only get more memory by being given a new complete EXTENT (granule) from the granule management code. ..... </pre> </div></div> <p><em>Ref: Oracle Bug 9911213: ORA-04031 AFTER APPLYING 10.2.0.4 PATCSHET</em></p> <p>Since the lower limit of BUFFER CACHE was determined by DB_CAHCE_SIZE parameter; SHARED POOL could not grow by allocating a new EXTENT, then ORA-4031 appeared.</p> <p>SUBPOOL Allocations:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... ============================== Memory Utilization of Subpool 1 ================================ Allocation Name Size _________________________ __________ "free memory " 215299680 ..... "sql area " 151923248 ..... "Oracle Text Commit new id" 399237696 ..... "library cache " 30711448 ..... ============================== Memory Utilization of Subpool 2 ================================ Allocation Name Size _________________________ __________ "free memory " 367295736 ..... "sql area " 160984248 ..... "Oracle Text Commit new id" 392833064 ..... "library cache " 35069800 ..... ============================== Memory Utilization of Subpool 3 ================================ Allocation Name Size _________________________ __________ "free memory " 450731968 ..... "sql area " 182415376 ..... "Oracle Text Commit new id" 417149240 ..... "library cache " 39156336 ..... ============================== Memory Utilization of Subpool 4 ================================ Allocation Name Size _________________________ __________ "free memory " 781766288 ..... "sql area " 156513808 ..... "Oracle Text Commit new id" 410783408 ..... "library cache " 31300664 </pre> </div></div> <p>The total size of <em>Oracle Text Commit new id</em> is 1.5GB <em>(399237696+392833064+417149240+410783408)</em>. It's high.</p> <b><em>Oracle Text Commit new id</em> Allocation Trend:</b> <p><ins>An Excerpt from SGA Stat:</ins></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select a.instance_number,begin_interval_time, bytes from dba_hist_sgastat a, dba_hist_snapshot b 2 where pool='shared pool' and 3 a.snap_id=b.snap_id and 4 a.instance_number=b.instance_number and 5 name='Oracle Text Commit new id' 6 order by begin_interval_time; ..... 1 06/10/2010 01:00:07,750 352864368 1 06/10/2010 02:00:55,107 353711568 ..... 1 12/10/2010 11:00:12,212 448444792 1 12/10/2010 12:00:27,412 449299672 1 12/10/2010 13:00:12,435 450157752 1 12/10/2010 14:00:19,294 450179512 ..... 1 04/11/2010 14:31:10,604 1622639416 1 04/11/2010 14:40:18,341 1623339552 1 04/11/2010 14:50:28,971 1623879936 1 04/11/2010 15:00:40,721 1623880712 722 rows selected. SQL&gt; </pre> </div></div> <p><em>Oracle Text Commit new id</em> had increased in small sizes.</p> <b>Summary:</b> <p><ins>Root Cause:</ins></p> <p>This problem is Oracle BUG:8593562 encountered in Oracle Text environment.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>..... It is incremented as the space is allocated, but not decremented as it is freed. It will reset when the instance is restarted. ..... The bug is currently in work by Development and expected to be resolved in a future release. ..... </pre> </div></div> <p><em>Ref: Growth of "Oracle Text Commit new id" memory with Sync on Commit Index <span class="error">&#91;ID 872413.1&#93;</span></em></p> <p><ins>Workaround:</ins></p> <ul class="alternate" type="square"> <li>Restart the INSTANCE.</li> </ul> Operating System Product Version 10.2.0.4 Database Name . Host Name . [QA-48] Unable to start VIP because of invalid RX packets numbers. http://jira.ubtools.com/jira/browse/QA-48 *When starting a VIP on a node, it fails and started on the other node. <p><b>Starting the VIP:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre># ./crs_start ora.akyorap2.vip Attempting to start `ora.akyorap2.vip` on member `akyorap2` Start of `ora.akyorap2.vip` on member `akyorap2` failed. Attempting to start `ora.akyorap2.vip` on member `akyorap1` Start of `ora.akyorap2.vip` on member `akyorap1` succeeded. # </pre> </div></div> <p>The log level increased to get more detailed diagnostic data.</p> <p><b>Setting Log Level:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>#./crsctl debug log res "ora.akyorap2.vip:1" Set Resource Debug Module: ora.akyorap2.vip Level: 1 # </pre> </div></div> <p><b>Errors from the Log:</b><br/> <em>(&lt;ORA_CRS_HOME&gt;/log/&lt;nodeName&gt;/racg/ora.akyorap2.vip.log)</em></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] checkIf: start for if=en1 Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] IsIfAlive: start for if=en1 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] defaultgw: started Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] defaultgw: completed with 10.46.1 80.1 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] About to execute command: /usr/sbin/ping -S 10.46.180.52 -c 1 -w 1 10.46.180.1 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:51 GMT+02:00 2009 [ 413770 ] About to execute command: /usr/sbin/ping -S 10.46.180.52 -c 1 -w 1 10.46.180.1 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] IsIfAlive: RX packets checked if=en1 failed Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] Interface en1 checked failed (host =akyorap2) Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] IsIfAlive: end for if=en1 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] checkIf: end for if=en1 Invalid parameters, or failed to bring up VIP (host=akyorap2) </pre> </div></div> QA-48 Unable to start VIP because of invalid RX packets numbers. Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Wed, 18 Mar 2009 19:44:38 +0000 (UTC) Thu, 19 Mar 2009 13:45:12 +0000 (UTC) 0 The problem raised from <em>IsIfAlive()</em> of $ORA_CRS_HOME/racgvip. <p>Here are the related excerpt from racgvip:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> # Check the status of the interface thro' pinging gateway if [ -n "$DEFAULTGW" ] then _RET=1 # get base IP address of the interface tmpIP=`$LSATTR -El ${_IF} -a netaddr | $AWK '{print $2}'` # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` x=$CHECK_TIMES while [ $x -gt 0 ] do if [ -n "$tmpIP" ] then logx "About to execute command: $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW " $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 else logx "About to execute command: $PING $PING_TIMEOUT $DEFAULTGW" $PING $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` if [ "$_O1" != "$_O2" ] then # RX packets numbers changed _RET=0 break fi $SLEEP 1 x=`$EXPR $x - 1` done if [ $_RET -ne 0 ] then logx "IsIfAlive: RX packets checked if=$_IF failed" else logx "IsIfAlive: RX packets checked if=$_IF OK" fi .... </pre> </div></div> <p>According to the the code above, it does the followings:</p> <ul class="alternate" type="square"> <li>Assigns the current RX packet number to _O1 variable as the first RX packet number.</li> <li>Loops $CHECK_TIMES times: <ul class="alternate" type="square"> <li>Pings default gateway.</li> <li>Assigns the current RX packet number to _O2 variable as the next RX packet number.</li> <li>If RX packet number changed(_O1!=_O2), break the loop.</li> <li>Sleep 1 second.</li> </ul> </li> <li>If RX packet number is NOT changed(_O1==_O2) raise the error; else it's OK.</li> </ul> racgvip was modified as below to dump the values of _<em>O1</em> and _<em>O2</em>: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` logx "--------------&gt; by dunal: _O1: $_O1" x=$CHECK_TIMES while [ $x -gt 0 ] do if [ -n "$tmpIP" ] then logx "About to execute command: $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW " $PING -S $tmpIP $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 else logx "About to execute command: $PING $PING_TIMEOUT $DEFAULTGW" $PING $PING_TIMEOUT $DEFAULTGW &gt; /dev/null 2&gt;&amp;1 fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` logx "--------------&gt; by dunal: _O2: $_O2" ... </pre> </div></div> <p>As seen above, <em>logx "--------------&gt; by dunal: ..."</em> lines are added to the script. <font color="red"> Don't do that if you're not sure about what you do.</font> </p> <p>After restarting the VIP, the values of _<em>O1</em> and _<em>O2</em> are dumped in the logs.</p> <p><b>Failed Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] --------------&gt; by dunal: _O1: - 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:49 GMT+02:00 2009 [ 413770 ] About to execute command: /usr/sbin/ping -S 10.46.180.52 -c 1 -w 1 10.46.180.1 Wed Mar 18 20:58:50 GMT+02:00 2009 [ 413770 ] --------------&gt; by dunal: _O2: - 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:51 GMT+02:00 2009 [ 413770 ] About to execute command: /usr/sbin/ping -S 10.46.180.52 -c 1 -w 1 10.46.180.1 Wed Mar 18 20:58:51 GMT+02:00 2009 [ 413770 ] --------------&gt; by dunal: _O2: - 2009-03-18 20:58:52.212: [ RACG][1] [360462][1][ora.akyorap2.vip]: Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] IsIfAlive: RX packets checked if=en1 failed Wed Mar 18 20:58:52 GMT+02:00 2009 [ 413770 ] Interface en1 checked failed (host =akyorap2) ... </pre> </div></div> <p>As seen above, the values are '-'. It's wrong. But, they are same. So, RX packet number not changed.</p> <p><b>Successful Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] --------------&gt; by dunal: _O1: 17297 2009-03-18 20:58:55.793: [ RACG][1] [397546][1][ora.akyorap2.vip]: Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] About to execute command: /usr/sbin/ping -S 10.46.180.51 -c 1 -w 1 10.46.180.1 Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] --------------&gt; by dunal: _O2: 17298 2009-03-18 20:58:55.793: [ RACG][1] [397546][1][ora.akyorap2.vip]: Wed Mar 18 20:58:55 GMT+02:00 2009 [ 405728 ] IsIfAlive: RX packets checked if=en1 OK </pre> </div></div> <p>_<em>O1</em> and _<em>O2</em> are different. That means RX packet number changed and the interface is up.</p> <p><b>netstat Output on Failed Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>/usr/bin/netstat -f inet -n -I en1 | /usr/bin/awk "{ if (/^en1/) {print $5; exit}}" en1 1500 link#3 0.21.5e.34.55.bc - 34601 0 16269 3 0 </pre> </div></div> <p>The column#5 is '-'. This is wrong and caused the problem.</p> <p><b>netstat Output on Successful Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>en1 1500 link#3 0.21.5e.34.57.fe 29223 0 10609 3 0 </pre> </div></div> <p>The column#5 is <em>29223</em>. This is expected number.</p> <p><b>Headers of netstat on Failed Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>#/usr/bin/netstat -f inet -n -I en1 Name Mtu Network Address ZoneID Ipkts Ierrs Opkts Oerrs Coll en1 1500 link#3 0.21.5e.34.55.bc - 35645 0 16801 3 0 en1 1500 10.46.180 10.46.180.52 - 35645 0 16801 3 0 </pre> </div></div> <p><b>Headers of netstat on Successful Node:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>#/usr/bin/netstat -f inet -n -I en1 Name Mtu Network Address ZoneID Ipkts Ierrs Opkts Oerrs Coll en1 1500 link#3 0.21.5e.34.57.fe 29743 0 10762 3 0 en1 1500 10.46.180 10.46.180.51 29743 0 10762 3 0 en1 1500 10.46.180 10.46.180.53 29743 0 10762 3 0 en1 1500 10.46.180 10.46.180.54 29743 0 10762 3 0 </pre> </div></div> <p><font color="red">The difference is the <em>ZoneID</em> column.</font> </p> <p>Looks like a network configuration problem. This issue will be open for an update from Network Administrators.</p> The Network Adminisitrator said it was an AIX Bug: <ul class="alternate" type="square"> <li><span class="nobr"><a href="http://www-01.ibm.com/support/docview.wss?uid=isg1IZ41358">IZ41358: ZONEID NEEDS TO PRINT "-" RATHER THAN A BLANK FOR NO VALUE. APPLIES TO AIX 6100-02<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span></li> </ul> <p>But, this fix changes ZoneID from blank value to '-'. After this fix, no VIP could be started.</p> No solution found from Metalink. Looks like an inconsistency of Oracle on AIX 6.1. <p><b><ins>Workaround:</ins></b></p> <p>Capturing column number of netstat must be changed from 5 to 6.</p> <p><b>Original lines for _O1:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... tmpIP=`$LSATTR -El ${_IF} -a netaddr | $AWK '{print $2}'` # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` x=$CHECK_TIMES while [ $x -gt 0 ] ... </pre> </div></div> <p><b>Modified line for _O1:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... tmpIP=`$LSATTR -El ${_IF} -a netaddr | $AWK '{print $2}'` # get RX packets numbers _O1=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$6; exit}}"` x=$CHECK_TIMES while [ $x -gt 0 ] ... </pre> </div></div> <p><b>Original lines for _O2:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$5; exit}}"` if [ "$_O1" != "$_O2" ] then # RX packets numbers changed ... </pre> </div></div> <p><b>Modified line for _O2:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... fi _O2=`$NETSTAT -n -I $_IF | $AWK "{ if (/^$_IF/) {print \\$6; exit}}"` if [ "$_O1" != "$_O2" ] then # RX packets numbers changed ... </pre> </div></div> <p>Then, VIP could be started on the correct nodes:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>./crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....ap1.gsd application ONLINE ONLINE akyorap1 ora....ap1.ons application ONLINE ONLINE akyorap1 ora....ap1.vip application ONLINE ONLINE akyorap1 ora....ap2.gsd application ONLINE ONLINE akyorap2 ora....ap2.ons application ONLINE ONLINE akyorap2 ora....ap2.vip application ONLINE ONLINE akyorap2 </pre> </div></div> <p><em>Note: Don't edit Oracle scripts unless you know what you're doing.</em></p> Operating System Operating System Version 6.1 Product Version Oracle 10.2.0.4, RAC [QA-47] ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption http://jira.ubtools.com/jira/browse/QA-47 <b><ins>Problem:</ins></b> <p>Import causes instance to be hang. During import only one instance is open.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>imp system/manager file=../yedek/gedik_full.dmp log=../yedek/gedik_full_imp3.log full=y FEEDBACK=1000000 buffer=10000000 RESUMABLE=y RESUMABLE_TIMEOUT=72000 </pre> </div></div> <p><b><ins>Diagnostic Data for Oracle:</ins></b></p> <p><b>Alert Log:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Mon Mar 9 19:38:45 2009 ARC0: Log corruption near block 50941 change 9160702125 time ? Mon Mar 9 19:38:45 2009 Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_arc0_26085.trc: ORA-00354: corrupt redo log block header ORA-00353: log corruption near block 50941 change 9160702125 time 03/09/2009 1 9:38:35 ORA-00312: online log 1 thread 1: '+DATA/orcl/onlinelog/group_1.516.680795507' ARC0: All Archive destinations made inactive due to error 354 Mon Mar 9 19:38:45 2009 ARC0: Closing local archive destination LOG_ARCHIVE_DEST_1: '/u01/app/oracle/p roduct/10.2.0/dbs/arch/1_27_681074311.dbf' (error 354) (ORCL1) Committing creation of archivelog '/u01/app/oracle/product/10.2.0/dbs/arch/1_2 7_681074311.dbf' (error 354) ARCH: Archival stopped, error occurred. Will continue retrying </pre> </div></div> <p><b>Archive Log Trace:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Corrupt redo block 50941 detected: bad block number Flag: 0x30 Format: 0x38 Block: 0x20302030 Seq: 0x5c305c79 Beg: 0x3030 Cks:0x5 c31 ----- Dump of Corrupt Redo Buffer ----- 5c463830203020305c305c795c3130305c3230305c305c305c305c3020665c30 3030433c5c305c345c305c305c305c305c305c3035320a303a35323920202009 5c305c5030305c303033203120305c32303920383022203520305c315c353034 5c305c3020305c3020725c305c3820373035317231305c315c305c330a305c30 3239353220093a35203843203030433f20372034203530395c3230225c393830 30313030203020315c3630795c3431305c3430305c463830203020305c305c79 5c3130305c32303035320a303a3532395c2020095c305c305c305c30433c2066 5c3430305c305c305c305c305c305c3020305c305c305c5030305c3030292031 20305c32303920383022203520305c310a3320363239353220093a355c305c20 5c305c30203620303038203331725c455c625c365c3331305c305c3020665c30 3030433c20372034203530395c3230225c3938305c3530302030203035320a79 3a3532393020200931305c365c305c3038305c6230305c4641305c3431305c35 5c305c3030305c3020305c3542305c3f5c305c3031305c3030305c3131305c31 425420440a4920353239353220093a35314345205c305c395c3130305c323062 20382030203530395c313022203530305c305c345c305c302035303020372034 303530365c3831315c3230304646463035320a463a3532394320200938412032 Rereading log member '+DATA/orcl/onlinelog/group_1.516.680795507' (corruption ) ... Corrupt redo block 50941 detected: bad block number Flag: 0x0 Format: 0x0 Block: 0x00000000 Seq: 0x00000000 Beg: 0x0 Cks:0x0 ----- Dump of Corrupt Redo Buffer ----- 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 Rereading log member '+DATA/orcl/onlinelog/group_1.516.680795507' (corruption ) ... Corrupt redo block 50941 detected: bad block number Flag: 0x30 Format: 0x38 Block: 0x20302030 Seq: 0x5c305c79 Beg: 0x3030 Cks:0x5 c31 ----- Dump of Corrupt Redo Buffer ----- 5c463830203020305c305c795c3130305c3230305c305c305c305c3020665c30 3030433c5c305c345c305c305c305c305c305c3035320a303a35323920202009 5c305c5030305c303033203120305c32303920383022203520305c315c353034 5c305c3020305c3020725c305c3820373035317231305c315c305c330a305c30 3239353220093a35203843203030433f20372034203530395c3230225c393830 30313030203020315c3630795c3431305c3430305c463830203020305c305c79 5c3130305c32303035320a303a3532395c2020095c305c305c305c30433c2066 5c3430305c305c305c305c305c305c3020305c305c305c5030305c3030292031 20305c32303920383022203520305c310a3320363239353220093a355c305c20 5c305c30203620303038203331725c455c625c365c3331305c305c3020665c30 3030433c20372034203530395c3230225c3938305c3530302030203035320a79 3a3532393020200931305c365c305c3038305c6230305c4641305c3431305c35 5c305c3030305c3020305c3542305c3f5c305c3031305c3030305c3131305c31 425420440a4920353239353220093a35314345205c305c395c3130305c323062 20382030203530395c313022203530305c305c345c305c302035303020372034 303530365c3831315c3230304646463035320a463a3532394320200938412032 *** 2009-03-10 03:55:10.757 62692 kcrr.c </pre> </div></div> <p>As seen above, even if the database hangs, the contents of redo buffer dump change.</p> <p><b><ins>Diagnostic Data for Solaris:</ins></b></p> <p><b>Soft Link Mapping to Raw Devices:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>oravol1: disk@g600a0b80005a81660000074949959b42:b,raw oravol2: disk@g600a0b80005a816600000742499595ea:b,raw oravol3: disk@g600a0b80005a8166000007444995971e:b,raw oravol4: disk@g600a0b80005a8c9f000004f049959717:b,raw oravol5: disk@g600a0b80005a8c9f000004f249959991:b,raw </pre> </div></div> <p><b>Open File Descriptors of ARCH process:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ ps -ef|grep arc0 oracle 19941 14227 0 04:10:31 pts/12 0:00 grep arc0 oracle 26085 1 0 19:25:05 ? 0:29 ora_arc0_ORCL1 bash-3.00$ ls -ltr /proc/26085/path ... lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:25 261 -&gt; /devices/scsi_vhci/disk@g600a0b80005a816600000742499595ea:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:25 260 -&gt; /devices/scsi_vhci/disk@g600a0b80005a8c9f000004f049959717:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:25 259 -&gt; /devices/scsi_vhci/disk@g600a0b80005a81660000074949959b42:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:25 257 -&gt; /devices/scsi_vhci/disk@g600a0b80005a8166000007444995971e:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:25 256 -&gt; /devices/scsi_vhci/disk@g600a0b80005a8c9f000004f249959991:b,raw ... bash-3.00$ </pre> </div></div> <p><b>Gathering truss output for ARCH:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>truss -fae -w 261,260,259,257,256 -r 261,260,259,257,256 -o arc0.truss.log -p 26085 </pre> </div></div> <p>The command above will trace system calls with pread()/pwrite() IO buffer dumping for fd of 261,260,259,257,256.</p> <p><b>Open File Descriptors of LGWR process:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ ps -ef|grep lgwr oracle 28447 1 0 Mar 04 ? 0:17 asm_lgwr_+ASM1 oracle 25925 1 0 19:24:49 ? 0:38 ora_lgwr_ORCL1 oracle 26468 14227 0 04:21:02 pts/12 0:00 grep lgwr bash-3.00$ ls -ltr /proc/25925/path ... lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:24 260 -&gt; /devices/scsi_vhci/disk@g600a0b80005a816600000742499595ea:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:24 259 -&gt; /devices/scsi_vhci/disk@g600a0b80005a81660000074949959b42:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:24 258 -&gt; /devices/scsi_vhci/disk@g600a0b80005a8c9f000004f049959717:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:24 257 -&gt; /devices/scsi_vhci/disk@g600a0b80005a8166000007444995971e:b,raw lrwxrwxrwx 1 oracle oinstall 0 Mar 9 19:24 256 -&gt; /devices/scsi_vhci/disk@g600a0b80005a8c9f000004f249959991:b,raw ... </pre> </div></div> <p><b>Gathering truss output for ARCH:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ truss -fae -w 260,259,258,257,256 -r 260,259,258,257,256 -o lgwr.truss.log -p 25925 &amp; </pre> </div></div> <p>The command above will trace system calls with pread()/pwrite() IO buffer dumping for fd of 260,259,258,257,256.</p> QA-47 ORA-00354 ORA-00353 ORA-00312: Redolog Block Corruption Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Tue, 10 Mar 2009 02:33:18 +0000 (UTC) Fri, 10 Apr 2009 13:13:02 +0000 (UTC) 0 <b>Last Successful Log Switch:</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Beginning log switch checkpoint up to RBA [0x19.2.10], SCN: 9160700232 Mon Mar 9 19:38:21 2009 Thread 1 advanced to log sequence 25 (LGWR switch) Current log# 1 seq# 25 mem# 0: +DATA/orcl/onlinelog/group_1.516.680795507 Thread 1 cannot allocate new log, sequence 26 Checkpoint not complete Current log# 1 seq# 25 mem# 0: +DATA/orcl/onlinelog/group_1.516.680795507 Mon Mar 9 19:38:28 2009 Completed checkpoint up to RBA [0x19.2.10], SCN: 9160700232 </pre> </div></div> <p>As seen above, the last successful sequence before the corruption is 25.</p> <p><b>Header of Archive Log</b>:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(root@gdksun1:bin)$ dd if=/u01/app/oracle/product/10.2.0/dbs/arch/1_25_681074311.dbf bs=512 skip=50941 count=1|od -x 0000000 2201 0000 c6fd 0000 0019 0000 81d8 54c6 &lt;blockNo&gt; 0000020 2e32 3134 362e 0736 6b78 1207 2f0f 0212 0000040 332d 002c 0505 3831 3834 0532 7567 6469 0000060 0c65 3838 322e 3433 382e 2e38 3138 7807 0000100 076b 0f12 172f 3001 002c 0505 3032 3834 0000120 0739 7362 7361 6369 0e69 3538 312e 3530 0000140 312e 3535 322e 3233 7807 076b 0f12 172f </pre> </div></div> <p>The block number is 0x0000c6fd (bytes swapped since the platform is little endian). Since 50941=0x0000c6fd, block number in archive log is correct. That means, LGWR had successfuly written the correct redo before the log switch.</p> <b>Computing the Offset of Corrupted ASM Block:</b> <p>SQL&gt; select GROUP_NUMBER,NAME,ALLOCATION_UNIT_SIZE from v$asm_diskgroup;</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>GROUP_NUMBER NAME ALLOCATION_UNIT_SIZE ------------ ------------------------- -------------------- 1 DATA 1048576 SQL&gt; select GROUP_NUMBER, DISK_NUMBER, name, path from v$asm_disk; GROUP_NUMBER DISK_NUMBER NAME PATH ------------ ----------- ------------------------- -------------------- 1 0 DATA_0000 /u01/oradata/oravol1 1 1 DATA_0001 /u01/oradata/oravol2 1 2 DATA_0002 /u01/oradata/oravol3 1 3 DATA_0003 /u01/oradata/oravol4 1 4 DATA_0004 /u01/oradata/oravol5 </pre> </div></div> <ul class="alternate" type="square"> <li>ASM File Name: +DATA/orcl/onlinelog/group_1.516.680795507</li> <li>ASM File#.........: 516</li> <li>Corrupted Block#...: 50941</li> <li>File Block Size:</li> </ul> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select BLOCK_SIZE from v$asm_file where FILE_NUMBER=516; BLOCK_SIZE ---------- 512 </pre> </div></div> <ul class="alternate" type="square"> <li>Blocks per ASM Extent: 1048576/512=2048</li> <li>ASM Extent#......: 50941/2048 = 24 (rounded down)</li> <li>Block# in ASM Extent...: 50941 - 24*2048 = 1789</li> <li>Disk# and ASM Extent Offset:</li> </ul> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select DISK_KFFXP, AU_KFFXP from x$kffxp where XNUM_KFFXP=24 and group_kffxp=1 and NUMBER_KFFXP=516; DISK_KFFXP AU_KFFXP ---------- ---------- 1 60884 </pre> </div></div> <p>Disk#1 : /u01/oradata/oravol2<br/> ASM Extent Offset...: 60884*1048576 = 63841501184 --&gt; 0xEDD400000<br/> ASM Corrupted Block Offset.....: 63841501184+1789*512 = 63842417152 --&gt; 0xEDD4DFA00</p> <b><ins>Interpreting the truss Output of ARCH:</ins></b> <p>fd#261 is /u01/oradata/oravol2 for ARCH.</p> <p><b>Reading Offsets by ARCH:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ grep "pread(261" arc0.truss.log 26085: pread(261, 0xFFFFFD7FFC32DE00, 131072, 0xEDE600000) = 131072 26085: pread(261, 0xFFFFFD7FFC21CE00, 131072, 0xEDE620000) = 131072 26085: pread(261, 0xFFFFFD7FFC10BE00, 131072, 0xEDE640000) = 131072 26085: pread(261, 0xFFFFFD7FFBE2DE00, 131072, 0xEDE660000) = 131072 26085: pread(261, 0xFFFFFD7FFBA2DE00, 131072, 0xEDE680000) = 131072 26085: pread(261, 0xFFFFFD7FFB42DE00, 131072, 0xEDE6A0000) = 131072 26085: pread(261, 0xFFFFFD7FFB53DE00, 131072, 0xEDE6C0000) = 131072 26085: pread(261, 0xFFFFFD7FFB64DE00, 131072, 0xEDE6E0000) = 131072 26085: pread(261, 0xFFFFFD7FFADCDE00, 131072, 0xEDE700000) = 131072 26085: pread(261, 0xFFFFFD7FFAE6DE00, 131072, 0xEDE800000) = 131072 26085: pread(261, 0xFFFFFD7FFAEDDE00, 131072, 0xEDE720000) = 131072 26085: pread(261, 0xFFFFFD7FFAF7DE00, 131072, 0xEDE820000) = 131072 26085: pread(261, 0xFFFFFD7FFC2CDE00, 131072, 0xEDE740000) = 131072 26085: pread(261, 0xFFFFFD7FFC36DE00, 131072, 0xEDE840000) = 131072 26085: pread(261, 0xFFFFFD7FFC1BCE00, 131072, 0xEDE760000) = 131072 26085: pread(261, 0xFFFFFD7FFC25CE00, 131072, 0xEDE860000) = 131072 26085: pread(261, 0xFFFFFD7FFC0ABE00, 131072, 0xEDE780000) = 131072 26085: pread(261, 0xFFFFFD7FFC14BE00, 131072, 0xEDE880000) = 131072 26085: pread(261, 0xFFFFFD7FFBDCDE00, 131072, 0xEDE7A0000) = 131072 26085: pread(261, 0xFFFFFD7FFBE6DE00, 131072, 0xEDE8A0000) = 131072 26085: pread(261, 0xFFFFFD7FFB9CDE00, 131072, 0xEDE7C0000) = 131072 26085: pread(261, 0xFFFFFD7FFBA6DE00, 131072, 0xEDE8C0000) = 131072 26085: pread(261, 0xFFFFFD7FFB3CDE00, 131072, 0xEDE7E0000) = 131072 26085: pread(261, 0xFFFFFD7FFB46DE00, 131072, 0xEDE8E0000) = 131072 26085: pread(261, 0xFFFFFD7FFB51DE00, 131072, 0xEDE900000) = 131072 26085: pread(261, 0xFFFFFD7FFB62DE00, 131072, 0xEDE920000) = 131072 26085: pread(261, 0xFFFFFD7FFAE0DE00, 131072, 0xEDE940000) = 131072 26085: pread(261, 0xFFFFFD7FFAF1DE00, 131072, 0xEDE960000) = 131072 26085: pread(261, 0xFFFFFD7FFC30DE00, 131072, 0xEDE980000) = 131072 26085: pread(261, 0xFFFFFD7FFC1FCE00, 131072, 0xEDE9A0000) = 131072 26085: pread(261, 0xFFFFFD7FFC0EBE00, 131072, 0xEDE9C0000) = 131072 26085: pread(261, 0xFFFFFD7FFBE0DE00, 131072, 0xEDE9E0000) = 131072 26085: pread(261, 0xFFFFFD7FFBEADE00, 512, 0xEDD400000) = 512 26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 26085: pread(261, 0xFFFFFD7FFBA4DE00, 131072, 0xEDD500000) = 131072 26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 26085: pread(261, 0xFFFFFD7FFBA4DE00, 131072, 0xEDD500000) = 131072 26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 26085: pread(261, 0xFFFFFD7FFC53BE00, 16384, 0xEDDED4000) = 16384 bash-3.00$ </pre> </div></div> <p>As seen above, offsets starting with 0xEDE and 0xEDD5 are greater than our corrupted offset of 0xEDD4DFA00. So, They are out of the scope.</p> <p>The followings should be examined:</p> <ul class="alternate" type="square"> <li>26085: pread(261, 0xFFFFFD7FFBEADE00, 512, 0xEDD400000) = 512 <ul class="alternate" type="square"> <li>This is the ASM Extent Offset. In other words, it's the base offset. (0xEDD400000+512)&lt;0xEDD4DFA00. So, it doesn't read the corrupted block.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 <ul class="alternate" type="square"> <li>(0xEDD400200+130560)&lt;0xEDD4DFA00. It doesn't read the corrupted block.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 <ul class="alternate" type="square"> <li>(0xEDD420000+512)&lt;0xEDD4DFA00. It doesn't read the corrupted block.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 <ul class="alternate" type="square"> <li>Same as before.</li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 <ul class="alternate" type="square"> <li>Same as before.</li> </ul> </li> </ul> <p>ARCH did not read the corrupted block#50941. But, it reported an error.</p> <p><b>dd Output of the Corrupted Block:</b></p> <p>ASM Corrupted Block Offset in 512 byte block: 63842417152/512=124692221</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124692221 count=1|od -x 0000000 2201 0000 f0fd 0000 001b 0000 80d8 2304 &lt;blockNo&gt; 0000020 3838 322e 3731 312e 3431 7807 0a6c 111e 0000040 2230 3001 002c 0605 3131 3730 3130 3306 </pre> </div></div> <p>0x0000f0fd is not 50941. So, it's corrupted.</p> <p>The reason why ARCH did not read this block is hidden in the error messages:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00353: log corruption near block 50941 change 9160702125 time 03/09/2009 1 </pre> </div></div> <p>It says <em>near</em>.</p> <b><ins>Finding the Other Corrupted Block</ins></b>: <p><b>dd Outputs on pread() of ARCH</b>:</p> <ul class="alternate" type="square"> <li>26085: pread(261, 0xFFFFFD7FFBEADE00, 512, 0xEDD400000) = 512 <ul class="alternate" type="square"> <li>Offset: 0xEDD400000 = 63841501184</li> <li>Offset in 512 byte block: 63841501184/512=124690432 <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690432 count=1|od -x 0000000 2201 0000 c000 0000 001b 0000 8000 621d &lt;blockNo&gt; ... </pre> </div></div></li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFB9AE000, 130560, 0xEDD400200) = 130560 <ul class="alternate" type="square"> <li>First Block Offset: 0xEDD400200 = 63841501696</li> <li>First Block Offset in 512 byte block: 63841501696/512=124690433 (next block of previous block) <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690433 count=1|od -x 0000000 2201 0000 c001 0000 001b 0000 8124 5172 &lt;blockNo&gt; .. </pre> </div></div></li> <li>Last Block Offset: 0xEDD400200 + 130560-512= 63841631744</li> <li>First Block Offset in 512 byte block: 63841631744/512=124690687 <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690687 count=1|od -x 0000000 2201 0000 c0ff 0000 001b 0000 8018 4635 &lt;blockNo&gt; .. </pre> </div></div></li> </ul> </li> <li>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 <ul class="alternate" type="square"> <li>Offset: 0xEDD420000 = 63841632256</li> <li>Offset in 512 byte block: 63841632256/512 = 124690688 <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/oradata/oravol2 bs=512 iseek=124690688 count=1|od -x 0000000 2201 0000 c800 0000 001b 0000 805c 2d48 &lt;blockNo&gt; .. </pre> </div></div></li> </ul> </li> </ul> <p>As seen above, the block numbers increase from 0xC000 to 0xC0FF. But, in the last call, it jumped to 0xC800.</p> <p><b>truss Output of ARCH for block# 0xC800</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>26085: pread(261, 0xFFFFFD7FFBAADE00, 512, 0xEDD420000) = 512 26085: 01 "\0\0\0C8\0\01B\0\0\0 \80 H -\00505 4 1 4 5 0\v 6 6 6 6 6 6 4 &lt;blockNo&gt; 26085: 1 4 5 00F 2 1 2 . 1 5 6 . 2 3 0 . 2 1 807 x l\n07\f %1F01 0 ,\0 26085: 0505 3 5 6 0 705 3 8 0 3 50E 8 8 . 2 4 1 . 1 3 6 . 2 2 007 x l\n 26085: 07\f %1F01 0 ,\00505 6 2 0 5 1\b a d a m k a c i0E 1 9 5 . 2 4 4 26085: . 6 2 . 1 4 507 x l\n07\f % "01 0 ,\00505 6 2 0 5 1\b a d a m k 26085: a c i\f 7 8 . 1 9 0 . 6 8 . 1 707 x l\n07\f % #01 0 ,\00502 - 1 26085: 05 K A Y A 20E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n07\f % .02 - 2 26085: ,\00502 - 105 K A Y A 20E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n07 26085: \f &amp;0102 - 2 ,\00505 6 1 1 4 105 1 9 5 5 60E 1 9 5 . 2 4 4 . 6 2 26085: . 1 4 707 x l\n07\f &amp;\r01 0 ,\00505 6 1 1 4 105 1 9 5 5 6\f 8 8 26085: . 2 3 4 . 5 . 2 3 107 x l\n07\f &amp;0F01 0 ,\00502 - 105 K A Y A 2 26085: 0E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n07\f &amp;1002 - 2 ,\00506 1 1 26085: 1 0 1 605 O K A Y A\f 8 5 . 1 0 8 . 8 7 . 5 007 x l\n07\f &amp; !01 26085: 0 ,\00502 - 105 K A Y A 20E 1 9 5 . 2 4 4 . 6 2 . 1 4 507 x l\n 26085: 07\f &amp; "02 - 2 ,\00505 4 1 9 3 806 6 4 3 2 5 5\r 8 8 . 2 2 5 . 1 26085: 2 0 . 5 307 x l\n07\f &amp; +01 0 ,\00505 5 3 0 5 506 0 9 1 2 1 90E </pre> </div></div> <p>Then, the following messages were written to the trace file:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>26085: write(2, " * * * 2 0 0 9 - 0 3 -".., 27) = 27 26085: write(2, "\n", 1) = 1 26085: write(2, " ", 1) = 1 26085: write(2, "\n", 1) = 1 26085: write(2, " C o r r u p t r e d o".., 51) = 51 26085: write(2, "\n", 1) = 1 26085: write(2, " F l a g : 0 x 3 0 F".., 80) = 80 26085: write(2, "\n", 1) = 1 26085: write(2, " - - - - - D u m p o".., 39) = 39 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 4 6 3 8 3 0 2 0 3 0".., 64) = 64 &lt;blockNoPiece0&gt; 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 0 4 3 3 c 5 c 3 0".., 64) = 64 &lt;blockNoPiece1&gt; 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 5 0 3 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 3 0 2 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 2 3 9 3 5 3 2 2 0 0 9".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 1 3 0 3 0 2 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 1 3 0 3 0 5 c 3 2".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 4 3 0 3 0 5 c 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 2 0 3 0 5 c 3 2 3 0 3 9".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 3 0 2 0 3 6".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 0 4 3 3 c 2 0 3 7".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 a 3 5 3 2 3 9 3 0 2 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 5 c 3 0 5 c 3 0 3 0 3 0".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 4 2 5 4 2 0 4 4 0 a 4 9".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 2 0 3 8 2 0 3 0 2 0 3 5".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " 3 0 3 5 3 0 3 6 5 c 3 8".., 64) = 64 26085: write(2, "\n", 1) = 1 26085: write(2, " R e r e a d i n g l o".., 78) = 78 26085: write(2, "\n", 1) = 1 </pre> </div></div> <p>Rereading the block fails like this.</p> <p>There are 2 problems:</p> <ul class="alternate" type="square"> <li>Redo block# jumped to 0xC800 from 0xC0FF. So, On-Disk image is corrupted.</li> <li>On-Memory image of block is different than On-Disk image.</li> </ul> <b>Checking missing IO of LGWR from truss Output :</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ grep Err lgwr.truss.log|grep pwrite bash-3.00$ grep Err lgwr.truss.log|grep pread bash-3.00$ </pre> </div></div> <p>No missing IO.</p> <p><b>Checking IO buffers of LGWR</b>:</p> <p>fd#260 is /u01/oradata/oravol2 for LGWR.<br/> Offset: 0xEDD420000.</p> <p>The Last write to block:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>25925: pwrite(260, 0x380D78400, 76288, 0xEDD420000) = 76288 25925: 01 "\0\0\0C8\0\01B\0\0\0 \80 H -\00505 4 1 4 5 0\v 6 6 6 6 6 6 4 &lt;blockNo&gt; 25925: 1 4 5 00F 2 1 2 . 1 5 6 . 2 3 0 . 2 1 807 x l\n07\f %1F01 0 ,\0 25925: 0505 3 5 6 0 705 3 8 0 3 50E 8 8 . 2 4 1 . 1 3 6 . 2 2 007 x l\n </pre> </div></div> <p>As seen above, the contents of redo buffer is corrupted. The block number is 0xC800.</p> <p>But, this LGWR had generated correct archivelog:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>bash-3.00$ dd if=/u01/app/oracle/product/10.2.0/dbs/arch/1_25_681074311.dbf bs=512 skip=256 count=1|od -x 1+0 records in 1+0 records out 0000000 2201 0000 0100 0000 0019 0000 8000 d162 &lt;blockNo&gt; 0000020 3534 332e 2e33 3032 0733 6b78 0904 3c0c 0000040 0114 2c30 0500 3205 3031 3631 6905 6e69 </pre> </div></div> <p>0x0100 = 256, which is the correct block number.</p> Looks like a configuration issue or a bug in OS/STORAGE side. <p>This issue handles redo corruption only. But, the database encounters the corruptions on UNDO,INDEX,TABLE, CONTROL FILES, too. But, the root cause is same:<br/> <font color="red">The On-Disk image of the block and its On-Memory image are not same.</font></p> <p>Similar to <a href="http://jira.ubtools.com/jira/browse/QA-37" title="&quot;ORA-01187: cannot read from file&quot; in one of the RAC Node."><del>QA-37</del></a>.</p> <p>This issue will be updated when a comment is sent by the OS vendor.</p> Operating System reinstalled by the vendor. Then problem has not occured. Operating System Operating System Version 10 Product Version Oracle 10.2.0.4 SE,RAC [QA-46] ORA-12545: Connect failed in RAC environment because of an implicit redirect to another node. http://jira.ubtools.com/jira/browse/QA-46 <b><ins>Description:</ins></b> <p>The clients can not connect to the database with <em>ORA-12545</em> error even if They can ping the database server.</p> <p><b><ins>Diagnostic Data for Oracle:</ins></b></p> <p><b>Remote and Local Listeners for Both Nodes:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; show parameter listener NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ local_listener string remote_listener string LISTENERS_ORCL SQL&gt; </pre> </div></div> <p><b>Remote Listener Configuration for Both Nodes:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>LISTENERS_ORCL = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = gdksun1-vip)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = gdksun2-vip)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = gdksun1-pubext-vip)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = gdksun2-pubext-vip)(PORT = 1521)) ) </pre> </div></div> <p><b>tns alias</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SUNGDK = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = &lt;IP0&gt;)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = &lt;IP1&gt;)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = ORCL) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5) ) ) ) </pre> </div></div> <p><b>sqlnet trace parameters</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>TRACE_LEVEL_CLIENT = 16 TRACE_FILE_CLIENT = sqlnet.trc TRACE_DIRECTORY_CLIENT = &lt;dizinAdı&gt; TRACE_UNIQUE_CLIENT = ON TRACE_TIMESTAMP_CLIENT = ON </pre> </div></div> <p><b>sqlnet trace</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(5996) [27-ŞUB-2009 20:57:50:875] nttgetport: port resolved to 1521 (5996) [27-ŞUB-2009 20:57:50:875] nttgetport: exit (5996) [27-ŞUB-2009 20:57:50:875] nttbnd2addr: using host IP address: &lt;IP1&gt; (5996) [27-ŞUB-2009 20:57:50:875] nttbnd2addr: exit (5996) [27-ŞUB-2009 20:57:50:875] nsc2addr: normal exit </pre> </div></div> <p>The host IP and port are resolved to &lt;IP1&gt; and 1521, respectively.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(5996) [27-ŞUB-2009 20:57:50:937] nscon: sending NSPTCN packet (5996) [27-ŞUB-2009 20:57:50:937] nspsend: entry (5996) [27-ŞUB-2009 20:57:50:937] nspsend: plen=58, type=1 (5996) [27-ŞUB-2009 20:57:50:937] nttwr: entry (5996) [27-ŞUB-2009 20:57:50:937] nttwr: socket 420 had bytes written=58 (5996) [27-ŞUB-2009 20:57:50:937] nttwr: exit (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 58 bytes to transport (5996) [27-ŞUB-2009 20:57:50:937] nspsend: packet dump (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 00 3A 00 00 01 00 00 00 |.:......| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 01 38 01 2C 00 00 08 00 |.8.,....| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 7F FF 86 0E 00 00 01 00 |........| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 01 3E 00 3A 00 00 02 00 |.&gt;.:....| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 21 21 00 00 00 00 00 00 |!!......| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 00 00 00 00 0A C0 00 00 |........| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 00 0A 00 00 00 00 00 00 |........| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 00 00 |.. | (5996) [27-ŞUB-2009 20:57:50:937] nspsend: normal exit </pre> </div></div> <p>A connect packet <em>(NSPTCN)</em> sent to &lt;IP1&gt;.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(5996) [27-ŞUB-2009 20:57:50:937] nsdofls: sending NSPTDA packet (5996) [27-ŞUB-2009 20:57:50:937] nspsend: entry (5996) [27-ŞUB-2009 20:57:50:937] nspsend: plen=328, type=6 (5996) [27-ŞUB-2009 20:57:50:937] nttwr: entry (5996) [27-ŞUB-2009 20:57:50:937] nttwr: socket 420 had bytes written=328 (5996) [27-ŞUB-2009 20:57:50:937] nttwr: exit (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 328 bytes to transport (5996) [27-ŞUB-2009 20:57:50:937] nspsend: packet dump (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 01 48 00 00 06 00 00 00 |.H......| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 00 00 28 44 45 53 43 52 |..(DESCR| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 49 50 54 49 4F 4E 3D 28 |IPTION=(| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 41 44 44 52 45 53 53 3D |ADDRESS=| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 28 50 52 4F 54 4F 43 4F |(PROTOCO| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 4C 3D 54 43 50 29 28 48 |L=TCP)(H| ... (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 75 72 61 64 3F 54 75 6C |urad?Tul| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: 75 6E 61 79 29 29 29 29 |unay))))| (5996) [27-ŞUB-2009 20:57:50:937] nspsend: normal exit </pre> </div></div> <p>A data packet <em>(NSPTDA)</em> sent to &lt;IP1&gt;.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(5996) [27-ŞUB-2009 20:57:50:937] nscon: recving a packet (5996) [27-ŞUB-2009 20:57:50:937] nsprecv: entry (5996) [27-ŞUB-2009 20:57:50:937] nsbal: entry (5996) [27-ŞUB-2009 20:57:50:937] nsbgetfl: entry (5996) [27-ŞUB-2009 20:57:50:937] nsbgetfl: normal exit (5996) [27-ŞUB-2009 20:57:50:937] nsmal: entry (5996) [27-ŞUB-2009 20:57:50:937] nsmal: 48 bytes at 0x15bcf60 (5996) [27-ŞUB-2009 20:57:50:937] nsmal: normal exit (5996) [27-ŞUB-2009 20:57:50:937] nsbal: normal exit (5996) [27-ŞUB-2009 20:57:50:937] nsprecv: reading from transport... (5996) [27-ŞUB-2009 20:57:50:937] nttrd: entry (5996) [27-ŞUB-2009 20:57:50:968] nttrd: socket 420 had bytes read=10 (5996) [27-ŞUB-2009 20:57:50:968] nttrd: exit (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 10 bytes from transport (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: tlen=10, plen=10, type=5 (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: packet dump (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 00 0A 00 00 05 02 00 00 |........| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 01 85 |.. | (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: normal exit (5996) [27-ŞUB-2009 20:57:50:968] nscon: got NSPTRD packet </pre> </div></div> <p>Got a redirect packet <em>(NSPTRD)</em> from &lt;IP1&gt;.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(5996) [27-ŞUB-2009 20:57:50:968] nsrdr: recving a packet (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: entry (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: reading from transport... (5996) [27-ŞUB-2009 20:57:50:968] nttrd: entry (5996) [27-ŞUB-2009 20:57:50:968] nttrd: socket 420 had bytes read=399 (5996) [27-ŞUB-2009 20:57:50:968] nttrd: exit (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 399 bytes from transport (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: tlen=399, plen=399, type=6 (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: packet dump (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 01 8F 00 00 06 00 00 00 |........| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 00 40 28 41 44 44 52 45 |.@(ADDRE| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 53 53 3D 28 50 52 4F 54 |SS=(PROT| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 4F 43 4F 4C 3D 54 43 50 |OCOL=TCP| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 29 28 48 4F 53 54 3D 67 |)(HOST=g| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 64 6B 73 75 6E 32 29 28 |dksun2)(| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 50 4F 52 54 3D 31 35 32 |PORT=152| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 31 29 29 00 28 44 45 53 |1)).(DES| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 43 52 49 50 54 49 4F 4E |CRIPTION| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 3D 28 41 44 44 52 45 53 |=(ADDRES| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 53 3D 28 50 52 4F 54 4F |S=(PROTO| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 43 4F 4C 3D 54 43 50 29 |COL=TCP)| ... (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 3D 4D 75 72 61 64 3F 54 |=Murad?T| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 75 6C 75 6E 61 79 29 29 |ulunay))| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 28 49 4E 53 54 41 4E 43 |(INSTANC| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 45 5F 4E 41 4D 45 3D 6F |E_NAME=o| (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: 72 63 6C 32 29 29 29 |rcl2))) | (5996) [27-ŞUB-2009 20:57:50:968] nsprecv: normal exit (5996) [27-ŞUB-2009 20:57:50:968] nsrdr: got NSPTDA packet </pre> </div></div> <p>Got a data packet <em>(NSPTDA)</em> from &lt;IP1&gt;.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(5996) [27-ŞUB-2009 20:57:50:984] nttgetport: port resolved to 1521 (5996) [27-ŞUB-2009 20:57:50:984] nttgetport: exit (5996) [27-ŞUB-2009 20:57:50:984] nttbnd2addr: looking up IP addr for host: gdksun2 (5996) [27-ŞUB-2009 20:57:53:640] nttbnd2addr: *** hostname lookup failure! *** (5996) [27-ŞUB-2009 20:57:53:640] nttbnd2addr: exit </pre> </div></div> <p><font color="red"> <br/> As seen above, even if the initial request was sent to &lt;IP1&gt;, now it's redirected to an host named <em>gdksun2</em>.</font></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>(5996) [27-ŞUB-2009 20:57:53:640] nserror: entry (5996) [27-ŞUB-2009 20:57:53:640] nserror: nsres: id=0, op=77, ns=12545, ns2=12560; nt[0]=515, nt[1]=1001, nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0 (5996) [27-ŞUB-2009 20:57:53:640] snsbitts_ts: entry (5996) [27-ŞUB-2009 20:57:53:640] snsbitts_ts: acquired the bit (5996) [27-ŞUB-2009 20:57:53:640] snsbitts_ts: normal exit (5996) [27-ŞUB-2009 20:57:53:640] snsbitcl_ts: entry (5996) [27-ŞUB-2009 20:57:53:640] snsbitcl_ts: normal exit (5996) [27-ŞUB-2009 20:57:53:640] nsc2addr: error exit (5996) [27-ŞUB-2009 20:57:53:640] nsmfr: entry (5996) [27-ŞUB-2009 20:57:53:640] nsmfr: 318 bytes at 0x15bce20 (5996) [27-ŞUB-2009 20:57:53:640] nsmfr: normal exit (5996) [27-ŞUB-2009 20:57:53:640] nsmfr: entry (5996) [27-ŞUB-2009 20:57:53:640] nsmfr: 164 bytes at 0x15b9920 (5996) [27-ŞUB-2009 20:57:53:640] nsmfr: normal exit (5996) [27-ŞUB-2009 20:57:53:640] nladtrm: entry (5996) [27-ŞUB-2009 20:57:53:640] nladtrm: exit (5996) [27-ŞUB-2009 20:57:53:640] nscall: error exit (5996) [27-ŞUB-2009 20:57:53:640] nioqper: error from nscall (5996) [27-ŞUB-2009 20:57:53:640] nioqper: nr err code: 0 (5996) [27-ŞUB-2009 20:57:53:640] nioqper: ns main err code: 12545 (5996) [27-ŞUB-2009 20:57:53:640] nioqper: ns (2) err code: 12560 (5996) [27-ŞUB-2009 20:57:53:640] nioqper: nt main err code: 515 (5996) [27-ŞUB-2009 20:57:53:640] nioqper: nt (2) err code: 1001 (5996) [27-ŞUB-2009 20:57:53:640] nioqper: nt OS err code: 0 (5996) [27-ŞUB-2009 20:57:53:640] niomapnserror: entry (5996) [27-ŞUB-2009 20:57:53:640] niqme: entry (5996) [27-ŞUB-2009 20:57:53:640] niqme: reporting NS-12545 error as ORA-12545 (5996) [27-ŞUB-2009 20:57:53:640] niqme: exit (5996) [27-ŞUB-2009 20:57:53:640] niomapnserror: returning error 12545 (5996) [27-ŞUB-2009 20:57:53:640] niomapnserror: exit (5996) [27-ŞUB-2009 20:57:53:640] niotns: Couldn't connect, returning 12545 </pre> </div></div> <p>Then, the client got ORA-12545 error.</p> QA-46 ORA-12545: Connect failed in RAC environment because of an implicit redirect to another node. Oracle - SQL*Net Major Closed Not a Problem ubTools Support ubTools Support Fri, 27 Feb 2009 21:05:57 +0000 (UTC) Fri, 27 Feb 2009 22:16:09 +0000 (UTC) 0 In this issue, the client was redirected to the less loaded other node, which is not reachable by the remote client. <p>This is expected behavior as below:</p> <ul class="alternate" type="square"> <li>According to listener.ora configuration, listener sends IP or hostname back to client.</li> </ul> <ul class="alternate" type="square"> <li>When Load Balancing is in use in RAC environment, request sent to listener may be redirected to other node if other node is less loaded.</li> </ul> <p>For both cases, If listener sends an unreachable IP or hostname, client encounters an error.</p> <p><b><ins>Solutions:</ins></b></p> <ul class="alternate" type="square"> <li>Change hostname to IP address in listener.ora or add hostname to DNS,"/etc/hosts"-like configuration file in client side.</li> <li>If there are multiple IP addresses in database server, and they are reachable by some group of clients only, then define multiple listeners for each group and allow only 1 listener in load balancing.</li> </ul> Operating System Operating System Version 10 Product Version Oracle 10.2.0.4 Standard Edition, RAC [QA-45] 'direct path read temp' hangs on read() system call when ASMLIB in use. http://jira.ubtools.com/jira/browse/QA-45 <ins><b>Environment:</b></ins> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Database......: Oracle 10.2.0.4 Standard Edition, RAC ASMLIB........: oracleasm-2.6.16.46-0.12-smp-2.0.3-1.x86_64.rpm oracleasmlib-2.0.2-1.x86_64.rpm oracleasm-support-2.1.2-1.SLE10.x86_64.rpm </pre> </div></div> <p><ins><b>Description:</b></ins></p> <p><em>direct path read temp</em> hangs on read() system call when ASMLIB in use.</p> <p><ins><b>Diagnostic Data for Oracle:</b></ins></p> <p><b>Wait Event:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select SEQ#,EVENT,P1,P2,P3,WAIT_TIME,SECONDS_IN_WAIT from v$session_wait where sid=512 and state='WAITING'; SEQ# EVENT ---------- ---------------------------------------------------------------- P1 P2 P3 WAIT_TIME SECONDS_IN_WAIT ---------- ---------- ---------- ---------- --------------- 46619 direct path read temp 202 285578 7 0 5611 ... SQL&gt; select SEQ#,EVENT,P1,P2,P3,WAIT_TIME,SECONDS_IN_WAIT from v$session_wait where sid=512 and state='WAITING'; SEQ# EVENT ---------- ---------------------------------------------------------------- P1 P2 P3 WAIT_TIME SECONDS_IN_WAIT ---------- ---------- ---------- ---------- --------------- 46619 direct path read temp 202 285578 7 0 5824 SQL&gt; </pre> </div></div> <p>The session is waiting for the completion of <em>direct path read temp</em> for 5824 seconds. The SEQ# column is not changing. It's TOO long to read just 7 blocks from the disk.</p> <p><b>Stack Trace:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select spid from v$session s,v$process p where s.paddr=p.addr and s.sid=512; SPID ------------ 2359 SQL&gt; oradebug SETOSPID 2359 Oracle pid: 38, Unix process pid: 2359, image: oracle@gdksun1 SQL&gt; oradebug dump errorstack 3 Statement processed. SQL&gt; oradebug TRACEFILE_NAME /u03/app/oracle/admin/ORCL/udump/orcl1_ora_2359.trc SQL&gt; &lt;from the trace file&gt; Current SQL statement for this session: CREATE INDEX "ACD2" ON "ACCOUNT_DETAIL" ... ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedst()+31 call ksedst1() 000000000 ? 000000001 ? 7FFF177ECC40 ? 7FFF177ECCA0 ? 7FFF177ECBE0 ? 000000000 ? ksedmp()+610 call ksedst() 000000000 ? 000000001 ? 7FFF177ECC40 ? 7FFF177ECCA0 ? 7FFF177ECBE0 ? 000000000 ? ksdxfdmp()+1118 call ksedmp() 000000003 ? 000000001 ? 7FFF177ECC40 ? 7FFF177ECCA0 ? 7FFF177ECBE0 ? 000000000 ? ksdxcb()+1547 call ksdxfdmp() 7FFF177EDD90 ? 000000011 ? 000000003 ? 7FFF177EDED0 ? 7FFF177EDE30 ? 000000000 ? sspuser()+111 call ksdxcb() 000000001 ? 000000011 ? 000000001 ? 000000001 ? 7FFF177EDE30 ? 000000000 ? __funlockfile()+80 call sspuser() 000000001 ? 000000011 ? 000000001 ? 000000001 ? 7FFF177EDE30 ? 000000000 ? __read_nocancel()+7 signal __funlockfile() 00000000D ? 7FFF177EE970 ? 000000050 ? FFFFFFFFFFFFFFFF ? 000000000 ? 2B4E95CCE000 ? call_instance_read( call __read_nocancel() 00000000D ? 7FFF177EE970 ? )+12 000000050 ? FFFFFFFFFFFFFFFF ? 000000000 ? 2B4E95CCE000 ? asm_io_v2()+185 call call_instance_read( 00000000D ? 7FFF177EE970 ? ) 000000050 ? FFFFFFFFFFFFFFFF ? 000000000 ? 2B4E95CCE000 ? kfkOsmIO()+1205 call asm_io_v2() 00000000D ? 7FFF177EE970 ? 000000246 ? FFFFFFFFFFFFFFFF ? 000000000 ? 2B4E95CCE000 ? kfkReapIO()+497 call kfkOsmIO() 2B4E95830588 ? 2B4E95AAE000 ? 000000000 ? 000000000 ? 000000000 ? 2B4E95B2E000 ? kfkIOPriv()+770 call kfkReapIO() 000000000 ? 006110320 ? 2B4E95830588 ? 006110320 ? 006110320 ? 2B4E95B2E000 ? kfdIOPriv()+95 call kfkIOPriv() 000000000 ? 000000000 ? 000000024 ? 000000000 ? 2B4E95B66040 ? 000000001 ? kfioReapIO()+476 call kfdIOPriv() 000000000 ? 000000000 ? 000000000 ? 000000000 ? 2B4E95B66040 ? 000000001 ? kfioRequest()+197 call kfioReapIO() 7FFF177EED68 ? 000000001 ? 0FFFFFFFF ? 000000000 ? 2B4E95B66040 ? 000000001 ? ksfd_osmwat()+874 call kfioRequest() 000000000 ? 000000000 ? 000000000 ? 000000000 ? 7FFF177EED68 ? 2B4E00000001 ? ksfdwtio()+693 call ksfd_osmwat() 000000001 ? 000000000 ? 07FFFFFFF ? 000000000 ? 7FFF177EED68 ? 2B4E00000001 ? ksfdwat1()+220 call ksfdwtio() 000000001 ? 000000030 ? 07FFFFFFF ? 000000000 ? 7FFF177EED68 ? 2B4E00000001 ? ksfdrwat0()+1269 call ksfdwat1() 000000001 ? 000000030 ? 07FFFFFFF ? 000000000 ? 7FFF177EED68 ? 2B4E00000001 ? ksfdblock()+156 call ksfdrwat0() 000000001 ? 000000030 ? 07FFFFFFF ? 000000000 ? 2B4E7FFFFFFF ? 2B4E00000001 ? kcflwi()+48 call ksfdblock() 7FFF177F11C0 ? 000000001 ? 000000010 ? 000000000 ? 2B4E7FFFFFFF ? 2B4E00000001 ? kcflci()+689 call kcflwi() 2B4E95FF3F28 ? 000000001 ? 000000010 ? 000000000 ? 2B4E7FFFFFFF ? 2B4E00000001 ? kcblci()+197 call kcflci() 2B4E95FF3F28 ? 000000000 ? 0000000CA ? 000045B8A ? 7FFF177F1270 ? 000000000 ? kcblcio()+280 call kcblci() 2B4E95D168F0 ? 2B4E95FF3E70 ? 000000001 ? 000045B8A ? 7FFF177F1270 ? 000000000 ? kcblsltck()+50 call kcblcio() 2B4E95D168F0 ? 2B4E95FF3E70 ? 000000001 ? 000045B8A ? 7FFF177F1270 ? 000000000 ? stsCheckIO()+194 call kcblsltck() 2B4E95D168F0 ? 2B4E95FF3E70 ? 000000001 ? 000045B8A ? 7FFF177F1270 ? 000000000 ? srsnext()+746 call stsCheckIO() 2B4E95D16FE0 ? 2B4E958F9108 ? 000000000 ? 000000001 ? 7FFF177F1270 ? 000000000 ? srsget()+138 call srsnext() 2B4E9602FE14 ? 000000000 ? 2B4E95D16FE0 ? 2B4E958F8F10 ? 2B4E00000000 ? 000000000 ? sorgetqbf()+297 call srsget() 2B4E95D16F28 ? 000000000 ? 000000000 ? 000000000 ? 2B4E95D372B0 ? 2B4E95D37468 ? qersoFetch()+176 call sorgetqbf() 2B4E95D16F28 ? 2B4E95D37468 ? 2B4E95D372B0 ? 7FFF177F1584 ? 2B4E95D372B0 ? 2B4E95D37468 ? qerliFetch()+304 call qersoFetch() 5EEF015A0 ? 002D292E4 ? 7FFF177F1748 ? 000000001 ? 5EEF01648 ? 2B4E95D37468 ? kdicrws()+8744 call qerliFetch() 5EEF01358 ? 00143CCD6 ? 2B4E9581C140 ? 000000001 ? 5EEF01648 ? 5D2E60688 ? kdicdrv()+335 call kdicrws() 5D2E60688 ? 5D2E60B60 ? 000000000 ? 000000001 ? 2B4E95D367B0 ? 5D2E60610 ? opiexe()+12879 call kdicdrv() 5D2E60B60 ? 5D2E60380 ? 000000002 ? 000000001 ? 2B4E95D367B0 ? 000000000 ? opiosq0()+3316 call opiexe() 000000004 ? 000000000 ? 7FFF177F3748 ? 000000004 ? 2B4E95D367B0 ? 000000000 ? opiosq()+11 call opiosq0() 000000003 ? 00000000F ? 7FFF177F65E0 ? 000000000 ? 2B4E95D367B0 ? 000000000 ? opiodr()+984 call opiosq() 000000003 ? 00000000F ? 7FFF177F65E0 ? 000000000 ? 2B4E95D367B0 ? 000000000 ? ttcpip()+1012 call opiodr() 00000004A ? 00000000F ? 7FFF177F65E0 ? 000000004 ? 0053E3F30 ? 000000000 ? opitsk()+1322 call ttcpip() 0060AB150 ? 7FFF177F4540 ? 7FFF177F65E0 ? 000000000 ? 7FFF177F60D8 ? 7FFF177F6748 ? opiino()+1026 call opitsk() 000000003 ? 000000000 ? 7FFF177F65E0 ? 000000001 ? 000000000 ? 612CA0900000000 ? opiodr()+984 call opiino() 00000003C ? 000000004 ? 7FFF177F77A8 ? 000000000 ? 000000000 ? 612CA0900000000 ? opidrv()+547 call opiodr() 00000003C ? 000000004 ? 7FFF177F77A8 ? 000000000 ? 0053E3D00 ? 612CA0900000000 ? sou2o()+114 call opidrv() 00000003C ? 000000004 ? 7FFF177F77A8 ? 000000000 ? 0053E3D00 ? 612CA0900000000 ? opimai_real()+163 call sou2o() 7FFF177F7780 ? 00000003C ? 000000004 ? 7FFF177F77A8 ? 0053E3D00 ? 612CA0900000000 ? main()+116 call opimai_real() 000000002 ? 7FFF177F7810 ? 000000004 ? 7FFF177F77A8 ? 0053E3D00 ? 612CA0900000000 ? __libc_start_main() call main() 000000002 ? 7FFF177F7810 ? +244 000000004 ? 7FFF177F77A8 ? 0053E3D00 ? 612CA0900000000 ? _start()+41 call __libc_start_main() 0006D23A8 ? 000000002 ? 7FFF177F7968 ? 000000000 ? 0053E3D00 ? 000000002 ? </pre> </div></div> <p>Looks like an hang in __read_nocancel() .</p> <p><ins><b>Diagnostic Data for Linux:</b></ins></p> <p><b>strace Output:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>oracle@gdksun1:~&gt; strace -fp 2359 Process 2359 attached - interrupt to quit read(13, </pre> </div></div> <p>The process is sleeping on the file descriptor(fd) of 13 by read() system call.</p> <p><b>lsof Output:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>oracle@gdksun1:~&gt; lsof -p 2359|grep 13 oracle 2359 oracle DEL REG 0,12 131074 /2 oracle 2359 oracle mem REG 8,2 133423 16839 /lib64/ld-2.4.so oracle 2359 oracle mem REG 8,17 681761 52138 /u03/app/oracle/product/10.2.0/db_1/lib/libocr10.so oracle 2359 oracle mem REG 8,17 691049 52139 /u03/app/oracle/product/10.2.0/db_1/lib/libocrb10.so oracle 2359 oracle mem REG 8,17 11385162 44025 /u03/app/oracle/product/10.2.0/db_1/lib/libjox10.so oracle 2359 oracle 6w REG 8,17 1494136 90989 /u03/app/oracle/admin/ORCL/bdump/alert_ORCL1.log oracle 2359 oracle 13u REG 0,19 0 18446604444591769000 /dev/oracleasm/iid/0000000000000002 oracle@gdksun1:~&gt; </pre> </div></div> <p>The fd#13 is an ASM device.</p> <p><b>gdb output:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>oracle@gdksun1:~&gt; gdb $ORACLE_HOME/bin/oracle 2359 GNU gdb 6.6 Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-suse-linux"... Using host libthread_db library "/lib64/libthread_db.so.1". Attaching to program: /u03/app/oracle/product/10.2.0/db_1/bin/oracle, process 2359 Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libskgxp10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libskgxp10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libhasgen10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libhasgen10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libskgxn2.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libskgxn2.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libocr10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libocr10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libocrb10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libocrb10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libocrutl10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libocrutl10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libjox10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libjox10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libclsra10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libclsra10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libdbcfg10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libdbcfg10.so Reading symbols from /u03/app/oracle/product/10.2.0/db_1/lib/libnnz10.so...done. Loaded symbols for /u03/app/oracle/product/10.2.0/db_1/lib/libnnz10.so Reading symbols from /usr/lib64/libaio.so.1...done. Loaded symbols for /usr/lib64/libaio.so.1 Reading symbols from /lib64/libdl.so.2...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/libm.so.6...done. Loaded symbols for /lib64/libm.so.6 Reading symbols from /lib64/libpthread.so.0...done. [Thread debugging using libthread_db enabled] [New Thread 47616513025792 (LWP 2359)] Loaded symbols for /lib64/libpthread.so.0 Reading symbols from /lib64/libnsl.so.1...done. Loaded symbols for /lib64/libnsl.so.1 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /usr/lib64/libnuma.so...done. Loaded symbols for /usr/lib64/libnuma.so Reading symbols from /opt/oracle/extapi/64/asm/orcl/1/libasm.so...done. Loaded symbols for /opt/oracle/extapi/64/asm/orcl/1/libasm.so 0x00002b4e9512f910 in __read_nocancel () from /lib64/libpthread.so.0 (gdb) (gdb) (gdb) (gdb) (gdb) (gdb) (gdb) (gdb) (gdb) (gdb) (gdb) (gdb) backtrace #0 0x00002b4e9512f910 in __read_nocancel () from /lib64/libpthread.so.0 #1 0x00002b4e95d6772c in call_instance_read (priv=&lt;value optimized out&gt;, buf=0x7fff177ee970, size=80) at asmlib_v2.c:540 #2 0x00002b4e95d67869 in asm_io_v2 (ctx=0xd, requests=&lt;value optimized out&gt;, reqlen=582, waitreqs=0xffffffffffffffff, waitlen=0, completions=0x2b4e95cce000, complen=1, timeout=4294967295, statusp=0x7fff177eea14) at asmlib_v2.c:705 #3 0x0000000000b1804d in kfkOsmIO () #4 0x0000000000b113e9 in kfkReapIO () #5 0x0000000000b0b342 in kfkIOPriv () #6 0x0000000000a8939f in kfdIOPriv () #7 0x0000000000b06eec in kfioReapIO () #8 0x0000000000b04de5 in kfioRequest () #9 0x00000000008d669a in ksfd_osmwat () #10 0x00000000008be64d in ksfdwtio () #11 0x00000000008bb3a4 in ksfdwat1 () #12 0x00000000008bb1f5 in ksfdrwat0 () #13 0x00000000008bb464 in ksfdblock () #14 0x00000000026c0e98 in kcflwi () #15 0x00000000026c0e31 in kcflci () #16 0x0000000001011435 in kcblci () #17 0x0000000001010e20 in kcblcio () #18 0x0000000001010ca2 in kcblsltck () #19 0x00000000020588f0 in stsCheckIO () #20 0x0000000002063908 in srsnext () #21 0x0000000002062eba in srsget () #22 0x000000000205d089 in sorgetqbf () #23 0x0000000002d64222 in qersoFetch () #24 0x0000000002d2e412 in qerliFetch () #25 0x00000000014327d2 in kdicrws () #26 0x000000000142fa47 in kdicdrv () #27 0x0000000002f2cf07 in opiexe () #28 0x00000000034c55d4 in opiosq0 () #29 0x00000000034c48db in opiosq () #30 0x00000000012e88f4 in opiodr () #31 0x0000000003a4b900 in ttcpip () #32 0x00000000012e3fc4 in opitsk () #33 0x00000000012e6ee4 in opiino () #34 0x00000000012e88f4 in opiodr () #35 0x00000000012da313 in opidrv () #36 0x0000000001e62466 in sou2o () #37 0x00000000006d24cb in opimai_real () #38 0x00000000006d241c in main () (gdb) </pre> </div></div> <p>Looks like a hang in __read_nocancel() . It's the same as in Oracle stack trace.</p> <p><b>An Excerpt from /var/log/messages:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Feb 2 22:46:51 gdksun1 kernel: 509 [RAIDarray.mpp]mppLnx_do_queuecommand: mppLnx_scsi_execute_async failed. </pre> </div></div> <p>At the same time, <em>mppLnx_do_queuecommand: mppLnx_scsi_execute_async failed</em> appeared in <em>/var/log/messages</em>.</p> QA-45 'direct path read temp' hangs on read() system call when ASMLIB in use. Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Mon, 2 Feb 2009 23:13:16 +0000 (UTC) Tue, 3 Feb 2009 00:03:05 +0000 (UTC) 0 The problem caused by __read_nocancel () from /lib64/libpthread.so.0. <p>OS Vendor driver looks incompatible with Oracle ASMLIB.</p> Operating System Operating System Version SLES 10 SP1 (x86-64) Product Version Oracle 10.2.0.4 Standard Edition, RAC [QA-44] TNS connection lost for big SQL*Net packets, and slow performance for small SQL*Net packets. http://jira.ubtools.com/jira/browse/QA-44 The customer has the following errors in SQ*Net SERVER trace: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>[17-KAS-2008 06:02:33:866] nspsend: transport write error [17-KAS-2008 06:02:33:867] nserror: nsres: id=0, op=67, ns=12547, ns2=12560; nt[0]=517, nt[1]=32, nt[2]=0; ora[ 0]=0, ora[1]=0, ora[2]=0 [17-KAS-2008 06:02:33:868] nsdo: nsctxrnk=0 [17-KAS-2008 06:02:33:869] nioqsn: send failed: bl = 47, nicbl = 59 [17-KAS-2008 06:02:33:870] nioqper: error from nioqsn [17-KAS-2008 06:02:33:871] nioqper: nr err code: 0 [17-KAS-2008 06:02:33:872] nioqper: ns main err code: 12547 [17-KAS-2008 06:02:33:873] nioqper: ns (2) err code: 12560 [17-KAS-2008 06:02:33:875] nioqper: nt main err code: 517 [17-KAS-2008 06:02:33:876] nioqper: nt (2) err code: 32 [17-KAS-2008 06:02:33:877] nioqper: nt OS err code: 0 [17-KAS-2008 06:02:33:878] nioqer: entry [17-KAS-2008 06:02:33:879] nioqce: entry [17-KAS-2008 06:02:33:880] nioqce: exit [17-KAS-2008 06:02:33:881] nioqer: exit [17-KAS-2008 06:02:33:882] nioqsn: returning error: 3113 </pre> </div></div> <p>nt<span class="error">&#91;1&#93;</span>=32 is Operating System Dependent(OSD) error code.</p> <p>An excerpt from truss output of SERVER process:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>531245: lseek(7, 0, SEEK_CUR) = 501528 531245: write(7, " [ 1 7 - K A S - 2 0 0 8".., 34) = 34 531245: lseek(7, 0, SEEK_CUR) = 501562 531245: write(7, " e n t r y\n", 6) = 6 531245: write(12, "07DB\0\006\0\0\0\0\002C2".., 2011) Err#32 Broken pipe 531245: Received signal #13, SIGPIPE [ignored] 531245: siginfo: SIGPIPE 531245: lseek(8, 69632, SEEK_SET) = 69632 531245: read(8, "12\0DB0F\0\0 t\0DC0F\0\0".., 512) = 512 531245: lseek(8, 13312, SEEK_SET) = 13312 531245: read(8, "19\0A203\0\09E\0A303\0\0".., 512) = 512 531245: gettimeofday(0x000000011FFF7A90, 0x00000000) = 0 531245: lseek(7, 0, SEEK_CUR) = 501568 </pre> </div></div> <p>OSD error is Err#32 Broken pipe. This OSD error is also defined in errno.h:</p> <ul class="alternate" type="square"> <li>#define EPIPE 32 /* Broken pipe */</li> </ul> <p>Client side SQL*Net trace shows that client is waiting for a response from server on nttrd() call.</p> <p>Since the server process is lost connection, it's not able to send a message to the client side. Since the client side is not getting a response, his screen waits in "Not Responding" state in Windows.</p> QA-44 TNS connection lost for big SQL*Net packets, and slow performance for small SQL*Net packets. Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Wed, 26 Nov 2008 12:28:44 +0000 (UTC) Wed, 26 Nov 2008 13:38:00 +0000 (UTC) 0 The customer uses CISCO ASA 5520 series, Version 8.0.4 FIREWALL. This has an option of <em>inspect sqlnet</em>. After this option has been disabled, the problem has been solved. Operating System Product Version Oracle 9i [QA-43] Slow performance while navigating on the forms items. http://jira.ubtools.com/jira/browse/QA-43 The customer encountered slow performance while navigating on their forms items. The problem occured sporadically. When it occurs it takes 3-4 seconds, which are not acceptable. <p>An excerpt from EVENT 10046 trace file:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>*** [ Windows thread id: 5580 ] *** 2008-11-24 21:27:20.390 RPC CALL:...(stament removed); ===================== PARSING IN CURSOR #5 len=56 dep=1 uid=78 oct=3 lid=78 tim=1420397928 hv=3822139714 ad='57c1a3bc' SELECT ...(stament removed) END OF STMT PARSE #5:c=0,e=32,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1420397924 EXEC #5:c=0,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1420398028 FETCH #5:c=0,e=14,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=1,tim=1420398156 RPC EXEC:c=0,e=0 WAIT #0: nam='SQL*Net message to client' ela= 4 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1420398646 *** 2008-11-24 21:27:20.765 WAIT #0: nam='SQL*Net message from client' ela= 3 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1420772565 RPC CALL:...(stament removed); EXEC #5:c=0,e=38,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1420773632 FETCH #5:c=0,e=14,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=1,tim=1420773671 RPC EXEC:c=0,e=0 WAIT #0: nam='SQL*Net message to client' ela= 5 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1420774146 *** [ Windows thread id: 4416 ] *** 2008-11-24 21:27:24.765 WAIT #0: nam='SQL*Net message from client' ela= 6 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1424766291 RPC CALL:...(stament removed); ===================== PARSING IN CURSOR #7 len=70 dep=1 uid=78 oct=3 lid=78 tim=1424767161 hv=2516830579 ad='5af8ae9c' SELECT ...(stament removed) END OF STMT PARSE #7:c=0,e=78,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424767156 EXEC #7:c=0,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424767987 FETCH #7:c=0,e=94,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=1,tim=1424768238 RPC EXEC:c=0,e=0 WAIT #0: nam='SQL*Net message to client' ela= 7 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1424768583 PUT: vc (659FD948), msg (5CF1D5B0), size (90), flgs (3) *** [ Windows thread id: 5580 ] *** 2008-11-24 21:27:24.765 WAIT #0: nam='SQL*Net message from client' ela= 4 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1424771121 RPC CALL:...(stament removed); ===================== PARSING IN CURSOR #8 len=90 dep=1 uid=78 oct=3 lid=78 tim=1424772412 hv=1036237733 ad='5d1575f0' SELECT ...(stament removed) END OF STMT PARSE #8:c=0,e=56,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424772407 EXEC #8:c=0,e=55,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424772567 FETCH #8:c=0,e=14,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424772608 RPC EXEC:c=0,e=0 WAIT #0: nam='SQL*Net message to client' ela= 6 driver id=1297371904 #bytes=4 p3=0 obj#=-1 tim=1424773339 </pre> </div></div> QA-43 Slow performance while navigating on the forms items. Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Wed, 26 Nov 2008 11:56:57 +0000 (UTC) Wed, 26 Nov 2008 12:10:10 +0000 (UTC) 0 The time had been spent between the following operations: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>WAIT #0: nam='SQL*Net message to client' ela= 5 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1420774146 *** [ Windows thread id: 4416 ] *** 2008-11-24 21:27:24.765 WAIT #0: nam='SQL*Net message from client' ela= 6 driver id=1297371904 #bytes=1 p3=0 obj#=-1 tim=1424766291 </pre> </div></div> <p>As seen from the excerpt above, the windows thread ID had switched to 4416.</p> <p>elapsed time: (tim=1424766291) - (tim=1420774146) = 3992145 microseconds = 3.992145 seconds.</p> <p>The customer has SHARED SERVER configuration. But, SHARED_SERVERS parameter was set to 1. Unfortunately, we had not got an opportunity to debug SHARED SERVER operations. But, increasing SHARED_SERVERS parameter has solved this thread switch problem since there are now pre-created SHARED SERVERS.</p> Operating System Operating System Version 2003 Product Version Oracle 10g 10.2.0.4 [QA-42] ORA-27040 ORA-19504 OSD-04002: While backing up by RMAN to shared disk on Windows. http://jira.ubtools.com/jira/browse/QA-42 (The problem solution is simple. But, since it saves setup times, it's doccumented here.) <p><em>Note:145843.1 How to Configure RMAN to Write to Shared Drives on Windows NT/2000</em> is implemented. Although the script works on RMAN command line; it fails if it's defined as a job on Enterprise Manager.</p> <p>An excerpt from the script:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>run { allocate channel ch0 device type disk format '\\host\RMAN\RMAN_%U'; ... } </pre> </div></div> <p>An excerpt from the output log of EM:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>RMAN&gt; run 2&gt; { 3&gt; allocate channel ch0 device type disk format '\host\RMAN\RMAN_%U'; ... 10&gt; } ... RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of backup plus archivelog command at 09/18/2008 17:37:36 ORA-19504: failed to create file "C:\host\RMAN\RMAN_1KJQTRAU_1_1" ORA-27040: file create error, unable to create file OSD-04002: unable to open file O/S-Error: (OS 3) The system cannot find the path specified. </pre> </div></div> QA-42 ORA-27040 ORA-19504 OSD-04002: While backing up by RMAN to shared disk on Windows. Oracle - Administration Major Closed Answered ubTools Support ubTools Support Thu, 18 Sep 2008 14:35:59 +0000 (UTC) Thu, 18 Sep 2008 15:04:19 +0000 (UTC) 0 As seen above, the file name in the script is '\\host\RMAN\RMAN_%U'. But, it's converted by EM to: <ul class="alternate" type="square"> <li>'\host\RMAN\RMAN_%U'</li> <li>Then "C:\host\RMAN\RMAN_1KJQTRAU_1_1"</li> </ul> '\' character is a special character in JAVA/C. The correct file name should be: <ul class="alternate" type="square"> <li>'\\\host\RMAN\RMAN_%U'</li> </ul> <p>Three '\' characters should be used before hostname; not two.</p> Operating System Operating System Version Windows 2000 Product Version 10.2.0.4 [QA-41] Startup database fails with ORA-600 [4000], ORA-600 [4137]. http://jira.ubtools.com/jira/browse/QA-41 After an hardware problem, database crashed. <p>Since an ARCHIVELOG is missed, and restoring the previous backup is not acceptable, the customer wanted to open database in inconsistent state.</p> QA-41 Startup database fails with ORA-600 [4000], ORA-600 [4137]. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Mon, 16 Jun 2008 23:31:50 +0000 (UTC) Tue, 17 Jun 2008 09:28:35 +0000 (UTC) 0 Steps to open the database: <ul class="alternate" type="square"> <li>setting _<em>ALLOW_RESETLOGS_CORRUPTION=TRUE</em> in init&lt;SID&gt;.ora.</li> <li>startup mount;</li> <li>recover database until cancel;<br/> &lt;--cancel</li> <li>alter database open resetlogs;</li> </ul> <p>But, it failed with the following error:</p> <blockquote><p>ORA-00600: internal error code, arguments: <span class="error">&#91;4000&#93;</span>, <span class="error">&#91;9&#93;</span>, [], [], [], [], [], []</p></blockquote> <p>Oracle Note:47456.1:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>DESCRIPTION: This has the potential to be a very serious error. It means that Oracle has tried to find an undo segment number in the dictionary cache and failed. ARGUMENTS: Arg [a] Undo segment number FUNCTIONALITY: KERNEL TRANSACTION UNDO IMPACT: INSTANCE FAILURE - Instance will not restart STATEMENT FAILURE </pre> </div></div> An exerpt from the trace file: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [4000], [9], [], [], [], [], [], [] Current SQL statement for this session: select ctime, mtime, stime from obj$ where obj# = :1 ... Block header dump: 0x0080003e Object id on Block? Y seg/obj: 0x12 csc: 0x570.b8368d16 itc: 1 flg: - typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 ... Itl Xid Uba Flag Lck Scn/Fsc 0x01 xid: 0x0009.019.000dc23f uba: 0x58c13ddb.0523.46 --U- 1 fsc 0x0000.b8368d17 </pre> </div></div> <p>Looks like a problem regarding obj$ and its undo...If undo requirement is bypassed, there will be no requirement for undo. In order to do that, bumping SNC further needed.</p> <p><em>csc</em> shows the the SCN of last block cleanout. We <em>guessed</em> it may be used a target bumping SCN as below:</p> <ul class="alternate" type="square"> <li>0x570.b8368d16 =&gt; 0x570b8368d16 =&gt; Decimal: 5981685058838 =&gt; divide by 1024/1024/1024 = 5571</li> </ul> <p>Bump SCN as below and restart:</p> <ul class="alternate" type="square"> <li>Setting _<em>MINIMUM_GIGA_SCN</em> = 5571 in init&lt;SID&gt;.ora</li> <li>startup mount;</li> <li>recover database until cancel;<br/> &lt;--cancel</li> <li>alter database open resetlogs;</li> </ul> <p>ORA-600 <span class="error">&#91;4000&#93;</span> disappeared. But now, the following error appeared:</p> <blockquote><p>ORA-00600: internal error code, arguments: <span class="error">&#91;4137&#93;</span>, [], [], [], [], [], [], []</p></blockquote> <p>Oracle Note:47456.1: </p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>DESCRIPTION: While backing out an undo record (i.e. at the time of rollback) we found a transaction id mis-match indicating either a corruption in the rollback segment or corruption in an object which the rollback segment is trying to apply undo records on. This would indicate a corrupted rollback segment. FUNCTIONALITY: Kernel Transaction Undo Recovery IMPACT: POSSIBLE PHYSICAL CORRUPTION in Rollback segments </pre> </div></div> Restart the database: <ul class="alternate" type="square"> <li>Setting _CORRUPTED_ROLLBACK_SEGMENTS in init&lt;SID&gt;.ora</li> <li>startup mount;</li> <li>recover database until cancel;<br/> &lt;--cancel</li> <li>alter database open resetlogs;</li> </ul> <p>The database is opened.</p> <p>Since it's opened in inconsistent state, a full export and then import into a new database is required to get rid of the inconsistency in Oracle dictionary. But, the customer data will not be consistent after the import. It should be reviewed by the customer.</p> The database was opened inconsistently. It'll be recreated with full export/import. Operating System Product Version Oracle 8.1.7.3.0 [QA-40] "Oracle Database Server" status is INVALID after applying 10.2.0.4 PatchSet. http://jira.ubtools.com/jira/browse/QA-40 After applying 10.2.0.4.0 PatchSet into 10.2.0.3.0, catupgrd.sql logs shows the following: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... SQL&gt; CREATE OR REPLACE PACKAGE BODY dbms_sqlpa wrapped 2 a000000 3 1 4 abcd 5 abcd 6 abcd ... Warning: Package Body created with compilation errors. SQL&gt; show errors; Errors for PACKAGE BODY DBMS_SQLPA: LINE/COL ERROR -------- ----------------------------------------------------------------- 113/5 PL/SQL: SQL Statement ignored 118/44 PL/SQL: ORA-00904: "OTHER_XML": invalid identifier SQL&gt; ... Component Status Version HH:MM:SS Oracle Database Server INVALID 10.2.0.4.0 00:09:22 JServer JAVA Virtual Machine VALID 10.2.0.4.0 00:02:43 Oracle XDK VALID 10.2.0.4.0 00:00:29 Oracle Database Java Packages VALID 10.2.0.4.0 00:00:14 Oracle Text VALID 10.2.0.4.0 00:00:21 Oracle XML Database VALID 10.2.0.4.0 00:02:02 Oracle Workspace Manager VALID 10.2.0.4.3 00:00:43 Oracle Data Mining VALID 10.2.0.4.0 00:00:20 OLAP Analytic Workspace VALID 10.2.0.4.0 00:00:16 OLAP Catalog VALID 10.2.0.4.0 00:00:55 Oracle OLAP API VALID 10.2.0.4.0 00:00:43 Oracle interMedia VALID 10.2.0.4.0 00:02:24 Spatial VALID 10.2.0.4.0 00:01:34 Oracle Ultra Search VALID 10.2.0.4.0 00:00:22 Oracle Expression Filter VALID 10.2.0.4.0 00:00:09 Oracle Enterprise Manager VALID 10.2.0.4.0 00:01:36 Oracle Rule Manager VALID 10.2.0.4.0 00:00:08 . </pre> </div></div> QA-40 "Oracle Database Server" status is INVALID after applying 10.2.0.4 PatchSet. Oracle - Administration Major Closed Answered ubTools Support ubTools Support Sun, 15 Jun 2008 18:06:17 +0000 (UTC) Sun, 15 Jun 2008 18:34:09 +0000 (UTC) 0 Compiling DBMS_SQLPA causes the problem. To find the object including <em>OTHER_XML</em> column, ERRORSTACK trace for ORA-904 would be useful. But, since it's a known column of PLAN_TABLE, it's not required while diagnosing the problem. <p>There were both SYS.PLAN_TABLE as a table and PUBLIC.PLAN_TABLE as a public synonym in the database:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select owner,object_name,object_type from dba_objects where owner in ('SYS','PUBLIC') and upper(object_name) like 'PLAN_TABLE%'; OWNER ------------------------------ OBJECT_NAME -------------------------------------------------------------------------------- OBJECT_TYPE ------------------- PUBLIC PLAN_TABLE SYNONYM SYS PLAN_TABLE TABLE OWNER ------------------------------ OBJECT_NAME -------------------------------------------------------------------------------- OBJECT_TYPE ------------------- SYS PLAN_TABLE$ TABLE SQL&gt; select TABLE_OWNER,TABLE_NAME from dba_synonyms where OWNER='PUBLIC' and SYNONYM_NAME='PLAN_TABLE'; TABLE_OWNER TABLE_NAME ------------------------------ ------------------------------ SYS PLAN_TABLE$ SQL&gt; </pre> </div></div> <p>But, not all columns of SYS.PLAN_TABLE table and PUBLIC.PLAN_TABLE synonym are same:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; desc sys.plan_table Name Null? Type ----------------------------------------- -------- ---------------------------- STATEMENT_ID VARCHAR2(30) TIMESTAMP DATE REMARKS VARCHAR2(80) OPERATION VARCHAR2(30) OPTIONS VARCHAR2(255) OBJECT_NODE VARCHAR2(128) OBJECT_OWNER VARCHAR2(30) OBJECT_NAME VARCHAR2(30) OBJECT_INSTANCE NUMBER(38) OBJECT_TYPE VARCHAR2(30) OPTIMIZER VARCHAR2(255) SEARCH_COLUMNS NUMBER ID NUMBER(38) PARENT_ID NUMBER(38) POSITION NUMBER(38) COST NUMBER(38) CARDINALITY NUMBER(38) BYTES NUMBER(38) OTHER_TAG VARCHAR2(255) PARTITION_START VARCHAR2(255) PARTITION_STOP VARCHAR2(255) PARTITION_ID NUMBER(38) OTHER LONG SQL&gt; SQL&gt; desc sys.plan_table$ Name Null? Type ----------------------------------------- -------- ---------------------------- STATEMENT_ID VARCHAR2(30) PLAN_ID NUMBER TIMESTAMP DATE REMARKS VARCHAR2(4000) OPERATION VARCHAR2(30) OPTIONS VARCHAR2(255) OBJECT_NODE VARCHAR2(128) OBJECT_OWNER VARCHAR2(30) OBJECT_NAME VARCHAR2(30) OBJECT_ALIAS VARCHAR2(65) OBJECT_INSTANCE NUMBER(38) OBJECT_TYPE VARCHAR2(30) OPTIMIZER VARCHAR2(255) SEARCH_COLUMNS NUMBER ID NUMBER(38) PARENT_ID NUMBER(38) DEPTH NUMBER(38) POSITION NUMBER(38) COST NUMBER(38) CARDINALITY NUMBER(38) BYTES NUMBER(38) OTHER_TAG VARCHAR2(255) PARTITION_START VARCHAR2(255) PARTITION_STOP VARCHAR2(255) PARTITION_ID NUMBER(38) OTHER LONG OTHER_XML CLOB DISTRIBUTION VARCHAR2(30) CPU_COST NUMBER(38) IO_COST NUMBER(38) TEMP_SPACE NUMBER(38) ACCESS_PREDICATES VARCHAR2(4000) FILTER_PREDICATES VARCHAR2(4000) PROJECTION VARCHAR2(4000) TIME NUMBER(38) QBLOCK_NAME VARCHAR2(30) SQL&gt; </pre> </div></div> <p>Since table access takes precedence on synonym access, SYS.PLAN_TABLE table was used. But, this table doesn't have a column named <em>OTHER_XML</em>, which caused the problem.</p> <p>After dropping SYS.PLAN_TABLE table, PUBLIC.PLAN_TABLE synonym used:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; drop table sys.plan_table; Table dropped. SQL&gt; SQL&gt; desc plan_table Name Null? Type ----------------------------------------- -------- ---------------------------- STATEMENT_ID VARCHAR2(30) PLAN_ID NUMBER TIMESTAMP DATE REMARKS VARCHAR2(4000) OPERATION VARCHAR2(30) OPTIONS VARCHAR2(255) OBJECT_NODE VARCHAR2(128) OBJECT_OWNER VARCHAR2(30) OBJECT_NAME VARCHAR2(30) OBJECT_ALIAS VARCHAR2(65) OBJECT_INSTANCE NUMBER(38) OBJECT_TYPE VARCHAR2(30) OPTIMIZER VARCHAR2(255) SEARCH_COLUMNS NUMBER ID NUMBER(38) PARENT_ID NUMBER(38) DEPTH NUMBER(38) POSITION NUMBER(38) COST NUMBER(38) CARDINALITY NUMBER(38) BYTES NUMBER(38) OTHER_TAG VARCHAR2(255) PARTITION_START VARCHAR2(255) PARTITION_STOP VARCHAR2(255) PARTITION_ID NUMBER(38) OTHER LONG OTHER_XML CLOB DISTRIBUTION VARCHAR2(30) CPU_COST NUMBER(38) IO_COST NUMBER(38) TEMP_SPACE NUMBER(38) ACCESS_PREDICATES VARCHAR2(4000) FILTER_PREDICATES VARCHAR2(4000) PROJECTION VARCHAR2(4000) TIME NUMBER(38) QBLOCK_NAME VARCHAR2(30) SQL&gt; </pre> </div></div> <p>Applying PatchSet did not give INVALID status:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Component Status Version HH:MM:SS Oracle Database Server VALID 10.2.0.4.0 00:09:20 JServer JAVA Virtual Machine VALID 10.2.0.4.0 00:02:56 Oracle XDK VALID 10.2.0.4.0 00:00:28 Oracle Database Java Packages VALID 10.2.0.4.0 00:00:14 Oracle Text VALID 10.2.0.4.0 00:00:22 Oracle XML Database VALID 10.2.0.4.0 00:02:05 Oracle Workspace Manager VALID 10.2.0.4.3 00:00:45 Oracle Data Mining VALID 10.2.0.4.0 00:00:21 OLAP Analytic Workspace VALID 10.2.0.4.0 00:00:16 OLAP Catalog VALID 10.2.0.4.0 00:00:55 Oracle OLAP API VALID 10.2.0.4.0 00:00:41 Oracle interMedia VALID 10.2.0.4.0 00:02:24 Spatial VALID 10.2.0.4.0 00:01:37 Oracle Ultra Search VALID 10.2.0.4.0 00:00:22 Oracle Expression Filter VALID 10.2.0.4.0 00:00:09 Oracle Enterprise Manager VALID 10.2.0.4.0 00:01:37 Oracle Rule Manager VALID 10.2.0.4.0 00:00:08 . </pre> </div></div> <ul class="alternate" type="square"> <li>Drop SYS.PLAN_TABLE table.</li> <li>Install PatchSet.</li> </ul> Operating System Product Version Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production [QA-39] Database hangs on "cursor: pin S wait on X" wait events. http://jira.ubtools.com/jira/browse/QA-39 The Database hangs. <p>The ASH report shows the activity on <em>cursor: pin S wait on X</em> wait event.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Top User Events Event Event Class % Activity Avg Active Sessions cursor: pin S wait on X Concurrency 98.90 13.74 </pre> </div></div> <p>An excerpt from SYSTEMSTATE (level 10) dump</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>PROCESS 23: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=42033 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f3271 ... ---------------------------------------- KGX Atomic Operation Log 7000000d44f4280 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4939 efd 0 whr 5 slp 44251 opr=2 pso=7000000cefa5df0 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... PROCESS 33: ... ---------------------------------------- SO: 7000000f78e0770, type: 4, owner: 7000000fa8bbcf8, flag: INIT/-/-/0x00 (session) sid: 4976 trans: 7000000f10ce318, creator: 7000000fa8bbcf8, flag: (8100041) USR/- BSY/-/-/-/-/- DID: 0001-0021-000193D1, short-term DID: 0000-0000-00000000 txn branch: 0 oct: 3, prv: 0, sql: 7000000fc179fa8, psql: 7000000fc68f018, user: 0/SYS O/S info: user: orapaky0, term: , ospid: 5099938, machine: akmenkulp2 program: sqlplus@akmenkulp2 (TNS V1-V3) application name: sqlplus@akmenkulp2 (TNS V1-V3), hash value=0 waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=27479 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f33be ... ---------------------------------------- KGX Atomic Operation Log 7000000fc64adb8 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4976 efd 0 whr 5 slp 15527 opr=2 pso=7000000d86f7d88 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... ---------------------------------------- KGX Atomic Operation Log 7000000fc64ad80 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper EXCL Cursor Pin uid 4976 efd 0 whr 1 slp 0 opr=3 pso=7000000cee2b2c0 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... PROCESS 34: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=63882 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f348e ... ---------------------------------------- KGX Atomic Operation Log 7000000fdd75c30 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4693 efd 0 whr 5 slp 62048 opr=2 pso=7000000e7548610 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... PROCESS 39: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=12512 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f36cf ... ---------------------------------------- KGX Atomic Operation Log 7000000fd857ad8 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4970 efd 0 whr 5 slp 12395 opr=2 pso=7000000cea1c250 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... PROCESS 47: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=48746 wait_time=0 seconds since wait started=0 idn=16a1ebe6, value=132e00000000, where|sleeps=5002494fe ... ---------------------------------------- KGX Atomic Operation Log 7000000c5ca4690 Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD Cursor Pin uid 4896 efd 0 whr 5 slp 48634 opr=2 pso=7000000e24f7e30 flg=0 pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0 ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0 hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788 ---------------------------------------- ... PROCESS 53: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=38686 wait_time=0 seconds since wait started=0 idn=16a1ebe6, value=132e00000000, where|sleeps=500249647 ... ---------------------------------------- KGX Atomic Operation Log 7000000db1dee98 Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD Cursor Pin uid 4828 efd 0 whr 5 slp 38647 opr=2 pso=7000000e7f58940 flg=0 pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0 ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0 hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788 ---------------------------------------- ... PROCESS 54: ... (session) sid: 4910 trans: 0, creator: 7000000d81123c0, flag: (e1) USR/- BSY/-/-/-/-/- DID: 0001-002F-00044CCF, short-term DID: 0000-0000-00000000 txn branch: 0 oct: 3, prv: 0, sql: 7000000dbd12ce0, psql: 7000000dbb98da0, user: 55/SYSMAN O/S info: user: orapaky0, term: unknown, ospid: 1234, machine: akmenkulp2 program: OMS client info: akmenkulp2_Management_Service application name: OEM.DefaultPool, hash value=3997945242 action name: /database/instance/sitemap, hash value=105676648 waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=54571 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f3abc ... ---------------------------------------- KGX Atomic Operation Log 7000000dcd36f50 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4910 efd 0 whr 5 slp 53641 opr=2 pso=7000000e7c7f338 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... ---------------------------------------- KGX Atomic Operation Log 7000000dcd36e38 Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper EXCL Cursor Pin uid 4910 efd 0 whr 1 slp 0 opr=3 pso=7000000e79b9030 flg=0 pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0 ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0 hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788 ---------------------------------------- ... PROCESS 55: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=39147 wait_time=0 seconds since wait started=0 idn=16a1ebe6, value=132e00000000, where|sleeps=5002496e9 ... ---------------------------------------- KGX Atomic Operation Log 7000000dbb57a10 Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD Cursor Pin uid 4973 efd 0 whr 5 slp 39118 opr=2 pso=7000000ce4e6fe8 flg=0 pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0 ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0 hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788 ---------------------------------------- ... PROCESS 62: ... (session) sid: 4805 trans: 0, creator: 7000000eab6d018, flag: (e1) USR/- BSY/-/-/-/-/- DID: 0001-003E-00010563, short-term DID: 0000-0000-00000000 txn branch: 0 oct: 3, prv: 0, sql: 7000000dce39e08, psql: 7000000dc0fc240, user: 72/MENKUL2008 O/S info: user: akbank, term: L1058, ospid: 3468:3428, machine: AA\L1058 program: toad.exe application name: TOAD 9.1.0.62, hash value=3156025525 waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=41562 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f3dc4 ... ---------------------------------------- KGX Atomic Operation Log 7000000db0f6b20 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4805 efd 0 whr 5 slp 41523 opr=2 pso=7000000ce1b0070 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... ---------------------------------------- KGX Atomic Operation Log 7000000db0f6ae8 Mutex 7000000e51cde60(4805, 0) idn 7a99f649 oper EXCL Cursor Pin uid 4805 efd 0 whr 1 slp 0 opr=3 pso=7000000d8b887e8 flg=0 pcs=7000000e51cde60 nxt=0 flg=35 cld=0 hd=7000000dcd2b108 par=7000000d9991e20 ct=0 hsh=0 unp=0 unn=0 hvl=d99920f8 nhv=1 ses=7000000f88942e0 hep=7000000e51cdee0 flg=80 ld=1 ob=7000000e59bf060 ptr=7000000882ed3d8 fex=7000000882ec6e8 ---------------------------------------- ... PROCESS 65: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=35648 wait_time=0 seconds since wait started=0 idn=7a99f649, value=12c500000000, where|sleeps=500028b22 ... ---------------------------------------- KGX Atomic Operation Log 7000000fd34bbf8 Mutex 7000000e51cde60(4805, 0) idn 7a99f649 oper GET_SHRD Cursor Pin uid 4570 efd 0 whr 5 slp 35618 opr=2 pso=7000000d898d0c0 flg=0 pcs=7000000e51cde60 nxt=0 flg=35 cld=0 hd=7000000dcd2b108 par=7000000d9991e20 ct=0 hsh=0 unp=0 unn=0 hvl=d99920f8 nhv=1 ses=7000000f88942e0 hep=7000000e51cdee0 flg=80 ld=1 ob=7000000e59bf060 ptr=7000000882ed3d8 fex=7000000882ec6e8 ---------------------------------------- ... PROCESS 68: ... (session) sid: 4698 trans: 0, creator: 7000000f98a7320, flag: (41) USR/- BSY/-/-/-/-/- DID: 0001-0044-00000344, short-term DID: 0000-0000-00000000 txn branch: 0 oct: 3, prv: 0, sql: 7000000c5281eb8, psql: 7000000d4d7e910, user: 72/MENKUL2008 O/S info: user: geneks, term: AKYGM011, ospid: 3168:3148, machine: AKYATIRIM\AKYGM011 program: waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=44607 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f3dfe ... ---------------------------------------- KGX Atomic Operation Log 7000000dc155c48 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4698 efd 0 whr 5 slp 8722 opr=2 pso=7000000cea6fce8 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... ---------------------------------------- KGX Atomic Operation Log 7000000dc155c10 Mutex 7000000e5f9cd90(4698, 0) idn 651e7adb oper EXCL Cursor Pin uid 4698 efd 0 whr 1 slp 0 opr=3 pso=7000000cef62b68 flg=0 pcs=7000000e5f9cd90 nxt=0 flg=34 cld=1 hd=7000000dcea52a8 par=7000000e5f9d708 ct=1 hsh=0 unp=0 unn=0 hvl=e5f9d098 nhv=1 ses=7000000f782cbe0 hep=7000000e5f9ce10 flg=80 ld=1 ob=7000000e53207c8 ptr=700000099b3ef88 fex=700000099b3e298 ---------------------------------------- ... PROCESS 70: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=34939 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f3e0f ... ---------------------------------------- KGX Atomic Operation Log 7000000fc176d20 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4784 efd 0 whr 5 slp 33075 opr=2 pso=7000000ceb24778 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... PROCESS 73: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=24155 wait_time=0 seconds since wait started=0 idn=651e7adb, value=125a00000000, where|sleeps=5000121e2 ... ---------------------------------------- KGX Atomic Operation Log 7000000dc4eba08 Mutex 7000000e5f9cd90(4698, 0) idn 651e7adb oper GET_SHRD Cursor Pin uid 4637 efd 0 whr 5 slp 8674 opr=2 pso=7000000cef8dcb0 flg=0 pcs=7000000e5f9cd90 nxt=0 flg=34 cld=1 hd=7000000dcea52a8 par=7000000e5f9d708 ct=1 hsh=0 unp=0 unn=0 hvl=e5f9d098 nhv=1 ses=7000000f782cbe0 hep=7000000e5f9ce10 flg=80 ld=1 ob=7000000e53207c8 ptr=700000099b3ef88 fex=700000099b3e298 ---------------------------------------- ... PROCESS 76: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=35688 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f3ed4 ... ---------------------------------------- KGX Atomic Operation Log 7000000fc655c80 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4806 efd 0 whr 5 slp 35673 opr=2 pso=7000000ce702c90 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... PROCESS 142: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=48159 wait_time=0 seconds since wait started=0 idn=16a1ebe6, value=132e00000000, where|sleeps=5002499ac ... ---------------------------------------- KGX Atomic Operation Log 7000000db235510 Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD Cursor Pin uid 4592 efd 0 whr 5 slp 48111 opr=2 pso=7000000e2252830 flg=0 pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0 ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0 hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788 ---------------------------------------- ... </pre> </div></div> QA-39 Database hangs on "cursor: pin S wait on X" wait events. Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Sat, 7 Jun 2008 17:35:22 +0000 (UTC) Sat, 7 Jun 2008 19:05:44 +0000 (UTC) 0 Mutex identifier value helps to locate address of mutex. For example: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>PROCESS 23: ... waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=42033 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f3271 ... ---------------------------------------- KGX Atomic Operation Log 7000000d44f4280 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4939 efd 0 whr 5 slp 44251 opr=2 pso=7000000cefa5df0 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- </pre> </div></div> <p>The mutex identifier <em>0xec048ba</em> at the address <em>0x7000000d92bce40</em> is requested in the shared mode (<em>GET_SHRD</em>)</p> According to SYSTEMSTATE dump, the following table shows the which process is waiting for the which process: <table class='confluenceTable'><tbody> <tr> <th class='confluenceTh'>Waiter Process#</th> <th class='confluenceTh'>Holder Process#</th> </tr> <tr> <td class='confluenceTd'>23</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>33</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>34</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>39</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>47</td> <td class='confluenceTd'>54</td> </tr> <tr> <td class='confluenceTd'>53</td> <td class='confluenceTd'>54</td> </tr> <tr> <td class='confluenceTd'>54</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>55</td> <td class='confluenceTd'>54</td> </tr> <tr> <td class='confluenceTd'>62</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>65</td> <td class='confluenceTd'>62</td> </tr> <tr> <td class='confluenceTd'>68</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>70</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>73</td> <td class='confluenceTd'>68</td> </tr> <tr> <td class='confluenceTd'>76</td> <td class='confluenceTd'>33</td> </tr> <tr> <td class='confluenceTd'>142</td> <td class='confluenceTd'>54</td> </tr> </tbody></table> <p>According to the table above, there is a deadlock and the root holder is the process#33. This process waits for itself. That means there is a self-deadlock problem.</p> Process#33 state: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>PROCESS 33: ... ---------------------------------------- SO: 7000000f78e0770, type: 4, owner: 7000000fa8bbcf8, flag: INIT/-/-/0x00 (session) sid: 4976 trans: 7000000f10ce318, creator: 7000000fa8bbcf8, flag: (8100041) USR/- BSY/-/-/-/-/- DID: 0001-0021-000193D1, short-term DID: 0000-0000-00000000 txn branch: 0 oct: 3, prv: 0, sql: 7000000fc179fa8, psql: 7000000fc68f018, user: 0/SYS O/S info: user: orapaky0, term: , ospid: 5099938, machine: akmenkulp2 program: sqlplus@akmenkulp2 (TNS V1-V3) application name: sqlplus@akmenkulp2 (TNS V1-V3), hash value=0 waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=27479 wait_time=0 seconds since wait started=0 idn=ec048ba, value=137000000000, where|sleeps=5007f33be ... ---------------------------------------- KGX Atomic Operation Log 7000000fc64adb8 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD Cursor Pin uid 4976 efd 0 whr 5 slp 15527 opr=2 pso=7000000d86f7d88 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... ---------------------------------------- KGX Atomic Operation Log 7000000fc64ad80 Mutex 7000000d92bce40(4976, 0) idn ec048ba oper EXCL Cursor Pin uid 4976 efd 0 whr 1 slp 0 opr=3 pso=7000000cee2b2c0 flg=0 pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0 ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770 hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418 ---------------------------------------- ... </pre> </div></div> <p>Process#33 holds mutex identifier <em>0xec048ba</em> in the exclusive mode. But, it requests the same identifier in the shared mode. It's self-deadlock bug.</p> Unfortunately, there is no stack trace of process#33 to find the kernel function in which it's was running. But, the following SQL ran by process#33 helps to narrow down the Oracle bugs: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> LIBRARY OBJECT HANDLE: handle=7000000fc179fa8 mtx=7000000fc17a0d8(1) cdp=1 name= select i.obj#, i.rowcnt, i.leafcnt, i.distkey, i.lblkkey, i.dblkkey,i.clufac, i.blevel , i.analyzetime, i.samplesize, decode(i.pctthres$,null,null,mod(trunc(i.pctthres$/256),256)), i.flags, ist.cachedblk, ist.cachehit, ist.logicalread from ind$ i, ind_stats$ ist where i.obj# = ist.obj#(+) and i.bo#=:1 order by i.obj# hash=25d75620e6d3487e18921ac30ec048ba timestamp=06-06-2008 23:01:28 ... </pre> </div></div> <p>Looks like a statistic collection SQL...</p> Oracle Note:5907779.8: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>This problem is introduced in 10.2.0.3 A process may hang with a self deadlock typically when executing DBMS_STATS. The hung process shows itself waiting on a "cursor: pin S wait on X" waitevent waiting for an object that it has pinned itself. </pre> </div></div> <p>According to the note, this problem has been fixed in Oracle 10.2.0.4.</p> Operating System Product Version Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production [QA-38] DBMS_XMLPARSER.FREEPARSER doesn't release UGA memory. http://jira.ubtools.com/jira/browse/QA-38 DBMS_XMLPARSER.FREEPARSER doesn't release UGA memory. <p><b>Session memory statistics before operation:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select name,value from v$sesstat a, v$statname b 2 where a.statistic#=b.statistic# 3 and b.name like '%memory%' 4 and sid = 58 5 order by value desc; NAME VALUE ---------------------------------------------------------------- ---------- session pga memory 424336 session pga memory max 424336 session uga memory 209872 session uga memory max 209872 sorts (memory) 16 workarea memory allocated 14 6 rows selected. </pre> </div></div> <p><b>Operation:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... dbms_xmlparser.parseclob (v_parser, data_for_table); ... dbms_xmlparser.freeParser(v_parser); ... </pre> </div></div> <p><b>Session memory statistics after operation:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select name,value from v$sesstat a, v$statname b 2 where a.statistic#=b.statistic# 3 and b.name like '%memory%' 4 and sid = 58 5 order by value desc; NAME VALUE ---------------------------------------------------------------- ---------- session pga memory 52396928 session pga memory max 52396928 session uga memory 51816784 session uga memory max 51816784 sorts (memory) 19 workarea memory allocated 14 6 rows selected. </pre> </div></div> <p><b>An excerpt from HEAPDUMP LEVEL 4 (UGA) dump:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... EXTENT 788 addr=ffffffff7ce90080 Chunk ffffffff7ce90090 sz= 392 free " " Chunk ffffffff7ce90218 sz= 184 freeable "kgiobdtb " Chunk ffffffff7ce902d0 sz= 1112 recreate "koh-kghu sessi " latch=0 ds ffffffff7ce9db50 sz= 1112 ct= 1 Chunk ffffffff7ce90728 sz= 2136 freeable "PLS non-lib hp " ds=ffffffff7cf6abd8 Chunk ffffffff7ce90f80 sz= 4288 freeable "qmxdpls_subhea " ds=ffffffff7ce96b78 Chunk ffffffff7ce92040 sz= 4288 freeable "qmxdpls_subhea " ds=ffffffff7ce96b78 Chunk ffffffff7ce93100 sz= 4288 freeable "qmxdpls_subhea " ds=ffffffff7ce96b78 Chunk ffffffff7ce941c0 sz= 4288 freeable "qmxdpls_subhea " ds=ffffffff7ce96b78 Chunk ffffffff7ce95280 sz= 4328 freeable "qmxdpls_subhea " ds=ffffffff7ce96b78 Chunk ffffffff7ce96368 sz= 48 freeable "allocator state" Chunk ffffffff7ce96398 sz= 72 freeable "persistant defi" Chunk ffffffff7ce963e0 sz= 48 freeable "kgbt " Chunk ffffffff7ce96410 sz= 48 freeable "frame segment " Chunk ffffffff7ce96440 sz= 64 freeable "qmxdpls_init_ug" Chunk ffffffff7ce96480 sz= 48 freeable "frame segment " Chunk ffffffff7ce964b0 sz= 72 freeable "frame segment " Chunk ffffffff7ce964f8 sz= 72 freeable "kxsxsi: frame " Chunk ffffffff7ce96540 sz= 1568 recreate "qmxdpls_subhea " latch=0 ds ffffffff7ce96b78 sz= 50681480 ct= 11820 ffffffff779d6940 sz= 4288 ffffffff779d7a00 sz= 4288 ffffffff779d8ac0 sz= 4288 ffffffff779d9b80 sz= 4288 ffffffff779dac40 sz= 4288 ffffffff779dbd00 sz= 4288 ffffffff779dcdc0 sz= 4288 ffffffff779dde80 sz= 4288 ffffffff779def40 sz= 4288 ffffffff779c04c0 sz= 4288 ffffffff779c1580 sz= 4288 ffffffff779c2640 sz= 4288 ffffffff779c3700 sz= 4288 ffffffff779c47c0 sz= 4288 ffffffff779c5880 sz= 4288 ffffffff779c6940 sz= 4288 ffffffff779c7a00 sz= 4288 ... ffffffff7ce93100 sz= 4288 ffffffff7ce941c0 sz= 4288 ffffffff7ce95280 sz= 4328 Chunk ffffffff7ce96b60 sz= 160 freeable "qmxdpls_heapptr" Chunk ffffffff7ce96c00 sz= 232 freeable "lob ctl struct " Chunk ffffffff7ce96ce8 sz= 80 freeable "frame " Chunk ffffffff7ce96d38 sz= 40 freeable "private oac inf" Chunk ffffffff7ce96d60 sz= 128 freeable "bnrdef and uac " Chunk ffffffff7ce96de0 sz= 600 recreate "bind var heap " latch=0 ds ffffffff7ce971f0 sz= 600 ct= 1 Chunk ffffffff7ce97038 sz= 928 freeable "kgiob " Chunk ffffffff7ce973d8 sz= 4160 freeable "koh-kghu sessi " ds=ffffffff7cf65710 Chunk ffffffff7ce98418 sz= 8192 freeable "kdit " Chunk ffffffff7ce9a418 sz= 40 free " " Chunk ffffffff7ce9a440 sz= 8192 freeable "kdit " Chunk ffffffff7ce9c440 sz= 48 freeable "ktatt " Chunk ffffffff7ce9c470 sz= 48 freeable "kdit " Chunk ffffffff7ce9c4a0 sz= 80 freeable "kgicu " Chunk ffffffff7ce9c4f0 sz= 5672 free " " Chunk ffffffff7ce9db18 sz= 2520 freeable "koh-kghu sessio" Chunk ffffffff7ce9e4f0 sz= 48 freeable "frame segment " Chunk ffffffff7ce9e520 sz= 40 freeable "frame segment " Chunk ffffffff7ce9e548 sz= 72 freeable "kxsxsi: frame " Chunk ffffffff7ce9e590 sz= 2464 perm "perm " alo=432 Chunk ffffffff7ce9ef30 sz= 48 freeable "allocator state" Chunk ffffffff7ce9ef60 sz= 80 freeable "frame " Chunk ffffffff7ce9efb0 sz= 128 freeable "bnrdef and uac " Chunk ffffffff7ce9f030 sz= 600 recreate "bind var heap " latch=0 ds ffffffff7ce9f440 sz= 600 ct= 1 Chunk ffffffff7ce9f288 sz= 928 freeable "kgiob " Chunk ffffffff7ce9f628 sz= 2520 freeable "koh-kghu sessio" EXTENT 789 addr=ffffffff7ce30080 Chunk ffffffff7ce30090 sz= 2016 perm "perm " alo=2016 ... Total heap size = 51790440 FREE LISTS: Bucket 0 size=56 ... Bucket 16 size=524312 Bucket 17 size=2097176 Total free space = 870336 UNPINNED RECREATABLE CHUNKS (lru first): PERMANENT CHUNKS: Chunk ffffffff7ce9e590 sz= 2464 perm "perm " alo=432 Chunk ffffffff7ce30090 sz= 2016 perm "perm " alo=2016 Chunk ffffffff7cf70090 sz= 288 perm "perm " alo=288 Chunk ffffffff7cf600a8 sz= 20320 perm "perm " alo=20320 Permanent space = 25088 ****************************************************** </pre> </div></div> <p>DBMS_SESSION.FREE_UNUSED_USER_MEMORY did not help.</p> QA-38 DBMS_XMLPARSER.FREEPARSER doesn't release UGA memory. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Fri, 6 Jun 2008 12:58:12 +0000 (UTC) Fri, 6 Jun 2008 13:35:03 +0000 (UTC) 0 The UGA of PGA had been filled with a big chunk which has recreatable "qmxdpls_subhea". This chunk is 50681480 byte. (See <a href="http://jira.ubtools.com/jira/browse/QA-8" title="Heapdump Interpretation"><del>QA-8</del></a> for the simple definitions of HEAPDUMP). <p>Oracle Note:3518909.8:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Calling Dbms_xmlparser.freeParser / dbms_xmldom.freeDocument in the procudure do not appear to free the memory. The leaked memory shows in heapdumps as "qmxdpls_subheap" </pre> </div></div> <p>Although the mentioned bug fixed in Oracle 9.2.0.6; the customer encounters the same problem in Oracle 9.2.0.8.</p> <p>Since the next usage of DBMS_XMLPARSER.PARSECLOB after a previous DBMS_XMLPARSER.FREEPARSER within the same session, the UGA did not grow. This is acceptable by the customer.</p> Operating System Product Version Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production [QA-37] "ORA-01187: cannot read from file" in one of the RAC Node. http://jira.ubtools.com/jira/browse/QA-37 The one of RAC Nodes encounters the following error codes while no problem occurs on the other node: <p>From ALERT LOG:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Generic Alert Log Error May 2, 2008 11:21:30 PM ORA-12012: error on auto execute of job 8913 ORA-01187: cannot read from file ORA-01187: cannot read from file 96 because it failed verification tests ORA-01110: data file 96: '/u64/oradata/DMSDB/LBPRD_IDX_SKU_005DMSDB.dbf' ORA-06512: at "SYS.PRVT_ADVISOR", line 1624 ORA-06512: at "SYS.DBMS_ADVISOR", line 186 ORA-06512: at "SYS.DBMS_SPACE", line 1347 ORA-06512: at "SYS.DBMS_SPACE", line 1566 because it failed verification tests Trace File: /u00/app/oracle/oracle/admin/DMSDB/bdump/dmsdb2_j000_21660.trc </pre> </div></div> <p>From dmsdb2_j000_21660.trc:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>*** 2008-04-07 23:03:34.750 GATHER_STATS_JOB: GATHER_TABLE_STATS('"LOADOWNER"','"MARKEDPRODUCT"','"MARKEDPRODUCT_20080406"', ...) ORA-01187: cannot read from file 88 because it failed verification tests ORA-01110: data file 88: '/u64/oradata/DMSDB/MRK_IDX_004DMSDB.dbf' </pre> </div></div> QA-37 "ORA-01187: cannot read from file" in one of the RAC Node. Oracle - Administration Major Closed Answered ubTools Support ubTools Support Thu, 8 May 2008 12:27:12 +0000 (UTC) Mon, 12 May 2008 12:37:41 +0000 (UTC) 0 From the trace file, the problem can be reproduced by DBMS_STATS.GATHER_TABLE_STATS(). <p>The following strace output will give the system calls:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>strace -f -o strace.log sqlplus / as sysdba &lt;&lt;EOF exec DBMS_STATS.GATHER_TABLE_STATS('&lt;owner&gt;','&lt;tableName&gt;','&lt;partitionName',1,DEGREE=&gt;2); exit; EOF </pre> </div></div> An excerpt from strace.log that the db files were opened with O_DIRECT flag: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>16822 open("/u51/oradata/DMSDB/system001DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LAR GEFILE) = 14 ... 16822 open("/u31/oradata/DMSDB/ctl1DMSDB.ctl", O_RDWR|O_SYNC|O_DIRECT|O_LARGEFIL E) = 15 16822 open("/u32/oradata/DMSDB/ctl2DMSDB.ctl", O_RDWR|O_SYNC|O_DIRECT|O_LARGEFIL E) = 16 16822 open("/u33/oradata/DMSDB/ctl3DMSDB.ctl", O_RDWR|O_SYNC|O_DIRECT|O_LARGEFIL E) = 17 16822 open("/u52/oradata/DMSDB/undotbs002DMSDB2.dbf", O_RDWR|O_SYNC|O_DIRECT|O_L ARGEFILE) = 18 ... 16822 open("/u70/oradata/DMSDB/TEMPDMSDB_002.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LARG EFILE) = 20 16822 open("/u70/oradata/DMSDB/TEMPDMSDB_002.dbf", O_RDWR|O_DIRECT|O_LARGEFILE) = 21 ... 16822 open("/u52/oradata/DMSDB/sysaux001DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LAR GEFILE) = 28 16822 open("/u51/oradata/DMSDB/system002DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LAR GEFILE) = 29 16822 open("/u55/oradata/DMSDB/undotbs001DMSDB1.dbf", O_RDWR|O_SYNC|O_DIRECT|O_L ARGEFILE) = 30 16822 open("/u61/oradata/DMSDB/MRK_IDX_001DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_L ARGEFILE) = 31 ... </pre> </div></div> <p>The second excerpt from strace.log that the db files were opened without O_DIRECT flag:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>16822 open("/u65/oradata/DMSDB/dor_data_200805_001_DMSDB.dbf", O_RDWR|O_SYNC|O_L ARGEFILE) = 19 16822 open("/u65/oradata/DMSDB/dor_data_200805_002_DMSDB.dbf", O_RDWR|O_SYNC|O_L ARGEFILE) = 27 ... 16822 open("/u64/oradata/DMSDB/MRK_IDX_004DMSDB.dbf", O_RDWR|O_SYNC|O_LARGEFILE) = 32 </pre> </div></div> The customer uses OCFS2. <p>O_DIRECT flag of open() system call bypasses File System(FS) cache; and DISK-IO occurs between user address space and disk.</p> <p>OCFS opens dbfiles with O_DIRECT flag to eliminate inconsistency among FS caches of nodes. Since RAC provides consistency among SGAs and there will be no db buffers is FS cache, no consistency problem occurs.</p> <p>From Ref: Oracle Note:391771.1:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>48. Any special flags to run Oracle RAC? OCFS2 volumes containing the Voting diskfile (CRS), Cluster registry (OCR), Data files, Redo logs, Archive logs and Control files must be mounted with the datavolume and nointr mount options. The datavolume option ensures that the Oracle processes opens these files with the o_direct flag. The nointr option ensures that the ios are not interrupted by signals. # mount -o datavolume,nointr -t ocfs2 /dev/sda1 /u01/db </pre> </div></div> <p>The customer was not using the <em>datavolume,nointr</em> option. After mounting with the <em>datavolume,nointr</em>, the problem has been solved.</p> Operating System Operating System Version RHEL 4 Product Version Oracle 10g 10.2.0.3 [QA-36] Who is the inventor of Response Time Analysis(RTA) in Oracle ? http://jira.ubtools.com/jira/browse/QA-36 This issue moved to <span class="nobr"><a href="http://www.ubtools.com/web/public/resources/logs/rta_inventor">http://www.ubtools.com/web/public/resources/logs/rta_inventor<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span>. QA-36 Who is the inventor of Response Time Analysis(RTA) in Oracle ? Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Tue, 8 Apr 2008 09:14:55 +0000 (UTC) Thu, 1 May 2008 06:43:55 +0000 (UTC) 0 Operating System Product Version X [QA-35] ORA-00600 [kturrur11], [65535], [0]: Instance crashed. http://jira.ubtools.com/jira/browse/QA-35 The instance crashes with the following error code: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [kturrur11], [65535], [0], [], [], [], [], [] </pre> </div></div> <p><b>Stack trace:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedst+001c bl ksedst1 FFFFFFFFFFFA3B0 ? 000000000 ? ksedmp+0290 bl ksedst 1047C9C10 ? ksfdmp+0018 bl 03F53584 kgerinv+00dc bl _ptrgl kgeasnmierr+0040 bl kgerinv 0FFFFFFFF ? 0000000BA ? FFFFFFFFFFFAD18 ? FFFFFFFFFFFAD20 ? FFFFFFFFFFFAC08 ? kturgmbu+02b8 bl kgeasnmierr 000085188 ? 000000000 ? 000550021 ? 200000002 ? 000000000 ? 00000FFFF ? 000000000 ? 000000000 ? kturrur+01c8 bl kturgmbu 1001AEFA8 ? 70000020F745AC0 ? 0000F4240 ? 104BF4B00 ? 000085188 ? FFFFFFFFFFFA950 ? 70000020A3BC630 ? 1102B0D58 ? ktundo+016c bl kturrur 1102B0D58 ? 000000000 ? 100000000 ? FFFFFFFFFFFA9D0 ? FFFFFF00000003 ? 15453015F ? 000000000 ? 000000000 ? ktubko+0794 bl ktundo 1FFFFBC80 ? 5B16B1B30F745928 ? 1001DD940 ? 70000020A3C8EA0 ? 000000528 ? FFFFFFFFFFFBE08 ? 400000000 ? FFFFFFFFFFFBF08 ? kturrt+15fc bl ktubko 70000020A3C52F0 ? 600000000 ? 000000000 ? 0E8E65D60 ? 3B008CEC89 ? 55002100000000 ? kturec+0dcc bl kturrt FFFFFFFFFFFC528 ? 21000000000000 ? 1FFFFC5E0 ? 000000000 ? 0000009A0 ? 40A288838 ? 110021A88 ? kturax+0300 bl kturec 5522880400 ? 000000000 ? 19E370001 ? 000000001 ? FFFFFFFFFFFFCAC0 ? 11FFFFFEFF ? 400000000 ? ktprbeg+02b0 bl kturax 10FDB1B2B0 ? 004AD8530 ? ktmmon+0ebc bl ktprbeg 080000000 ? ktmSmonMain+0030 bl ktmmon 000000000 ? ksbrdp+03e0 bl _ptrgl opirip+03fc bl 01FC66A0 opidrv+0448 bl opirip 1103BD070 ? 4103BE990 ? FFFFFFFFFFFF860 ? sou2o+0090 bl opidrv 32023373FC ? 400000020 ? FFFFFFFFFFFF860 ? opimai_real+0150 bl 01FC0DF4 main+0098 bl opimai_real 000000000 ? 000000000 ? __start+0090 bl main 000000000 ? 000000000 ? </pre> </div></div> QA-35 ORA-00600 [kturrur11], [65535], [0]: Instance crashed. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Thu, 8 Nov 2007 08:36:27 +0000 (UTC) Thu, 8 Nov 2007 12:29:43 +0000 (UTC) 0 <b>From Oracle Note:4940513.8:</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Bug 4940513 OERI[kturrur11] can occur with multi block undo This note gives a brief overview of bug 4940513. Affects: Product (Component) Oracle Server (Rdbms) Range of versions believed to be affected Versions &lt; 11 Versions confirmed as being affected * 9.2.0.6 * 9.2.0.7 * 10.1.0.4 * 10.1.0.5 * 10.2.0.2 Platforms affected Generic (all / most platforms affected) Fixed: This issue is fixed in * 9.2.0.8 (Server Patch Set) * 10.2.0.3 (Server Patch Set) * 11g (Future version) Symptoms: Related To: * Internal Error May Occur (ORA-600) * ORA-600 [kturrur11] * (None Specified) Description In rare situations the server could raise ORA-600 [kturrur11][65535][0] Workaround: Avoid the multi block undo code path by making sure that the block size in the undo tablespace is large enough to accomodate the largest column that is changed by any SQL statement. If the block size of the data tablespaces is larger than the block size of the undo tablespace, increase the blocksize of the undo tablespace to that of the data tablespace. </pre> </div></div> <b>Workaround:</b> <p>Drop segments which need recovery.</p> <p><b>Finding the UNDO segment from ALERT LOG:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Errors in file /product/10g/admin/DTWP/bdump/dtwp1_smon_7024648.trc: ORA-00600: internal error code, arguments: [kturrur11], [65535], [0], [], [], [], [], [] replication_dependency_tracking turned off (no async multimaster replication found) Sat Nov 3 13:19:38 2007 ORACLE Instance DTWP1 (pid = 15) - Error 600 encountered while recovering transaction (85, 33). Sat Nov 3 13:19:38 2007 Errors in file /product/10g/admin/DTWP/bdump/dtwp1_smon_7024648.trc: ORA-00600: internal error code, arguments: [kturrur11], [65535], [0], [], [], [], [], [] </pre> </div></div> <p>SMON is trying to rollback a transaction in (UNDOSEGMENT#85, UNDOSLOT#33).</p> <p><b>Identifiying UNDO segment:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>select segment_name,owner,tablespace_name from dba_rollback_segs where segment_id=85; SEGMENT_NAME OWNER TABLESPACE_NAME ------------------------------ ------ ------------------------------ _SYSSMU85$ PUBLIC UNDOTBS1 </pre> </div></div> <p><b>Undo block in the SMON trace:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>******************************************************************************** UNDO BLK: xid: 0x0055.021.00085188 seq: 0xffff cnt: 0x1 irb: 0x1 icl: 0x0 flg: 0x0000 Rec Offset Rec Offset Rec Offset Rec Offset Rec Offset --------------------------------------------------------------------------- 0x01 0x0018 *----------------------------- * Rec #0x1 slt: 0x21 objn: 125213(0x0001e91d) objd: 564547 tblspc: 10(0x0000000a) * Layer: 5 (Transaction Undo) opc: 1 rci 0x00 Undo type: Multi-block undo Mid-piece Last buffer split: Yes Temp Object: No Tablespace Undo: No rdba: 0x5b16b1af *----------------------------- </pre> </div></div> <p>Transaction ID: xid: 0x0055.021.00085188</p> <p>Hexadecimal 55 = Decimal 85<br/> Hexadecimal 21 = Decimal 33</p> <p>That means this UNDO block is the block which SMON is reading to rollback a segment.</p> <p><b>irb</b> points to last UNDO RECORD in UNDO block. <b>rci</b> points to previous UNDO RECORD. if rci=0, it's the first UNDO RECORD. Recovery operation starts from irb and chain is followed by rci until rci is zero.</p> <p>In this case, the UNDO block includes just one UNDO RECORD. This UNDO RECORD inludes UNDO DATA for object#125213.</p> <p><b>Object needs recovery:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>select owner,object_name,object_type from dba_objects where object_id=125213 Owner : OWBRUN Object_name : sm_post_ind2 Object_type INDEX </pre> </div></div> <p>Index dropped. But problem did not disappear. Then, it's decided to drop this UNDO segment after identifiying all objects in.</p> <p><b>Reading Transaction Table in the UNDO header:</b></p> <p>ALTER SYSTEM DUMP UNDO HEADER '_SYSSMU85$';</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... ******************************************************************************** Undo Segment: _SYSSMU85$ (85) ******************************************************************************** ... TRN TBL:: index state cflags wrap# uel scn dba parent-xid nub stmt_num cmt ------------------------------------------------------------------------------------------------ 0x00 9 0x00 0x85435 0xffff 0x0847.4024f907 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x01 9 0x00 0x84b3c 0x0004 0x0847.4024f89b 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x02 9 0x00 0x85237 0x0006 0x0847.4024f895 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x03 9 0x00 0x85406 0x0011 0x0847.4024f877 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x04 9 0x00 0x851d9 0x000a 0x0847.4024f89e 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x05 9 0x00 0x85234 0x002f 0x0847.4024f881 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x06 9 0x00 0x8543f 0x002b 0x0847.4024f897 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x07 9 0x00 0x850ce 0x002e 0x0847.4024f8ac 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x08 9 0x00 0x853f3 0x001a 0x0847.4024f88a 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x09 9 0x00 0x85188 0x001f 0x0847.4024f87d 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x0a 9 0x00 0x84f75 0x0014 0x0847.4024f8a0 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x0b 9 0x00 0x832f2 0x0007 0x0847.4024f8aa 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x0c 9 0x00 0x85313 0x001e 0x0847.4024f8d2 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x0d 9 0x00 0x85320 0x000c 0x0847.4024f8d0 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x0e 9 0x00 0x849fb 0x0012 0x0847.4024f890 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x0f 9 0x00 0x8530c 0x000e 0x0847.4024f88e 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x10 9 0x00 0x84ac9 0x0015 0x0847.4024f8de 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x11 9 0x00 0x854f4 0x0009 0x0847.4024f87a 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x12 9 0x00 0x84ce9 0x0002 0x0847.4024f892 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x13 9 0x00 0x85220 0x001b 0x0847.4024f8c4 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x14 9 0x00 0x85119 0x001d 0x0847.4024f8a2 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x15 9 0x00 0x8540c 0x0025 0x0847.4024f8e0 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x16 9 0x00 0x85177 0x0017 0x0847.4024f8ea 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x17 9 0x00 0x84f02 0x002a 0x0847.4024f8ec 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x18 9 0x00 0x84e2d 0x0027 0x0847.4024f8f7 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x19 9 0x00 0x8537a 0x0020 0x0847.4024f8a6 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x1a 9 0x00 0x8530b 0x000f 0x0847.4024f88c 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x1b 9 0x00 0x841bc 0x0029 0x0847.4024f8c6 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x1c 9 0x00 0x852a9 0x002d 0x0847.4024f8fb 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x1d 9 0x00 0x84d24 0x0019 0x0847.4024f8a4 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x1e 9 0x00 0x85419 0x0010 0x0847.4024f8d3 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x1f 9 0x00 0x84ea2 0x0005 0x0847.4024f87f 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x20 9 0x00 0x853a5 0x000b 0x0847.4024f8a8 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x21 10 0x10 0x85188 0x0919 0x0847.4024f8fd 0x5b16b1b3 0x0000.000.00000000 0x00000002 0x00000000 0 0x22 9 0x00 0x85279 0x002c 0x0847.4024f886 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x23 9 0x00 0x847b0 0x0028 0x0847.4024f8bb 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x24 9 0x00 0x851cf 0x0023 0x0847.4024f8b9 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x25 9 0x00 0x84a9c 0x0016 0x0847.4024f8e1 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x26 10 0x90 0x7539b 0x0003 0x0847.3fcdcb21 0x4f81d3d2 0x0000.000.00000000 0x0000dbd9 0x00000000 0 0x27 9 0x00 0x850ac 0x001c 0x0847.4024f8f9 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x28 9 0x00 0x8531b 0x0013 0x0847.4024f8bc 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x29 9 0x00 0x854f0 0x000d 0x0847.4024f8c7 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x2a 9 0x00 0x85301 0x0018 0x0847.4024f8ed 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x2b 9 0x00 0x83c38 0x0001 0x0847.4024f899 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x2c 9 0x00 0x85051 0x0008 0x0847.4024f888 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x2d 9 0x00 0x84a3c 0x0000 0x0847.4024f904 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x2e 9 0x00 0x84f35 0x0024 0x0847.4024f8ae 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 0x2f 9 0x00 0x85100 0x0022 0x0847.4024f884 0x00000000 0x0000.000.00000000 0x00000000 0x00000000 1194066440 </pre> </div></div> <p><b>State#10</b> means active transaction. <b>dba</b> points to starting UNDO block address.</p> <p>There are 2 active transactions. The one of them points to the slot of 0x21, which is the same as seen in the SMON trace that causes this ORA-600 <span class="error">&#91;kturrur11&#93;</span> error. The other active transaction is available in the slot of 0x26, which has a dba of 0x4f81d3d2.</p> <p>The object in the slot of 0x21 had been found above; but the object in slot of 0x26 is not known yet.</p> <p><b>Object needs recovery:</b></p> <blockquote> <p>Hexadecimal 4f81d3d2 = Decimal 1333908434</p> <p>select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(1333908434) from x$dual;<br/> 318</p> <p>select DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(1333908434) from x$dual;<br/> 119762</p> <p>alter system dump datafile 318 block 119762;</p></blockquote> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>... *** SESSION ID:(489.37) 2007-11-03 16:23:56.878 Start dump data blocks tsn: 1 file#: 318 minblk 119762 maxblk 119762 buffer tsn: 1 rdba: 0x4f81d3d2 (318/119762) ... UNDO BLK: xid: 0x0055.026.0007539b seq: 0xf72d cnt: 0x5b irb: 0x1 icl: 0x0 flg: 0x0000 Rec Offset Rec Offset Rec Offset Rec Offset Rec Offset --------------------------------------------------------------------------- 0x01 0x1f8c 0x02 0x1f30 0x03 0x1ed4 0x04 0x1e78 0x05 0x1e1c 0x06 0x1dc0 0x07 0x1d64 0x08 0x1d08 0x09 0x1cac 0x0a 0x1c50 0x0b 0x1bf4 0x0c 0x1b9c 0x0d 0x1b48 0x0e 0x1af0 0x0f 0x1a98 0x10 0x1a40 0x11 0x19e8 0x12 0x1990 0x13 0x193c 0x14 0x18e4 0x15 0x1890 0x16 0x1838 0x17 0x17e0 0x18 0x178c 0x19 0x1738 0x1a 0x16e0 0x1b 0x1688 0x1c 0x1634 0x1d 0x15dc 0x1e 0x1584 0x1f 0x152c 0x20 0x14d4 0x21 0x147c 0x22 0x1424 0x23 0x13cc 0x24 0x1374 0x25 0x131c 0x26 0x12c4 0x27 0x126c 0x28 0x1214 0x29 0x11bc 0x2a 0x1168 0x2b 0x1110 0x2c 0x10b8 0x2d 0x1064 0x2e 0x1010 0x2f 0x0fbc 0x30 0x0f64 0x31 0x0f0c 0x32 0x0eb8 0x33 0x0e60 0x34 0x0e0c 0x35 0x0db8 0x36 0x0d64 0x37 0x0d0c 0x38 0x0cb4 0x39 0x0c60 0x3a 0x0c08 0x3b 0x0bb0 0x3c 0x0b58 0x3d 0x0b04 0x3e 0x0aac 0x3f 0x0a58 0x40 0x0a04 0x41 0x09ac 0x42 0x0954 0x43 0x08fc 0x44 0x08a4 0x45 0x0850 0x46 0x07f8 0x47 0x07a0 0x48 0x074c 0x49 0x06f4 0x4a 0x06a0 0x4b 0x0648 0x4c 0x05f0 0x4d 0x0598 0x4e 0x0540 0x4f 0x04e8 0x50 0x0490 0x51 0x0438 0x52 0x03e0 0x53 0x0388 0x54 0x0334 0x55 0x02dc 0x56 0x0284 0x57 0x022c 0x58 0x01d4 0x59 0x0180 0x5a 0x012c 0x5b 0x00d4 *----------------------------- </pre> </div></div> <p>irb points to the UNDO RECORD of 0x1.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>*----------------------------- * Rec #0x1 slt: 0x26 objn: 125212(0x0001e91c) objd: 564548 tblspc: 10(0x0000000a) * Layer: 10 (Index) opc: 22 rci 0x00 Undo type: Regular undo User Undo Applied Last buffer split: No Temp Object: No Tablespace Undo: No rdba: 0x4f81d3d1 *----------------------------- ... </pre> </div></div> <p>rci of UNDO RECORD of 0x1 is 0x00. That means this is the first and last UNDO RECORD.</p> <p>Object ID in this UNDO RECORD is 125212.</p> <p>SQL&gt; select owner,object_name,object_type from dba_objects where object_id in (125213,125212);</p> <p>OWNER<br/> ------------------------------<br/> OBJECT_NAME<br/> -------------------------------------------------------------------------------- <br/> OBJECT_TYPE<br/> -------------------<br/> OWBRUN<br/> SM_POST_IND1<br/> INDEX</p> <p>It's another lucky object that its type is INDEX. This index dropped. Now, after being sure that there is no new active transactions in this UNDO segment, the followings were done:</p> <ul class="alternate" type="square"> <li>Shutdown the database</li> <li>Set the following parameter to PFILE/SPFILE:</li> </ul> <p> _smu_debug_mode=4<br/> _offline_rollback_segments=(_SYSSMU85$)</p> <ul class="alternate" type="square"> <li>Startup the database</li> <li>drop rollback segment "_SYSSMU85$";</li> </ul> <p>After UNDO segment is successfuly dropped, the INTERNAL parameters above should be removed. But, in our case, while dropping UNDO segment, although the current internal error (ORA-600) <span class="error">&#91;kturrur11&#93;</span>) disappeared; the another internal error (ORA-600 <span class="error">&#91;kddummy_blkchk&#93;</span>) was encountered. It's created as another issue as <a href="http://jira.ubtools.com/jira/browse/QA-34" title="ORA-00600 [kddummy_blkchk] while dropping UNDO segment."><del>QA-34</del></a>.</p> <p>Since all objects needing recovery in the UNDO segment were dropped, there is no need to re-create the database after using _<em>offline_rollback_segments</em> parameter.</p> Operating System Product Version Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production [QA-34] ORA-00600 [kddummy_blkchk] while dropping UNDO segment. http://jira.ubtools.com/jira/browse/QA-34 While dropping an offlined UNDO segment <em>(by _offline_rollback_segments)</em>, the following error appeared: <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; drop rollback segment "_SYSSMU85$"; &gt; &gt; drop rollback segment "_SYSSMU85$" &gt; * &gt; ERROR at line 1: &gt; ORA-00607: Internal error occurred while making a change to a data block &gt; ORA-00600: internal error code, arguments: [kddummy_blkchk], [2], [846985], &gt; [38508], [], [], [], [] </pre> </div></div> <p>Then, the instance crashed. After re-starting the instance, it crashed again.</p> <p><b>Stack trace:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [kddummy_blkchk], [2], [846985], [38508], [], [], [], [] ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedst+001c bl ksedst1 FFFFFFFFFFF9D10 ? 000000000 ? ksedmp+0290 bl ksedst 1047C9C10 ? ksfdmp+0018 bl 03F53584 kgerinv+00dc bl _ptrgl kseinpre+0040 bl kgerinv 110040AA0 ? 000000000 ? 1048470A0 ? 07FFFFFFF ? 000000000 ? ksesin+0048 bl kseinpre 1048470A0 ? 07FFFFFFF ? 000000000 ? kco_blkchk+0778 bl ksesin 10484752C ? 300000003 ? 000000000 ? 000000002 ? 000000000 ? 0000CEC89 ? 000000000 ? 00000966C ? kcoapl+0d24 bl kco_blkchk FFF00FFFFFFA310 ? 284422800B4E4358 ? 102FD4FDC ? 7000001F5151F50 ? 000000080 ? kcbapl+0178 bl kcoapl FFFFFFFFFFFC218 ? 7000001E815A000 ? 100000001 ? 7FFFFFFF000000F7 ? 200000000000 ? 20BD260C8 ? 000000000 ? kcrfw_redo_gen+2964 bl kcbapl 000000000 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ? kcbchg1_main+25e0 bl kcrfw_redo_gen 102DA5AC7 ? 2D30491F0A2889F8 ? FFFFFFFFFFFAB20 ? 700000010008000 ? 1000024A4 ? 000000001 ? 400000000000001 ? 000000000 ? kcbchg1+038c bl kcbchg1_main 000000000 ? 0000001F4 ? 000000000 ? 110366678 ? 0000023A0 ? 70000020B29AFD8 ? ktbchgro+0380 bl kcbchg1 00A288AD0 ? 30A2889FA ? FFFFFFFFFFFB620 ? FFFFFFFFFFFB658 ? 000000000 ? 000000000 ? ktfbapp+0044 bl ktbchgro 000000000 ? 300000003 ? FFFFFFFFFFFCB48 ? FFFFFFFFFFFC218 ? FFFFFFFFFFFBFD8 ? FFFFFFFFFFFC0B0 ? FFFFFFFFFFFC4D8 ? FFFFFFFFFFFC5B0 ? kteopgen+00ec bl ktfbapp 000000000 ? FFFFFFFFFFFC218 ? 044244040 ? 0FFFFFFFF ? FFFFFFFFFFFBFD8 ? kteopdelete+1468 bl kteopgen FFFFFFFFFFFCB48 ? 000000000 ? FFFFFFFFFFFBFD8 ? FFFFFFFFFFFC140 ? FFFFFFFFFFFC218 ? FFFFFFFFFFFC0B0 ? 000000000 ? 1101EBDCC ? ktsxfastdele+0118 bl kteopdelete 700000209E9B238 ? 100000001 ? 100B5A770 ? 000000000 ? FFFFFFFFFFFC270 ? 000000000 ? 000000000 ? kteopshrink+0308 bl 01FC21A0 ktssdrbm_segment+0a bl kteopshrink 100000001 ? FFFFFFFFFFFCAD8 ? f8 000000001 ? 000000001 ? 0000001A0 ? 700000200C016A0 ? 000000000 ? ktssdro_segment+06c bl ktssdrbm_segment FFFFFFFFFFFD498 ? 8 FFFFFFFFFFFD560 ? 100008043 ? 1FFFFFFFF ? ktssdt_segs+0350 bl ktssdro_segment 70000020A3C52F0 ? 600007530 ? 0001DCE78 ? ktmmon+1048 bl ktssdt_segs 000000000 ? 7FFFFFFF7FFFFFFF ? 7FFFFFFF7FFFFFFF ? 000000000 ? 000000000 ? 000000000 ? 7FFFFFFC7FFFFFFC ? 0472CD5BE ? ktmSmonMain+0030 bl ktmmon 000000000 ? ksbrdp+03e0 bl _ptrgl opirip+03fc bl 01FC66A0 opidrv+0448 bl opirip 1103BD070 ? 4103BE990 ? FFFFFFFFFFFF860 ? sou2o+0090 bl opidrv 32023373FC ? 400000020 ? FFFFFFFFFFFF860 ? opimai_real+0150 bl 01FC0DF4 main+0098 bl opimai_real 000000000 ? 000000000 ? __start+0090 bl main 000000000 ? 000000000 ? </pre> </div></div> <p><b>UNDO segment status:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SQL&gt; select status$ from undo$ where us#=85; STATUS$ ---------- 1 </pre> </div></div> <p><b>UNDO$ structure from $ORACLE_HOME/rdbms/admin/sql.bsq:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>create table undo$ /* undo segment table */ ( us# number not null, /* undo segment number */ name varchar2("M_IDEN") not null, /* name of this undo segment */ user# number not null, /* owner: 0 = SYS(PRIVATE), 1 = PUBLIC */ file# number not null, /* segment header file number */ block# number not null, /* segment header block number */ scnbas number, /* highest commit time in rollback segment */ scnwrp number, /* scnbas - scn base, scnwrp - scn wrap */ xactsqn number, /* highest transaction sequence number */ undosqn number, /* highest undo block sequence number */ inst# number, /* parallel server instance that owns the segment */ status$ number not null, /* segment status (see KTS.H): */ /* 1 = INVALID, 2 = AVAILABLE, 3 = IN USE, 4 = OFFLINE, 5 = NEED RECOVERY, * 6 = PARTLY AVAILABLE (contains in-doubt txs) */ ts# number, /* tablespace number */ ugrp# number, /* The undo group it belongs to */ keep number, optimal number, flags number, spare1 number, spare2 number, spare3 number, spare4 varchar2(1000), spare5 varchar2(1000), spare6 date ) </pre> </div></div> <p>status$=1 means INVALID or DOES NOT EXIST. That means the UNDO segment doesn't exist.</p> QA-34 ORA-00600 [kddummy_blkchk] while dropping UNDO segment. Oracle - Internals Blocker Closed Answered ubTools Support ubTools Support Thu, 8 Nov 2007 07:57:20 +0000 (UTC) Thu, 8 Nov 2007 12:39:27 +0000 (UTC) 0 Since the UNDO segment doesn't exist, the most <b>probably</b> its type is converted to TEMP. After setting the following event in the SPFILE/PFILE, the problem disappeared. <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>event="10061 trace name context forever, level 10" </pre> </div></div> <p>This event disables SMON from cleaning temp segment.</p> The current UNDO TABLESPACE was dropped, and a new one has been created. Then, Event 10061 has been removed. Operating System Product Version Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production [QA-31] How did Oracle compute the selectivity on index ? http://jira.ubtools.com/jira/browse/QA-31 The customer wanted to know how Oracle computes the selectivity on index IC_TRAN_PNDI1. They're not sure if Oracle optimizer computes correct. QA-31 How did Oracle compute the selectivity on index ? Oracle - SQL Tuning Major Closed Answered ubTools Support ubTools Support Sat, 15 Sep 2007 11:09:55 +0000 (UTC) Wed, 30 Sep 2015 14:34:51 +0000 (UTC) 0 Event 10053 trace file. SQLTXPLAIN report. SQLTXPLAIN report. <b>BASE STATISTICAL INFORMATION</b> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> *********************** Table stats Table: IC_TRAN_PND Alias: T (Using composite stats) TOTAL :: CDN: 34357548 NBLKS: 737250 AVG_ROW_LEN: 143 -- Index stats INDEX NAME: IC_TRAN_PNDI1 COL#: 2 7 6 8 TOTAL :: LVLS: 3 #LB: 341316 #DK: 113700 LB/K: 3 DB/K: 248 CLUF: 28283677 ... *********************** </pre> </div></div> <p><b>Definition of BASE STATISTICAL INFORMATION</b></p> <blockquote> <p>CDN: Cardinality, number of rows.<br/> NBLKS: Number of blocks.<br/> AVG_ROW_LEN: Average row length.</p> <p>COL#: Column numbers in order.<br/> LVLS: Index depth.<br/> #LB: Number of leaf blocks.<br/> #DK: Number of distinct keys.<br/> LB/K: Leaf blocks per key.<br/> DB/K: Data bloks per key.<br/> CLUF: Clustering factor.</p></blockquote> <p><b>SINGLE TABLE ACCESS PATH</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Column: DELETE_MAR Col#: 28 Table: IC_TRAN_PND Alias: T NDV: 3 NULLS: 0 DENS: 3.3333e-01 LO: 0 HI: 2 NO HISTOGRAM: #BKT: 1 #VAL: 2 Column: COMPLETED_ Col#: 24 Table: IC_TRAN_PND Alias: T NDV: 2 NULLS: 0 DENS: 5.0000e-01 LO: 0 HI: 1 NO HISTOGRAM: #BKT: 1 #VAL: 2 Column: TRANS_QTY Col#: 16 Table: IC_TRAN_PND Alias: T NDV: 62267 NULLS: 0 DENS: 1.6060e-05 LO: -3116050 HI: 150907016871 NO HISTOGRAM: #BKT: 1 #VAL: 2 Column: ITEM_ID Col#: 2 Table: IC_TRAN_PND Alias: T NDV: 4527 NULLS: 0 DENS: 2.2090e-04 LO: 3 HI: 9816 NO HISTOGRAM: #BKT: 1 #VAL: 2 Column: WHSE_CODE Col#: 6 Table: IC_TRAN_PND Alias: T NDV: 105 NULLS: 0 DENS: 9.5238e-03 NO HISTOGRAM: #BKT: 1 #VAL: 2 Column: LOT_ID Col#: 7 Table: IC_TRAN_PND Alias: T NDV: 13443 NULLS: 0 DENS: 7.4388e-05 LO: 0 HI: 46635 NO HISTOGRAM: #BKT: 1 #VAL: 2 Column: LOCATION Col#: 8 Table: IC_TRAN_PND Alias: T NDV: 31 NULLS: 0 DENS: 3.2258e-02 NO HISTOGRAM: #BKT: 1 #VAL: 2 TABLE: IC_TRAN_PND ORIG CDN: 34357548 ROUNDED CDN: 1 CMPTD CDN: 0 ... </pre> </div></div> <p><b>Definition of SINGLE TABLE ACCESS PATH</b></p> <blockquote> <p>NDV: Number of distinct values.<br/> NULLS: Number of NULLs.<br/> DENS: Density.<br/> LO: Lowest value for numeric columns.<br/> HI: Highest value for numeric columns.<br/> ...</p></blockquote> According to the execation plan, these are the predicates: <p><ins>Access Predicates</ins>:</p> <blockquote> <p>T.ITEM_ID=5125<br/> AND T.LOT_ID=L.LOT_ID<br/> AND T.WHSE_CODE='350'</p></blockquote> <p><ins>Filter Predicates</ins>:</p> <blockquote> <p>B.ITEM_ID=T.ITEM_ID<br/> AND T.LOT_ID&gt;0<br/> AND T.LOCATION&lt;&gt;'NONE'</p></blockquote> <p>Column order of IC_TRAN_PNDI1:</p> <ul class="alternate" type="square"> <li>ITEM_ID</li> <li>LOT_ID</li> <li>WHSE_CODE</li> <li>LOCATION</li> </ul> According to the execution plan, T.LOT_ID is joined with L.LOT_ID. That means T.LOT_ID gets values in the join. So, accessing the index consists of the following columns: <ul class="alternate" type="square"> <li>ITEM_ID</li> <li>LOT_ID</li> <li>WHSE_CODE</li> </ul> <p>That's why the access predicates consist of these columns. T.LOCATION&lt;&gt;'NONE' is not included in access predicates. Because, &lt;&gt; can not be used accessing index.</p> <p>After accessing index by access predicates, filter operation starts by filter predicates in order to eliminate rows on index without going to table. Additionally, T.LOCATION&lt;&gt;'NONE' is used in filter predicates to filter index keys on index.</p> <font color="red">Note:</font> Since there is no NULL/histogram in our IC_TRAN_PNDI1 index columns and all predicates are ANDed, we did not cover other situations for selectivity computations. <p><ins>Selectivity of access predicates</ins>:</p> <table class='confluenceTable'><tbody> <tr> <th class='confluenceTh'>Column</th> <th class='confluenceTh'>Operation</th> <th class='confluenceTh'>Formula</th> <th class='confluenceTh'>Value</th> </tr> <tr> <td class='confluenceTd'>ITEM_ID</td> <td class='confluenceTd'>=</td> <td class='confluenceTd'>1/NDV=DENS</td> <td class='confluenceTd'>2.2090e-04</td> </tr> <tr> <td class='confluenceTd'>LOT_ID</td> <td class='confluenceTd'>=</td> <td class='confluenceTd'>1/NDV=DENS</td> <td class='confluenceTd'>7.4388e-05</td> </tr> <tr> <td class='confluenceTd'>WHSE_CODE</td> <td class='confluenceTd'>=</td> <td class='confluenceTd'>1/NDV=DENS</td> <td class='confluenceTd'>9.5238e-03</td> </tr> </tbody></table> <p>Since the columns are ANDed, combined selectivity means:</p> <p>= Sel(ITEM_ID)*Sel(LOT_ID)*Sel(WHSE_CODE)<br/> = 2.2090e-04*7.4388e-05*9.5238e-03<br/> = <font color="red">1.5649e-10</font></p> <p><ins>Selectivity of filter predicates</ins>:</p> <p>After accessing the index, filter operation will start. In our case, access predicates will also be used in filter operation to eliminate rows in the index. Because, their values are known, and can be used in filter operation.</p> <p>But, their selectivity will not be re-computed, since they are already computed to access the index. So, T.LOT_ID&gt;0 in filter predicates doesn't make sense even if its operation is not an equal operation as in access predicates.</p> <table class='confluenceTable'><tbody> <tr> <th class='confluenceTh'>Column</th> <th class='confluenceTh'>Operation</th> <th class='confluenceTh'>Formula</th> <th class='confluenceTh'>Value</th> </tr> <tr> <td class='confluenceTd'>LOCATION</td> <td class='confluenceTd'>&lt;&gt;</td> <td class='confluenceTd'>1-(1/NDV=DENS)</td> <td class='confluenceTd'>1-3.2258e-02=0.967742</td> </tr> </tbody></table> <p>Since the columns are ANDed, combined selectivity means:</p> <p>= Sel(ITEM_ID)*Sel(LOT_ID)*Sel(WHSE_CODE)*Sel(LOCATION)<br/> = 2.2090e-04*7.4388e-05*9.5238e-03*0.967742<br/> = <font color="red">1.5144e-10</font></p> Interpreting Event 10053 trace file is need to see if optimizer computation and ours match. <p><b>Final cost at the bottom:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Final - All Rows Plan: JOIN ORDER: 1 CST: 29 CDN: 2 RSC: 28 RSP: 28 BYTES: 308 IO-RSC: 28 IO-RSP: 28 CPU-RSC: 0 CPU-RSP: 0 </pre> </div></div> <p><font color="red">The final cost is 29.</font> </p> <p><b>Going backward lines to break down the final cost of 29:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> BASE STATISTICAL INFORMATION *********************** Table stats Table: IC_ITEM_INV_V_SIL Alias: X TOTAL :: (NOT ANALYZED) CDN: 0 NBLKS: 0 AVG_ROW_LEN: 0 _OPTIMIZER_PERCENT_PARALLEL = 0 BEST_CST: 13.00 PATH: 2 Degree: 1 </pre> </div></div> <p><font color="red">The cost of IC_ITEM_INV_V_SIL is 13.</font></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> GENERAL PLANS *********************** Join order[1]: IC_ITEM_INV_V_SIL[X]#0 GROUP BY sort GROUP BY cardinality: 1, TABLE cardinality: 2 HAVING selectivity: 5.0000e-02 -&gt; GROUPS: 1 SORT resource Sort statistics Sort width: 299 Area size: 1048576 Max Area size: 104857600 Degree: 1 Blocks to Sort: 1 Row size: 180 Rows: 2 Initial runs: 1 Merge passes: 1 IO Cost / pass: 30 Total IO sort cost: 16 Total CPU sort cost: 0 Total Temp space used: 0 Best so far: TABLE#: 0 CST: 29 CDN: 2 BYTES: 308 SORT resource Sort statistics Sort width: 299 Area size: 1048576 Max Area size: 104857600 Degree: 1 Blocks to Sort: 1 Row size: 180 Rows: 2 Initial runs: 1 Merge passes: 1 IO Cost / pass: 30 Total IO sort cost: 16 Total CPU sort cost: 0 Total Temp space used: 0 .. </pre> </div></div> <p><font color="red">The cost of sorting IC_ITEM_INV_V_SIL is 16.</font> </p> <p>Total cost (29) = Accessing IC_ITEM_INV_V_SIL (13) + Sorting IC_ITEM_INV_V_SIL (16)</p> <p><b>Going backward lines to break down the cost of 13:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Join result: cost: 7 cdn: 1 rcz: 98 Best so far: TABLE#: 0 CST: 1 CDN: 1 BYTES: 4 Best so far: TABLE#: 1 CST: 1 CDN: 1 BYTES: 11 Best so far: TABLE#: 3 CST: 3 CDN: 1 BYTES: 65 Best so far: TABLE#: 2 CST: 7 CDN: 1 BYTES: 98 Final - All Rows Plan: JOIN ORDER: 2 CST: 7 CDN: 1 RSC: 7 RSP: 7 BYTES: 98 IO-RSC: 7 IO-RSP: 7 CPU-RSC: 0 CPU-RSP: 0 </pre> </div></div> <p><font color="red">JOIN ORDER: 2 is selected with the cost of 7.</font></p> <p><b>Going backward lines to break down the cost of 7:</b></p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Join order[2]: IC_ITEM_MST_B[B]#0 IC_ITEM_MST_TL[T]#1 IC_LOTS_MST[L]#3 IC_TRAN_PND[T]#2 ... Now joining: IC_TRAN_PND[T]#2 ******* NL Join Outer table: cost: 3 cdn: 1 rcz: 65 resp: 3 Access path: index (scan) Index: IC_TRAN_PNDI1 TABLE: IC_TRAN_PND RSC_CPU: 0 RSC_IO: 4 IX_SEL: 1.5650e-10 TB_SEL: 1.5144e-10 Join: resc: 7 resp: 7 Best NL cost: 7 resp: 7 </pre> </div></div> <p>Our index IC_TRAN_PNDI1 appears here. So, here is the stop point for our case.</p> <p>Here is the selectivity comparison table which includes Oracle-computed selectivity values and manually computed selectivity values.</p> <table class='confluenceTable'><tbody> <tr> <th class='confluenceTh'>&nbsp;</th> <th class='confluenceTh'>Oracle</th> <th class='confluenceTh'>Manual</th> </tr> <tr> <td class='confluenceTd'>IX_SEL(Access predicates)</td> <td class='confluenceTd'> 1.5650e-10</td> <td class='confluenceTd'>1.5649e-10</td> </tr> <tr> <td class='confluenceTd'>TB_SEL(Access+Filter Predicates)</td> <td class='confluenceTd'>1.5144e-10</td> <td class='confluenceTd'>1.5144e-10</td> </tr> </tbody></table> <p>No computation errors found.</p> <span class="nobr"><a href="http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html">"Cost Based Oracle: Fundamentals"<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span> book of Jonathan Lewis was used in the calculations above. Operating System SQL_TEXT CREATE OR REPLACE VIEW IC_ITEM_INV_V<br/> (ITEM_ID, LOT_NO, SUBLOT_NO, LOT_ID, LOT_STATUS, <br/> &nbsp;LOT_CREATED, EXPIRE_DATE, QC_GRADE, WHSE_CODE, LOCATION, <br/> &nbsp;LOCT_ONHAND, LOCT_ONHAND2, COMMIT_QTY, COMMIT_QTY2)<br/> AS <br/> SELECT l.item_id, l.lot_no, l.sublot_no, l.lot_id, s.lot_status,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;l.lot_created, l.expire_date, l.qc_grade, b.whse_code, b.LOCATION,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;b.loct_onhand, b.loct_onhand2, 0, 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;FROM ic_lots_mst l, ic_loct_inv b, ic_lots_sts s<br/> &nbsp;&nbsp;&nbsp;&nbsp;WHERE l.item_id = b.item_id<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND l.inactive_ind = 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND l.lot_id = b.lot_id<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND b.lot_status = s.lot_status(+)<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND NVL (s.order_proc_ind, 1) = 1<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND NVL (s.rejected_ind, 0) = 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND b.loct_onhand &gt; 0<br/> &nbsp;&nbsp;&nbsp;UNION ALL<br/> &nbsp;&nbsp;&nbsp;SELECT /*+ INDEX(t IC_TRAN_PNDI1) */ t.item_id, l.lot_no, l.sublot_no, t.lot_id, t.lot_status,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;l.lot_created, l.expire_date, l.qc_grade, t.whse_code, t.LOCATION,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0, 0, t.trans_qty commit_qty, t.trans_qty2 commit_qty2<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;FROM ic_lots_mst l, ic_tran_pnd t, ic_item_mst i<br/> &nbsp;&nbsp;&nbsp;&nbsp;WHERE i.item_id = l.item_id<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND i.item_id = t.item_id<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND l.inactive_ind = 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND t.lot_id = l.lot_id<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND t.delete_mark = 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND t.completed_ind = 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND t.trans_qty &lt; 0<br/> /<br/> <br/> <br/> <br/> SELECT SUM (loct_onhand), SUM (loct_onhand2), SUM (commit_qty),<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SUM (commit_qty2), SUM (loct_onhand) + SUM (commit_qty), lot_no,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sublot_no, lot_id, lot_status, lot_created, LOCATION, expire_date,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;qc_grade<br/> &nbsp;&nbsp;&nbsp;&nbsp;FROM xtdba.ic_item_inv_v_sil x<br/> &nbsp;&nbsp;&nbsp;WHERE item_id = 5125<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND whse_code = '350'<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND loct_onhand &gt;= 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND expire_date &gt;<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TO_DATE ('06-SEP-2007, 11:59:59', 'DD-MON-YYYY, HH:MI:SS')<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND lot_id &gt; 0<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AND LOCATION &lt;&gt; 'NONE'<br/> GROUP BY lot_no,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sublot_no,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lot_id,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lot_status,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lot_created,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;LOCATION,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;expire_date,<br/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;qc_grade<br/> &nbsp;&nbsp;HAVING SUM (loct_onhand) + SUM (commit_qty) &gt; 0<br/> ORDER BY lot_created Operating System Version B.11.11 Product Version 9.2.0.7.0 [QA-30] Memory leak on MMNL background process. http://jira.ubtools.com/jira/browse/QA-30 <h4><a name="Problem%3A"></a>Problem:</h4> <p> The size of MMNL background process is growing, then the server is crashed.</p> <h4><a name="Analysis%3A"></a>Analysis:</h4> <blockquote> <p> bash-3.00$ ps -ef|grep mmnl<br/> oracle 2250 1 0 Jun 28 ? 12:03 ora_mmnl_bgw<br/> oracle 21397 20996 0 13:31:42 pts/5 0:00 grep mmnl</p> <p> SQL&gt; select s.sid, n.name,s.value<br/> from v$sesstat s , v$statname n<br/> where s.statistic# = n.statistic#<br/> and n.name like '%memory%'<br/> and s.sid in<br/> (select se.sid from v$session se, v$process pr<br/> where se.paddr=pr.addr and pr.spid=2250)<br/> order by value desc;</p> <p> SID NAME VALUE<br/> ---------- ---------------------------------------------------------------- ----------<br/> <font color="red">1646 session pga memory 463496</font> <br/> 1646 session pga memory max 463496<br/> 1646 session uga memory 88640<br/> 1646 session uga memory max 88640<br/> 1646 workarea memory allocated 0<br/> 1646 sorts (memory) 0</p> <p> 6 rows selected.</p> <p> SQL&gt;</p> <p> bash-3.00$ pmap -x 2250<br/> 2250: ora_mmnl_bgw<br/> Address Kbytes RSS Anon Locked Mode Mapped File<br/> 0000000100000000 81016 78904 - - r-x-- oracle<br/> 000000010501C000 856 592 112 - rwx-- oracle<br/> 00000001050F2000 3128 1352 64 - rwx-- [ heap ]<br/> <font color="red">0000000105400000 4190208 1255504 4096 - rwx-- [ heap ]<br/> 0000000205000000 3731456 2145424 1138688 - rwx-- [ heap ]</font> <br/> <font color="blue">0000000380000000 253952 253952 - 253952 rwxsR [ ism shmi d=0xc ]<br/> 0000040000000000 290816 290816 - 290816 rwxsR [ ism shmi d=0xd ]<br/> 0000040040000000 290816 290816 - 290816 rwxsR [ ism shmi d=0xe ]<br/> 0000040080000000 16 16 - 16 rwxsR [ ism shmi d=0xf ]</font> <br/> FFFFFFFF7B500000 64 24 - - rwx-- [ anon ]<br/> FFFFFFFF7B530000 128 16 - - rw--- [ anon ]<br/> FFFFFFFF7B600000 8 - - - rw-s- dev:291,0 in o:240652<br/> FFFFFFFF7B750000 64 24 16 - rw--- [ anon ]<br/> FFFFFFFF7B760000 64 24 24 - rw--- [ anon ]<br/> FFFFFFFF7B770000 64 56 48 - rw--- [ anon ]<br/> FFFFFFFF7B800000 16 16 - - r-x-- liblgrp.so.1<br/> FFFFFFFF7B904000 8 8 - - rwx-- liblgrp.so.1<br/> FFFFFFFF7BA78000 8 8 - - rwxs- [ anon ]<br/> FFFFFFFF7BB00000 8 8 - - r-x-- libc_psr.so. 1<br/> FFFFFFFF7BC00000 8 8 8 - rwx-- [ anon ]<br/> FFFFFFFF7BD00000 8 8 - - r-x-- libmd5.so.1<br/> FFFFFFFF7BE02000 8 8 - - rwx-- libmd5.so.1<br/> FFFFFFFF7BF00000 8 8 8 - rwx-- [ anon ]<br/> FFFFFFFF7C000000 640 168 - - r-x-- libm.so.2<br/> FFFFFFFF7C19E000 40 24 8 - rwx-- libm.so.2<br/> FFFFFFFF7C200000 8 8 - - r-x-- libkstat.so. 1<br/> FFFFFFFF7C302000 8 8 8 - rwx-- libkstat.so. 1<br/> FFFFFFFF7C400000 32 24 - - r-x-- librt.so.1<br/> FFFFFFFF7C508000 8 8 - - rwx-- librt.so.1<br/> FFFFFFFF7C600000 32 32 - - r-x-- libaio.so.1<br/> FFFFFFFF7C708000 8 8 - - rwx-- libaio.so.1<br/> FFFFFFFF7C800000 8 8 8 - rwx-- [ anon ]<br/> FFFFFFFF7C900000 912 656 - - r-x-- libc.so.1<br/> FFFFFFFF7CAE4000 64 64 64 - rwx-- libc.so.1<br/> FFFFFFFF7CAF4000 8 - - - rwx-- libc.so.1<br/> FFFFFFFF7CB00000 24 16 16 - rwx-- [ anon ]<br/> FFFFFFFF7CC00000 32 16 - - r-x-- libgen.so.1<br/> FFFFFFFF7CD08000 8 8 - - rwx-- libgen.so.1<br/> FFFFFFFF7CE00000 56 32 - - r-x-- libsocket.so .1<br/> FFFFFFFF7CF0E000 16 16 - - rwx-- libsocket.so .1<br/> FFFFFFFF7D000000 688 248 - - r-x-- libnsl.so.1<br/> FFFFFFFF7D1AC000 64 64 - - rwx-- libnsl.so.1<br/> FFFFFFFF7D1BC000 32 8 - - rwx-- libnsl.so.1<br/> FFFFFFFF7D200000 1912 320 - - r-x-- libnnz10.so<br/> FFFFFFFF7D4DC000 632 232 - - rwx-- libnnz10.so<br/> FFFFFFFF7D57A000 8 - - - rwx-- libnnz10.so<br/> FFFFFFFF7D600000 40 16 - - r-x-- libdbcfg10.s o<br/> FFFFFFFF7D708000 8 8 - - rwx-- libdbcfg10.s o<br/> FFFFFFFF7D800000 8488 8200 - - r-x-- libjox10.so<br/> FFFFFFFF7E148000 536 480 - - rwx-- libjox10.so<br/> FFFFFFFF7E200000 8 8 8 - rwx-- [ anon ]<br/> FFFFFFFF7E300000 16 16 - - r-x-- libocrutl10. so<br/> FFFFFFFF7E402000 16 16 - - rwx-- libocrutl10. so<br/> FFFFFFFF7E500000 8 8 8 - rwx-- [ anon ]<br/> FFFFFFFF7E600000 144 40 - - r-x-- libocrb10.so<br/> FFFFFFFF7E722000 8 8 - - rwx-- libocrb10.so<br/> FFFFFFFF7E800000 200 72 - - r-x-- libocr10.so<br/> FFFFFFFF7E930000 16 16 - - rwx-- libocr10.so<br/> FFFFFFFF7EA00000 8 8 - - r-x-- libskgxn2.so<br/> FFFFFFFF7EB00000 8 8 - - rwx-- libskgxn2.so<br/> FFFFFFFF7EC00000 1480 352 - - r-x-- libhasgen10. so<br/> FFFFFFFF7EE70000 72 56 - - rwx-- libhasgen10. so<br/> FFFFFFFF7EE82000 8 - - - rwx-- libhasgen10. so<br/> FFFFFFFF7EF00000 8 8 8 - rwx-- [ anon ]<br/> FFFFFFFF7F000000 8 8 - - r-x-- libskgxp10.s o<br/> FFFFFFFF7F100000 8 8 - - rwx-- libskgxp10.s o<br/> FFFFFFFF7F200000 8 8 - - r-x-- libodmd10.so<br/> FFFFFFFF7F300000 8 8 - - rwx-- libodmd10.so<br/> FFFFFFFF7F400000 8 8 - - r-x-- libdl.so.1<br/> FFFFFFFF7F500000 8 8 8 - rwx-- [ anon ]<br/> FFFFFFFF7F600000 176 176 - - r-x-- ld.so.1<br/> FFFFFFFF7F72C000 16 16 8 - rwx-- ld.so.1<br/> FFFFFFFF7FFF0000 64 48 24 - rw--- [ stack ]<br/> ---------------- ---------- ---------- ---------- ----------<br/> total Kb 8859344 4329168 1143232 835600<br/> bash-3.00$</p> <p> bash-3.00$ truss -p 2250</p> <p><font color="red">open("/dev/kstat", O_RDONLY) = 58843</font><br/> ioctl(58843, KSTAT_IOC_CHAIN_ID, 0x00000000) = 755<br/> ioctl(58843, KSTAT_IOC_READ, "kstat_headers") Err#12 ENOMEM<br/> brk(0x2EBC064D0) = 0<br/> brk(0x2EBC264D0) = 0<br/> ioctl(58843, KSTAT_IOC_READ, "kstat_headers") = 755<br/> brk(0x2EBC264D0) = 0<br/> brk(0x2EBC2A4D0) = 0<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info0") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info1") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info2") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info3") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info8") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info9") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info10") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info11") = 755<br/> ioctl(58843, KSTAT_IOC_READ, "cpu_info512") = 755<br/> pset_bind(PS_QUERY, P_PID, -1, 0xFFFFFFFF7FFFD5AC) = 0<br/> <font color="red">open("/dev/kstat", O_RDONLY) = 58844</font></p></blockquote> QA-30 Memory leak on MMNL background process. Oracle - Administration Major Closed Answered ubTools Support ubTools Support Mon, 16 Jul 2007 05:35:29 +0000 (UTC) Tue, 18 Sep 2007 05:15:33 +0000 (UTC) 0 <h4><a name="Cause%3A"></a>Cause:</h4> <p><em>session pga memory</em> is 463496 BYTE. But, it's too high in OS even if shared segment is substructed:</p> <p>4329168 - (253952+290816+290816+16)= 3493568 (KB)</p> <p>3493568*1024 is too high. There is huge memory allocation in HEAP usage.</p> <p>Oracle opened <em>/dev/kstat</em> to get operating system kernel statistics without closing it before subsequent open. There should be one close() system call for each open() call.</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>ORACLE BUG: 3701351.</p> <p>Base Bug#3559340 inludes a fix for Oracle 10.1.0.3.</p> Base Bug#3559340 is fixed in Oracle 10.1.0.4. Operating System Operating System Version 5.10 Product Version 10.1.0.3.0 [QA-29] ORA-600 [2845] while selecting, Invalid ROWID. http://jira.ubtools.com/jira/browse/QA-29 ORA-600 <span class="error">&#91;2845&#93;</span> while selecting, Invalid ROWID. QA-29 ORA-600 [2845] while selecting, Invalid ROWID. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 18:09:20 +0000 (UTC) Sun, 16 Sep 2007 16:24:50 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [2845], [0], [30], [0], [], [], [], [] </pre> </div></div> <h4><a name="Errorcodedefinition%3A"></a>Error code definition:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Oracle is reading a range of blocks from a database file. If the starting block number or file number is 0, or the file number is greater than can be accommodated in the SGA (DB_FILES), error ORA-600 [2845] is raised. Ref: Metalink Note: 31057.1 ORA-600 [2845] "Read of bad DBA Requested" </pre> </div></div> <h4><a name="Cursordump%3A"></a>Cursor dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>******************** Cursor Dump ************************ Current cursor: 30, pgadep: 0 pgactx: 14a415b8 ctxcbk: 0 ctxqbc: 14a41990 qbcrws: 14a40ef0 Cursor Dump: ---------------------------------------- ... ---------------------------------------- Cursor 30 (1400ea8e0): CURFETCH curiob: 14019be40 curflg: 46 curpar: 0 curusr: c curses 8a2f38 cursor name: SELECT "NOTE" FROM "MEDIX"."PAT_SES" WHERE "ROWID"=:1 child pin: 5485b30, child lock: 54f2960, parent lock: 54c5f68 xscflg: 80110676, parent handle: 14a46070 nxt: 3.0x00000018 nxt: 2.0x000007d8 nxt: 1.0x000004e0 Cursor frame allocation dump: frm: -------- Comment -------- Size Seg Off bind 0: dty=1 mxl=32(18) mal=00 scl=00 pre=00 oacflg=01 bfp=140192748 bln=18 avl=18 flg=05 value="00000000.0000.0000" ---------------------------------------- ... </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>As seen in the cursor dump above, the current cursor number is 30. The cursor#30 has a bind variable using a ROWID. But, the value of this bind variable is "00000000.0000.0000". In Oracle7, this ROWID points to block#0, slot#0, file#0. This is wrong.</p> <h4><a name="Recommendation%3A"></a>Recommendation:</h4> <ul class="alternate" type="square"> <li>Check application against any possible wrong ROWID usage.</li> <li>Call Oracle Support.</li> </ul> Operating System Product Version 7.3.4.5.0 [QA-28] ORA-00600 [729]: UGA memory leak. http://jira.ubtools.com/jira/browse/QA-28 ORA-00600 <span class="error">&#91;729&#93;</span>: UGA memory leak. QA-28 ORA-00600 [729]: UGA memory leak. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 18:00:56 +0000 (UTC) Sun, 16 Sep 2007 16:25:06 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [729], [480], [space leak], [], [], [], [], [] </pre> </div></div> <h4><a name="Errorcodedefinition%3A"></a>Error code definition:</h4> <p>A space leak has been detected in the User Global Area (UGA). There is no data corruption as a result of this error. It is an internal memory housekeeping problem. Second argument is the number of bytes leaked.</p> <h4><a name="UGAHeapdump%3A"></a>UGA Heap dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>******** ERROR: UGA memory leak detected 480 ******** ****************************************************** HEAP DUMP heap name="session heap" desc=0x222bd6f4 extent sz=0x108c alt=32767 het=32767 rec=0 flg=3 opc=3 parent=212550 owner=ad83d50 nex=0 xsz=0x108c EXTENT 0 Chunk 2330b100 sz= 3844 free " " EXTENT 1 Chunk 232f5174 sz= 516 free " " EXTENT 2 Chunk 236f0050 sz= 4176 free " " EXTENT 3 Chunk 236d0050 sz= 1228 free " " EXTENT 4 Chunk 236f18e4 sz= 1280 free " " EXTENT 5 Chunk 23307098 sz= 4228 free " " EXTENT 6 Chunk 2330a27c sz= 3696 free " " EXTENT 7 Chunk 23308130 sz= 1008 free " " Chunk 23308520 sz= 480 freeable "define var info" Chunk 23308700 sz= 2740 free " " EXTENT 8 Chunk 23306214 sz= 2832 perm "perm " alo=2832 Chunk 23306d24 sz= 864 free " " EXTENT 9 Chunk 233091c8 sz= 4228 free " " EXTENT 10 Chunk 232f405c sz= 612 free " " EXTENT 11 ... </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>As seen in the UGA heap dump, there is a freeable chunk of <em>define var info</em> memory type. This chunk looks leaked.</p> <h4><a name="Workaround%3A"></a>Workaround:</h4> <p>There is no data corruption in this error, and can be safely ignore for small memory leaks by adding the following event to init.ora:</p> <ul class="alternate" type="square"> <li>event = "10262 trace name context forever, level 500"</li> </ul> <p>Then, restart your database. This event disables space leaks less than 500 bytes.</p> <p>You can see the details at Metalink Note:31056.1 ORA-600 <span class="error">&#91;729&#93;</span> "UGA Space Leak"</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>Bug:2177050: ORA-600 <span class="error">&#91;729&#93;</span> after application of the 8.1.7.3 patchset. The resulting trace file will include a memory dump which shows unfreed memory chunks with the tags "define var info" and/or "oactoid info".<br/> Ref: Metalink Note:31056.1 ORA-600 <span class="error">&#91;729&#93;</span> "UGA Space Leak"</p> Operating System Operating System Version 2000 Product Version 8.1.7.3.0 [QA-27] ORA-00600 [kcbgcur_1] by PQ operation. http://jira.ubtools.com/jira/browse/QA-27 ORA-00600 <span class="error">&#91;kcbgcur_1&#93;</span> by PQ operation. QA-27 ORA-00600 [kcbgcur_1] by PQ operation. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:56:13 +0000 (UTC) Sun, 16 Sep 2007 16:25:21 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [kcbgcur_1], [], [], [], [], [], [], [] </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>kcbgcur(). This function is a function of Oracle Cache Layer. </pre> </div></div> <h4><a name="Undoblockdump%3A"></a>Undo block dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>UNDO BLK: xid: 0x0005.05e.000000c4 seq: 0x8c cnt: 0x31 irb: 0x19 icl: 0x0 flg: 0x0000 Rec Offset Rec Offset Rec Offset Rec Offset Rec Offset --------------------------------------------------------------------------- 0x01 0x1f38 0x02 0x1e88 0x03 0x1de4 0x04 0x1d3c 0x05 0x1c94 0x06 0x1bf4 0x07 0x1b54 0x08 0x1ac4 0x09 0x1a20 0x0a 0x1978 0x0b 0x18d4 0x0c 0x1820 0x0d 0x1784 0x0e 0x16e0 0x0f 0x1638 0x10 0x1598 0x11 0x14e8 0x12 0x1448 0x13 0x13a4 0x14 0x1308 0x15 0x126c 0x16 0x11d0 0x17 0x112c 0x18 0x1084 0x19 0x0fe0 0x1a 0x0f0c 0x1b 0x0e60 0x1c 0x0db8 0x1d 0x0d28 0x1e 0x0c90 0x1f 0x0bf0 0x20 0x0b28 0x21 0x0a88 0x22 0x09ec 0x23 0x0950 0x24 0x08ac 0x25 0x0814 0x26 0x077c 0x27 0x06e4 0x28 0x0650 0x29 0x05b4 0x2a 0x0524 0x2b 0x0480 0x2c 0x03f4 0x2d 0x035c 0x2e 0x02c0 0x2f 0x0230 0x30 0x01a0 0x31 0x0108 ... *----------------------------- * Rec #0x19 slt: 0x5e objn: 0(0x00000000) objd: 0 tblspc: 0(0x00000000) * Layer: 11 (Row) opc: 1 rci 0x18 Undo type: Regular undo Last buffer split: No Temp Object: No Tablespace Undo: No rdba: 0x00000000 *----------------------------- KDO undo record: KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x00c0083d.008c.18 KDO Op code: IRP xtype: XA bdba: 0x0040760a hdba: 0x004075d9 itli: 1 ispac: 0 maxfr: 4863 tabn: 0 slot: 130(0x82) size/delt: 56 fb: --H-FL-- lb: 0x0 cc: 4 null: ---- col 0: [ 3] 37 34 34 col 1: [20] 45 6c 65 63 74 72 6f 6e 69 63 20 73 74 72 75 63 74 75 72 65 col 2: [ 0] col 3: [ 0] *----------------------------- ... </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>irb points the first undo record in undo block to begin rollback. So, the record 0x19 is your first undo record. The object number of the block, and the object number of the block undo applied to are 0. I think this may be your problem. Oracle may not be able to know the real object number during this rollback.</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>It looks like:</p> <ul class="alternate" type="square"> <li>Bug:984947 A PARALLEL QUERY SLAVE GOT ORA-600<span class="error">&#91;KCBGCUR_1&#93;</span></li> </ul> Operating System Operating System Version 2.2.14-5.0 Product Version 8.1.6.1.0 [QA-26] ORA-00600 [12700] by SNP process. http://jira.ubtools.com/jira/browse/QA-26 ORA-00600 <span class="error">&#91;12700&#93;</span> by SNP process. QA-26 ORA-00600 [12700] by SNP process. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:44:31 +0000 (UTC) Sun, 16 Sep 2007 16:25:36 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [12700], [62], [4202128], [133], [], [], [], [] </pre> </div></div> <h4><a name="CurrentSQLstatementforthissession%3A"></a>Current SQL statement for this session:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SELECT source from source$ WHERE obj# =:1 ORDER BY line </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>rtbhiopn(). </pre> </div></div> <h4><a name="Errorcodedefinition%3A"></a>Error code definition:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Oracle is trying to access a row using its ROWID, which has been obtained from an index. A mismatch was found between the index rowid and the data block it is pointing to. The rowid points to a non-existent row in the data block. The corruption can be in data and/or index blocks. ORA-600 [12700] can also be reported due to a consistent read (CR) problem. The information dumped to the trace file varies greatly between releases: - in Oracle 7.3.x it is ORA-600 [12700][a1][a2] , where Arg [a1] dba (Data Block Address) Arg [a2] slot number (number of the row in the block pointed by the dba) - in Oracle 8.x and 9.x, it is ORA-600 [12700][a1][a2][a3] , where Arg [a1] dataobj# from sys.obj$ Arg [a2] relative dba of the data block Arg [a3] slot number of the row in the data block Details: Metalink Note:28229.1 ORA-600 [12700] "Index entry Points to Missing ROWID" </pre> </div></div> <h4><a name="Errorcodeinterpretation%3A"></a>Error code interpretation:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Argument Dec Hex ---------- ---------- ---------- [62] 62 0x3E [4202128] 4202128 0x401E90 [133] 133 0x85 This problem is related to the slot#133 of the rdba#4202128 of the object#62. </pre> </div></div> <h4><a name="Indexblockdump%3A"></a>Index block dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Block header dump: rdba: 0x00401ede Object id on Block? Y seg/obj: 0x63 csc: 0x00.2fbe43 itc: 2 flg: - typ: 2 - INDEX fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0012.01a.00000130 0x008027fb.000e.02 C--- 0 scn 0x0000.0000c71c 0x02 0x0002.013.00000768 0x00807cf2.4f11.08 --U- 217 fsc 0x0000.002fbe45 Leaf block dump =============== header address 74698844=0x473d05c kdxcolev 0 kdxcolok 0 kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y kdxconco 2 kdxcosdc 1 kdxconro 217 kdxcofbo 470=0x1d6 kdxcofeo 471=0x1d7 kdxcoavs 1 kdxlespl 0 kdxlende 0 kdxlenxt 4202207=0x401edf kdxleprv 4203488=0x4023e0 kdxledsz 6 kdxlecol 0 kdxlebksz 3940 row#0[471] flag: ----, lock: 2, data:(6): 00 40 1e 93 00 5b col 0; len 3; (3): c2 27 11 col 1; len 3; (3): c2 02 62 ... row#213[3876] flag: ----, lock: 2, data:(6): 00 40 1e 90 00 85 col 0; len 3; (3): c2 27 11 col 1; len 3; (3): c2 05 0b row#214[3892] flag: ----, lock: 2, data:(6): 00 40 1e 90 00 86 col 0; len 3; (3): c2 27 11 col 1; len 3; (3): c2 05 0c ... </pre> </div></div> <h4><a name="Datablockdump%3A"></a>Data block dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Block header dump: rdba: 0x00401e90 Object id on Block? Y seg/obj: 0x3e csc: 0x00.2fbe43 itc: 1 flg: - typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0002.013.00000768 0x00807cf2.4f11.0b --U- 139 fsc 0x009b.002fbe45 data_block_dump =============== tsiz: 0xfb8 hsiz: 0x128 pbl: 0x04b43044 bdba: 0x00401e90 flag=--------- ntab=1 nrow=139 frre=-1 fsbo=0x128 fseo=0x2d4 avsp=0x111 tosp=0x23a 0xe:pti[0] nrow=139 offs=0 0x12:pri[0] offs=0xfb6 0x14:pri[1] offs=0xfb4 0x16:pri[2] offs=0xfb2 . 0x11a:pri[132] offs=0x382 0x11c:pri[133] sfll=0 0x11e:pri[134] sfll=0 0x120:pri[135] sfll=0 0x122:pri[136] sfll=0 0x124:pri[137] sfll=0 0x126:pri[138] sfll=0 block_row_dump: tab 0, row 0, @0xfb6 tl: 2 fb: --HDFL-- lb: 0x1 tab 0, row 1, @0xfb4 tl: 2 fb: --HDFL-- lb: 0x1 . tab 0, row 132, @0x382 tl: 42 fb: --H-FL-- lb: 0x1 cc: 3 col 0: [ 3] c2 27 11 col 1: [ 3] c2 05 0a col 2: [30] 09 09 09 09 09 09 6c 5f 6e 65 78 74 48 6f 6c 64 44 65 73 69 72 65 4e 75 6d 62 65 72 2c 0a end_of_block_dump </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>As seen in the index block dump, kdxledsz is 6. That means this index is a unique B*Tree index which uses restricted ROWID format in 6 bytes. The first 4 bytes are used for rdba, and the last 2 bytes are used for slot#.</p> <p>This internal error code had returned 0x401E90 for the rdba, and 0x85 for the slot#. The restricted ROWID in the index dump has to be the combination of them. So, it's 0x00401E900085. This restricted ROWID is available in the index dump.</p> <p>The pri[] field shows slot# of rows in data block. In this error, the returned slot# is 133. But, as seen in the data block dump, there is no row allocated for this slot. The max slot# in the block dump is 132.</p> <p>Although there is a value in the index block, there is no matching row in the data block. The data block looks corrupted.</p> <h4><a name="Workaround%3A"></a>Workaround:</h4> <p>The most probably the object#62 is source$. Restore SYSTEM tablespace from the backup, and recover it.</p> Operating System Product Version 8.0.5.2.1 [QA-25] ORA-00600 [kkslgop1] in SELECT when CURSOR_SHARING IS NOT EXACT. http://jira.ubtools.com/jira/browse/QA-25 ORA-00600 <span class="error">&#91;kkslgop1&#93;</span> in SELECT when CURSOR_SHARING IS NOT EXACT. QA-25 ORA-00600 [kkslgop1] in SELECT when CURSOR_SHARING IS NOT EXACT. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:39:50 +0000 (UTC) Sun, 16 Sep 2007 16:26:04 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [kkslgop1], [], [], [], [], [], [], [] </pre> </div></div> <h4><a name="CurrentSQLstatementforthissession%3A"></a>Current SQL statement for this session:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>SELECT COMP_TIME FROM CSTMAPSTATUS WHERE CSTID = :"SYS_B_0" AND SLOTNO = :"SYS_B_1" </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>kkslgop(). This is a function of Oracle Compilation Layer. </pre> </div></div> <h4><a name="Processstate%3A"></a>Process state:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>PROCESS STATE ------------- ... ---------------------------------------- SO: 404c6264, type: 3, owner: 403cde98, pt: 0, flag: INIT/-/-/0x00 (session) trans: 40e83928, creator: 403cde98, flag: (8000041) USR/- BSY/-/-/-/-/- DID: 0001-0014-00000002, short-term DID: 0000-0000-00000000 txn branch: 40f8201c oct: 3, prv: 0, user: 24/APPMGR O/S info: user: Administrator, term: CIMMB, ospid: 219:228, machine: PDP1_MES_DOM\CIMMB program: TIME_GAP.exe last wait for 'SQL*Net message from dblink' blocking sess=0x0 seq=60687 wait_time=-1 driver id=54435000, #bytes=1, =0 ---------------------------------------- ... </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>As you see in your SQL statement, your bind variables are system generated bind variables. In other words, cursor sharing is enabled in your database.</p> <p>Also, as seen in the process state, your last wait event is <em>SQL*Net message from dblink</em>. That means a dblink operation had been done before.</p> <h4><a name="Workaround%3A"></a>Workaround:</h4> <p>Use cursor_sharing=exact</p> <h4><a name="Bug%3A"></a>Bug:</h4> <ul class="alternate" type="square"> <li>Bug:2169897 ORA-600 ARGUMENTS: <span class="error">&#91;KKSLGOP1&#93;</span> VIA SELECT ACROSS DB_LINK</li> </ul> <ul class="alternate" type="square"> <li>Bug:2159152 CURSOR_SHARING=FORCE MAY NOT SHARE STATEMENTS USING VIEWS IN 8172/8173<br/> Back to top </li> </ul> Operating System Product Version 8.1.7.2.0 [QA-24] ORA-07445 [000000010112A75C] during import. http://jira.ubtools.com/jira/browse/QA-24 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-07445: exception encountered: core dump [000000010112A75C] [SIGSEGV] [Address not mapped to object] [260] [] [] </pre> </div></div> <h4><a name="CurrentSQLstatementforthissession%3A"></a>Current SQL statement for this session:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>CREATE PROCEDURE TableParse_Proc wrapped 0 abcd abcd abcd abcd abcd .. </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>parfs4_freelist_sort() </pre> </div></div> <h4><a name="Processstate%3A"></a>Process state:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>PROCESS STATE ------------- ... ---------------------------------------- SO: 399071ce0, type: 3, owner: 39905dc18, pt: 0, flag: INIT/-/-/0x00 (session) trans: 399e4f3b8, creator: 39905dc18, flag: (10000041) USR/- BSY/-/-/-/-/- DID: 0001-0008-00000002, short-term DID: 0000-0000-00000000 txn branch: 0 oct: 24, prv: 0, user: 360/WINRLS O/S info: user: mtrxdev, term: pts/2, ospid: 8915, machine: vhdcap5g program: imp@vhdcap5g (TNS V1-V3) last wait for 'SQL*Net more data from client' blocking sess=0x0 seq=45627 wait_time=-2 driver id=62657100, #bytes=2882, =0 ---------------------------------------- </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>As seen above, this problem was encountered in import while creating a wrapped package.</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>There are several bugs about this problem with additional ORA-4030 error. The base bug is below:</p> <ul class="alternate" type="square"> <li>Bug:2278310 IMPORT OF WRAPPED PL/SQL PROCEDURE FAILS WITH ORA-04030</li> </ul> QA-24 ORA-07445 [000000010112A75C] during import. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:35:57 +0000 (UTC) Sun, 16 Sep 2007 16:26:20 +0000 (UTC) 0 Operating System Operating System Version 5.8 Product Version 8.1.7.2.0 [QA-23] ORA-00600 [15851] while creating unique index. http://jira.ubtools.com/jira/browse/QA-23 ORA-00600 <span class="error">&#91;15851&#93;</span> while creating unique index. QA-23 ORA-00600 [15851] while creating unique index. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:29:05 +0000 (UTC) Sun, 16 Sep 2007 16:26:35 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [15851], [8], [8], [1], [2], [], [], [] </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>srsqb1nx(). </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>Most probably, this is a sort problem while creating index.</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>Metalink Note:1032586.6 ORA-600 <span class="error">&#91;15851&#93;</span></p> Operating System Operating System Version 4.0 Product Version 7.3.4.0.0 [QA-22] ORA-00600 [13004] while creating index. http://jira.ubtools.com/jira/browse/QA-22 ORA-00600 <span class="error">&#91;13004&#93;</span> while creating index. QA-22 ORA-00600 [13004] while creating index. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:25:20 +0000 (UTC) Sun, 16 Sep 2007 16:26:49 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>ORA-00600: internal error code, arguments: [13004], [], [], [], [], [], [], [] </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>kkrirop(). This is a function of Oracle Compilation Layer. </pre> </div></div> <h4><a name="Bug%3A"></a>Bug:</h4> <p>Bug:994802 CREATE INDEX RESULTS IN ORA-600 <span class="error">&#91;13004&#93;</span></p> Operating System Operating System Version 4.0 Product Version 7.3.4.0.0 [QA-21] ORA-07445 [11]: SMON crashed. http://jira.ubtools.com/jira/browse/QA-21 ORA-07445 <span class="error">&#91;11&#93;</span>: SMON crashed. QA-21 ORA-07445 [11]: SMON crashed. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:19:24 +0000 (UTC) Sun, 16 Sep 2007 16:27:05 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ORA-07445: exception encountered: core dump [11] [3221212616] [240] [0] [] [] </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> kdb4_dup_keys(). This is a function of Oracle Data Layer. </pre> </div></div> <h4><a name="Cursordump%3A"></a>Cursor dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ******************** Cursor Dump ************************ Current cursor: 1, pgadep: 1 pgactx: c00000014e736d90 ctxcbk: c00000014e776720 ctxqbc: 0 ctxrws: c00000014e7253c8 Cursor Dump: ---------------------------------------- Cursor 1 (80000001000befe8): CURBOUND curiob: 80000001000c1358 curflg: 5 curpar: 0 curusr: 0 curses c00000012c18a070 cursor name: delete from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4 child pin: c00000013511e670, child lock: c00000013511d630, parent lock: c00000013511d6a0 xscflg: 20100466, parent handle: c00000014e748b20, xscfl2: 5100400 nxt: 3.0x00000560 nxt: 2.0x000005e0 nxt: 1.0x000005e0 Cursor frame allocation dump: frm: -------- Comment -------- Size Seg Off bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0 bfp=80000001000d2470 bln=22 avl=02 flg=05 value=1 bind 1: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0 bfp=80000001000d2440 bln=24 avl=02 flg=05 value=2 bind 2: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0 bfp=80000001000d2410 bln=24 avl=02 flg=05 value=2 bind 3: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0 bfp=80000001000d23e0 bln=24 avl=02 flg=05 value=59 End of cursor dump </pre> </div></div> <h4><a name="Recomendation%3A"></a>Recomendation:</h4> <p>Check if sys.uet$ is corrupted.</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>Bug:2106455 SMON CRASHES WITH ORA-07445 IN KDB4_DUP_KEYS</p> Operating System Operating System Version B.11.00 Product Version 8.1.7.1.0 [QA-20] ORA-00600 [723]: Memory leak in LGWR. http://jira.ubtools.com/jira/browse/QA-20 ORA-00600 <span class="error">&#91;723&#93;</span>: Memory leak in LGWR. QA-20 ORA-00600 [723]: Memory leak in LGWR. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:11:21 +0000 (UTC) Sun, 16 Sep 2007 16:27:20 +0000 (UTC) 0 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ORA-00600: internal error code, arguments: [723], [5200], [5200], [memory leak], [], [], [], [] </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ksmdpg() Deallocate variable PGA. Just free top PGA heap, the callback will free. the extents to the OSD. Ref: Bug:1283286 </pre> </div></div> <h4><a name="Processstate%3A"></a>Process state:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> PROCESS STATE ------------- Process global information: process: 0, call: 0, xact: 0, curses: 0, usrses: 0 No process is allocated. END OF PROCESS STATE </pre> </div></div> <h4><a name="PGAHeapdump%3A"></a>PGA Heap dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ******** ERROR: PGA memory leak detected 5200 &gt; 3616 ******** ****************************************************** HEAP DUMP heap name="pga heap" desc=0x40003190 extent sz=0x2148 alt=40 het=32767 rec=0 flg=3 opc=3 parent=0 owner=0 nex=0 xsz=0x2148 EXTENT 0 Chunk 400de8e0 sz= 4432 free " " Chunk 400dfa30 sz= 256 freeable "LGWR PIC bds ar" Chunk 400dfb30 sz= 896 freeable "LGWR PIC ins ar" Chunk 400dfeb0 sz= 896 freeable "LGWR PIC ins ar" Chunk 400e0230 sz= 568 free " " Chunk 400e0468 sz= 896 freeable "LGWR PIC ins ar" Chunk 400e07e8 sz= 568 free " " EXTENT 1 . </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>As seen above and included in your trace, the memory class of some chunks are "LGWR PIC ins ar" and similar. If you notice that sum of them is 5200 bytes, and they are freeable chunks. These chunks are leaked.</p> <p>Also, there is no allocated process for LGWR. The most probably, you are closing the database.</p> <h4><a name="Workaround%3A"></a>Workaround:</h4> <p>There is no data corruption in this error, and can be safely ignore for small memory leaks by adding the following event to init.ora:</p> <ul class="alternate" type="square"> <li>event = "10262 trace name context forever, level 6000"</li> </ul> <p>Then, restart your database. This event disables space leaks less than 6000 bytes.</p> <p>You can see the details at Metalink Note:39308.1 ORA-600 <span class="error">&#91;723&#93;</span> "PGA memory leak"</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>Bug:1125724 ORA-600<span class="error">&#91;723&#93;</span> DURING SHUTDOWN</p> Operating System Operating System Version B.11.00 Product Version 8.1.6.0.0 [QA-19] ORA-00600 [2845] in UPDATE. WRONG ROWID VALUE. http://jira.ubtools.com/jira/browse/QA-19 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ORA-00600: internal error code, arguments: [2845], [0], [50], [39314], [], [], [], [] </pre> </div></div> <h4><a name="CurrentSQLstatementforthissession%3A"></a>Current SQL statement for this session:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> update pers_auth_str_tbl set asgn_str=:b1 where rowid=:b2 </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> kcfrbd() This funtion is a funtion of Oracle's Cache Layer. </pre> </div></div> <h4><a name="Valuesofbindvariables%3A"></a>Values of bind variables:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> :b1 = 0 :b2 = "9992" </pre> </div></div> <h4><a name="Datatypesofbindvariables%3A"></a>Data types of bind variables:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> :b1 : Number :b2 : Varchar2 </pre> </div></div> <h4><a name="Problemexplanation%3A"></a>Problem explanation:</h4> <p>As you see, your ROWID value in :b2 is "9992". This is incorrect. ROWID format in Oracle7 is 'BBBBBBBB.SSSS.FFFF' (Block.Slot.File).</p> <h4><a name="Bug%3A"></a>Bug:</h4> <p>Check Bug#632396. This bug says:</p> <ul class="alternate" type="square"> <li>The correct behaviour is to return an "Invalid Rowid" message.</li> </ul> <h4><a name="Recomendation%3A"></a>Recomendation:</h4> <p>Use proper datatype in the bind variable.</p> QA-19 ORA-00600 [2845] in UPDATE. WRONG ROWID VALUE. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:08:53 +0000 (UTC) Sun, 16 Sep 2007 16:27:35 +0000 (UTC) 0 Operating System Operating System Version B.10.20 Product Version 7.3.2.3.0 [QA-18] ORA-00600 [6033] in SELECT. http://jira.ubtools.com/jira/browse/QA-18 <h4><a name="Errorcode%3A"></a>Error code:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ORA-00600: internal error code, arguments: [6033], [], [], [], [], [], [], [] </pre> </div></div> <h4><a name="CurrentSQLstatementforthissession%3A"></a>Current SQL statement for this session:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> SELECT * FROM CSD_BOUNCE_CONTENT_BODY WHERE BOUNCE_CONTENT_ID = :b1 </pre> </div></div> <h4><a name="Oraclekernelfunctionfromwhichtheproblemisraised%3A"></a>Oracle kernel function from which the problem is raised:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> kdifxs() This is a function of Oracle Data Layer and responsible for fetching a row in an index scan. </pre> </div></div> <h4><a name="Leafblockdump%3A"></a>Leaf block dump:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> Leaf block dump =============== header address 2567438500=0x990800a4 kdxcolev 0 kdxcolok 0 kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y kdxconco 2 kdxcosdc 0 kdxconro 663 kdxcofbo 1410=0x582 kdxcofeo 4421=0x1145 kdxcoavs 8687 kdxlespl 0 kdxlende 0 kdxlenxt 0=0x0 kdxleprv 62923132=0x3c0217c kdxledsz 0 kdxlebksz 16152 kdxlepnro 11 kdxlepnco 1 ... </pre> </div></div> <p>This is an index object# 0x1ec41. As seen above, kdxcoopc is 0xa0. That means, this index is a key compressed V8 B*Tree index. Also, kdxledsz is 0. In other words, this index is a non-unique index.</p> <h4><a name="Recommendations%3A"></a>Recommendations:</h4> <p>Check your table by the following statement against any possible corruption:</p> <ul class="alternate" type="square"> <li>SQL &gt; analyze table CSD_BOUNCE_CONTENT_BODY validate structure cascade;</li> </ul> <p>If no corruption is detected, please see the following bugs:</p> <ul class="alternate" type="square"> <li>ORA-600 <span class="error">&#91;6033&#93;</span> DURING WORK FLOW ORDER IMPORT PROCESS</li> </ul> <ul class="alternate" type="square"> <li>ORA-600 <span class="error">&#91;711&#93;</span>, <span class="error">&#91;1&#93;</span>, <span class="error">&#91;0X2EDFE84&#93;</span> <span class="error">&#91;KDIFXS - PREFIX CONTEXT&#93;</span> WITH COMPRESSED INDEX</li> </ul> QA-18 ORA-00600 [6033] in SELECT. Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 17:02:05 +0000 (UTC) Sun, 16 Sep 2007 16:27:49 +0000 (UTC) 0 Operating System Operating System Version 5.8 Product Version 8.1.7.0.0 [QA-17] Which parameters affect CBO ? http://jira.ubtools.com/jira/browse/QA-17 Which parameters affect CBO ? QA-17 Which parameters affect CBO ? Oracle - SQL Tuning Major Closed Answered ubTools Admin ubTools Support Sun, 15 Jul 2007 14:35:07 +0000 (UTC) Sun, 16 Sep 2007 16:28:03 +0000 (UTC) 0 The parameters affecting CBO are included in Event 10053 trace files. <p>Sample:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> SQL&gt; alter session set events '10053 trace name context forever, level 1'; Session altered. SQL&gt; select f1 from test10053 where f1=23; F1 ---------- 23 </pre> </div></div> <p>Trace file is generated under USER_DUMP_DEST. Here is an excerpt from the trace file:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> *** 2005-01-10 13:09:03.010 *** SESSION ID:(8.1072) 2005-01-10 13:09:03.008 QUERY select f1 from test10053 where f1=23 *************************************** PARAMETERS USED BY THE OPTIMIZER ******************************** OPTIMIZER_FEATURES_ENABLE = 9.2.0 OPTIMIZER_MODE/GOAL = Choose _OPTIMIZER_PERCENT_PARALLEL = 101 HASH_AREA_SIZE = 1048576 HASH_JOIN_ENABLED = TRUE HASH_MULTIBLOCK_IO_COUNT = 0 SORT_AREA_SIZE = 524288 OPTIMIZER_SEARCH_LIMIT = 5 PARTITION_VIEW_ENABLED = FALSE _ALWAYS_STAR_TRANSFORMATION = FALSE _B_TREE_BITMAP_PLANS = TRUE STAR_TRANSFORMATION_ENABLED = FALSE _COMPLEX_VIEW_MERGING = TRUE _PUSH_JOIN_PREDICATE = TRUE PARALLEL_BROADCAST_ENABLED = TRUE OPTIMIZER_MAX_PERMUTATIONS = 2000 OPTIMIZER_INDEX_CACHING = 0 _SYSTEM_INDEX_CACHING = 0 OPTIMIZER_INDEX_COST_ADJ = 100 OPTIMIZER_DYNAMIC_SAMPLING = 1 _OPTIMIZER_DYN_SMP_BLKS = 32 QUERY_REWRITE_ENABLED = FALSE QUERY_REWRITE_INTEGRITY = ENFORCED _INDEX_JOIN_ENABLED = TRUE _SORT_ELIMINATION_COST_RATIO = 0 _OR_EXPAND_NVL_PREDICATE = TRUE _NEW_INITIAL_JOIN_ORDERS = TRUE ALWAYS_ANTI_JOIN = CHOOSE ALWAYS_SEMI_JOIN = CHOOSE _OPTIMIZER_MODE_FORCE = TRUE _OPTIMIZER_UNDO_CHANGES = FALSE _UNNEST_SUBQUERY = TRUE _PUSH_JOIN_UNION_VIEW = TRUE _FAST_FULL_SCAN_ENABLED = TRUE _OPTIM_ENHANCE_NNULL_DETECTION = TRUE _ORDERED_NESTED_LOOP = TRUE _NESTED_LOOP_FUDGE = 100 _NO_OR_EXPANSION = FALSE _QUERY_COST_REWRITE = TRUE QUERY_REWRITE_EXPRESSION = TRUE _IMPROVED_ROW_LENGTH_ENABLED = TRUE _USE_NOSEGMENT_INDEXES = FALSE _ENABLE_TYPE_DEP_SELECTIVITY = TRUE _IMPROVED_OUTERJOIN_CARD = TRUE _OPTIMIZER_ADJUST_FOR_NULLS = TRUE _OPTIMIZER_CHOOSE_PERMUTATION = 0 _USE_COLUMN_STATS_FOR_FUNCTION = TRUE _SUBQUERY_PRUNING_ENABLED = TRUE _SUBQUERY_PRUNING_REDUCTION_FACTOR = 50 _SUBQUERY_PRUNING_COST_FACTOR = 20 _LIKE_WITH_BIND_AS_EQUALITY = FALSE _TABLE_SCAN_COST_PLUS_ONE = TRUE _SORTMERGE_INEQUALITY_JOIN_OFF = FALSE _DEFAULT_NON_EQUALITY_SEL_CHECK = TRUE _ONESIDE_COLSTAT_FOR_EQUIJOINS = TRUE _OPTIMIZER_COST_MODEL = CHOOSE _GSETS_ALWAYS_USE_TEMPTABLES = FALSE DB_FILE_MULTIBLOCK_READ_COUNT = 16 _NEW_SORT_COST_ESTIMATE = TRUE _GS_ANTI_SEMI_JOIN_ALLOWED = TRUE _CPU_TO_IO = 0 _PRED_MOVE_AROUND = TRUE *************************************** BASE STATISTICAL INFORMATION *********************** Table stats Table: TEST10053 Alias: TEST10053 TOTAL :: CDN: 1000 NBLKS: 2 AVG_ROW_LEN: 7 -- Index stats INDEX NAME: I_TEST10053 COL#: 1 TOTAL :: LVLS: 1 #LB: 3 #DK: 1000 LB/K: 1 DB/K: 1 CLUF: 2 _OPTIMIZER_PERCENT_PARALLEL = 0 *************************************** SINGLE TABLE ACCESS PATH Column: F1 Col#: 1 Table: TEST10053 Alias: TEST10053 NDV: 1000 NULLS: 0 DENS: 1.0000e-03 LO: 1 HI: 1000 NO HISTOGRAM: #BKT: 1 #VAL: 2 TABLE: TEST10053 ORIG CDN: 1000 ROUNDED CDN: 1 CMPTD CDN: 1 Access path: tsc Resc: 2 Resp: 2 Access path: index (iff) Index: I_TEST10053 TABLE: TEST10053 RSC_CPU: 0 RSC_IO: 2 IX_SEL: 0.0000e+00 TB_SEL: 1.0000e+00 Access path: iff Resc: 2 Resp: 2 Access path: index (equal) Index: I_TEST10053 TABLE: TEST10053 RSC_CPU: 0 RSC_IO: 1 IX_SEL: 0.0000e+00 TB_SEL: 1.0000e-03 BEST_CST: 1.00 PATH: 4 Degree: 1 *************************************** OPTIMIZER STATISTICS AND COMPUTATIONS *************************************** GENERAL PLANS *********************** Join order[1]: TEST10053 [TEST10053] Best so far: TABLE#: 0 CST: 1 CDN: 1 BYTES: 3 Final: CST: 1 CDN: 1 RSC: 1 RSP: 1 BYTES: 3 IO-RSC: 1 IO-RSP: 1 CPU-RSC: 0 CPU-RSP: 0 </pre> </div></div> <p>Warnings:</p> <ul class="alternate" type="square"> <li>To generate Event 10053 data, statement must be HARD PARSED.</li> <li>RBO doesn't generate Event 10053 data.</li> </ul> Operating System Product Version Generic [QA-16] Does commit cause checkpoint ? http://jira.ubtools.com/jira/browse/QA-16 Does commit cause checkpoint ? QA-16 Does commit cause checkpoint ? Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 14:33:36 +0000 (UTC) Sun, 16 Sep 2007 16:28:22 +0000 (UTC) 0 Commit doesn't cause a checkpoint itself. If it was so, there would not be a need for redo-logging. <p>There is an article of K Gopalakrishnan. It's published in Oracle Internals magazine. Gopal explains the relationship and differences between commit-SCN and SCN.</p> Operating System Product Version Generic [QA-15] SQ enqueue problem. http://jira.ubtools.com/jira/browse/QA-15 Other than SYSDBA, no new connections allowed to the database. QA-15 SQ enqueue problem. Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 14:18:33 +0000 (UTC) Sun, 16 Sep 2007 16:28:38 +0000 (UTC) 0 <h4><a name="AnexcerptfromSYSTEMSTATEdump%3A"></a>An excerpt from SYSTEMSTATE dump:</h4> <p>The SYSTEMSTATE dump was generated in USER_DUMP_DEST as below:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> SQL&gt; connect / as sysdba SQL&gt; alter session set max_dump_file_size=UNLIMITED; SQL&gt; alter session set events 'IMMEDIATE trace name SYSTEMSTATE level 10'; -- 2 or 3 minutes later SQL&gt; alter session set events 'IMMEDIATE trace name SYSTEMSTATE level 10'; </pre> </div></div> <p>There are many sessions in SYSTEMSTATE dump waiting for enq: SQ - contention as below:</p> <blockquote> <p>(session) trans: 0, creator: 41b5c31f8, flag: (e1) USR/- BSY/<del>/</del>/<del>/</del>/-<br/> DID: 0001-0057-00000021, short-term DID: 0000-0000-00000000<br/> txn branch: 0<br/> oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS<br/> O/S info: user: , term: , ospid: , machine:<br/> program:<br/> <font color="red">waiting for 'enq: SQ - contention' blocking sess=0x4234f5b68</font> seq=1 wait_time=0<br/> name|mode=53510006, object #=8e, 0=0<br/> Dumping Session Wait History<br/> for 'enq: SQ - contention' count=1 wait_time=3007641<br/> name|mode=53510006, object #=8e, 0=0<br/> for 'enq: SQ - contention' count=1 wait_time=3007783<br/> ...<br/> SO: 40e830a70, type: 54, owner: 4234e5f20, flag: INIT/<del>/</del>/0x00<br/> LIBRARY OBJECT LOCK: lock=40e830a70 handle=41985e8b8 mode=N<br/> call pin=41a6d2740 session pin=0 hpc=0000 hlc=0000<br/> htl=40e830ae0<span class="error">&#91;40e8308e8,40e6833d8&#93;</span> htb=40e8d2cc8<br/> user=4234e5f20 session=4234e5f20 count=1 flags=PNC/<span class="error">&#91;0400&#93;</span> savepoint=2<br/> LIBRARY OBJECT HANDLE: handle=41985e8b8<br/> <font color="red">name=SYS.AUDSES$</font><br/> hash=deaeba50687d3d62c586aafe9b84f98c timestamp=07-20-2004 15:34:57<br/> namespace=TABL flags=KGHP/TIM/SML/<span class="error">&#91;02000000&#93;</span><br/> kkkk-dddd-llll=0000-0001-0001 lock=N pin=S latch#=34 hpc=022e hlc=022e</p></blockquote> <p>The blocking session(0x4234f5b68):</p> <blockquote> <p> <font color="red">SO: 4234f5b68 </font>, type: 4, owner: 41b5c8c60, flag: INIT/<del>/</del>/0x00<br/> (session) trans: 41cde12f8, creator: 41b5c8c60, flag: (e1) USR/- BSY/<del>/</del>/<del>/</del>/-<br/> DID: 0001-0056-00000020, short-term DID: 0000-0000-00000000<br/> txn branch: 0<br/> oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS<br/> O/S info: user: , term: , ospid: , machine:<br/> program:<br/> <font color="red">waiting for 'gc cr request'</font> blocking sess=0x0 seq=2 wait_time=0<br/> file#=1, block#=1ea, class#=1<br/> Dumping Session Wait History<br/> for 'gc cr request' count=1 wait_time=1230415<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1230287<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1220829<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1230501<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1230312<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1230295<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1230618<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1230445<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1229421<br/> file#=1, block#=1ea, class#=1<br/> for 'gc cr request' count=1 wait_time=1231336<br/> file#=1, block#=1ea, class#=1<br/> temporary object counter: 0</p></blockquote> <h4><a name="Probleminterpretation%3A"></a>Problem interpretation:</h4> <p>The blocker session waited for a RAC related wait event named <em>gc cr request</em> for the same file# and block#. Unfortunately, at the time of the problem happened, no SYSTEMSTATE dumps were generated for the other nodes. So, it was not possible to diagnose the root blocker on the other node to find why it holds same buffer too long.</p> <p>The sessions were waiting for SQ enqueue on SYS.AUDSES$ sequence. During connection, the value of V$SESSION.AUDSID is obtained from SYS.AUDSES$ sequence. SYSDBA doesn't use this sequence in connection. So, it was not blocked.</p> <h4><a name="Solution%3A"></a>Solution:</h4> <p>The default cache size of SYS.AUDSES$ was 20. It has been increased to 1000.</p> Operating System Product Version 10.1.0.3 [QA-14] is the current CPU breakdown formula correct ? http://jira.ubtools.com/jira/browse/QA-14 is the current CPU breakdown formula correct ? <blockquote><p><font color="red"> CPU used by this session = parse time cpu + recursive cpu usage + others</font> </p></blockquote> QA-14 is the current CPU breakdown formula correct ? Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 13:49:04 +0000 (UTC) Wed, 19 Sep 2007 12:13:26 +0000 (UTC) 0 <h4><a name="Answer%3A"></a>Answer:</h4> <p>This is the most well-known, but wrong formula I've read in many Oracle documentations.</p> <p><em>parse time cpu</em> includes parse cpu time of both recursive and user statements. <em>recursive cpu usage</em> includes both parse cpu time and non-parse cpu time of recursive statements. That means parse cpu usage of recursive statements is included in both <em>parse time cpu</em> and <em>recursive cpu usage</em>. In other words, it's duplicated and formula above is not correct.</p> <p>ubTools offers the following formula:</p> <blockquote><p><font color="red"> CPU used by this session = parse time cpu + others(exec_and_fetch_time_cpu)</font> </p></blockquote> <h4><a name="Question%3A"></a>Question:</h4> <p>If there is little or no SQL processing done within PL/SQL, should I also subtract <em>recursive cpu usage</em> from <em>CPU used by this session</em> to get the others cpu component ?</p> <h4><a name="Answer%3A"></a>Answer:</h4> <p>NO. A formula should explain all cases. It should not work for just some scenarios only.</p> <p>Also, both SQL and statements in PL/SQL are associated with a cursor internally in Oracle perspective. In other words, they are not different things in PARSE,EXEC,FETCH calls. If a statement is called by an other statement, it's called recursive statement. So, both an SQL and a PL/SQL can be recursive statements.</p> <ul class="alternate" type="square"> <li>If there is no SQL processing in PL/SQL, it means there is no SQL in parent and child PL/SQLs. There are 2 scenarios for this case: <ul class="alternate" type="square"> <li>If There is no child PL/SQL in the parent PL/SQL, <em>recursive cpu usage</em> is ZERO. Since it's zero, no need to substruct it from <em>CPU used by this session</em>.</li> <li>If there is child PL/SQL in the parent PL/SQL, PARSE call is done for child PL/SQL in recursive mode. In this case, <em>parse time cpu</em> of recursive statement is already included in <em>recursive cpu usage</em>. So, <em>recursive cpu usage</em> should not be substructed from <em>CPU used by this session</em>.</li> </ul> </li> </ul> <ul class="alternate" type="square"> <li>If there is little SQL processing in PL/SQL with little <em>parse time cpu</em>, the distortion in the mentioned wrong formula is small. But, <em>recursive cpu usage</em> should still NOT be substructed from <em>CPU used by this session</em> even if the distortion is small. Why should DBAs substruct it if they have more correct formula ? No need.</li> </ul> <h4><a name="Recommendation%3A"></a>Recommendation:</h4> <p>The current Reponse Time Performance Analysis(RTA) implementaions are not correct. RTA has not reached its next level, yet. That's why ubTools offered a new technique by <em>Microstate Response-time Performance Profiling (MRPP)</em>.</p> <p>There has been a question on this topic at <span class="nobr"><a href="http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:7641015793792">Tom Kyte's site<sup><img class="rendericon" src="http://www.ubTools.com/jira/images/icons/linkext7.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span> by referring ubTools:</p> <h4><a name="Question%3A"></a>Question:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> Tom, Just wanted to: what exactly is "CPU used by this session". One site( http://www.ubtools.com/cgi-bin/ib/ikonboard.cgi?act=ST;f=25;t=4 says &lt;&gt; CPU used by this session = parse time cpu + recursive cpu usage + others This is the most well-known, but wrong formula I've read in many Oracle documentations. parse time cpu includes parse cpu time of both recursive and user statements. recursive cpu usage includes both parse cpu time and non-parse cpu of recursive statements. That means parse cpu usage of recursive statements is included in both parse time cpu and recursive cpu usage. In other words, it's duplicated and formula above is not correct. ubTools offers the following formula: CPU used by this session = parse time cpu + others(exec_and_fetch_time_cpu) &lt;&gt; what is exec_and_fetch_time_cpu ? Regards </pre> </div></div> <h4><a name="TomKyte%27sanswer%3A"></a>Tom Kyte's answer:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> I am not so sure they are correct. unless they are talking about the description of cpu used by this session (i is not clear to me whether they are saying "the description is wrong" or "the value reported by the statistic is wrong" if the values were wrong, the cpu times reported for most things would exceed elapsed time by large margins. so, they should be able to demonstrate that for us. (and you would have to ask the author of an article in most cases "what did you mean by this "exec and fetch time cpu" and how exactly do you think we could find it) I think they were saying "the description provided is wrong", but I have an easier description. cpu use by this session is cpu used by that session. </pre> </div></div> <h4><a name="Ouranswer%3A"></a>Our answer:</h4> <p>We said the current CPU breakdown formula is incorrect, not the description of Oracle statistics.</p> <p><em>CPU used by this session</em> is the total CPU usage in session or instance level. And, there are 3 components in CPU usage:</p> <ul class="alternate" type="square"> <li>Parse</li> <li>Exec</li> <li>Fetch</li> </ul> <p>These components can be seen in SQL_TRACE / EVENT10046 traces. Parse component is available by <em>parse time cpu</em> statistic. Since there is no Oracle statistic for Exec/Fetch components, we call them as <em>others</em>.</p> <p>We had not mentioned values of the CPU usage statistics in this discussion. If we start talking about the values, it gets started another wrong topic on the values. Here is a brief explanation:</p> <ul class="alternate" type="square"> <li>In busy environments, distortion on CPU measurement is minimal. In, non-busy environments, it may not be minimal. In many cases, there is no big performance problem in non-busy environments. So, the distortion on CPU usage doesn't make sense in many cases.</li> <li>The wait measurement includes serious distortions in busy environments.</li> </ul> <p>ubTools says for years that <font color="red"> RESPONSE TIME ANALYSIS(RTA) CAN NOT BE IMPLEMENTED IN INSTANCE LEVEL. RTA IS A METHOD FOR SESSION LEVEL.</font> </p> <p>For the full details with the proven samples, see <em>Microstate Response-time Performance Profiling (MRPP).</em></p> Operating System Product Version Generic [QA-13] Dumping a stack trace is too slow in 10g. http://jira.ubtools.com/jira/browse/QA-13 <h4><a name="Errorcode%3A"></a>Error code:</h4> <p>ORA-4031 (may not be seen by end users).</p> <h4><a name="Errorcodedefinition%3A"></a>Error code definition:</h4> <p>The CPU usage reaches 98% in <b>KERNEL</b> mode. <em>strace</em> utility on linux reports that the process spins on <b>read()</b> system call.</p> <h4><a name="Systemcalls%3A"></a>System calls:</h4> <p>An excerpt from <em>strace -p &lt;OSPID&gt;</em> output:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> read(29, "&lt;\345&amp;\0\1\0\21\0\330Wk\10\0\0\0@\30\0\0\0\0\0\0\0", 24) = 24 read(29, "I\345&amp;\0\1\0\22\0\230\226\217\10\0\0\0@\30\0\0\0\0\0\0"..., 24) = 24 read(29, "\20\1\0\0\1\0\24\0\260\n3\0\0\0\0`\20\0\0\0\0\0\0\0", 24) = 24 read(29, "\34\1\0\0\1\0\34\0`\200\353\0\0\0\0`\10\0\0\0\0\0\0\0", 24) = 24 read(29, "\'\1\0\0\1\0\34\0h\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24 read(29, "3\1\0\0\1\0\34\0l\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24 read(29, "V\345&amp;\0\4\0\361\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 24) = 24 read(29, "`\345&amp;\0\1\0\f\0\300XF\4\0\0\0@\4\0\0\0\0\0\0\0", 24) = 24 read(29, "k\345&amp;\0\1\0\21\0\20Xk\10\0\0\0@\30\0\0\0\0\0\0\0", 24) = 24 read(29, "y\345&amp;\0\1\0\22\0\260\226\217\10\0\0\0@\30\0\0\0\0\0\0"..., 24) = 24 read(29, "\20\1\0\0\1\0\24\0\300\n3\0\0\0\0`\20\0\0\0\0\0\0\0", 24) = 24 read(29, "\34\1\0\0\1\0\34\0p\200\353\0\0\0\0`\10\0\0\0\0\0\0\0", 24) = 24 read(29, "\'\1\0\0\1\0\34\0x\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24 read(29, "3\1\0\0\1\0\34\0|\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24 </pre> </div></div> <p>The file descriptor is 29(the first argument in read() system call). By linux <em>lsof</em> command, the descriptor#29 is 1081732 <em>/oracle/product/10.1.0/bin/oracle</em>. In other words, the process is reading Oracle executable.</p> <p><em>strace -c -p &lt;OSPID&gt;</em> output for 1 minute:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 21.570914 40 543060 read 0.00 0.000016 2 7 lseek 0.00 0.000003 3 1 getpid 0.00 0.000001 1 1 open 0.00 0.000001 1 1 readlink ------ ----------- ----------- --------- --------- ---------------- 100.00 21.570935 543070 total </pre> </div></div> <p><em>read()</em> system call had been called 543,060 times per minute. That's why CPU utilization in KERNEL mode is high.</p> <p>An excerpt from stack trace by OS debugger:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> #0 0x200000000137fa81 in read () from /lib/tls/libpthread.so.0 #1 0x4000000004d7a7e0 in sskgds_getsnm () #2 0x4000000002766160 in skdsttpcs () #3 0x4000000001131920 in ksedst () #4 0x40000000011b0140 in ksm_4031_dump () </pre> </div></div> <p><em>ksm_4031_dump()</em> function of Oracle dumps ORA-4031 traces. The top of the stack includes <em>read()</em> system calls.</p> <h4><a name="Probleminterpretation%3A"></a>Problem interpretation:</h4> <p>The process gets ORA-4031 error, then tries to dump trace file for this error. But, while dumping the trace, it spins on <em>read()</em> system calls.</p> <h4><a name="Workaround%3A"></a>Workaround:</h4> <p>Set the following parameters in pfile/spfile:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> _4031_dump_bitvec = 0 _4031_max_dumps = 0 </pre> </div></div> <h4><a name="Bug%3A"></a>Bug:</h4> <p>Ref: Oracle Note:3964602 DUMPING A CALL STACK TRACE IS SLOW.</p> QA-13 Dumping a stack trace is too slow in 10g. Oracle - Database Tuning Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 13:45:31 +0000 (UTC) Sun, 16 Sep 2007 16:29:13 +0000 (UTC) 0 Operating System Product Version 10.1.0.2 [QA-12] Do read()/write() system calls block users in physical IO ? http://jira.ubtools.com/jira/browse/QA-12 do read()/write() system calls block users until physical IO to disk is completed ? QA-12 Do read()/write() system calls block users in physical IO ? Oracle - Operating System Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 13:34:33 +0000 (UTC) Sun, 16 Sep 2007 16:29:29 +0000 (UTC) 0 There is a common misconseption that read()/write() system calls block users until physical IO to disk is completed. <p>read()/write() system calls do not block users during pyhsical IO unless file is opened with O_DIRECT or O_SYNC flags. Users are blocked just during copying buffers from/to user address space to/from kernel address space. So, although read()/write() calls look synchronous in user perspective, they don't do physical IO as synchronously.</p> <p>In Asynchronous IO calls(i.e aio_read()/aio_write()), users are just blocked during enqueuing IO requests, not during copying buffers from/to user address space to/from kernel address space and not during physical IO.</p> Operating System Product Version ??? [QA-11] How to see the tasks of Oracle background processes ? http://jira.ubtools.com/jira/browse/QA-11 How to see the tasks of Oracle background processes ? QA-11 How to see the tasks of Oracle background processes ? Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 13:26:17 +0000 (UTC) Sun, 16 Sep 2007 16:29:47 +0000 (UTC) 0 <h4><a name="Answer%3A"></a>Answer:</h4> <p>Use the following query:</p> <p>select substr(DEST,1,10) DEST, DESCRIPTION from x$messages order by DEST;</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> DEST DESCRIPTION ---------- ---------------------------------------------------------------- * Monitor Cleanup * KSB action for X-instance calls * generic shutdown background * Scumnt mount lock * database close in progress * Poll system events broadcast channel * svr actn for shrd grp reg/dereg ARB* ASM to slave BG msg ARC* Archiver wakeup ARCH Archiver Message ARCH Archiver shutdown DEST DESCRIPTION ---------- ---------------------------------------------------------------- CJQ* Shutdown Job Queue Process CJQ* Job Queue Interupt CJQ* Job Queue Interupt CJQ* Job Queue Interupt CJQ* Job Queue Timout CJQ0 Check for async messages from other instances CJQ0 Coordinator send broadcast timeout CKPT create/scrub cmon foregrounds CKPT perform RM action in CKPT CKPT identify control file CKPT close control file DEST DESCRIPTION ---------- ---------------------------------------------------------------- CKPT release (XR,4,0) enqueue CKPT CKPT stat update timeout action CKPT CKPT reuse call completion action CKPT CKPT reuse range call continuation CKPT CKPT reuse call continuation CKPT refresh control file CKPT check for parameters from other instances CKPT start background CKPT CPU dynamic reconfiguration CKPT check for quiesce messages CKPT unquiesce the instance during database close DEST DESCRIPTION ---------- ---------------------------------------------------------------- CKPT unsubscribe to quiesce channel CKPT subscribe to quiesce channel CKPT Get Proxy Lock CKPT Db Checkpt Compl check CKPT Db Checkpt Request check CKPT update recovery-based i/o statistics CKPT Compile Environment Monitor CKPT SQL Memory Management Calculation CKPT free PX memory chunks in background CKPT KKX: drop ncomp dll action CKPT Flashback barrier DEST DESCRIPTION ---------- ---------------------------------------------------------------- CKPT hold alert level CKPT recovery area alert action CKPT start change tracking in ckpt CKPT get (XR,4,0) enqueue CKPT sense a heartbeat CKPT set heartbeat sensing CKPT emulate i/o errors on a disk CKPT timeout CKPT Run self test on group CKPT asynchronously dismount disk group CKPT dismount disk group DEST DESCRIPTION ---------- ---------------------------------------------------------------- CKPT query disk group status CKPT check disk status CKPT update disk status CKPT update disk group status CKPT kfc CKPT dismount disk group CKPT kfc CKPT mount disk group CTWR change tracking message CTWR change tracking timeout action DBW* hardware clock went backwards DBW* DBWR write buffers DBW* get/release open thread enqueue DEST DESCRIPTION ---------- ---------------------------------------------------------------- DBW* mount/dismount all db files DBW0 SGA memory tuning parameter update - DBW0 DBW0 Db mount lock DBW0 kfcb Poke DBW0 DBW0 kfc mount disk group DBW0 kfc dismount disk group DBW0 kfc invalidate file extent DBW0 Reserve lock name space lock DBW0 Release lock name space lock DBW0 complete Release space call DBW0 verify/invalidate all db files DEST DESCRIPTION ---------- ---------------------------------------------------------------- DBW0 recovery db file verification DBW0 identify db file DBW0 close and unlock db file DBW0 lock db file DBW0 offline db file DBW0 Db File check DBW0 Message to flush IMU txns DBW0 Db Instance Lock Mgmt DIAG write trace records out DIAG Clusterwise dump request DIAG poradebug commands DEST DESCRIPTION ---------- ---------------------------------------------------------------- DIAG write trace records out DIAG write trace records out DIAG write trace records out DMON DMON Wakeup DMON DMON shutdown DMON DMON Verify Standby shutdown for PM violation DMON Standby site request resync DMON Metadata file available DMON DMON rcv NS status DMON DMON Receive Message DMON DMON Disable DRC DEST DESCRIPTION ---------- ---------------------------------------------------------------- DMON DMON Interrupt Routine INSV INSV Wakeup INSV NetSlave Shutdown Message INSV INSV Receive Message LCK0 ksim LCK0 functions LCK0 ksim reg/dereg instance group LCK0 ksim query instance group LCK0 ksim polling interrupt action LCK0 KSXR remote instance died LCK0 KSXR finialize LCK0 kxfp signal recv function DEST DESCRIPTION ---------- ---------------------------------------------------------------- LCK0 get and hold global enqueue LCK0 perform a user instance lock operation LCK0 SMON purge object number cache LCK0 KQLM interrupt action LCK0 KQLM invalidation instance lock operation LCK0 KQLM pin instance lock operation LCK0 KQR timeout action LCK0 KQR get instance lock LCK0 sequence bckgrnd instance lock LCK0 release TS enq for sort segment LCK0 kea signal recv function DEST DESCRIPTION ---------- ---------------------------------------------------------------- LCK0 get TS enq for sort segment LCK0 release quiesce enqueue LCK0 get quiesce enqueue LCK0 KCL lock affinity timeout action LCK0 Check SCN adjust LCK0 Cross-instance broadcast message LCK0 ksim get value LGWR LGWR failure LGWR kfr ACD relocation LGWR kfr Incr Ckpt LGWR kfr Poke LGWR DEST DESCRIPTION ---------- ---------------------------------------------------------------- LGWR kfr Dismount disk group LGWR kfr mount disk group LGWR LGWR to Start DMON LGWR free KTU instance lock LGWR convert KTU instance lock LGWR get KTU instance lock LGWR dml_locks = 0 global enforcement LGWR Open/close/mount/dismount thread LGWR Redo writer generate offline immed marker LGWR Redo writer log switch operations LGWR LGWR re-eval standby locks DEST DESCRIPTION ---------- ---------------------------------------------------------------- LGWR Redo writer interrupt action LGWR Redo writer IO's LMD* Flush side-channel msgs LMD LNS* Network Server wakeup LNS* Network Server forced LNS* Network Server shutdown LNS* Network Server reinit MMAN lock memory at startup MMAN Memory Management MMAN Handle sga_target resize MMAN Reset advisory pool when advisory turned ON DEST DESCRIPTION ---------- ---------------------------------------------------------------- MMAN Complete deferred initialization of components MMAN lock memory timeout action MMNL tune undo retention MMNL MMNL Periodic MQL Selector MMNL ASH Sampler (KEWA) MMNL MMON SWRF Raw Metrics Capture MMON reload failed KSPD callbacks MMON SGA memory tuning MMON background recovery area alert action MMON Flashback Marker MMON tablespace alert monitor DEST DESCRIPTION ---------- ---------------------------------------------------------------- MMON UNDO MMON ACTION MMON MMON Local action Listener MMON MMON Remote action Listener MMON Advisor delete expired tasks MMON ASH Emergency Flusher (KEWA) MMON MMON SWRF Auto DBFUS Task MMON MMON SWRF Auto Purge Task MMON MMON SWRF Auto Flush Task MMON alert message purge MMON alert message cleanup MMON Check for sync messages from other instances DEST DESCRIPTION ---------- ---------------------------------------------------------------- MMON ADDM (KEH) MMON threshold reconciliation MMON metrics monitoring MMON shutdown MMON MMON run-once action driver MMON MMON testing slave MMON MMON testing action MMON MMON Completion Callback Dispatcher MMON Job Autostart action force MMON Coordinator autostart timeout MMON Check for autostart messages from other instances DEST DESCRIPTION ---------- ---------------------------------------------------------------- MMON Compute cache stats in background MMON undo usage MMON recovery area alert action MMON SGA memory tuning parameter update MMON reconfiguration MMON action NSV* NetSlave Wakeup Message NSV* NetSlave Receive Message NSV* NetSlave Metadata Resync NSV* NetSlave Health Check Message NSV* NetSlave Shutdown Message NSV* NetSlave request Primary to resync DEST DESCRIPTION ---------- ---------------------------------------------------------------- NSV* NetSlave Check DRC version QMNC Shutdown Q Monitor Coord RBAL ASM to master BG msg RBAL BG load lib msg RBAL|SMON OSM to BG mesg RECO distributed recovery wakeup RECO distributed recovery shutdown RSM* RSM Wakeup RSM* RSM Receive Message RSM* RSM Receive Message Response RVWR Open/close flashback thread DEST DESCRIPTION ---------- ---------------------------------------------------------------- RVWR RVWR IO's SMON kfcl instance recovery TEST Reliable Test Dummy Call 212 rows selected. SQL&gt; </pre> </div></div> Operating System Product Version 10.1.0.3.0 [QA-9] How to set an event in other session ? http://jira.ubtools.com/jira/browse/QA-9 How to set an event in other session ? QA-9 How to set an event in other session ? Oracle - Administration Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 13:06:07 +0000 (UTC) Sun, 16 Sep 2007 16:30:07 +0000 (UTC) 0 <h4><a name="Answer%3A"></a>Answer:</h4> <p>Use SYS.DBMS_SYSTEM.SET_EV() procedure. Here is the specification for this procedure:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> PROCEDURE SET_EV Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- SI BINARY_INTEGER IN SE BINARY_INTEGER IN EV BINARY_INTEGER IN LE BINARY_INTEGER IN NM VARCHAR2 IN </pre> </div></div> <ul class="alternate" type="square"> <li>SID: V$SESSION.SID</li> <li>SE: V$SESSION.SERIAL#</li> <li>EV: Event number. For example: <ul class="alternate" type="square"> <li>10046: SQL traces.</li> <li>10053: Optimizer traces.</li> <li>NNN : ORA-NNN errors.</li> <li>65535: IMMEDIATE traces.</li> </ul> </li> <li>LE: Event level. For Event 10046 events: <ul class="alternate" type="square"> <li>0: Disable event.</li> <li>1: PARSE, FETCH, EXEC, EXECUTION PLAN</li> <li>4: Level 1 + BINDS</li> <li>8: Level 1 + WAITS</li> <li>12: Level 4 + Level 8</li> </ul> </li> <li>NM: Event name. For example: <ul class="alternate" type="square"> <li>ERRORSTACK.......: For error stack traces.</li> <li>PROCESSSTATE...: For process states</li> <li>SYSTEMSTATE.......: For System states.</li> <li>''..................................: For CONTEXT FOREVER.</li> </ul> </li> </ul> <h4><a name="Sample%3A"></a>Sample:</h4> <p>Dumps PROCESSSTATE trace IMMEDIATELY in LEVEL 10:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> SQL&gt; exec dbms_system.set_ev(8,1056,65535,10,'PROCESSSTATE'); </pre> </div></div> <p>Dumps ERRORSTACK trace in LEVEL 3 on ORA-942 error:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> SQL&gt; exec dbms_system.set_ev(8,1060,942,3,'ERRORSTACK'); </pre> </div></div> <p>Dumps Event 10046 trace in LEVEL 8 for CONTEXT FOREVER:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> SQL&gt; exec dbms_system.set_ev(8,1060,10046,8,''); </pre> </div></div> Operating System Product Version ??? [QA-8] Heapdump Interpretation http://jira.ubtools.com/jira/browse/QA-8 I have a process which is taking up way more memory than I'd expected. The process runs a PL/SQL that does some nested loop joins on a PL/SQL table. <p>The background process is using &gt; 200Mb of private memory and this number goes up if we tweak the WHERE clause in the join to return more data.</p> <p>I did a heapdump of the process and the trace file looks like this (lots of stuff trimmed):</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> ... EXTENT 437 Chunk 925dfe4 sz= 1836 perm "perm " alo=1836 Chunk 925e710 sz= 1156 recreate "session heap " latch=0 ds 92693fc sz= 30315156 ct= 440 b7aa56c sz= 3980 92f30a0 sz= 1072 afb6e34 sz= 16472 afb2dcc sz= 16472 afaed64 sz= 16472 ... </pre> </div></div> <p>I presume that "session heap" is the UGA for this process'<br/> session. Basically it goes on like this for several pages with sz anywhere between 16k and 1Mb. How can I interpret this? I presume the memory is to do with cursor information. This is a sort but the sort area size is only 10Mb and cannot account for all the private memory in use.</p> <p>I'm just trying to decide if this is a reasonable amount of memory to be using (i.e. explain what it is using it <b>for</b>) and just put up with it, or if something has gone wrong. I'm on 8.1.5 on Linux 2.2 (I know, I know...)</p> <p>Thanks for any insight!</p> QA-8 Heapdump Interpretation Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 12:58:14 +0000 (UTC) Sun, 16 Sep 2007 16:30:20 +0000 (UTC) 0 <h4><a name="Answer%3A"></a>Answer:</h4> <p>A heap consists of memory areas named extent. Each extent consists of memory areas named chunks.</p> <h4><a name="Interpretation%3A"></a>Interpretation:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> EXTENT 437 Chunk 925dfe4 sz= 1836 perm "perm " alo=1836 Chunk 925e710 sz= 1156 recreate "session heap " latch=0 EXTENT 437 ---&gt; extent number 925dfe4 ----&gt; chunk address sz= -----&gt; size of chunk perm ------&gt; permanent memory class "perm " ------&gt; chunk comment </pre> </div></div> <p>Memory classes can be the followings:</p> <ul class="alternate" type="square"> <li>Recreatable (can be removed and then recreated when requested. i.e: shared SQL statements)</li> <li>Free (free, no object in it)</li> <li>Freeable(used in session/call duration)</li> <li>Permanent(for permament objects)</li> </ul> <p>Each chunk in same extent is contiguous. For your case, the first chunk address(0x925dfe4) + its size(1836) = the second chunk address (0x925e710)</p> <h4><a name="Foryourproblem%3A"></a>For your problem:</h4> <p>Shared memory segments such as SGA are included in process address space. So, You may be encoutering this problem. Search metalink for pmap command.</p> Operating System Product Version ??? [QA-7] _TRACE_FILES_PUBLIC parameter http://jira.ubtools.com/jira/browse/QA-7 <h4><a name="Parameter%3A"></a>Parameter:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> Name.................: _TRACE_FILES_PUBLIC Values...............: TRUE/FALSE Default value........: FALSE Initial Release......: ? Scope................: Instance </pre> </div></div> <h4><a name="Explanation%3A"></a>Explanation:</h4> <p>Trace files are not created with read permission by default for non-dba groups. Here is a sample on Linux:</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> $ ls -ltr total 4 -rw-r----- 1 oracle oinstall 2146 Jan 6 11:37 linkplus_ora_18653.trc </pre> </div></div> <p>With _TRACE_FILES_PUBLIC=TRUE, other groups can read trace files.</p> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre> $ ls -ltr total 8 -rw-r--r-- 1 oracle oinstall 2742 Jan 6 12:00 linkplus_ora_18759.trc </pre> </div></div> <h4><a name="Warning%3A"></a>Warning:</h4> <p>Setting this parameter to TRUE should be done for trusted users since trace files may include security data in BIND variables.</p> QA-7 _TRACE_FILES_PUBLIC parameter Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 12:53:58 +0000 (UTC) Sun, 16 Sep 2007 16:30:34 +0000 (UTC) 0 Operating System Product Version ??? [QA-6] _OPTIM_PEEK_USER_BINDS parameter http://jira.ubtools.com/jira/browse/QA-6 <h4><a name="Parameter%3A"></a>Parameter:</h4> <div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent panelContent"> <pre>Name.................: _OPTIM_PEEK_USER_BINDS Values...............: TRUE/FALSE Default value........: TRUE Initial Release......: 9.0.1 Scope................: Instance/Session </pre> </div></div> <h4><a name="Explanation%3A"></a>Explanation:</h4> <p>Until Oracle 9.0.1, values of bind variables are known in the PARSE phase. Since it's not known, it's not possible to generate execution plans according to bind values.</p> <p>With 9i and onwards, Oracle peeks the values of bind variables in the FIRST PARSE phase and generates execution plans according to the values in this first PARSE. If subsequent bind values are skewed, then execution plans may not be optimal for the subsequent binds.</p> QA-6 _OPTIM_PEEK_USER_BINDS parameter Oracle - Internals Major Closed Answered ubTools Support ubTools Support Sun, 15 Jul 2007 12:47:08 +0000 (UTC) Sun, 16 Sep 2007 16:31:03 +0000 (UTC) 0 Operating System Product Version 9.0.1