<< Back to previous view

[QA-56] "cursor: mutex X" Created: 27/Jul/13  Updated: 28/Jul/13

Status: Open
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Unresolved Votes: 0

Product Version: 11.2.0.3.7
Operating System: Solaris
Host Name: .
Database Name: .

 Description   
The customer has upgraded from Oracle 10.2.0.5 to Oracle 11.2.0.3.7. They encountered library cache lock problem.

Some excerpt from the AWR:

Elapsed:	 	 15.01 (mins)	 	 
DB Time:	 	 6,548.72 (mins)	 	 
.....
Top 5 Timed Foreground Events

Event	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
library cache lock	425	364,186	856909	92.69	Concurrency
enq: TX - row lock contention	288	11,500	39930	2.93	Application
TCP Socket (KGAS)	136,317	9,552	70	2.43	Network
DB CPU	 	4,167	 	1.06	 
db file sequential read	742,541	2,176	3	0.55	User I/O


 Comments   
Comment by ubTools Support [ 27/Jul/13 10:42 AM ]
Some excerpt from the HANGANALYZE trace:
Chain 1:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (opusdata.opusdata)
                   os id: 25465
              process id: 256, oracle@<hostname>
              session id: 39
        session serial #: 61345
    }
    is waiting for 'library cache lock' with wait info:
    {
                      p1: 'handle address'=0x22038dff68
                      p2: 'lock address'=0x21ef361978
                      p3: '100*mode+namespace'=0x520002
            time in wait: 5 min 33 sec
           timeout after: never
                 wait id: 5128
                blocking: 0 sessions
            wait history:
              * time between current wait and wait #1: 0.001507 sec
              1.       event: 'SQL*Net message from client'
                 time waited: 0.003166 sec
                     wait id: 5127            p1: 'driver id'=0x54435000
                                              p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000002 sec
              2.       event: 'SQL*Net message to client'
                 time waited: 0.000002 sec
                     wait id: 5126            p1: 'driver id'=0x54435000
                                              p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000011 sec
              3.       event: 'SQL*Net message from client'
                 time waited: 0.003906 sec
                     wait id: 5125            p1: 'driver id'=0x54435000
                                              p2: '#bytes'=0x1
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 1 (opusdata.opusdata)
                   os id: 22379
              process id: 500, oracle@<hostname>
              session id: 8008
        session serial #: 62080
    }
    which is waiting for 'cursor: mutex X' with wait info:
    {
                      p1: 'idn'=0xd4d88873
                      p2: 'value'=0x1
                      p3: 'where'=0x400000000
            time in wait: 0.000000 sec
      heur. time in wait: 1 min 3 sec
           timeout after: never
                 wait id: 7092018
                blocking: 426 sessions
            wait history:
              * time between current wait and wait #1: 0.000007 sec
              1.       event: 'cursor: mutex X'
                 time waited: 0.000002 sec
                     wait id: 7092017         p1: 'idn'=0xd4d88873
                                              p2: 'value'=0x1
                                              p3: 'where'=0x400000000
              * time between wait #1 and #2: 0.000009 sec
              2.       event: 'cursor: mutex X'
                 time waited: 0.000002 sec
                     wait id: 7092016         p1: 'idn'=0xd4d88873
                                              p2: 'value'=0x1
                                              p3: 'where'=0x400000000
              * time between wait #2 and #3: 0.000007 sec
              3.       event: 'cursor: mutex X'
                 time waited: 0.000003 sec
                     wait id: 7092015         p1: 'idn'=0xd4d88873
                                              p2: 'value'=0x1
                                              p3: 'where'=0x400000000
    }
 
Chain 1 Signature: 'cursor: mutex X'<='library cache lock'
Chain 1 Signature Hash: 0xfbcb6c60
.....
              process id: 3136, oracle@<hostname>
.....
    is waiting for 'library cache lock' with wait info:
.....
    and is blocked by 'instance: 1, os id: 22379, session id: 8008',

There are many other sessions waiting on library cache lock in the trace. And, they are blocked by SID#8008 which is waiting on cursor: mutex X. SID#8008 blocks 426 sessions.

Comment by ubTools Support [ 27/Jul/13 01:03 PM ]
Unfortunately, the blocker session SID#8008 exited. But, a new blocker SID#7579 appeared. Some excerpt from its ERRORSTACK LEVEL 3 trace:
KGX Atomic Operation Log 3c6bb9388
       Mutex 22c1df2c18(7579, 0) idn d4d88873 oper EXCL
       Cursor Parent uid 7579 efd 15 whr 1 slp 0
       oper=OPERATION_DEFAULT pt1=0 pt2=0 pt3=0
       pt4=0 u41=0 stt=0
      KGX Atomic Operation Log 3c6bb93d8
       Mutex 22c1df2d30(0, 1) idn d4d88873 oper GET_EXCL
       hash table uid 7579 efd 15 whr 4 slp 36782
       oper=OPERATION_DEFAULT pt1=0 pt2=0 pt3=0
       pt4=0 u41=0 stt=0

Mutex IDN 0xd4d88873 is hold in EXCL mode at mutex address 0x22c1df2c18. The holder SID is 7579. The mutex type is Cursor Parent
The same mutex IDN is requested in EXCL(oper GET_EXCL) mode at mutex address 0x22c1df2d30 and waited. The mutex type is hash table
See Oracle note Understanding and Reading Systemstates (Doc ID 423153.1) for interpreting.

Waiting session holds the same mutex IDN in EXCL mode but in the different mutex address and the different mutex type. No other holder encountered in the SYSTEMSTATE trace.

Comment by ubTools Support [ 28/Jul/13 01:26 PM ]
Some excerpt from AWR:
Foreground Wait Events
.....
Event	Waits	%Time -outs	Total Wait Time (s)	Avg wait (ms)	Waits /txn	% DB time
.....
cursor: mutex X	8,066,619	0	835	0	138.17	0.21
.....
Mutex Sleep Summary

ordered by number of sleeps desc

Mutex Type	Location	Sleeps	Wait Time (ms)
hash table	kkshhcdel [KKSHBKLOC4]	8,061,510	0

The mutex sleep location is kkshhcdel [KKSHBKLOC4]. Nothing found about it in Metalink.

Comment by ubTools Support [ 28/Jul/13 05:33 PM ]
Workaround:

Change the SQL text.

Comment by ubTools Support [ 28/Jul/13 05:36 PM ]
The customer will open an SR to Oracle Support. I'll update this issue later with SR result.




[QA-49] ORA-4031: High Allocation for "Oracle Text Commit new id" in Shared Pool. Created: 05/Nov/10  Updated: 05/Nov/10

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: 10.2.0.4
Operating System: Solaris
Host Name: .
Database Name: .

 Description   
The customer encountered ORA-4031 and trace file generated. SGA is an ASMM SGA. The application uses Oracle Text.

 Comments   
Comment by ubTools Support [ 05/Nov/10 10:24 PM ]
Analysis of the Trace:

The Requested SUBPOOL:

.....
=================================
Begin 4031 Diagnostic Information
=================================
.....
HEAP DUMP heap name="sga heap(3,0)"  desc=380043660
 extent sz=0xfe0 alt=216 het=32767 rec=9 flg=-126 opc=0
 parent=0 owner=0 nex=0 xsz=0x1000000
 latch set 3 of 4
 durations enabled for this heap
 reserved granules for root 0 (granule size 16777216)
.....

The allocation was requested from sga heap(3,0), which is (SUBPOOL:3,DURATION:0).

All SUBPOOLS and Their DURATION Memories:

.....
HEAP DUMP heap name="sga heap(1,0)"  desc=380030610
Total heap size    =218102664
Total free space   =  1066928
Total reserved free space   =  8439520
Unpinned space     = 38812528  rcr=11971 trn=17906
Permanent space    =208595160
HEAP DUMP heap name="sga heap(1,1)"  desc=380031e68
Total heap size    = 67108512
Total free space   =  2912528
Total reserved free space   =  1382816
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(1,2)"  desc=3800336c0
Total heap size    =167771280
Total free space   = 92743480
Total reserved free space   =  3852856
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(1,3)"  desc=380034f18
Total heap size    =268434048
Total free space   = 74547592
Total reserved free space   = 13497472
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(2,0)"  desc=380039e38
Total heap size    =201325536
Total free space   =    17200
Total reserved free space   =  8435920
Unpinned space     = 26474112  rcr=7934 trn=8094
Permanent space    =192871456
HEAP DUMP heap name="sga heap(2,1)"  desc=38003b690
Total heap size    = 83885640
Total free space   = 48723768
Total reserved free space   =  1035792
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(2,2)"  desc=38003cee8
Total heap size    =369096816
Total free space   =258674312
Total reserved free space   = 16982464
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(2,3)"  desc=38003e740
Total heap size    =218102664
Total free space   = 17202608
Total reserved free space   = 10966696
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(3,0)"  desc=380043660
Total heap size    =184548408
Total free space   =    13008
Total reserved free space   =  5061928
Unpinned space     = 26943408  rcr=4930 trn=9425
Permanent space    =179472608
HEAP DUMP heap name="sga heap(3,1)"  desc=380044eb8
Total heap size    = 67108512
Total free space   = 27568352
Total reserved free space   =     4744
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(3,2)"  desc=380046710
Total heap size    =352319688
Total free space   =233302736
Total reserved free space   = 15981216
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(3,3)"  desc=380047f68
Total heap size    =385873944
Total free space   =143746536
Total reserved free space   = 19402616
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(4,0)"  desc=38004ce88
Total heap size    =184548408
Total free space   =     8616
Total reserved free space   =  7592328
Unpinned space     = 28725496  rcr=8459 trn=9864
Permanent space    =176946600
HEAP DUMP heap name="sga heap(4,1)"  desc=38004e6e0
Total heap size    = 83885640
Total free space   = 33356784
Total reserved free space   =  1189120
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(4,2)"  desc=38004ff38
Total heap size    =335542560
Total free space   =238988592
Total reserved free space   = 16293768
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
HEAP DUMP heap name="sga heap(4,3)"  desc=380051790
Total heap size    =721416504
Total free space   =445595432
Total reserved free space   = 33743680
Unpinned space     =        0  rcr=0 trn=0
Permanent space    =        0
.....

All PERMANENT SPACES were allocated in DURATION 0. Although there are enough free spaces in the other DURATIONS of (3,1),(3,2),(3,3); free space can not be allocated from them.

.....
duration memory (duration 0) cannot take free memory from other durations within the same subpool.  
It can only get more memory by being given a new complete EXTENT (granule) from the granule management code.
.....

Ref: Oracle Bug 9911213: ORA-04031 AFTER APPLYING 10.2.0.4 PATCSHET

Since the lower limit of BUFFER CACHE was determined by DB_CAHCE_SIZE parameter; SHARED POOL could not grow by allocating a new EXTENT, then ORA-4031 appeared.

SUBPOOL Allocations:

.....
==============================
Memory Utilization of Subpool 1
================================
     Allocation Name          Size   
_________________________  __________
"free memory              "   215299680  
.....
"sql area                 "   151923248
.....
"Oracle Text Commit new id"   399237696
.....
"library cache            "    30711448
.....
==============================
Memory Utilization of Subpool 2
================================
     Allocation Name          Size   
_________________________  __________
"free memory              "   367295736
.....
"sql area                 "   160984248
.....
"Oracle Text Commit new id"   392833064
.....
"library cache            "    35069800
.....
==============================
Memory Utilization of Subpool 3
================================
     Allocation Name          Size   
_________________________  __________
"free memory              "   450731968  
.....
"sql area                 "   182415376
.....
"Oracle Text Commit new id"   417149240
.....
"library cache            "    39156336
.....
==============================
Memory Utilization of Subpool 4
================================
     Allocation Name          Size   
_________________________  __________
"free memory              "   781766288
.....
"sql area                 "   156513808
.....
"Oracle Text Commit new id"   410783408
.....
"library cache            "    31300664

The total size of Oracle Text Commit new id is 1.5GB (399237696+392833064+417149240+410783408). It's high.

Comment by ubTools Support [ 05/Nov/10 10:35 PM ]
Oracle Text Commit new id Allocation Trend:

An Excerpt from SGA Stat:

SQL> select a.instance_number,begin_interval_time, bytes from dba_hist_sgastat a, dba_hist_snapshot b
  2  where pool='shared pool' and
  3        a.snap_id=b.snap_id and
  4        a.instance_number=b.instance_number and
  5        name='Oracle Text Commit new id'
  6  order by begin_interval_time;

.....
              1 06/10/2010 01:00:07,750                                                      352864368
              1 06/10/2010 02:00:55,107                                                      353711568
.....
              1 12/10/2010 11:00:12,212                                                      448444792
              1 12/10/2010 12:00:27,412                                                      449299672
              1 12/10/2010 13:00:12,435                                                      450157752
              1 12/10/2010 14:00:19,294                                                      450179512
.....
              1 04/11/2010 14:31:10,604                                                     1622639416
              1 04/11/2010 14:40:18,341                                                     1623339552
              1 04/11/2010 14:50:28,971                                                     1623879936
              1 04/11/2010 15:00:40,721                                                     1623880712

722 rows selected.

SQL>

Oracle Text Commit new id had increased in small sizes.

Comment by ubTools Support [ 05/Nov/10 10:43 PM ]
Summary:

Root Cause:

This problem is Oracle BUG:8593562 encountered in Oracle Text environment.

.....
It is incremented as the space is allocated, but not decremented as it is freed.
 It will reset when the instance is restarted.
.....
The bug is currently in work by Development and expected to be resolved in a future release.
.....

Ref: Growth of "Oracle Text Commit new id" memory with Sync on Commit Index [ID 872413.1]

Workaround:

  • Restart the INSTANCE.




[QA-44] TNS connection lost for big SQL*Net packets, and slow performance for small SQL*Net packets. Created: 26/Nov/08  Updated: 26/Nov/08

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: Oracle 9i
Operating System: TRU64

 Description   
The customer has the following errors in SQ*Net SERVER trace:
[17-KAS-2008 06:02:33:866] nspsend: transport write error
[17-KAS-2008 06:02:33:867] nserror: nsres: id=0, op=67, ns=12547, ns2=12560; nt[0]=517, nt[1]=32, nt[2]=0; ora[
0]=0, ora[1]=0, ora[2]=0
[17-KAS-2008 06:02:33:868] nsdo: nsctxrnk=0
[17-KAS-2008 06:02:33:869] nioqsn: send failed: bl = 47, nicbl = 59
[17-KAS-2008 06:02:33:870] nioqper:  error from nioqsn
[17-KAS-2008 06:02:33:871] nioqper:    nr err code: 0
[17-KAS-2008 06:02:33:872] nioqper:    ns main err code: 12547
[17-KAS-2008 06:02:33:873] nioqper:    ns (2)  err code: 12560
[17-KAS-2008 06:02:33:875] nioqper:    nt main err code: 517
[17-KAS-2008 06:02:33:876] nioqper:    nt (2)  err code: 32
[17-KAS-2008 06:02:33:877] nioqper:    nt OS   err code: 0
[17-KAS-2008 06:02:33:878] nioqer: entry
[17-KAS-2008 06:02:33:879] nioqce: entry
[17-KAS-2008 06:02:33:880] nioqce: exit
[17-KAS-2008 06:02:33:881] nioqer: exit
[17-KAS-2008 06:02:33:882] nioqsn:  returning error: 3113

nt[1]=32 is Operating System Dependent(OSD) error code.

An excerpt from truss output of SERVER process:

531245: lseek(7, 0, SEEK_CUR)                           = 501528
531245: write(7, " [ 1 7 - K A S - 2 0 0 8".., 34)      = 34
531245: lseek(7, 0, SEEK_CUR)                           = 501562
531245: write(7, " e n t r y\n", 6)                     = 6
531245: write(12, "07DB\0\006\0\0\0\0\002C2".., 2011)   Err#32 Broken pipe
531245:     Received signal #13, SIGPIPE [ignored]
531245:       siginfo: SIGPIPE
531245: lseek(8, 69632, SEEK_SET)                       = 69632
531245: read(8, "12\0DB0F\0\0 t\0DC0F\0\0".., 512)      = 512
531245: lseek(8, 13312, SEEK_SET)                       = 13312
531245: read(8, "19\0A203\0\09E\0A303\0\0".., 512)      = 512
531245: gettimeofday(0x000000011FFF7A90, 0x00000000)    = 0
531245: lseek(7, 0, SEEK_CUR)                           = 501568

OSD error is Err#32 Broken pipe. This OSD error is also defined in errno.h:

  • #define EPIPE 32 /* Broken pipe */

Client side SQL*Net trace shows that client is waiting for a response from server on nttrd() call.

Since the server process is lost connection, it's not able to send a message to the client side. Since the client side is not getting a response, his screen waits in "Not Responding" state in Windows.



 Comments   
Comment by ubTools Support [ 26/Nov/08 01:38 PM ]
The customer uses CISCO ASA 5520 series, Version 8.0.4 FIREWALL. This has an option of inspect sqlnet. After this option has been disabled, the problem has been solved.




[QA-43] Slow performance while navigating on the forms items. Created: 26/Nov/08  Updated: 26/Nov/08

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: Oracle 10g 10.2.0.4
Operating System: Windows
Operating System Version: 2003

 Description   
The customer encountered slow performance while navigating on their forms items. The problem occured sporadically. When it occurs it takes 3-4 seconds, which are not acceptable.

An excerpt from EVENT 10046 trace file:

*** [ Windows thread id: 5580 ]
*** 2008-11-24 21:27:20.390
RPC CALL:...(stament removed);
=====================
PARSING IN CURSOR #5 len=56 dep=1 uid=78 oct=3 lid=78 tim=1420397928
hv=3822139714 ad='57c1a3bc'
SELECT ...(stament removed)
END OF STMT
PARSE #5:c=0,e=32,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1420397924
EXEC #5:c=0,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1420398028
FETCH #5:c=0,e=14,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=1,tim=1420398156
RPC EXEC:c=0,e=0
WAIT #0: nam='SQL*Net message to client' ela= 4 driver id=1297371904
#bytes=1 p3=0 obj#=-1 tim=1420398646
*** 2008-11-24 21:27:20.765
WAIT #0: nam='SQL*Net message from client' ela= 3 driver id=1297371904
#bytes=1 p3=0 obj#=-1 tim=1420772565
RPC CALL:...(stament removed);
EXEC #5:c=0,e=38,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1420773632
FETCH #5:c=0,e=14,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=1,tim=1420773671
RPC EXEC:c=0,e=0
WAIT #0: nam='SQL*Net message to client' ela= 5 driver id=1297371904
#bytes=1 p3=0 obj#=-1 tim=1420774146
*** [ Windows thread id: 4416 ]
*** 2008-11-24 21:27:24.765
WAIT #0: nam='SQL*Net message from client' ela= 6 driver id=1297371904
#bytes=1 p3=0 obj#=-1 tim=1424766291
RPC CALL:...(stament removed);
=====================
PARSING IN CURSOR #7 len=70 dep=1 uid=78 oct=3 lid=78 tim=1424767161
hv=2516830579 ad='5af8ae9c'
SELECT ...(stament removed)
END OF STMT
PARSE #7:c=0,e=78,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424767156
EXEC #7:c=0,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424767987
FETCH #7:c=0,e=94,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=1,tim=1424768238
RPC EXEC:c=0,e=0
WAIT #0: nam='SQL*Net message to client' ela= 7 driver id=1297371904
#bytes=1 p3=0 obj#=-1 tim=1424768583
PUT: vc (659FD948), msg (5CF1D5B0), size (90), flgs (3)
*** [ Windows thread id: 5580 ]
*** 2008-11-24 21:27:24.765
WAIT #0: nam='SQL*Net message from client' ela= 4 driver id=1297371904
#bytes=1 p3=0 obj#=-1 tim=1424771121
RPC CALL:...(stament removed);
=====================
PARSING IN CURSOR #8 len=90 dep=1 uid=78 oct=3 lid=78 tim=1424772412
hv=1036237733 ad='5d1575f0'
SELECT ...(stament removed)
END OF STMT
PARSE #8:c=0,e=56,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424772407
EXEC #8:c=0,e=55,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424772567
FETCH #8:c=0,e=14,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1424772608
RPC EXEC:c=0,e=0
WAIT #0: nam='SQL*Net message to client' ela= 6 driver id=1297371904
#bytes=4 p3=0 obj#=-1 tim=1424773339


 Comments   
Comment by ubTools Support [ 26/Nov/08 12:10 PM ]
The time had been spent between the following operations:
WAIT #0: nam='SQL*Net message to client' ela= 5 driver id=1297371904
 #bytes=1 p3=0 obj#=-1 tim=1420774146
*** [ Windows thread id: 4416 ]
*** 2008-11-24 21:27:24.765
WAIT #0: nam='SQL*Net message from client' ela= 6 driver id=1297371904
 #bytes=1 p3=0 obj#=-1 tim=1424766291

As seen from the excerpt above, the windows thread ID had switched to 4416.

elapsed time: (tim=1424766291) - (tim=1420774146) = 3992145 microseconds = 3.992145 seconds.

The customer has SHARED SERVER configuration. But, SHARED_SERVERS parameter was set to 1. Unfortunately, we had not got an opportunity to debug SHARED SERVER operations. But, increasing SHARED_SERVERS parameter has solved this thread switch problem since there are now pre-created SHARED SERVERS.





[QA-39] Database hangs on "cursor: pin S wait on X" wait events. Created: 07/Jun/08  Updated: 07/Jun/08

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
Operating System: IBM-AIX

 Description   
The Database hangs.

The ASH report shows the activity on cursor: pin S wait on X wait event.

Top User Events
Event                        Event Class   % Activity     Avg Active Sessions
cursor: pin S wait on X      Concurrency        98.90                   13.74

An excerpt from SYSTEMSTATE (level 10) dump

PROCESS 23:
...
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=42033 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f3271
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000d44f4280
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4939 efd 0 whr 5 slp 44251
       opr=2 pso=7000000cefa5df0 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
PROCESS 33:
...
    ----------------------------------------
    SO: 7000000f78e0770, type: 4, owner: 7000000fa8bbcf8, flag: INIT/-/-/0x00
    (session) sid: 4976 trans: 7000000f10ce318, creator: 7000000fa8bbcf8, flag: (8100041) USR/- BSY/-/-/-/-/-
              DID: 0001-0021-000193D1, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 3, prv: 0, sql: 7000000fc179fa8, psql: 7000000fc68f018, user: 0/SYS
    O/S info: user: orapaky0, term: , ospid: 5099938, machine: akmenkulp2
              program: sqlplus@akmenkulp2 (TNS V1-V3)
    application name: sqlplus@akmenkulp2 (TNS V1-V3), hash value=0
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=27479 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f33be
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fc64adb8
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4976 efd 0 whr 5 slp 15527
       opr=2 pso=7000000d86f7d88 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fc64ad80
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper EXCL
       Cursor Pin uid 4976 efd 0 whr 1 slp 0
       opr=3 pso=7000000cee2b2c0 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
PROCESS 34:
...
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=63882 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f348e
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fdd75c30
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4693 efd 0 whr 5 slp 62048
       opr=2 pso=7000000e7548610 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
PROCESS 39:
...
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=12512 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f36cf
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fd857ad8
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4970 efd 0 whr 5 slp 12395
       opr=2 pso=7000000cea1c250 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
PROCESS 47:
...
      waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=48746 wait_time=0 seconds since wait started=0
                  idn=16a1ebe6, value=132e00000000, where|sleeps=5002494fe
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000c5ca4690
         Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD
         Cursor Pin uid 4896 efd 0 whr 5 slp 48634
         opr=2 pso=7000000e24f7e30 flg=0
         pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0
         ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0
         hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788
        ----------------------------------------
...
PROCESS 53:
...
      waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=38686 wait_time=0 seconds since wait started=0
                  idn=16a1ebe6, value=132e00000000, where|sleeps=500249647
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000db1dee98
         Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD
         Cursor Pin uid 4828 efd 0 whr 5 slp 38647
         opr=2 pso=7000000e7f58940 flg=0
         pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0
         ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0
         hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788
        ----------------------------------------
...
PROCESS 54:
...
      (session) sid: 4910 trans: 0, creator: 7000000d81123c0, flag: (e1) USR/- BSY/-/-/-/-/-
                DID: 0001-002F-00044CCF, short-term DID: 0000-0000-00000000
                txn branch: 0
                oct: 3, prv: 0, sql: 7000000dbd12ce0, psql: 7000000dbb98da0, user: 55/SYSMAN
      O/S info: user: orapaky0, term: unknown, ospid: 1234, machine: akmenkulp2
                program: OMS
      client info: akmenkulp2_Management_Service
      application name: OEM.DefaultPool, hash value=3997945242
      action name: /database/instance/sitemap, hash value=105676648
      waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=54571 wait_time=0 seconds since wait started=0
                  idn=ec048ba, value=137000000000, where|sleeps=5007f3abc
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000dcd36f50
         Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
         Cursor Pin uid 4910 efd 0 whr 5 slp 53641
         opr=2 pso=7000000e7c7f338 flg=0
         pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
         ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
         hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
        ----------------------------------------
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000dcd36e38
         Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper EXCL
         Cursor Pin uid 4910 efd 0 whr 1 slp 0
         opr=3 pso=7000000e79b9030 flg=0
         pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0
         ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0
         hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788
        ----------------------------------------
...
PROCESS 55:
...
      waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=39147 wait_time=0 seconds since wait started=0
                  idn=16a1ebe6, value=132e00000000, where|sleeps=5002496e9
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000dbb57a10
         Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD
         Cursor Pin uid 4973 efd 0 whr 5 slp 39118
         opr=2 pso=7000000ce4e6fe8 flg=0
         pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0
         ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0
         hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788
        ----------------------------------------
...
PROCESS 62:
...
      (session) sid: 4805 trans: 0, creator: 7000000eab6d018, flag: (e1) USR/- BSY/-/-/-/-/-
                DID: 0001-003E-00010563, short-term DID: 0000-0000-00000000
                txn branch: 0
                oct: 3, prv: 0, sql: 7000000dce39e08, psql: 7000000dc0fc240, user: 72/MENKUL2008
      O/S info: user: akbank, term: L1058, ospid: 3468:3428, machine: AA\L1058
                program: toad.exe
      application name: TOAD 9.1.0.62, hash value=3156025525
      waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=41562 wait_time=0 seconds since wait started=0
                  idn=ec048ba, value=137000000000, where|sleeps=5007f3dc4
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000db0f6b20
         Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
         Cursor Pin uid 4805 efd 0 whr 5 slp 41523
         opr=2 pso=7000000ce1b0070 flg=0
         pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
         ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
         hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
        ----------------------------------------
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000db0f6ae8
         Mutex 7000000e51cde60(4805, 0) idn 7a99f649 oper EXCL
         Cursor Pin uid 4805 efd 0 whr 1 slp 0
         opr=3 pso=7000000d8b887e8 flg=0
         pcs=7000000e51cde60 nxt=0 flg=35 cld=0 hd=7000000dcd2b108 par=7000000d9991e20
         ct=0 hsh=0 unp=0 unn=0 hvl=d99920f8 nhv=1 ses=7000000f88942e0
         hep=7000000e51cdee0 flg=80 ld=1 ob=7000000e59bf060 ptr=7000000882ed3d8 fex=7000000882ec6e8
        ----------------------------------------
...
PROCESS 65:
...
      waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=35648 wait_time=0 seconds since wait started=0
                  idn=7a99f649, value=12c500000000, where|sleeps=500028b22
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000fd34bbf8
         Mutex 7000000e51cde60(4805, 0) idn 7a99f649 oper GET_SHRD
         Cursor Pin uid 4570 efd 0 whr 5 slp 35618
         opr=2 pso=7000000d898d0c0 flg=0
         pcs=7000000e51cde60 nxt=0 flg=35 cld=0 hd=7000000dcd2b108 par=7000000d9991e20
         ct=0 hsh=0 unp=0 unn=0 hvl=d99920f8 nhv=1 ses=7000000f88942e0
         hep=7000000e51cdee0 flg=80 ld=1 ob=7000000e59bf060 ptr=7000000882ed3d8 fex=7000000882ec6e8
        ----------------------------------------
...
PROCESS 68:
...
   (session) sid: 4698 trans: 0, creator: 7000000f98a7320, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0001-0044-00000344, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 3, prv: 0, sql: 7000000c5281eb8, psql: 7000000d4d7e910, user: 72/MENKUL2008
    O/S info: user: geneks, term: AKYGM011, ospid: 3168:3148, machine: AKYATIRIM\AKYGM011
              program:
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=44607 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f3dfe
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000dc155c48
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4698 efd 0 whr 5 slp 8722
       opr=2 pso=7000000cea6fce8 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000dc155c10
       Mutex 7000000e5f9cd90(4698, 0) idn 651e7adb oper EXCL
       Cursor Pin uid 4698 efd 0 whr 1 slp 0
       opr=3 pso=7000000cef62b68 flg=0
       pcs=7000000e5f9cd90 nxt=0 flg=34 cld=1 hd=7000000dcea52a8 par=7000000e5f9d708
       ct=1 hsh=0 unp=0 unn=0 hvl=e5f9d098 nhv=1 ses=7000000f782cbe0
       hep=7000000e5f9ce10 flg=80 ld=1 ob=7000000e53207c8 ptr=700000099b3ef88 fex=700000099b3e298
      ----------------------------------------
...
PROCESS 70:
...
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=34939 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f3e0f
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fc176d20
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4784 efd 0 whr 5 slp 33075
       opr=2 pso=7000000ceb24778 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
PROCESS 73:
...
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=24155 wait_time=0 seconds since wait started=0
                idn=651e7adb, value=125a00000000, where|sleeps=5000121e2
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000dc4eba08
       Mutex 7000000e5f9cd90(4698, 0) idn 651e7adb oper GET_SHRD
       Cursor Pin uid 4637 efd 0 whr 5 slp 8674
       opr=2 pso=7000000cef8dcb0 flg=0
       pcs=7000000e5f9cd90 nxt=0 flg=34 cld=1 hd=7000000dcea52a8 par=7000000e5f9d708
       ct=1 hsh=0 unp=0 unn=0 hvl=e5f9d098 nhv=1 ses=7000000f782cbe0
       hep=7000000e5f9ce10 flg=80 ld=1 ob=7000000e53207c8 ptr=700000099b3ef88 fex=700000099b3e298
      ----------------------------------------
...
PROCESS 76:
...
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=35688 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f3ed4
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fc655c80
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4806 efd 0 whr 5 slp 35673
       opr=2 pso=7000000ce702c90 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
PROCESS 142:
...
      waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=48159 wait_time=0 seconds since wait started=0
                  idn=16a1ebe6, value=132e00000000, where|sleeps=5002499ac
...
        ----------------------------------------
        KGX Atomic Operation Log 7000000db235510
         Mutex 7000000d71fd6b0(4910, 0) idn 16a1ebe6 oper GET_SHRD
         Cursor Pin uid 4592 efd 0 whr 5 slp 48111
         opr=2 pso=7000000e2252830 flg=0
         pcs=7000000d71fd6b0 nxt=0 flg=35 cld=0 hd=7000000fda5fc00 par=7000000d71fdaa0
         ct=0 hsh=0 unp=0 unn=0 hvl=d71fdd78 nhv=1 ses=7000000f78b5cc0
         hep=7000000d71fd730 flg=80 ld=1 ob=7000000d76d3848 ptr=70000008a8ff478 fex=70000008a8fe788
        ----------------------------------------
...


 Comments   
Comment by ubTools Support [ 07/Jun/08 05:46 PM ]
Mutex identifier value helps to locate address of mutex. For example:
PROCESS 23:
...
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=42033 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f3271
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000d44f4280
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4939 efd 0 whr 5 slp 44251
       opr=2 pso=7000000cefa5df0 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------

The mutex identifier 0xec048ba at the address 0x7000000d92bce40 is requested in the shared mode (GET_SHRD)

Comment by ubTools Support [ 07/Jun/08 06:11 PM ]
According to SYSTEMSTATE dump, the following table shows the which process is waiting for the which process:
Waiter Process# Holder Process#
23 33
33 33
34 33
39 33
47 54
53 54
54 33
55 54
62 33
65 62
68 33
70 33
73 68
76 33
142 54

According to the table above, there is a deadlock and the root holder is the process#33. This process waits for itself. That means there is a self-deadlock problem.

Comment by ubTools Support [ 07/Jun/08 06:25 PM ]
Process#33 state:
PROCESS 33:
...
    ----------------------------------------
    SO: 7000000f78e0770, type: 4, owner: 7000000fa8bbcf8, flag: INIT/-/-/0x00
    (session) sid: 4976 trans: 7000000f10ce318, creator: 7000000fa8bbcf8, flag: (8100041) USR/- BSY/-/-/-/-/-
              DID: 0001-0021-000193D1, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 3, prv: 0, sql: 7000000fc179fa8, psql: 7000000fc68f018, user: 0/SYS
    O/S info: user: orapaky0, term: , ospid: 5099938, machine: akmenkulp2
              program: sqlplus@akmenkulp2 (TNS V1-V3)
    application name: sqlplus@akmenkulp2 (TNS V1-V3), hash value=0
    waiting for 'cursor: pin S wait on X' blocking sess=0x0 seq=27479 wait_time=0 seconds since wait started=0
                idn=ec048ba, value=137000000000, where|sleeps=5007f33be
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fc64adb8
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper GET_SHRD
       Cursor Pin uid 4976 efd 0 whr 5 slp 15527
       opr=2 pso=7000000d86f7d88 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...
      ----------------------------------------
      KGX Atomic Operation Log 7000000fc64ad80
       Mutex 7000000d92bce40(4976, 0) idn ec048ba oper EXCL
       Cursor Pin uid 4976 efd 0 whr 1 slp 0
       opr=3 pso=7000000cee2b2c0 flg=0
       pcs=7000000d92bce40 nxt=7000000cb6ed710 flg=35 cld=0 hd=7000000fc8611f0 par=7000000d94637a0
       ct=4 hsh=0 unp=0 unn=0 hvl=cb6eda00 nhv=1 ses=7000000f78e0770
       hep=7000000d92bcec0 flg=80 ld=1 ob=7000000d9c1b3a8 ptr=7000000991e6108 fex=7000000991e5418
      ----------------------------------------
...

Process#33 holds mutex identifier 0xec048ba in the exclusive mode. But, it requests the same identifier in the shared mode. It's self-deadlock bug.

Comment by ubTools Support [ 07/Jun/08 06:39 PM ]
Unfortunately, there is no stack trace of process#33 to find the kernel function in which it's was running. But, the following SQL ran by process#33 helps to narrow down the Oracle bugs:
      LIBRARY OBJECT HANDLE: handle=7000000fc179fa8 mtx=7000000fc17a0d8(1) cdp=1
      name=
select i.obj#, i.rowcnt, i.leafcnt, i.distkey, i.lblkkey, i.dblkkey,i.clufac, i.blevel
, i.analyzetime, i.samplesize, decode(i.pctthres$,null,null,mod(trunc(i.pctthres$/256),256)),
 i.flags, ist.cachedblk, ist.cachehit, ist.logicalread from ind$ i, ind_stats$ ist
 where i.obj# = ist.obj#(+) and i.bo#=:1 order by i.obj#
      hash=25d75620e6d3487e18921ac30ec048ba timestamp=06-06-2008 23:01:28
...

Looks like a statistic collection SQL...

Comment by ubTools Support [ 07/Jun/08 06:44 PM ]
Oracle Note:5907779.8:
This problem is introduced in 10.2.0.3 

A process may hang with a self deadlock typically when executing 
DBMS_STATS. The hung process shows itself waiting on a
"cursor: pin S wait on X" waitevent waiting for an object
that it has pinned itself.

According to the note, this problem has been fixed in Oracle 10.2.0.4.





[QA-36] Who is the inventor of Response Time Analysis(RTA) in Oracle ? Created: 08/Apr/08  Updated: 01/May/08

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: X
Operating System: Generic

 Description   
This issue moved to http://www.ubtools.com/web/public/resources/logs/rta_inventor.




[QA-16] Does commit cause checkpoint ? Created: 15/Jul/07  Updated: 16/Sep/07

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: Generic
Operating System: Generic

 Description   
Does commit cause checkpoint ?

 Comments   
Comment by ubTools Support [ 15/Jul/07 02:34 PM ]
Commit doesn't cause a checkpoint itself. If it was so, there would not be a need for redo-logging.

There is an article of K Gopalakrishnan. It's published in Oracle Internals magazine. Gopal explains the relationship and differences between commit-SCN and SCN.





[QA-15] SQ enqueue problem. Created: 15/Jul/07  Updated: 16/Sep/07

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: 10.1.0.3
Operating System: Generic

 Description   
Other than SYSDBA, no new connections allowed to the database.

 Comments   
Comment by ubTools Support [ 15/Jul/07 02:32 PM ]

An excerpt from SYSTEMSTATE dump:

The SYSTEMSTATE dump was generated in USER_DUMP_DEST as below:

 
SQL> connect / as sysdba
SQL> alter session set max_dump_file_size=UNLIMITED;
SQL> alter session set events 'IMMEDIATE trace name SYSTEMSTATE level 10';
-- 2 or 3 minutes later
SQL> alter session set events 'IMMEDIATE trace name SYSTEMSTATE level 10';

There are many sessions in SYSTEMSTATE dump waiting for enq: SQ - contention as below:

(session) trans: 0, creator: 41b5c31f8, flag: (e1) USR/- BSY/////-
DID: 0001-0057-00000021, short-term DID: 0000-0000-00000000
txn branch: 0
oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
O/S info: user: , term: , ospid: , machine:
program:
waiting for 'enq: SQ - contention' blocking sess=0x4234f5b68 seq=1 wait_time=0
name|mode=53510006, object #=8e, 0=0
Dumping Session Wait History
for 'enq: SQ - contention' count=1 wait_time=3007641
name|mode=53510006, object #=8e, 0=0
for 'enq: SQ - contention' count=1 wait_time=3007783
...
SO: 40e830a70, type: 54, owner: 4234e5f20, flag: INIT///0x00
LIBRARY OBJECT LOCK: lock=40e830a70 handle=41985e8b8 mode=N
call pin=41a6d2740 session pin=0 hpc=0000 hlc=0000
htl=40e830ae0[40e8308e8,40e6833d8] htb=40e8d2cc8
user=4234e5f20 session=4234e5f20 count=1 flags=PNC/[0400] savepoint=2
LIBRARY OBJECT HANDLE: handle=41985e8b8
name=SYS.AUDSES$
hash=deaeba50687d3d62c586aafe9b84f98c timestamp=07-20-2004 15:34:57
namespace=TABL flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0001-0001 lock=N pin=S latch#=34 hpc=022e hlc=022e

The blocking session(0x4234f5b68):

SO: 4234f5b68 , type: 4, owner: 41b5c8c60, flag: INIT///0x00
(session) trans: 41cde12f8, creator: 41b5c8c60, flag: (e1) USR/- BSY/////-
DID: 0001-0056-00000020, short-term DID: 0000-0000-00000000
txn branch: 0
oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
O/S info: user: , term: , ospid: , machine:
program:
waiting for 'gc cr request' blocking sess=0x0 seq=2 wait_time=0
file#=1, block#=1ea, class#=1
Dumping Session Wait History
for 'gc cr request' count=1 wait_time=1230415
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1230287
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1220829
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1230501
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1230312
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1230295
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1230618
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1230445
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1229421
file#=1, block#=1ea, class#=1
for 'gc cr request' count=1 wait_time=1231336
file#=1, block#=1ea, class#=1
temporary object counter: 0

Problem interpretation:

The blocker session waited for a RAC related wait event named gc cr request for the same file# and block#. Unfortunately, at the time of the problem happened, no SYSTEMSTATE dumps were generated for the other nodes. So, it was not possible to diagnose the root blocker on the other node to find why it holds same buffer too long.

The sessions were waiting for SQ enqueue on SYS.AUDSES$ sequence. During connection, the value of V$SESSION.AUDSID is obtained from SYS.AUDSES$ sequence. SYSDBA doesn't use this sequence in connection. So, it was not blocked.

Solution:

The default cache size of SYS.AUDSES$ was 20. It has been increased to 1000.





[QA-14] is the current CPU breakdown formula correct ? Created: 15/Jul/07  Updated: 19/Sep/07

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: Generic
Operating System: Generic

 Description   
is the current CPU breakdown formula correct ?

CPU used by this session = parse time cpu + recursive cpu usage + others



 Comments   
Comment by ubTools Support [ 15/Jul/07 02:14 PM ]

Answer:

This is the most well-known, but wrong formula I've read in many Oracle documentations.

parse time cpu includes parse cpu time of both recursive and user statements. recursive cpu usage includes both parse cpu time and non-parse cpu time of recursive statements. That means parse cpu usage of recursive statements is included in both parse time cpu and recursive cpu usage. In other words, it's duplicated and formula above is not correct.

ubTools offers the following formula:

CPU used by this session = parse time cpu + others(exec_and_fetch_time_cpu)

Question:

If there is little or no SQL processing done within PL/SQL, should I also subtract recursive cpu usage from CPU used by this session to get the others cpu component ?

Answer:

NO. A formula should explain all cases. It should not work for just some scenarios only.

Also, both SQL and statements in PL/SQL are associated with a cursor internally in Oracle perspective. In other words, they are not different things in PARSE,EXEC,FETCH calls. If a statement is called by an other statement, it's called recursive statement. So, both an SQL and a PL/SQL can be recursive statements.

  • If there is no SQL processing in PL/SQL, it means there is no SQL in parent and child PL/SQLs. There are 2 scenarios for this case:
    • If There is no child PL/SQL in the parent PL/SQL, recursive cpu usage is ZERO. Since it's zero, no need to substruct it from CPU used by this session.
    • If there is child PL/SQL in the parent PL/SQL, PARSE call is done for child PL/SQL in recursive mode. In this case, parse time cpu of recursive statement is already included in recursive cpu usage. So, recursive cpu usage should not be substructed from CPU used by this session.
  • If there is little SQL processing in PL/SQL with little parse time cpu, the distortion in the mentioned wrong formula is small. But, recursive cpu usage should still NOT be substructed from CPU used by this session even if the distortion is small. Why should DBAs substruct it if they have more correct formula ? No need.

Recommendation:

The current Reponse Time Performance Analysis(RTA) implementaions are not correct. RTA has not reached its next level, yet. That's why ubTools offered a new technique by Microstate Response-time Performance Profiling (MRPP).

There has been a question on this topic at Tom Kyte's site by referring ubTools:

Question:

 
Tom,

Just wanted to: what exactly is "CPU used by this session". One site(
http://www.ubtools.com/cgi-bin/ib/ikonboard.cgi?act=ST;f=25;t=4
says
<>
CPU used by this session = parse time cpu + recursive cpu usage + others

This is the most well-known, but wrong formula I've read in many Oracle
documentations.

parse time cpu includes parse cpu time of both recursive and user statements.
recursive cpu usage includes both parse cpu time and non-parse cpu of recursive
statements. That means parse cpu usage of recursive statements is included in
both parse time cpu and recursive cpu usage. In other words, it's duplicated and
formula above is not correct.

ubTools offers the following formula:

CPU used by this session = parse time cpu + others(exec_and_fetch_time_cpu)
<>

what is exec_and_fetch_time_cpu ?

Regards

Tom Kyte's answer:

 
I am not so sure they are correct.  unless they are talking about the
description of cpu used by this session (i is not clear to me whether they are
saying "the description is wrong" or "the value reported by the statistic is
wrong"

if the values were wrong, the cpu times reported for most things would exceed
elapsed time by large margins.  so, they should be able to demonstrate that for
us.

(and you would have to ask the author of an article in most cases "what did you
mean by this "exec and fetch time cpu" and how exactly do you think we could
find it)

I think they were saying "the description provided is wrong", but I have an
easier description.

cpu use by this session is cpu used by that session.

Our answer:

We said the current CPU breakdown formula is incorrect, not the description of Oracle statistics.

CPU used by this session is the total CPU usage in session or instance level. And, there are 3 components in CPU usage:

  • Parse
  • Exec
  • Fetch

These components can be seen in SQL_TRACE / EVENT10046 traces. Parse component is available by parse time cpu statistic. Since there is no Oracle statistic for Exec/Fetch components, we call them as others.

We had not mentioned values of the CPU usage statistics in this discussion. If we start talking about the values, it gets started another wrong topic on the values. Here is a brief explanation:

  • In busy environments, distortion on CPU measurement is minimal. In, non-busy environments, it may not be minimal. In many cases, there is no big performance problem in non-busy environments. So, the distortion on CPU usage doesn't make sense in many cases.
  • The wait measurement includes serious distortions in busy environments.

ubTools says for years that RESPONSE TIME ANALYSIS(RTA) CAN NOT BE IMPLEMENTED IN INSTANCE LEVEL. RTA IS A METHOD FOR SESSION LEVEL.

For the full details with the proven samples, see Microstate Response-time Performance Profiling (MRPP).





[QA-13] Dumping a stack trace is too slow in 10g. Created: 15/Jul/07  Updated: 16/Sep/07

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Database Tuning Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: 10.1.0.2
Operating System: Generic

 Description   

Error code:

ORA-4031 (may not be seen by end users).

Error code definition:

The CPU usage reaches 98% in KERNEL mode. strace utility on linux reports that the process spins on read() system call.

System calls:

An excerpt from strace -p <OSPID> output:

 
read(29, "<\345&\0\1\0\21\0\330Wk\10\0\0\0@\30\0\0\0\0\0\0\0", 24) = 24
read(29, "I\345&\0\1\0\22\0\230\226\217\10\0\0\0@\30\0\0\0\0\0\0"..., 24) = 24
read(29, "\20\1\0\0\1\0\24\0\260\n3\0\0\0\0`\20\0\0\0\0\0\0\0", 24) = 24
read(29, "\34\1\0\0\1\0\34\0`\200\353\0\0\0\0`\10\0\0\0\0\0\0\0", 24) = 24
read(29, "\'\1\0\0\1\0\34\0h\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24
read(29, "3\1\0\0\1\0\34\0l\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24
read(29, "V\345&\0\4\0\361\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 24) = 24
read(29, "`\345&\0\1\0\f\0\300XF\4\0\0\0@\4\0\0\0\0\0\0\0", 24) = 24
read(29, "k\345&\0\1\0\21\0\20Xk\10\0\0\0@\30\0\0\0\0\0\0\0", 24) = 24
read(29, "y\345&\0\1\0\22\0\260\226\217\10\0\0\0@\30\0\0\0\0\0\0"..., 24) = 24
read(29, "\20\1\0\0\1\0\24\0\300\n3\0\0\0\0`\20\0\0\0\0\0\0\0", 24) = 24
read(29, "\34\1\0\0\1\0\34\0p\200\353\0\0\0\0`\10\0\0\0\0\0\0\0", 24) = 24
read(29, "\'\1\0\0\1\0\34\0x\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24
read(29, "3\1\0\0\1\0\34\0|\200\353\0\0\0\0`\4\0\0\0\0\0\0\0", 24) = 24

The file descriptor is 29(the first argument in read() system call). By linux lsof command, the descriptor#29 is 1081732 /oracle/product/10.1.0/bin/oracle. In other words, the process is reading Oracle executable.

strace -c -p <OSPID> output for 1 minute:

 
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 21.570914 40 543060 read
0.00 0.000016 2 7 lseek
0.00 0.000003 3 1 getpid
0.00 0.000001 1 1 open
0.00 0.000001 1 1 readlink
------ ----------- ----------- --------- --------- ----------------
100.00 21.570935 543070 total

read() system call had been called 543,060 times per minute. That's why CPU utilization in KERNEL mode is high.

An excerpt from stack trace by OS debugger:

 
#0 0x200000000137fa81 in read () from /lib/tls/libpthread.so.0
#1 0x4000000004d7a7e0 in sskgds_getsnm ()
#2 0x4000000002766160 in skdsttpcs ()
#3 0x4000000001131920 in ksedst ()
#4 0x40000000011b0140 in ksm_4031_dump ()

ksm_4031_dump() function of Oracle dumps ORA-4031 traces. The top of the stack includes read() system calls.

Problem interpretation:

The process gets ORA-4031 error, then tries to dump trace file for this error. But, while dumping the trace, it spins on read() system calls.

Workaround:

Set the following parameters in pfile/spfile:

 
_4031_dump_bitvec        = 0
_4031_max_dumps         = 0

Bug:

Ref: Oracle Note:3964602 DUMPING A CALL STACK TRACE IS SLOW.






Generated at Thu Mar 28 17:38:05 UTC 2024 using JIRA Standard Edition, Version: 3.12.3-#302.