<< Back to previous view

[QA-37] "ORA-01187: cannot read from file" in one of the RAC Node. Created: 08/May/08  Updated: 12/May/08

Status: Closed
Project: Questions & Answers
Fix Version/s: None

Type: Oracle - Administration Priority: Major
Reporter: ubTools Support Assignee: ubTools Support
Resolution: Answered Votes: 0

Product Version: Oracle 10g 10.2.0.3
Operating System: Linux
Operating System Version: RHEL 4

 Description   
The one of RAC Nodes encounters the following error codes while no problem occurs on the other node:

From ALERT LOG:

Generic Alert Log Error May 2, 2008 11:21:30 PM ORA-12012: error on auto execute of job 8913
ORA-01187: cannot read from file ORA-01187: cannot read from file 96 because it failed verification tests
ORA-01110: data file 96: '/u64/oradata/DMSDB/LBPRD_IDX_SKU_005DMSDB.dbf'
ORA-06512: at "SYS.PRVT_ADVISOR", line 1624
ORA-06512: at "SYS.DBMS_ADVISOR", line 186
ORA-06512: at "SYS.DBMS_SPACE", line 1347
ORA-06512: at "SYS.DBMS_SPACE", line 1566
because it failed verification tests
Trace File:  /u00/app/oracle/oracle/admin/DMSDB/bdump/dmsdb2_j000_21660.trc 

From dmsdb2_j000_21660.trc:

*** 2008-04-07 23:03:34.750
GATHER_STATS_JOB: GATHER_TABLE_STATS('"LOADOWNER"','"MARKEDPRODUCT"','"MARKEDPRODUCT_20080406"', ...)
ORA-01187: cannot read from file 88 because it failed verification tests
ORA-01110: data file 88: '/u64/oradata/DMSDB/MRK_IDX_004DMSDB.dbf'


 Comments   
Comment by ubTools Support [ 08/May/08 12:32 PM ]
From the trace file, the problem can be reproduced by DBMS_STATS.GATHER_TABLE_STATS().

The following strace output will give the system calls:

strace -f -o strace.log sqlplus / as sysdba <<EOF
exec DBMS_STATS.GATHER_TABLE_STATS('<owner>','<tableName>','<partitionName',1,DEGREE=>2);
exit;
EOF
Comment by ubTools Support [ 08/May/08 12:35 PM ]
An excerpt from strace.log that the db files were opened with O_DIRECT flag:
16822 open("/u51/oradata/DMSDB/system001DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LAR
GEFILE) = 14
...
16822 open("/u31/oradata/DMSDB/ctl1DMSDB.ctl", O_RDWR|O_SYNC|O_DIRECT|O_LARGEFIL
E) = 15
16822 open("/u32/oradata/DMSDB/ctl2DMSDB.ctl", O_RDWR|O_SYNC|O_DIRECT|O_LARGEFIL
E) = 16
16822 open("/u33/oradata/DMSDB/ctl3DMSDB.ctl", O_RDWR|O_SYNC|O_DIRECT|O_LARGEFIL
E) = 17
16822 open("/u52/oradata/DMSDB/undotbs002DMSDB2.dbf", O_RDWR|O_SYNC|O_DIRECT|O_L
ARGEFILE) = 18
...
16822 open("/u70/oradata/DMSDB/TEMPDMSDB_002.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LARG
EFILE) = 20
16822 open("/u70/oradata/DMSDB/TEMPDMSDB_002.dbf", O_RDWR|O_DIRECT|O_LARGEFILE)
= 21
...
16822 open("/u52/oradata/DMSDB/sysaux001DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LAR
GEFILE) = 28
16822 open("/u51/oradata/DMSDB/system002DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_LAR
GEFILE) = 29
16822 open("/u55/oradata/DMSDB/undotbs001DMSDB1.dbf", O_RDWR|O_SYNC|O_DIRECT|O_L
ARGEFILE) = 30
16822 open("/u61/oradata/DMSDB/MRK_IDX_001DMSDB.dbf", O_RDWR|O_SYNC|O_DIRECT|O_L
ARGEFILE) = 31
...

The second excerpt from strace.log that the db files were opened without O_DIRECT flag:

16822 open("/u65/oradata/DMSDB/dor_data_200805_001_DMSDB.dbf", O_RDWR|O_SYNC|O_L
ARGEFILE) = 19
16822 open("/u65/oradata/DMSDB/dor_data_200805_002_DMSDB.dbf", O_RDWR|O_SYNC|O_L
ARGEFILE) = 27
...
16822 open("/u64/oradata/DMSDB/MRK_IDX_004DMSDB.dbf", O_RDWR|O_SYNC|O_LARGEFILE)
 = 32
Comment by ubTools Support [ 08/May/08 12:52 PM ]
The customer uses OCFS2.

O_DIRECT flag of open() system call bypasses File System(FS) cache; and DISK-IO occurs between user address space and disk.

OCFS opens dbfiles with O_DIRECT flag to eliminate inconsistency among FS caches of nodes. Since RAC provides consistency among SGAs and there will be no db buffers is FS cache, no consistency problem occurs.

From Ref: Oracle Note:391771.1:

48. Any special flags to run Oracle RAC?
OCFS2 volumes containing the Voting diskfile (CRS), Cluster registry (OCR),
Data files, Redo logs, Archive logs and Control files must be mounted
with the datavolume and nointr mount options.
The datavolume option ensures that the Oracle processes opens these files
with the o_direct flag.
The nointr option ensures that the ios are not interrupted by signals.

# mount -o datavolume,nointr -t ocfs2 /dev/sda1 /u01/db

The customer was not using the datavolume,nointr option. After mounting with the datavolume,nointr, the problem has been solved.

Generated at Fri May 02 04:26:07 UTC 2025 using JIRA Standard Edition, Version: 3.12.3-#302.