ORA-00600: internal error code, arguments: [kcbzwfcro_2] 故障案例一则

上周末,一套RAC生产库实例1 alert日志中出现大量ORA-00600报错,触发了监控报警,当时值班人员重启数据库实例后恢复。实例重启应急操作过程中正常一致性关闭未果,而是kill lgwr进程得以让实例终止。

Sat Feb 08 02:34:03 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc  (incident=193747):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Incident details in: /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/incident/incdir_193747/Travelskydba1_p000_166150_i193747.trc
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 02:34:05 2020
 Dumping diagnostic data in directory=[cdmp_20200208023405], requested by (instance=1, osid=166150 (P000)), summary=[incident=193747].
 Sat Feb 08 02:34:05 2020
 Sweep [inc][193747]: completed
 Sweep [inc2][193747]: completed
 Sat Feb 08 02:36:14 2020
 ALTER SYSTEM ARCHIVE LOG
 Sat Feb 08 02:36:14 2020
 Thread 1 advanced to log sequence 29734 (LGWR switch)
   Current log# 2 seq# 29734 mem# 0: +DG_REDO/Travelskydba/onlinelog/group_2.258.975692641
   Current log# 2 seq# 29734 mem# 1: +DG_REDO/Travelskydba/onlinelog/group_2.259.975692641
 Sat Feb 08 02:36:14 2020
 LNS: Standby redo logfile selected for thread 1 sequence 29734 for destination LOG_ARCHIVE_DEST_2
 Archived Log entry 179716 added for thread 1 sequence 29733 ID 0xd45ba67e dest 1:
 ALTER SYSTEM ARCHIVE LOG
 Sat Feb 08 02:36:25 2020
 Thread 1 advanced to log sequence 29735 (LGWR switch)
   Current log# 3 seq# 29735 mem# 0: +DG_REDO/Travelskydba/onlinelog/group_3.260.975692643
   Current log# 3 seq# 29735 mem# 1: +DG_REDO/Travelskydba/onlinelog/group_3.261.975692645
 Sat Feb 08 02:36:25 2020
 Archived Log entry 179724 added for thread 1 sequence 29734 ID 0xd45ba67e dest 1:
 Sat Feb 08 02:36:25 2020
 LNS: Standby redo logfile selected for thread 1 sequence 29735 for destination LOG_ARCHIVE_DEST_2
 Sat Feb 08 02:39:25 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p046_192910.trc  (incident=195547):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Incident details in: /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/incident/incdir_195547/Travelskydba1_p046_192910_i195547.trc
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Sat Feb 08 02:39:26 2020
 Dumping diagnostic data in directory=[cdmp_20200208023926], requested by (instance=1, osid=192910 (P046)), summary=[incident=195547].
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p046_192910.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p046_192910.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 02:39:26 2020
 SMON: Restarting fast_start parallel rollback
 Sat Feb 08 02:39:27 2020
 Sweep [inc][195547]: completed
 Sweep [inc2][195547]: completed
 Sat Feb 08 02:44:39 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p002_190908.trc  (incident=193795):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Incident details in: /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/incident/incdir_193795/Travelskydba1_p002_190908_i193795.trc
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p002_190908.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p002_190908.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 02:44:41 2020
 Dumping diagnostic data in directory=[cdmp_20200208024441], requested by (instance=1, osid=190908 (P002)), summary=[incident=193795].
 Sat Feb 08 02:44:43 2020
 Sweep [inc][193795]: completed
 Sweep [inc2][193795]: completed
 Sat Feb 08 02:50:02 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_53435.trc  (incident=193859):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Incident details in: /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/incident/incdir_193859/Travelskydba1_p005_53435_i193859.trc
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_53435.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_53435.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 02:50:03 2020
 SMON: Restarting fast_start parallel rollback
 Sat Feb 08 02:50:03 2020
 Dumping diagnostic data in directory=[cdmp_20200208025003], requested by (instance=1, osid=53435 (P005)), summary=[incident=193859].
 Sat Feb 08 02:50:04 2020
 Sweep [inc][193859]: completed
 Sweep [inc2][193859]: completed
 Sat Feb 08 02:55:15 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc  (incident=193748):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Incident details in: /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/incident/incdir_193748/Travelskydba1_p000_166150_i193748.trc
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Sat Feb 08 02:55:16 2020
 Dumping diagnostic data in directory=[cdmp_20200208025516], requested by (instance=1, osid=166150 (P000)), summary=[incident=193748].
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 02:55:19 2020
 Sweep [inc][193748]: completed
 Sweep [inc2][193748]: completed
 Sat Feb 08 03:00:37 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_104212.trc  (incident=193796):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_104212.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_104212.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:00:37 2020
 SMON: Restarting fast_start parallel rollback
 Sat Feb 08 03:05:49 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc  (incident=193749):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:11:10 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_155132.trc  (incident=193787):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_155132.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_155132.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:11:10 2020
 SMON: Restarting fast_start parallel rollback
 Sat Feb 08 03:13:29 2020
 Stopping background process CJQ0
 Sat Feb 08 03:16:22 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc  (incident=193750):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:21:43 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_17817.trc  (incident=193531):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_17817.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_17817.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:21:43 2020
 SMON: Restarting fast_start parallel rollback
 Sat Feb 08 03:26:55 2020
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:32:15 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_65986.trc  (incident=193491):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_65986.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_65986.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:32:15 2020
 SMON: Restarting fast_start parallel rollback
 Sat Feb 08 03:36:11 2020
 ALTER SYSTEM ARCHIVE LOG
 Sat Feb 08 03:36:12 2020
 Thread 1 advanced to log sequence 29736 (LGWR switch)
   Current log# 4 seq# 29736 mem# 0: +DG_REDO/Travelskydba/onlinelog/group_4.262.975692647
   Current log# 4 seq# 29736 mem# 1: +DG_REDO/Travelskydba/onlinelog/group_4.263.975692649
 Sat Feb 08 03:36:12 2020
 LNS: Standby redo logfile selected for thread 1 sequence 29736 for destination LOG_ARCHIVE_DEST_2
 Sat Feb 08 03:36:13 2020
 Archived Log entry 179730 added for thread 1 sequence 29735 ID 0xd45ba67e dest 1:
 Sat Feb 08 03:36:24 2020
 ALTER SYSTEM ARCHIVE LOG
 Sat Feb 08 03:36:27 2020
 Thread 1 advanced to log sequence 29737 (LGWR switch)
   Current log# 5 seq# 29737 mem# 0: +DG_REDO/Travelskydba/onlinelog/group_5.264.975692651
   Current log# 5 seq# 29737 mem# 1: +DG_REDO/Travelskydba/onlinelog/group_5.265.975692651
 Sat Feb 08 03:36:27 2020
 LNS: Standby redo logfile selected for thread 1 sequence 29737 for destination LOG_ARCHIVE_DEST_2
 Archived Log entry 179738 added for thread 1 sequence 29736 ID 0xd45ba67e dest 1:
 Sat Feb 08 03:37:28 2020
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p000_166150.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:39:16 2020
 Shutting down instance (immediate)
 Stopping background process SMCO
 Shutting down instance: further logons disabled
 Stopping background process QMNC
 Stopping background process MMNL
 Stopping background process MMON
 License high water mark = 375
 Sat Feb 08 03:42:48 2020
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_114400.trc  (incident=193539):
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Incident details in: /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/incident/incdir_193539/Travelskydba1_p005_114400_i193539.trc
 Use ADRCI or Support Workbench to package the incident.
 See Note 411.1 at My Oracle Support for error and packaging details.
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_114400.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Errors in file /opt/app/ora11g/diag/rdbms/Travelskydba/Travelskydba1/trace/Travelskydba1_p005_114400.trc:
 ORA-10388: parallel query server interrupt (failure)
 ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
 Sat Feb 08 03:42:49 2020
 SMON: Restarting fast_start parallel rollback


从alert日志中我们发现,基本上ORA-00600与ORA-10388成对出现,并且SMON在回滚并行进程,ORA-00600的trace文件也是并行进程-Travelskydba1_p00产生。

查看报错trace文件:

*** 2020-02-08 02:55:15.334
*** SESSION ID:(3707.19965) 2020-02-08 02:55:15.334
*** SERVICE NAME:(SYS$BACKGROUND) 2020-02-08 02:55:15.334
 
Incident 193748 created, dump file: /opt/app/ora11g/diag/rdbms/ora624a/ora624a1/incident/incdir_193748/ora624a1_p000_166150_i193748.trc
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []

fatal error occurred while scanning redo
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
Error 10388 occurred while dumping log
ORA-10388: parallel query server interrupt (failure)
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
Parallel Transaction recovery coordinatorcaught exception 10388

*** 2020-02-08 02:55:25.706
*** SESSION ID:(3707.19967) 2020-02-08 02:55:25.707
*** SERVICE NAME:(SYS$BACKGROUND) 2020-02-08 02:55:25.707
 
Parallel Transaction recovery coordinatorcaught exception 10388
DDE: Problem Key 'ORA 600 [kcbzwfcro_2]' was flood controlled (0x2) (incident: 193749)

*** 2020-02-08 03:05:49.294
*** SESSION ID:(3709.435) 2020-02-08 03:05:49.294
*** SERVICE NAME:(SYS$BACKGROUND) 2020-02-08 03:05:49.294
 
ORA-00600: internal error code, arguments: [kcbzwfcro_2], [136046], [1], [32768], [0], [], [], [], [], [], [], []
Potentially stale force-CR-override buffer found before OBJD MISMATCH check.
This issue should be investigated by both cache fusion and space layer.
BH (0x132f60d8d8) file#: 15 rdba: 0x03c01ae3 (15/6883) class: 1 ba: 0x132e6be000
  set: 75 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,3
  dbwrid: 2 obj: 136046 objn: 136046 tsn: 10 afn: 15 hint: f
  hash: [0x164ee2ecd0,0xe6f484db0] lru: [0xf1f515308,0xf1f51b968]
  ckptq: [NULL] fileq: [NULL] objq: [0x106f19fe10,0x12cec34f10] objaq: [0xf2f496400,0xf2f473b80]
  st: XCURRENT md: NULL fpin: 'kdgwh05: kdglfe' tch: 0 le: 0x3afe991a8
  flags: block_written_once redo_since_read remote_transfered
          force_cr_override
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [1]
GLOBAL CACHE ELEMENT DUMP (address: 0x3afe991a8):
  id1: 0x1ae3 id2: 0xf pkey: OBJ#136046 block: (15/6883)
  lock: X rls: 0x0 acq: 0x0 latch: 23
  flags: 0x20 fair: 0 recovery: 0 fpin: 'kdgwh05: kdglfe'
  bscn: 0x0.1c9bcdb5 bctx: (nil) write: 0 scan: 0x0 
  lcp: (nil) lnk: [NULL] lch: [0x132f60da10,0x132f60da10]
  seq: 72 hist: 113 238 180 113 238 180 113 238 180 113 238 180 113 238 180 113
    238 180 113 238
  LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
    flg: 0x0a200000 sflg: 0x2000 state: XCURRENT tsn: 10 tsh: 0
      addr: 0x132f60d8d8 obj: 136046 cls: DATA bscn: 0x0.1c9bcdb5
  buffer tsn: 10 rdba: 0x03c01ae3 (15/6883)
  scn: 0x0000.1c9bcdb5 seq: 0x01 flg: 0x04 tail: 0xcdb50601
  frmt: 0x02 chkval: 0x84e4 type: 0x06=trans data
堆栈信息:
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+41        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFDF8B56F90 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
ksedst1()+103        call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFDF8B56F90 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
ksedst()+39          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFDF8B56F90 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
dbkedDefDump()+2746  call     ksedst()             000000000 ? 000000001 ?
                                                   7FFDF8B56F90 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
ksedmp()+41          call     dbkedDefDump()       000000003 ? 000000002 ?
                                                   7FFDF8B56F90 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
ksfdmp()+69          call     ksedmp()             000000003 ? 000000002 ?
                                                   7FFDF8B56F90 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
dbgexPhaseII()+1764  call     ksfdmp()             000000003 ? 000000002 ?
                                                   7FFDF8B56F90 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
dbgexExplicitEndInc  call     dbgexPhaseII()       7F96E5CAD730 ? 7F96E5CB4700 ?
()+755                                             7FFDF8B5FDA8 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
dbgeEndDDEInvocatio  call     dbgexExplicitEndInc  7F96E5CAD730 ? 7F96E5CB4700 ?
nImpl()+769                   ()                   7FFDF8B5FDA8 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
dbgeEndSpltInvokOnR  call     dbgeEndDDEInvocatio  7F96E5CAD730 ? 7F96E5CB4700 ?
ec()+270                      nImpl()              7FFDF8B5FDA8 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
dbgePostErrorKGE()+  call     dbgeEndSpltInvokOnR  7F96E5CAD730 ? 7F96E5CB4700 ?
257                           ec()                 7FFDF8B5FDA8 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   00C113DA0 ? 7F96E5CBCCC0 ?
71                                                 000002894 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
kgeselv()+276        call     dbkePostKGE_kgsf()   00C113DA0 ? 7F96E5CBCCC0 ?
                                                   000002894 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000002 ?
ksesecl0()+162       call     kgeselv()            00C113DA0 ? 7F96E5CBCCC0 ?
                                                   000002894 ? 7FFDF8B57068 ?
                                                   7FFDF8B5BB10 ? 000000000 ?
ksucin()+147         call     ksesecl0()           00C113DA0 ? 7F96E5CBCCC0 ?
                                                   000002894 ? 000000000 ?
                                                   0000000B0 ? 000000000 ?
krr_thread_read()+8  call     ksucin()             00C113DA0 ? 7F96E5CBCCC0 ?
067                                                000002894 ? 000000000 ?
                                                   0000000B0 ? 000000000 ?
krr_read_buffer()+2  call     krr_thread_read()    7F96E42A0888 ? 000000002 ?
8                                                  000000000 ? 000000000 ?
                                                   7FFDF8B615A0 ? 7FFDF8B615A4 ?
krr_parse_redo()+12  call     krr_read_buffer()    7F96E42A0888 ? 000000002 ?
512                                                000000000 ? 000000000 ?
                                                   7FFDF8B615A0 ? 7FFDF8B615A4 ?
kcra_scan_redo()+11  call     krr_parse_redo()     7F96E42A0888 ? 000000002 ?
867                                                000000000 ? 000000000 ?
                                                   000000000 ? 7FFDF8B615A4 ?
kcra_dump_redo()+21  call     kcra_scan_redo()     7FFDF8B64560 ? 000000001 ?
04                                                 7FFDF8B640CC ? 000000000 ?
                                                   000000000 ? 7FFDF8B615A4 ?
kcra_dump_redo_inte  call     kcra_dump_redo()     7FFDF8B64560 ? 000000001 ?
rnal()+3362                                        7FFDF8B640CC ? 000000000 ?
                                                   000000000 ? 7FFDF8B615A4 ?
kcbzwfcro()+997      call     kcra_dump_redo_inte  7FFDF8B64C40 ? 7FFDF8B64C44 ?
                              rnal()               7FFDF8B64C48 ? 000000001 ?
                                                   1300000000 ? 1F00000000 ?
kcbgcur()+10141      call     kcbzwfcro()          132F60D8D8 ? 7FFDF8B685B0 ?
                                                   1F3953EC68 ? 000000001 ?
                                                   1300000000 ? 1F00000000 ?
ktbgcur()+123        call     kcbgcur()            7FFDF8B685B0 ? 000000002 ?
                                                   0000003C5 ? 000000000 ?
                                                   1300000000 ? 1F00000000 ?
kturGetCurPinForUnd  call     ktbgcur()            7FFDF8B685A0 ? 000000002 ?
o()+264                                            0000003C5 ? 000000000 ?
                                                   1300000000 ? 1F00000000 ?
kdiulk()+16575       call     kturGetCurPinForUnd  7FFDF8B685A0 ? 000000002 ?
                              o()                  0000003C5 ? 000000000 ?
                                                   0063CDCD8 ? 1F00000000 ?
kcoubk()+366         call     kdiulk()             7F96E5CC0E10 ? 18A5AF8518 ?
                                                   000000004 ? 000000080 ?
                                                   000000000 ? 1F00000000 ?
ktprundo_intra()+10  call     kcoubk()             7FFDF8B6874C ? 7F96E5CC0E10 ?
73                                                 18A5AF8518 ? 000000004 ?
                                                   000000080 ? 000000000 ?
ktprits()+1218       call     ktprundo_intra()     18A5620EA0 ? 18AD9DA7D0 ?
                                                   18A5AF8518 ? 000000004 ?
                                                   000000080 ? 000000000 ?
ktprmai()+213        call     ktprits()            7FFDF8B68A40 ? 18AD9DA7D0 ?
                                                   18A5AF8518 ? 000000004 ?
                                                   000000080 ? 000000000 ?
kxfprdp()+1466       call     ktprmai()            7FFDF8B68A40 ? 18AD9DA7D0 ?
                                                   18A5AF8518 ? 000000004 ?
                                                   000000080 ? 000000000 ?
opirip()+913         call     kxfprdp()            7FFDF8B68A40 ? 18AD9DA7D0 ?
                                                   1F4D61F8A0 ? 000000004 ?
                                                   000000080 ? 000000000 ?
opidrv()+603         call     opirip()             000000032 ? 000000004 ?
                                                   7FFDF8B69FE8 ? 000000004 ?
                                                   000000080 ? 000000000 ?
sou2o()+103          call     opidrv()             000000032 ? 000000004 ?
                                                   7FFDF8B69FE8 ? 000000004 ?
                                                   000000080 ? 000000000 ?
opimai_real()+250    call     sou2o()              7FFDF8B69FC0 ? 000000032 ?
                                                   000000004 ? 7FFDF8B69FE8 ?
                                                   000000080 ? 000000000 ?
ssthrdmain()+265     call     opimai_real()        000000000 ? 7FFDF8B6A1B0 ?
                                                   000000004 ? 7FFDF8B69FE8 ?
                                                   000000080 ? 000000000 ?
main()+201           call     ssthrdmain()         000000003 ? 7FFDF8B6A1B0 ?
                                                   000000001 ? 000000000 ?
                                                   000000080 ? 000000000 ?
__libc_start_main()  call     main()               000000003 ? 7FFDF8B6A350 ?
+253                                               000000001 ? 000000000 ?
                                                   000000080 ? 000000000 ?
_start()+41          call     __libc_start_main()  000A2A344 ? 000000001 ?
                                                   7FFDF8B6A348 ? 000000000 ?
                                                   000000080 ? 000000000 ?

根据trace文件,似乎与RAC间CR Block传输覆盖版本冲突有关:Potentially stale force-CR-override buffer found before OBJD MISMATCH check.

带着疑惑搜寻Mos(Oracle Support)后,确认此问题为Bug导致(ORA-600:[kcbzwfcro_2] Reported in Alert.log (Doc ID 2085507.1)

故障现象,Call Stack信息基本一致,并且Oracle给予了Workround

_gc_override_force_cr参数简单解释:if TRUE, try to override force-cr requests,在Undo header中会标识这个CR块是否为force_cr_override的flag,但是这个代表何意没有找到具体的官方解释,猜测是在节点间传输时本地CR与远程CR校验时出现错误,本地与远程均认为自己是最新版本CR。本次故障找到原因,但是此参数的含义还有继续找寻资料确认。