| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
|
| Marty, The tid in this case is the thread id, and can be used in iimonitor to find the session which is holding the mutex. It's likely that the session holding the DCB Mutex might be blocked by something else, and once you find the session you can check that status of that session. John On Aug 28, 7:09 pm, "Martin Bowes" > Hi everyone, > > I'm running II 9.1.1 (a64.lnx/103)NPTL + patch13001 on Red Hat > Enterprise Linux Server release 5 (Tikanga). > > About once a week my servers freeze up and new connections stall in > Mutex: DCB iidbdb > > This sucks big time. > > I've checked on the Ingres Tech support site and there are three old > bugs listed against the Mutex but none of them seem relevant in this > case. > > Can anyone shed some light on the details shown by show mutex? > > I get... > > show mutex 00002AAAB9806810 > > Mutex at 00002AAAB9806810: Name: DCB iidbdb, EXCL owner: (tid: > 1080768832, pid: 8227) > > Shared: 0 Collisions: 0 Hwm: 0 > > Excl: 15 Collisions: 13 > > The pid=8227 was the DBMS server pid. > > Does the tid tell me anything? > > Martin Bowes |
|
#2
|
| Hi John, Thanks for the quick reply.... >From the details I captured before restarting the system... Session 00002AAAB9A9ED00:1080768832 (ingres) cs_state:CS_EVENT_WAIT (LOG) cs_mask: At the time I stopped the server the following was captured by logstat: logstat -header -statistics ======================Thu Aug 28 07:55:27 2008 Logging System Summary=========== Database add 4147 Database removes 4073 Transaction begins 789304 Transaction ends 789230 Log read i/o's 15173 Log write i/o's 197624 Log writes 460376 Log forces 31180 Log optimized writes 285 Log optimized pages 2607 Log waits 20055 Log splits 8883 Log group commit 877 Log group count 877 Check commit timer 0 Timer write 0 Timer write, time 0 Timer write, idle 0 Inconsistent Database 0 Kbytes written 57851 ii_log_file read 8 ii_dual_log read 15165 write complete 97709 dual write complete 97593 All logwriters busy 1135 Max write queue len 60 Max write queue cnt 1 Log Waits By Type: Force 583 Free Buffer 0 Split Buffer 0 Log Header I/O 0 Ckpdb Stall 0 Opendb 5943 BCP Stall 1 Logfull Stall 0 Lastbuf 154 Forced I/O 140 Event 13234 Mini Transaction 0 Logfull Commit 0 ----Buffer utilization profile-------------------------------------------------- <10% *********************** 10-19% ****************** 20-29% * 30-39% *********** 40-49% * 50-59% * 60-69% *** 70-79% * 80-89% * >90% ******* ----Current log file header----------------------------------------------------- Block size: 4096 Block count: 512000 Partitions: 1 Buffer count: 20 00 CP interval: 10240 Logfull interval: 460800 Abort interval: 368640 Last Transaction Id: 000048AE4955E972 Last LSN: <48AE9A03,000C2FEB> Begin: <1219402243:162765:1260> CP: <1219402243:162765:1260> End: <121 9402243:166765:1160> Forced LGA,LSN: <1219402243,166762,608>,<48AE9A03,000C2FE1> Percentage of log file in use or reserved: 0 Log file blocks reserved by recovery system: 16 Archive Window: <0,0,0>..<0,0,0> Previous CP: <0,0,0> Status: ONLINE,CPNEEDED,RECOVER,CLOSE_DB,BCPSTALL,JSWITCHD ONE Active Log(s): II_LOG_FILE,II_DUAL_LOG Marty -----Original Message----- From: info-ingres-bounces-at-kettleriverconsulting.com [mailto:info-ingres-bounces-at-kettleriverconsulting.com] On Behalf Of John Dennis Sent: 28 August 2008 10:16 To: info-ingres-at-kettleriverconsulting.com Subject: Re: [Info-Ingres] Mutex: DCB iidbdb Marty, The tid in this case is the thread id, and can be used in iimonitor to find the session which is holding the mutex. It's likely that the session holding the DCB Mutex might be blocked by something else, and once you find the session you can check that status of that session. John On Aug 28, 7:09 pm, "Martin Bowes" > Hi everyone, > > I'm running II 9.1.1 (a64.lnx/103)NPTL + patch13001 on Red Hat > Enterprise Linux Server release 5 (Tikanga). > > About once a week my servers freeze up and new connections stall in > Mutex: DCB iidbdb > > This sucks big time. > > I've checked on the Ingres Tech support site and there are three old > bugs listed against the Mutex but none of them seem relevant in this > case. > > Can anyone shed some light on the details shown by show mutex? > > I get... > > show mutex 00002AAAB9806810 > > Mutex at 00002AAAB9806810: Name: DCB iidbdb, EXCL owner: (tid: > 1080768832, pid: 8227) > > Shared: 0 Collisions: 0 Hwm: 0 > > Excl: 15 Collisions: 13 > > The pid=8227 was the DBMS server pid. > > Does the tid tell me anything? > > Martin Bowes _______________________________________________ Info-Ingres mailing list Info-Ingres-at-kettleriverconsulting.com http://www.kettleriverconsulting.com...fo/info-ingres |
|
#3
|
| Marty, What does the logstat show for that session? John -----Original Message----- From: info-ingres-bounces-at-kettleriverconsulting.com on behalf of Martin Bowes Sent: Thu 28/08/2008 7:48 PM To: Ingres and related product discussion forum Cc: John Dennis Subject: Re: [Info-Ingres] Mutex: DCB iidbdb Hi John, Thanks for the quick reply.... >From the details I captured before restarting the system... Session 00002AAAB9A9ED00:1080768832 (ingres) cs_state:CS_EVENT_WAIT (LOG) cs_mask: At the time I stopped the server the following was captured by logstat: logstat -header -statistics ======================Thu Aug 28 07:55:27 2008 Logging System Summary=========== Database add 4147 Database removes 4073 Transaction begins 789304 Transaction ends 789230 Log read i/o's 15173 Log write i/o's 197624 Log writes 460376 Log forces 31180 Log optimized writes 285 Log optimized pages 2607 Log waits 20055 Log splits 8883 Log group commit 877 Log group count 877 Check commit timer 0 Timer write 0 Timer write, time 0 Timer write, idle 0 Inconsistent Database 0 Kbytes written 57851 ii_log_file read 8 ii_dual_log read 15165 write complete 97709 dual write complete 97593 All logwriters busy 1135 Max write queue len 60 Max write queue cnt 1 Log Waits By Type: Force 583 Free Buffer 0 Split Buffer 0 Log Header I/O 0 Ckpdb Stall 0 Opendb 5943 BCP Stall 1 Logfull Stall 0 Lastbuf 154 Forced I/O 140 Event 13234 Mini Transaction 0 Logfull Commit 0 ----Buffer utilization profile-------------------------------------------------- <10% *********************** 10-19% ****************** 20-29% * 30-39% *********** 40-49% * 50-59% * 60-69% *** 70-79% * 80-89% * >90% ******* ----Current log file header----------------------------------------------------- Block size: 4096 Block count: 512000 Partitions: 1 Buffer count: 20 00 CP interval: 10240 Logfull interval: 460800 Abort interval: 368640 Last Transaction Id: 000048AE4955E972 Last LSN: <48AE9A03,000C2FEB> Begin: <1219402243:162765:1260> CP: <1219402243:162765:1260> End: <121 9402243:166765:1160> Forced LGA,LSN: <1219402243,166762,608>,<48AE9A03,000C2FE1> Percentage of log file in use or reserved: 0 Log file blocks reserved by recovery system: 16 Archive Window: <0,0,0>..<0,0,0> Previous CP: <0,0,0> Status: ONLINE,CPNEEDED,RECOVER,CLOSE_DB,BCPSTALL,JSWITCHD ONE Active Log(s): II_LOG_FILE,II_DUAL_LOG Marty -----Original Message----- From: info-ingres-bounces-at-kettleriverconsulting.com [mailto:info-ingres-bounces-at-kettleriverconsulting.com] On Behalf Of John Dennis Sent: 28 August 2008 10:16 To: info-ingres-at-kettleriverconsulting.com Subject: Re: [Info-Ingres] Mutex: DCB iidbdb Marty, The tid in this case is the thread id, and can be used in iimonitor to find the session which is holding the mutex. It's likely that the session holding the DCB Mutex might be blocked by something else, and once you find the session you can check that status of that session. John On Aug 28, 7:09 pm, "Martin Bowes" > Hi everyone, > > I'm running II 9.1.1 (a64.lnx/103)NPTL + patch13001 on Red Hat > Enterprise Linux Server release 5 (Tikanga). > > About once a week my servers freeze up and new connections stall in > Mutex: DCB iidbdb > > This sucks big time. > > I've checked on the Ingres Tech support site and there are three old > bugs listed against the Mutex but none of them seem relevant in this > case. > > Can anyone shed some light on the details shown by show mutex? > > I get... > > show mutex 00002AAAB9806810 > > Mutex at 00002AAAB9806810: Name: DCB iidbdb, EXCL owner: (tid: > 1080768832, pid: 8227) > > Shared: 0 Collisions: 0 Hwm: 0 > > Excl: 15 Collisions: 13 > > The pid=8227 was the DBMS server pid. > > Does the tid tell me anything? > > Martin Bowes _______________________________________________ Info-Ingres mailing list Info-Ingres-at-kettleriverconsulting.com http://www.kettleriverconsulting.com...fo/info-ingres _______________________________________________ Info-Ingres mailing list Info-Ingres-at-kettleriverconsulting.com http://www.kettleriverconsulting.com...fo/info-ingres |
|
#4
|
| On Aug 28, 2008, at 5:48 AM, Martin Bowes wrote: > > Status: > ONLINE,CPNEEDED,RECOVER,CLOSE_DB,BCPSTALL,JSWITCHD ONE > I might know what this is. There's a race in the free-buffer code in the logging system that freezes the installation when it happens to hit during a begin-CP stall. The fix is one of the relatively few Datallegro fixes (as opposed to enhancements) that haven't yet made it into the standard code line. You might be able to make the stall go away, or at least be less frequent, by raising the log buffer count. That's not a sure thing though. Karl |
|
#5
|
| Hi John, Sadly I only captured the logstat -header -statistics Marty -----Original Message----- From: info-ingres-bounces-at-kettleriverconsulting.com [mailto:info-ingres-bounces-at-kettleriverconsulting.com] On Behalf Of John Dennis Sent: 28 August 2008 11:52 To: Ingres and related product discussion forum Subject: RE: [Info-Ingres] Mutex: DCB iidbdb Marty, What does the logstat show for that session? John -----Original Message----- From: info-ingres-bounces-at-kettleriverconsulting.com on behalf of Martin Bowes Sent: Thu 28/08/2008 7:48 PM To: Ingres and related product discussion forum Cc: John Dennis Subject: Re: [Info-Ingres] Mutex: DCB iidbdb Hi John, Thanks for the quick reply.... >From the details I captured before restarting the system... Session 00002AAAB9A9ED00:1080768832 (ingres) cs_state:CS_EVENT_WAIT (LOG) cs_mask: At the time I stopped the server the following was captured by logstat: logstat -header -statistics ======================Thu Aug 28 07:55:27 2008 Logging System Summary=========== Database add 4147 Database removes 4073 Transaction begins 789304 Transaction ends 789230 Log read i/o's 15173 Log write i/o's 197624 Log writes 460376 Log forces 31180 Log optimized writes 285 Log optimized pages 2607 Log waits 20055 Log splits 8883 Log group commit 877 Log group count 877 Check commit timer 0 Timer write 0 Timer write, time 0 Timer write, idle 0 Inconsistent Database 0 Kbytes written 57851 ii_log_file read 8 ii_dual_log read 15165 write complete 97709 dual write complete 97593 All logwriters busy 1135 Max write queue len 60 Max write queue cnt 1 Log Waits By Type: Force 583 Free Buffer 0 Split Buffer 0 Log Header I/O 0 Ckpdb Stall 0 Opendb 5943 BCP Stall 1 Logfull Stall 0 Lastbuf 154 Forced I/O 140 Event 13234 Mini Transaction 0 Logfull Commit 0 ----Buffer utilization profile-------------------------------------------------- <10% *********************** 10-19% ****************** 20-29% * 30-39% *********** 40-49% * 50-59% * 60-69% *** 70-79% * 80-89% * >90% ******* ----Current log file header----------------------------------------------------- Block size: 4096 Block count: 512000 Partitions: 1 Buffer count: 20 00 CP interval: 10240 Logfull interval: 460800 Abort interval: 368640 Last Transaction Id: 000048AE4955E972 Last LSN: <48AE9A03,000C2FEB> Begin: <1219402243:162765:1260> CP: <1219402243:162765:1260> End: <121 9402243:166765:1160> Forced LGA,LSN: <1219402243,166762,608>,<48AE9A03,000C2FE1> Percentage of log file in use or reserved: 0 Log file blocks reserved by recovery system: 16 Archive Window: <0,0,0>..<0,0,0> Previous CP: <0,0,0> Status: ONLINE,CPNEEDED,RECOVER,CLOSE_DB,BCPSTALL,JSWITCHD ONE Active Log(s): II_LOG_FILE,II_DUAL_LOG Marty -----Original Message----- From: info-ingres-bounces-at-kettleriverconsulting.com [mailto:info-ingres-bounces-at-kettleriverconsulting.com] On Behalf Of John Dennis Sent: 28 August 2008 10:16 To: info-ingres-at-kettleriverconsulting.com Subject: Re: [Info-Ingres] Mutex: DCB iidbdb Marty, The tid in this case is the thread id, and can be used in iimonitor to find the session which is holding the mutex. It's likely that the session holding the DCB Mutex might be blocked by something else, and once you find the session you can check that status of that session. John On Aug 28, 7:09 pm, "Martin Bowes" > Hi everyone, > > I'm running II 9.1.1 (a64.lnx/103)NPTL + patch13001 on Red Hat > Enterprise Linux Server release 5 (Tikanga). > > About once a week my servers freeze up and new connections stall in > Mutex: DCB iidbdb > > This sucks big time. > > I've checked on the Ingres Tech support site and there are three old > bugs listed against the Mutex but none of them seem relevant in this > case. > > Can anyone shed some light on the details shown by show mutex? > > I get... > > show mutex 00002AAAB9806810 > > Mutex at 00002AAAB9806810: Name: DCB iidbdb, EXCL owner: (tid: > 1080768832, pid: 8227) > > Shared: 0 Collisions: 0 Hwm: 0 > > Excl: 15 Collisions: 13 > > The pid=8227 was the DBMS server pid. > > Does the tid tell me anything? > > Martin Bowes _______________________________________________ Info-Ingres mailing list Info-Ingres-at-kettleriverconsulting.com http://www.kettleriverconsulting.com...fo/info-ingres _______________________________________________ Info-Ingres mailing list Info-Ingres-at-kettleriverconsulting.com http://www.kettleriverconsulting.com...fo/info-ingres |
|
#6
|
| Hi Karl, Thanks for the input. I'll attach that to the issue. FYI. I have 2000 x 4k log buffers. Marty -----Original Message----- From: info-ingres-bounces-at-kettleriverconsulting.com [mailto:info-ingres-bounces-at-kettleriverconsulting.com] On Behalf Of Karl & Betty Schendel Sent: 28 August 2008 12:06 To: Ingres and related product discussion forum Subject: Re: [Info-Ingres] Mutex: DCB iidbdb On Aug 28, 2008, at 5:48 AM, Martin Bowes wrote: > > Status: > ONLINE,CPNEEDED,RECOVER,CLOSE_DB,BCPSTALL,JSWITCHD ONE > I might know what this is. There's a race in the free-buffer code in the logging system that freezes the installation when it happens to hit during a begin-CP stall. The fix is one of the relatively few Datallegro fixes (as opposed to enhancements) that haven't yet made it into the standard code line. You might be able to make the stall go away, or at least be less frequent, by raising the log buffer count. That's not a sure thing though. Karl _______________________________________________ Info-Ingres mailing list Info-Ingres-at-kettleriverconsulting.com http://www.kettleriverconsulting.com...fo/info-ingres |
![]() |
| Thread Tools | |
| Display Modes | |