| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
|
| select * from monProcessWaits w, monWaitEventInfo i where w.SPID=54 and i.WaitEventID=w.WaitEventID order by WaitTime desc go SPID KPID WaitEventID Waits WaitTime WaitEventID WaitClassID Description ------ ----------- ----------- ----------- ----------- ----------- ----------- -------------------------------------------------- 82 49938512 266 1633 50773300 266 9 waiting for message in worker thread mailbox 82 49938512 250 283 8864200 250 7 waiting for incoming network data 82 49938512 51 125603 123400 51 3 waiting for disk write to complete 82 49938512 29 290144 33700 29 2 wait for buffer read to complete 82 49938512 214 113351 17700 214 1 waiting on run queue after yield 82 49938512 55 98269 15300 55 3 waiting for disk write to complete 82 49938512 31 93558 15100 31 3 wait for buffer write to complete 82 49938512 124 51347 7500 124 9 wait for someone else to finish reading in mass 82 49938512 52 3698 1200 52 3 waiting for disk write to complete 82 49938512 36 178 1100 36 6 wait for mass to stop changing 82 49938512 169 145 400 169 9 wait for message 82 49938512 251 227 400 251 8 waiting for network send to complete 82 49938512 171 38 200 171 8 waiting for CTLIB event to complete 82 49938512 150 122 100 150 5 waiting for semaphore 82 49938512 37 1 0 37 6 wait for mass to finish changing 82 49938512 41 3 0 41 9 wait to acquire latch 82 49938512 48 4 0 48 6 wait for write on buffer to complete 82 49938512 53 3 0 53 6 waiting for mass to finish changing 82 49938512 54 12 0 54 3 waiting for disk write to complete 82 49938512 272 1 0 272 9 waiting for lock on PLC Hello, We are busy investigating performance issues on our ASE servers and concerned about the amount of WaitTime(in Milliseconds), lost between various events, These timings were gathered from the monProcessWaits and monWaitEventInfo MDA tables. Is it advisable to increase the default settings on the ASE, for "Number of Mailboxes" and "Number of Messages", given these stats? The documentation instructs me to only increase it if TECH SUPPORT tells me to do so. but these readings are really bothering me and I am thinking we are hitting configurable limits? Our system is BCP based and has one user connected to the database at any one time. The particular process. SPID 54 is a parent process that calls individual procedures, but has suffered a painful decrease of 11 hours to 29 hours processing time. Any suggestions on the TOP 12 WaitTimes will be fantastic, thanks guys, L ASE Version: Adaptive Server Enterprise/12.5.3/EBF 13325 ESD#7/P/Sun_svr4/OS 5.8/ase1253/1951/64-bit/FBO/Fri Mar 24 11:00:22 2006 |
|
#2
|
| Hard to say without seeing 2 samples at a decent interval - result is that the below is from first connect time for each SPID/KPID. Likely problem is that optimizer has chosen a parallel query plan due to no other reasonable option (i.e. lack of appropriate indexes consequently it will do a parallel partition scan or parallel hash scan of table) - or even more likely a parallel join strategy that is based on a unindexed join or other nasty issue. If SPID 54 is the parent, then it is more likely that given the below is all SPID 82, that SPID 82 is only one of several child processes of the SPID 54 parallel query. Increasing the number of mailboxes or other tuning likely not going to help. Suggest find the problem query/tables involved and reviewing the query plans. Since this appears to be a 12.5.x based system, suggest using monSysSQLText/monSysStatement to identify the queries of interest. Look for queries with high LogicalIO + PhysicalIO counts initially (although from the below PhysicalIO's may not be the problem ). Can't tell for sure from just one sample, but it does not appear that disk IO speed is the problem as disk writes are in the 1ms range (event 51) or worse case 7ms range (event 31 - diff context though). You do have some Physical Reads (~300K) but don't know the timeframe since this started - if it is a child process of a parallel query, it is definitely a bell ringer of the point I was making earlier of lack of indexing causing a bad parallel query plan. liezlnel wrote: > select * from monProcessWaits w, > monWaitEventInfo i where w.SPID=54 > and i.WaitEventID=w.WaitEventID > order by WaitTime desc > go > > SPID KPID WaitEventID Waits WaitTime > WaitEventID WaitClassID Description > ------ ----------- ----------- ----------- ----------- > ----------- ----------- > -------------------------------------------------- > 82 49938512 266 1633 50773300 > 266 9 waiting for message in worker > thread mailbox > 82 49938512 250 283 8864200 > 250 7 waiting for incoming network data > 82 49938512 51 125603 123400 > 51 3 waiting for disk write to > complete > 82 49938512 29 290144 33700 > 29 2 wait for buffer read to complete > 82 49938512 214 113351 17700 > 214 1 waiting on run queue after yield > 82 49938512 55 98269 15300 > 55 3 waiting for disk write to > complete > 82 49938512 31 93558 15100 > 31 3 wait for buffer write to complete > 82 49938512 124 51347 7500 > 124 9 wait for someone else to finish > reading in mass > 82 49938512 52 3698 1200 > 52 3 waiting for disk write to > complete > 82 49938512 36 178 1100 > 36 6 wait for mass to stop changing > 82 49938512 169 145 400 > 169 9 wait for message > 82 49938512 251 227 400 > 251 8 waiting for network send to > complete > 82 49938512 171 38 200 > 171 8 waiting for CTLIB event to > complete > 82 49938512 150 122 100 > 150 5 waiting for semaphore > 82 49938512 37 1 0 > 37 6 wait for mass to finish changing > 82 49938512 41 3 0 > 41 9 wait to acquire latch > 82 49938512 48 4 0 > 48 6 wait for write on buffer to > complete > 82 49938512 53 3 0 > 53 6 waiting for mass to finish > changing > 82 49938512 54 12 0 > 54 3 waiting for disk write to > complete > 82 49938512 272 1 0 > 272 9 waiting for lock on PLC > > > > Hello, > > We are busy investigating performance issues on our ASE > servers and concerned about the amount of WaitTime(in > Milliseconds), > lost between various events, > These timings were gathered from the monProcessWaits and > monWaitEventInfo MDA tables. > > Is it advisable to increase the default settings on the > ASE, for "Number of Mailboxes" and "Number of Messages", > given these stats? > > The documentation instructs me to only increase it if > TECH SUPPORT tells me to do so. but these readings are > really bothering me and I am thinking we are hitting > configurable limits? > > Our system is BCP based and has one user connected to > the database at any one time. > > The particular process. SPID 54 is a parent process that > calls individual procedures, > but has suffered a painful decrease of 11 hours to 29 > hours processing time. > > Any suggestions on the TOP 12 WaitTimes will be > fantastic, thanks guys, L > > ASE Version: > Adaptive Server Enterprise/12.5.3/EBF 13325 > ESD#7/P/Sun_svr4/OS 5.8/ase1253/1951/64-bit/FBO/Fri Mar > 24 > 11:00:22 2006 |
|
#3
|
| > On 2008-08-26 01:14:15 +1000, liezlnel said: > Any suggestions on the TOP 12 WaitTimes will be > fantastic, thanks guys, L This is not much to go on. Better if you can provide sp_sysmon for a reasonable duration (eg an hour or two). 1 Your parallelism is not set up apprpriately. The config parms related to parallelism need to set in concert with each other (and they are not). (That could lead to improving config parms overall (not limited to parallelism). ) After you fix parallelism, you may not have Mailbox/Message waits. 2 Use bcp ... -A2048 or -A4096 3 Disk writes are waited for. Ensure you are running at optimum for your disk/SAN layout. 4 Buffer reads and writes are waited for. Check your caches and cache stats. 5 Addressing Run Queue waits will have to wait for you to address the above four first. You may already be CPU bound. -- Cheers Derek Senior Sybase DBA / Information Architect Copyright © 2008 Software Gems Pty Ltd Quality Standards = Zero Maintenance + Zero Surprises Performance Standards = Predictability + Scaleability |
|
#4
|
| Would you happen to have the query??? It looks like an extended group by and they are easily mangled into partial cartesians. liezlnel wrote: > First of all, let me apologise for the typo SPID number 54. > The SPID's on the output were all numbered 82 and was one > user and one process that normally kicks off multiple > processes below it, but this spid belonged to the parent. > > Secondly, thanks so much for the feedback so far, it is > really helpful. I can get anything out of the MDA tables, > but can't run sp_sysmon on production as we are worried that > this will reset the table statistics? > > By the time of the initial event info from the MDA tables, > the attached showplan for spid 82 was extracted. > > I also include the sp_configure output after that(also > attached) > > Our DBA's have some parallel processing settings switched on > at the moment, but we are not making full use of this > performance enhancement as yet. For example kicking of > multiple bcp processes at the same time and partitioning > tables and other methods, in order to benefit from the > multiple worker processes etc. > > Our biggest issue is that the lifespan of our most > problematic tables are short. They get dropped and > recreated before each BCP load. > So the traditional Update stats, Index building and > perf/tune basics are not always possible, however we do > build indexes on most of the tables before major processing > are done on them and before they are recycled and > dropped/created for a new load. > > Your comments are well appreciated > Liezl > > > > > >> select * from monProcessWaits w, >> monWaitEventInfo i where w.SPID=82 >> and i.WaitEventID=w.WaitEventID >> order by WaitTime desc >> go >> >> SPID KPID WaitEventID Waits WaitTime >> WaitEventID WaitClassID Description >> ------ ----------- ----------- ----------- ----------- >> ----------- ----------- >> -------------------------------------------------- >> 82 49938512 266 1633 50773300 >> >> 266 9 waiting for message in worker >> thread mailbox >> 82 49938512 250 283 8864200 >> >> 250 7 waiting for incoming network >> data >> 82 49938512 51 125603 123400 >> >> 51 3 waiting for disk write to >> complete >> 82 49938512 29 290144 33700 >> >> 29 2 wait for buffer read to >> complete >> 82 49938512 214 113351 17700 >> >> 214 1 waiting on run queue after >> yield >> 82 49938512 55 98269 15300 >> >> 55 3 waiting for disk write to >> complete >> 82 49938512 31 93558 15100 >> >> 31 3 wait for buffer write to >> complete >> 82 49938512 124 51347 7500 >> >> 124 9 wait for someone else to finish >> reading in mass >> 82 49938512 52 3698 1200 >> >> 52 3 waiting for disk write to >> complete >> 82 49938512 36 178 1100 >> >> 36 6 wait for mass to stop changing >> 82 49938512 169 145 400 >> >> 169 9 wait for message >> 82 49938512 251 227 400 >> >> 251 8 waiting for network send to >> complete >> 82 49938512 171 38 200 >> >> 171 8 waiting for CTLIB event to >> complete >> 82 49938512 150 122 100 >> >> 150 5 waiting for semaphore >> 82 49938512 37 1 0 >> >> 37 6 wait for mass to finish >> changing >> 82 49938512 41 3 0 >> >> 41 9 wait to acquire latch >> 82 49938512 48 4 0 >> >> 48 6 wait for write on buffer to >> complete >> 82 49938512 53 3 0 >> >> 53 6 waiting for mass to finish >> changing >> 82 49938512 54 12 0 >> >> 54 3 waiting for disk write to >> complete >> 82 49938512 272 1 0 >> >> 272 9 waiting for lock on PLC >> >> >> >> Hello, >> >> We are busy investigating performance issues on our >> ASE servers and concerned about the amount of WaitTime(in >> Milliseconds), >> lost between various events, >> These timings were gathered from the monProcessWaits >> and monWaitEventInfo MDA tables. >> >> Is it advisable to increase the default settings on >> the ASE, for "Number of Mailboxes" and "Number of >> Messages", given these stats? >> >> The documentation instructs me to only increase it if >> TECH SUPPORT tells me to do so. but these readings are >> really bothering me and I am thinking we are hitting >> configurable limits? >> >> Our system is BCP based and has one user connected to >> the database at any one time. >> >> The particular process. SPID 54 is a parent process >> that calls individual procedures, >> but has suffered a painful decrease of 11 hours to 29 >> hours processing time. >> >> Any suggestions on the TOP 12 WaitTimes will be >> fantastic, thanks guys, L >> >> ASE Version: >> Adaptive Server Enterprise/12.5.3/EBF 13325 >> ESD#7/P/Sun_svr4/OS 5.8/ase1253/1951/64-bit/FBO/Fri >> Mar 24 >> 11:00:22 2006 |
|
#5
|
| I have included the specific lines of code that was reported according to procedureID and line number in the monProcessStatement table at the time of the issue.. We have since learned that the data that was provided had a lot of nulls and that the process was among other things defaulting these nulls, to reasonable values. However, I am still concerned that the initial event info may have exposed limits on the configurion overall, which is currently holding us back. Liz > Would you happen to have the query??? It looks like an > extended group by and they are easily mangled into > partial cartesians. > > liezlnel wrote: > > First of all, let me apologise for the typo SPID number > > 54. The SPID's on the output were all numbered 82 and > > was one user and one process that normally kicks off > > multiple processes below it, but this spid belonged to > > the parent. > > Secondly, thanks so much for the feedback so far, it is > > really helpful. I can get anything out of the MDA tables > > , but can't run sp_sysmon on production as we are > > worried that this will reset the table statistics? > > > > By the time of the initial event info from the MDA > > tables, the attached showplan for spid 82 was extracted. > > > > I also include the sp_configure output after that(also > > attached) > > > > Our DBA's have some parallel processing settings > > switched on at the moment, but we are not making full > > use of this performance enhancement as yet. For example > > kicking of multiple bcp processes at the same time and > > partitioning tables and other methods, in order to > > benefit from the multiple worker processes etc. > > > > Our biggest issue is that the lifespan of our most > > problematic tables are short. They get dropped and > > recreated before each BCP load. > > So the traditional Update stats, Index building and > > perf/tune basics are not always possible, however we do > > build indexes on most of the tables before major > > processing are done on them and before they are recycled > > and dropped/created for a new load. > > > > Your comments are well appreciated > > Liezl > > > > > > > > > > > >> select * from monProcessWaits w, > >> monWaitEventInfo i where w.SPID=82 > >> and i.WaitEventID=w.WaitEventID > >> order by WaitTime desc > >> go > >> > >> SPID KPID WaitEventID Waits WaitTime > >> WaitEventID WaitClassID Description > >> ------ ----------- ----------- ----------- ----------- > >> ----------- ----------- > >> -------------------------------------------------- > >> 82 49938512 266 1633 50773300 > >> > >> 266 9 waiting for message in > worker >> thread mailbox > >> 82 49938512 250 283 8864200 > >> > >> 250 7 waiting for incoming network > >> data > >> 82 49938512 51 125603 123400 > >> > >> 51 3 waiting for disk write to > >> complete > >> 82 49938512 29 290144 33700 > >> > >> 29 2 wait for buffer read to > >> complete > >> 82 49938512 214 113351 17700 > >> > >> 214 1 waiting on run queue after > >> yield > >> 82 49938512 55 98269 15300 > >> > >> 55 3 waiting for disk write to > >> complete > >> 82 49938512 31 93558 15100 > >> > >> 31 3 wait for buffer write to > >> complete > >> 82 49938512 124 51347 7500 > >> > >> 124 9 wait for someone else to > finish >> reading in mass > >> 82 49938512 52 3698 1200 > >> > >> 52 3 waiting for disk write to > >> complete > >> 82 49938512 36 178 1100 > >> > >> 36 6 wait for mass to stop > changing >> 82 49938512 169 145 > 400 >> > >> 169 9 wait for message > >> 82 49938512 251 227 400 > >> > >> 251 8 waiting for network send to > >> complete > >> 82 49938512 171 38 200 > >> > >> 171 8 waiting for CTLIB event to > >> complete > >> 82 49938512 150 122 100 > >> > >> 150 5 waiting for semaphore > >> 82 49938512 37 1 0 > >> > >> 37 6 wait for mass to finish > >> changing > >> 82 49938512 41 3 0 > >> > >> 41 9 wait to acquire latch > >> 82 49938512 48 4 0 > >> > >> 48 6 wait for write on buffer to > >> complete > >> 82 49938512 53 3 0 > >> > >> 53 6 waiting for mass to finish > >> changing > >> 82 49938512 54 12 0 > >> > >> 54 3 waiting for disk write to > >> complete > >> 82 49938512 272 1 0 > >> > >> 272 9 waiting for lock on PLC > >> > >> > >> > >> Hello, > >> > >> We are busy investigating performance issues on our > >> ASE servers and concerned about the amount of > WaitTime(in >> Milliseconds), > >> lost between various events, > >> These timings were gathered from the > monProcessWaits >> and monWaitEventInfo MDA tables. > >> > >> Is it advisable to increase the default settings on > >> the ASE, for "Number of Mailboxes" and "Number of > >> Messages", given these stats? > >> > >> The documentation instructs me to only increase it > if >> TECH SUPPORT tells me to do so. but these readings > are >> really bothering me and I am thinking we are > hitting >> configurable limits? > >> > >> Our system is BCP based and has one user connected > to >> the database at any one time. > >> > >> The particular process. SPID 54 is a parent process > >> that calls individual procedures, > >> but has suffered a painful decrease of 11 hours to > 29 >> hours processing time. > >> > >> Any suggestions on the TOP 12 WaitTimes will be > >> fantastic, thanks guys, L > >> > >> ASE Version: > >> Adaptive Server Enterprise/12.5.3/EBF 13325 > >> ESD#7/P/Sun_svr4/OS 5.8/ase1253/1951/64-bit/FBO/Fri > >> Mar 24 > >> 11:00:22 2006 |
![]() |
| Thread Tools | |
| Display Modes | |