Configuring tempdb cache and log cache

Discussion:

(too old to reply)

Veljko Zrnic

2007-11-08 14:38:06 UTC

Hi all,

We have Sybase 12.5.4 running on a very rich Unix box with 90+G of
memory, and 20+ processors.
However the performance is constantly varying in such way that the same
query could take anywhere from 5 sec to 45 sec, and I'm trying to
determine the cause.

I've examined sysmon output from about 10 different executions in past
few days, each with period of 10mins.

The most obvious thing is tempdb cache spinlock contention which at
times reaches even 40% (but usually holds around 25%). Increasing number
of partitions on tempdb is my first thought but I'm not sure how much to
put? I'm thinking either 4 or 8 (so it would come <10%)
Is there any down-size for increasing the number of cache partitions?

Pool turnover is very different in each of the sysmon reports, so I
can't rely on that. Also, I already checked and tempdb is moved away
from master, and it's using dedicated devices so that should be fine.
Here are some outputs for cacheconfig:

sp_cacheconfig
go

Cache Name Status Type Config Value Run
Value
------------------------------ --------- -------- ------------
------------
default data cache Active Default 8192.00 Mb
8192.00 Mb
tempdb cache Active Mixed 200.00 Mb
200.00 Mb
------------
------------
Total 8392.00 Mb
8392.00 Mb
==========================================================================
Cache: default data cache, Status: Active, Type: Default
Config Size: 8192.00 Mb, Run Size: 8192.00 Mb
Config Replacement: strict LRU, Run Replacement: strict LRU
Config Partition: 1, Run Partition: 4
IO Size Wash Size Config Size Run Size APF Percent
-------- --------- ------------ ------------ -----------
2 Kb 245760 Kb 0.00 Mb 3680.00 Mb 10
4 Kb 104848 Kb 512.00 Mb 512.00 Mb 10
16 Kb 245760 Kb 4000.00 Mb 4000.00 Mb 10
==========================================================================
Cache: tempdb cache, Status: Active, Type: Mixed
Config Size: 200.00 Mb, Run Size: 200.00 Mb
Config Replacement: strict LRU, Run Replacement: strict LRU
Config Partition: 1, Run Partition: 1
IO Size Wash Size Config Size Run Size APF Percent
-------- --------- ------------ ------------ -----------
2 Kb 10240 Kb 0.00 Mb 50.00 Mb 10
4 Kb 30720 Kb 150.00 Mb 150.00 Mb 10
(return status = 0)

Mark A. Parsons

2007-11-08 14:54:48 UTC

Permalink

It's not uncommon to set the number of partitions to a value that's as large as (or larger) than the number of
dataserver engines.

[I believe that the number of partitions has to be a power of 2 (eg, 1, 2, 4, 8, 16, 32).]

If you have 4 dataserver engines, consider using 4 partitions.

If you have 12 dataserver engines, consider using 16 partitions.

If you have 20 dataserver engines, consider using 16 or 32 partitions.

I personally try to keep spinlock contention below 10%, and sometimes below 5%.

----------------

There's no noticeable performance issues in increasing the number of data cache partitions, especially when compared to
the alternative of (relatively) high spinlock contention percentages.

----------------

As to whether or not the reduction in spinlock contention will help speed up your queries ... *shrug* ... not enough
information has been provided to ascertain if the query performance is in fact related to spinlock contention, or if the
performance is related to other issues ...

Post by Veljko Zrnic
Hi all,
We have Sybase 12.5.4 running on a very rich Unix box with 90+G of
memory, and 20+ processors.
However the performance is constantly varying in such way that the same
query could take anywhere from 5 sec to 45 sec, and I'm trying to
determine the cause.
I've examined sysmon output from about 10 different executions in past
few days, each with period of 10mins.
The most obvious thing is tempdb cache spinlock contention which at
times reaches even 40% (but usually holds around 25%). Increasing number
of partitions on tempdb is my first thought but I'm not sure how much to
put? I'm thinking either 4 or 8 (so it would come <10%)
Is there any down-size for increasing the number of cache partitions?
Pool turnover is very different in each of the sysmon reports, so I
can't rely on that. Also, I already checked and tempdb is moved away
from master, and it's using dedicated devices so that should be fine.
sp_cacheconfig
go
Cache Name Status Type Config Value Run Value
------------------------------ --------- -------- ------------
------------
default data cache Active Default 8192.00 Mb 8192.00 Mb
tempdb cache Active Mixed 200.00 Mb 200.00 Mb
------------
------------
Total 8392.00 Mb 8392.00 Mb
==========================================================================
Cache: default data cache, Status: Active, Type: Default
Config Size: 8192.00 Mb, Run Size: 8192.00 Mb
Config Replacement: strict LRU, Run Replacement: strict LRU
Config Partition: 1, Run Partition: 4
IO Size Wash Size Config Size Run Size APF Percent
-------- --------- ------------ ------------ -----------
2 Kb 245760 Kb 0.00 Mb 3680.00 Mb 10
4 Kb 104848 Kb 512.00 Mb 512.00 Mb 10
16 Kb 245760 Kb 4000.00 Mb 4000.00 Mb 10
==========================================================================
Cache: tempdb cache, Status: Active, Type: Mixed
Config Size: 200.00 Mb, Run Size: 200.00 Mb
Config Replacement: strict LRU, Run Replacement: strict LRU
Config Partition: 1, Run Partition: 1
IO Size Wash Size Config Size Run Size APF Percent
-------- --------- ------------ ------------ -----------
2 Kb 10240 Kb 0.00 Mb 50.00 Mb 10
4 Kb 30720 Kb 150.00 Mb 150.00 Mb 10
(return status = 0)

Veljko Zrnic

2007-11-08 16:09:17 UTC

Permalink

Post by Mark A. Parsons
It's not uncommon to set the number of partitions to a value that's as
large as (or larger) than the number of dataserver engines.
I personally try to keep spinlock contention below 10%, and sometimes below 5%.

That's a good comparison. Thanks! We have 9 engines running so 8
partitions shouldn't be a problem at all...

Post by Mark A. Parsons
There's no noticeable performance issues in increasing the number of
data cache partitions, especially when compared to the alternative of
(relatively) high spinlock contention percentages.

But are there any downsides for increasing number of partitions? Why not
set it to 64 for example?

Post by Mark A. Parsons
As to whether or not the reduction in spinlock contention will help
speed up your queries ... *shrug* ... not enough information has been
provided to ascertain if the query performance is in fact related to
spinlock contention, or if the performance is related to other issues ...

Well, the tests are not thorough currently so one sysmon report would've
lead to wrong conclusions - that's why I didn't want to bother you with
that. If this partitioning doesn't work I'll try summarizing more info.

Mark A. Parsons

2007-11-08 16:25:13 UTC

Permalink

Post by Veljko Zrnic

But are there any downsides for increasing number of partitions? Why not
set it to 64 for example?

I'm sure there are, but I couldn't give you any specific technical details/explanations.

The typical process is one of trial-n-error ... set the partition count to 8, run sp_sysmon again during a period of
heavy activity, if spinlock contention is still > 5-10% then consider bumping partition count up to 16, run sp_sysmon
again during a period of heavy activity, if spinlock contention is still > 5-10% then consider creating a new data cache
to help spread out cache hits/activity.

Chances are pretty good that you'll see a definite improvement with the 8 partitions, with 16 probably being the max
that you'll need to consider.

-------------

Another approach to addressing spinlock contention would require some detailed analysis of the queries that are being
run during the time when you see heavy spinlock contention. Consider ...

A handful of queries performing in-memory table scans of smallish #temp tables as inner tables in multi-table joins.
Net result is heavy cache hit rates, high cpu utilization and (possibly) heavy spinlock contention. Putting an index on
the #temp table(s) could reduce the volume of cache hits, reduce overall cpu utilization, (probably) reduce spinlock
contention (due to reduced cache activity) *and* improve overall query response times.

Again, that's just one possible scenario ... the idea being to tune queries in such a way as to reduce the hit rates on
your cache(s).

Mark A. Parsons

2007-11-08 17:50:57 UTC

Permalink

Assuming you're running with at least ASE 12.5.0.3 you could look at using the MDA tables to sniff out those queries
which are performing excessive logical IOs (aka cache hits). As a first pass monSysStatement would give you an idea of
the types of logical IOs being performed for various queries. Tying this info back to monSysSQLText would then match up
heavy logical IO users with their associated SQL text. And a bit of perseverence and/or creativity would allow you to
match this info back to monProcess% (or sysprocesses) info (eg, login, hostname, program/application name, etc.).

You could also setup some Historical server queries with filters that pull out only those queries/statements which
perform at least XX logical IOs.

Sooo, when you have some free time (heh, Sybase DBA's have lots of free time, right? ;-)) there are options for zeroing
in on inefficient queries.

Post by Mark A. Parsons
Another approach to addressing spinlock contention would require some
detailed analysis of the queries that are being run during the time
when you see heavy spinlock contention. Consider ...
A handful of queries performing in-memory table scans of smallish
#temp tables as inner tables in multi-table joins. Net result is heavy
cache hit rates, high cpu utilization and (possibly) heavy spinlock
contention. Putting an index on the #temp table(s) could reduce the
volume of cache hits, reduce overall cpu utilization, (probably)
reduce spinlock contention (due to reduced cache activity) *and*
improve overall query response times.
Again, that's just one possible scenario ... the idea being to tune
queries in such a way as to reduce the hit rates on your cache(s).

Analyzing queries is virtually impossible - there are over 1500 tables,
and over 1000 applications that use them... there are thousands of db
hits each second.
But I'll let you know how this partitioning turns out. Thanks!

Veljko Zrnic

2007-11-08 17:35:04 UTC

Permalink

Post by Mark A. Parsons
Another approach to addressing spinlock contention would require some
detailed analysis of the queries that are being run during the time when
you see heavy spinlock contention. Consider ...
A handful of queries performing in-memory table scans of smallish #temp
tables as inner tables in multi-table joins. Net result is heavy cache
hit rates, high cpu utilization and (possibly) heavy spinlock
contention. Putting an index on the #temp table(s) could reduce the
volume of cache hits, reduce overall cpu utilization, (probably) reduce
spinlock contention (due to reduced cache activity) *and* improve
overall query response times.
Again, that's just one possible scenario ... the idea being to tune
queries in such a way as to reduce the hit rates on your cache(s).

Analyzing queries is virtually impossible - there are over 1500 tables,
and over 1000 applications that use them... there are thousands of db
hits each second.

But I'll let you know how this partitioning turns out. Thanks!

Carl Kayser

2007-11-08 15:50:25 UTC

Permalink

Post by Veljko Zrnic
Hi all,
We have Sybase 12.5.4 running on a very rich Unix box with 90+G of memory,
and 20+ processors.
However the performance is constantly varying in such way that the same
query could take anywhere from 5 sec to 45 sec, and I'm trying to
determine the cause.
I've examined sysmon output from about 10 different executions in past few
days, each with period of 10mins.
The most obvious thing is tempdb cache spinlock contention which at times
reaches even 40% (but usually holds around 25%). Increasing number of
partitions on tempdb is my first thought but I'm not sure how much to put?
I'm thinking either 4 or 8 (so it would come <10%)
Is there any down-size for increasing the number of cache partitions?
Pool turnover is very different in each of the sysmon reports, so I can't
rely on that. Also, I already checked and tempdb is moved away from
master, and it's using dedicated devices so that should be fine.
sp_cacheconfig
go
Cache Name Status Type Config Value Run Value
------------------------------ --------- -------- ------------
------------
default data cache Active Default 8192.00 Mb 8192.00 Mb
tempdb cache Active Mixed 200.00 Mb 200.00 Mb
------------
------------
Total 8392.00 Mb 8392.00 Mb
==========================================================================
Cache: default data cache, Status: Active, Type: Default
Config Size: 8192.00 Mb, Run Size: 8192.00 Mb
Config Replacement: strict LRU, Run Replacement: strict LRU
Config Partition: 1, Run Partition: 4
IO Size Wash Size Config Size Run Size APF Percent
-------- --------- ------------ ------------ -----------
2 Kb 245760 Kb 0.00 Mb 3680.00 Mb 10
4 Kb 104848 Kb 512.00 Mb 512.00 Mb 10
16 Kb 245760 Kb 4000.00 Mb 4000.00 Mb 10
==========================================================================
Cache: tempdb cache, Status: Active, Type: Mixed
Config Size: 200.00 Mb, Run Size: 200.00 Mb
Config Replacement: strict LRU, Run Replacement: strict LRU
Config Partition: 1, Run Partition: 1
IO Size Wash Size Config Size Run Size APF Percent
-------- --------- ------------ ------------ -----------
2 Kb 10240 Kb 0.00 Mb 50.00 Mb 10
4 Kb 30720 Kb 150.00 Mb 150.00 Mb 10
(return status = 0)

I'm not saying to do it, but one SA makes the general recommendion of
providing seperate caches for the tempdb data and another for the tempdb
log. My experience is inconclusive.

Veljko Zrnic

2007-11-08 16:04:07 UTC

Permalink

Post by Carl Kayser
I'm not saying to do it, but one SA makes the general recommendion of
providing seperate caches for the tempdb data and another for the tempdb
log. My experience is inconclusive.

Well, currently I've discarded adding additional caches because hit/miss
ratio in both caches is quite good (~99%).

Mark A. Parsons

2007-11-08 16:16:44 UTC

Permalink

Post by Veljko Zrnic

Well, currently I've discarded adding additional caches because hit/miss
ratio in both caches is quite good (~99%).

Cache hit ratios aren't necessarily tied to spinlock contention.

For example, you could have 100% cache hit rates (eg, everything fits into cache) but still have heavy spinlock
contention (eg, 1 partition for 8 dataserver engines that are under heavy use).

Generally speaking the creation of additional data caches may help alleviate some occurrences of heavy spinlock contention.

As an example, assume you have 1 data cache with heavy spinlock contention and a good bit of tempdb activity. Creating
a cache for tempdb activity would offload some of the hits on the default data cache thus (possibly) reducing the
spinlock contention in the default data cache. This could apply even if you're seeing a ~99% cache hit rate in the
original single-cache configuration.

Obviously (?) you would have to do a little bit of research into figuring out the size of the new data cache(s) and what
objects (eg, db, table, index) to bind to said new cache(s).

While I'm not giving a thumbs-up or thumbs-down to what Carl had mentioned, I would suggest that you consider the option
(of additional caches) in case you find that a) you still have heavy spinlock contention after you bump up the partition
count on your current data caches and b) other solutions to addressing your query performance issues prove fruitless.