The following table is taken from Wikipedia:
|5,400 rpm SATA drives||HDD||~50–80 IOPS||SATA 3 Gbit/s|
|7,200 rpm SATA drives||HDD||~75–100 IOPS||SATA 3 Gbit/s-SAS 12Gbps|
|10,000 rpm SAS drives||HDD||~125–150 IOPS||SAS|
|15,000 rpm SAS drives||HDD||~175–210 IOPS||SAS|
If you are experiencing high sysload coupled with low CPU usage, it's possible you need to turn your attention to the tps value produced by:
# iostat -cd 60
This command will give you output similar to the following, every 60 seconds:
avg-cpu: %user %nice %system %iowait %steal %idle 23.45 11.40 12.74 0.35 0.02 52.03 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn xvdj 0.03 0.30 7.00 232578 5465744 xvde 112.75 39.67 2510.80 30952562 1959249512
As you can see in this example, xvde is seeing ~113 transfers per second. This amount of transfers would not be extraordinary in a case where you are frequently running thousands of checks, and the results of these are written to disk.
Another indication of this type of issue could be the output of:
# sar -q
00:00:01 runq-sz %runocc swpq-sz %swpocc 00:05:02 26.4 72 0.0 0 00:10:02 25.9 71 0.0 0 00:15:02 27.4 73 0.0 0 00:20:01 27.3 62 0.0 0 00:25:01 25.5 66 0.0 0
The common guidance for the runq-sz value seems to be:
The number of kernel threads in memory that are waiting for a CPU to run. Typically, this value should be less than 2. Consistently higher values mean that the system might be CPU-bound.
If you are seeing issues related to high sysload and measurements similar to the above, together with a disk setup that seems under specification according to the table, we strongly suggest that you add more IO capacity to your server to lower the load.
Please sign in to leave a comment.