2010年2月22日 星期一

HOWTO: Monitor i/o performance

sudo iostat -xcknd 15

http://www.aspdeveloper.net/tiki-index.php?page=Linuxiostat

Read I/Os

  • rrqm/s: # read requests merged per second queued for the device
  • r/s: Read requests issued to the device (realize requests can be joined together, so you are basically looking at the total of rrqm + r to get total read requests, assuming SOME may not be merged, I'm not 100% sure about that, but at a minium rrqm/s is reads per second)
  • rkB/s: Number of kilobytes read from the device per second
  • wrqm/s: Write version of rrqm/s
  • w/s: Write version of r/s
  • wkB/s: Write version of rkB/s
  • avgrq-sz: Average size (in sectors) of the requests that were issued to the device. Is the sector size 512 bytes for your device? Where do you check this (I'm not sure without searching, however we're primarily looking for what is relatively busy to either isolate WHY it's busy, or determine what subsystem needs to be upgraded, so I don't care a lot about this column even though it sounds fairly important).
  • avgqu-sz: The average queue length of the requests that were issued to the device - This one I do care about more - When things are queued up for the disks they aren't keeping up. Does that matter? That depends on a LOT of factors, is your software WAITING for the disk to complete it's operation? Usually reads that is the case, for writes it could be in a writeback cache and not a big concern.
  • await: The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. - This one again I DO care about. This is even more important than the number in the queue for some applications (for example copying a file, you really don't care about the NEXT item in the queue when you are doing sequential writes). On the other hand if you are a file server and have a million small files you care about this, but at least as importantly you care about how many other requests were pending.
  • svctm: The average service time (in milliseconds) for I/O requests that were issued to the device - I view this one as useful for finding if you devices are performing roughly as expected. In this case my demo is using a quite old single disk, I don't doubt it took 11.5ms to seek to another sector, read/write it, and return. If I had a 15k rpm latest generation scsi drive and saw 11.5ms I'd maybe wonder why that was. This will also depend on how much seeking your disk has to do, this number being high is NOT always bad - for example with drives re-ordering read/write operations to make less head movement you could have this number higher on a new drive yet have significantly higher throughput. This probably is confusing, but think about trying to play hot-potatoe with 100 people between point A and B - if your ONLY goal is to get every person to hold the potatoe do you want it just randomly tossed about ONLY to decrease the time it spent in someones hands? Instead moving it so each person only sees it once should decrease your total time to reach all 100 people.
  • %util: Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.

沒有留言:

張貼留言