[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Full-disclosure] "IO wait chains" in Linux??



On Mon, 07 Feb 2011 18:29:04 GMT, "Cal Leeming [Simplicity Media Ltd]" said:

> 21000 - /usr/local/sbin/nginx - [D]
>  - /tmp/.somefile
>     - other PIDs waiting on this file (not just children of the parent)
>         - 51283 - /usr/local/sbin/apache (4.6 seconds)
>         - 31028 - /usr/local/sbin/python2.6 (1.9 seconds)
> 
> Sadly, I don't know much about how the kernel and the IO schedulers handle
> these things behind the scenes, so what I'm asking for may be impossible
> (apart from your other suggestion using watchdog+dmesg).

You need to distinguish between I/O that's supposed to complete quickly (for
instance, local disk I/O), and stuff that can reasonably take a while (reads
from a network connection, waiting for a file lock, etc).  The 'D' state is for
stuff that's supposed to be fast, if it's taking more than milliseconds you
have a big problem - as in "you have hit an actual kernel bug or hardware
has failed on you".  More often, you'll block inside open(), flock(), poll(),
and similar places, resulting in a 'sleeping' status.

So the big question is "what you're trying to accomplish" rather than "is there
a CLI tool that does XYZ" - most of the problems won't be found by checking for
XYZ at all...

Attachment: pgprTcxuvMpmu.pgp
Description: PGP signature

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/