Proc Filesystem For Quick'n'Dirty Troubleshooting - Tanel Poder's Performance & Troubleshooting Blog

Peeking into Linux kernel-land using /proc filesystem for quickndirty troubleshooting | Tanel Poder'...
1 of 22
http://blog.tanelpoder.com/2013/02/21/peeking-into-linux-kernel-land-using-proc-filesystem-for-quic...
Peeking into Linux kernel-land using /proc

filesystem for quickndirty troubleshooting
This blog entry is about modern Linuxes. In other words RHEL6 equivalents with 2.6.3x kernels and
not the ancient RHEL5 with 2.6.18 kernel (wtf?!), which is the most common in enterprises
unfortunately. And no, Im not going to use kernel debuggers or SystemTap scripts here, just plain old
cat /proc/PID/xyz commands against some useful /proc filesystem entries.
Heres one systematic troubleshooting example I reproduced in my laptop. A DBA was wondering why
their find command had been running much slower, without returning any results for a while.
Knowing the environment, we had a hunch, but I got asked about what would be the systematic
approach for troubleshooting this already ongoing problem right now.
Luckily the system was running OEL6, so had a pretty new kernel. Actually the 2.6.39 UEK2.
So, lets do some troubleshooting. First lets see whether that find process is still alive:
[root@oel6 ~]# ps -ef | grep find

root
27288 27245
4 11:57 pts/0
00:00:01 find . -type f
root
27334 27315
0 11:57 pts/1
00:00:00 grep find
Yep its there PID 27288 (Ill use that pid throughout the troubleshooting example).
Lets start from the basics and take a quick look whats the bottleneck for this process if its not
5/25/2015 12:34 PM
2 of 22
blocked by anything (for example reading everything it needs from cache) it should be 100% on CPU.
If its bottlenecked by some IO or contention issues, the CPU usage should be lower or completely
0%.
[root@oel6 ~]# top -cbp 27288

top - 11:58:15 up 7 days,
Tasks:
1 total,
Cpu(s):
0.1%us,
3:38,
0 running,
0.1%sy,
2 users,
load average: 1.21, 0.65, 0.47
1 sleeping,
0.0%ni, 99.8%id,
0 stopped,
0.0%wa,
0.0%hi,
Mem:
2026460k total,
1935780k used,
90680k free,
Swap:
4128764k total,
251004k used,
3877760k free,
PID USER
PR
NI
27288 root
20
VIRT
RES
109m 1160
SHR S %CPU %MEM

844 D
0.0
0.1
0 zombie
0.0%si,
0.0%st
64416k buffers
662280k cached
TIME+
COMMAND
0:01.11 find . -type f
Top tells me this process is either 0% on CPU or very close to zero percent (so it gets rounded to 0% in
the output). Theres an important difference though, as a process that is completely stuck, not having a
chance of getting onto CPU at all vs. a process which is getting out of its wait state every now and then
(for example some polling operation times out every now and then and thee process chooses to go back
to sleep). So, top on Linux is not a good enough tool to show that difference for sure but at least we
know that this process is not burning serious amounts of CPU time.
Lets use something else then. Normally when a process seems to be stuck like that (0% CPU usually
means that the process is stuck in some blocking system call which causes the kernel to put the
process to sleep) I run strace on that process to trace in which system call the process is currently
stuck. Also if the process is actually not completely stuck, but returns from a system call and wakes up
briefly every now and then, it would show up in strace (as the blocking system call would complete and
be entered again a little later):
5/25/2015 12:34 PM
3 of 22
[root@oel6 ~]# strace -cp 27288

Process 27288 attached - interrupt to quit
^C
^Z
[1]+
Stopped
strace -cp 27288
[root@oel6 ~]# kill -9 %%

[1]+
Stopped
strace -cp 27288
[root@oel6 ~]#
[1]+
Killed
strace -cp 27288
Oops, the strace command itself got hung too! It didnt print any output for a long time and didnt
respond to CTRL+C, so I had to put it into background with CTRL+Z and kill it from there. So much
for easy diagnosis.
Lets try pstack then (on Linux, pstack is just a shell script wrapper around the GDB debugger). While
pstack does not see into kernel-land, it will still give us a clue about which system call was requested
(as usually theres a corresponding libc library call in the top of the displayed userland stack):
[root@oel6 ~]# pstack 27288

^C
^Z
[1]+
Stopped
pstack 27288
[root@oel6 ~]# kill %%

[1]+
Stopped
pstack 27288
[root@oel6 ~]#
[1]+
Terminated
pstack 27288
Pstack also got stuck without returning anything!
5/25/2015 12:34 PM
4 of 22
So we still dont know whether our process is 100% (hopelessly) stuck or just 99.99% stuck (the
process keeps going back to sleep) and where its stuck.
Ok, where else to look? Theres one more commonly available thing to check the process status and
WCHAN fields, which can be reported via good old ps (perhaps I should have ran this command
earlier to make sure this process isnt a zombie by now):
[root@oel6 ~]# ps -flp 27288

F S UID
0 D root
PID
PPID
27288 27245
C PRI
0
80
NI ADDR SZ WCHAN
STIME TTY
TIME CMD
0 - 28070 rpc_wa 11:57 pts/0
00:00:01 find . -type f
You should run ps multiple times in a row to make sure that the process is still in the same state (you
dont want to get misled by a single sample taken at an unlucky time), but Im displaying it here only
once for brevity.
The process is in state D (uninterruptible sleep) which is usually disk IO related (as the ps man pages
also say). And the WCHAN (the function which caused this sleep/wait in this process) field is a bit
truncated. I could use a ps option (read the man) to print it out a bit wider, but as this info comes from
the proc filesystem anyway, lets query it directly from the source (again, it would be a good idea to
sample this multiple times as we are not sure yet whether our process is completely stuck or just
mostly sleeping):
[root@oel6 ~]# cat /proc/27288/wchan

rpc_wait_bit_killable
Hmm this process is waiting for some RPC call. RPC usually means that the process is talking to
other processes (either in the local server or even remote servers). But we still dont know why.
?
Before we go on to the meat of this article, lets figure out whether this process is completely stuck or
not. The /proc/PID/status can tell us that on modern kernels. I have highlighted the interesting bits
5/25/2015 12:34 PM
5 of 22
for our purpose:
5/25/2015 12:34 PM
6 of 22
[root@oel6 ~]# cat /proc/27288/status

Name:
find
State:
D (disk sleep)
Tgid:
27288
Pid:
27288
PPid:
27245
TracerPid:
Uid:
Gid:
FDSize: 256
Groups: 0 1 2 3 4 6 10
VmPeak:
112628 kB
VmSize:
112280 kB
VmLck:
0 kB
VmHWM:
1508 kB
VmRSS:
1160 kB
VmData:
260 kB
VmStk:
136 kB
VmExe:
224 kB
VmLib:
2468 kB
VmPTE:
88 kB
VmSwap:
0 kB
Threads:
SigQ:
4/15831
SigPnd: 0000000000040000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: ffffffffffffffff
CapEff: ffffffffffffffff
CapBnd: ffffffffffffffff
Cpus_allowed:
ffffffff,ffffffff
Cpus_allowed_list:
0-63
5/25/2015 12:34 PM
7 of 22
Mems_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,
Mems_allowed_list:
voluntary_ctxt_switches:
9950
nonvoluntary_ctxt_switches:
17104
The process is in the state D Disk Sleep (uninterruptible sleep). And look into the number of
voluntary_ctxt_switches and nonvoluntary_ctxt_switches this tells you how many times the process
was taken off CPU (or put back). Then run the same command again a few seconds later and see if
those numbers increase. In my case these numbers did not increase, thus I could conclude that the
process was completely stuck (well, at least during the few seconds between these commands). So I
can be more confident now that this process is completely stuck (and its not just flying under the
radar by burning 0.04% of CPU all time).
By the way, there are two more locations where to get the context switch counts (and the 2nd one
works on ancient kernels as well):
5/25/2015 12:34 PM
8 of 22
[root@oel6 ~]# cat /proc/27288/sched

find (27288, #threads: 1)
--------------------------------------------------------se.exec_start
617547410.689282
se.vruntime
2471987.542895
se.sum_exec_runtime
1119.480311
se.statistics.wait_start
0.000000
se.statistics.sleep_start
0.000000
se.statistics.block_start
617547410.689282
se.statistics.sleep_max
0.089192
se.statistics.block_max
60082.951331
se.statistics.exec_max
1.110465
se.statistics.slice_max
0.334211
se.statistics.wait_max
0.812834
se.statistics.wait_sum
724.745506
se.statistics.wait_count
27211
se.statistics.iowait_sum
0.000000
se.statistics.iowait_count
se.nr_migrations
312
se.statistics.nr_migrations_cold
se.statistics.nr_failed_migrations_affine:
se.statistics.nr_failed_migrations_running:
96
se.statistics.nr_failed_migrations_hot:
se.statistics.nr_forced_migrations :
1794
150
se.statistics.nr_wakeups
18507
se.statistics.nr_wakeups_sync
se.statistics.nr_wakeups_migrate
155
se.statistics.nr_wakeups_local
18504
se.statistics.nr_wakeups_remote
se.statistics.nr_wakeups_affine
155
se.statistics.nr_wakeups_affine_attempts:
158
se.statistics.nr_wakeups_passive
se.statistics.nr_wakeups_idle
avg_atom
0.041379
avg_per_cpu
3.588077
5/25/2015 12:34 PM
9 of 22
nr_switches
27054
nr_voluntary_switches
9950
nr_involuntary_switches
17104
se.load.weight
1024
policy
prio
120
clock-delta
72
Here you need to look into the
nr_switches
number (which equals
nr_voluntary_switches
nr_involuntary_switches ).
The total nr_switches is 27054 in the above output and this is also what the 3rd field in /proc/PID
/schedstat shows:
[root@oel6 ~]# cat /proc/27288/schedstat

1119480311 724745506 27054
And it doesnt increase

-
So, it looks like our process is pretty stuck :) Strace and pstack were useless. They use ptrace() system
call for attaching to the process and peeking into its memory, but as the process was hopelessly stuck,
very likely in some system call, so I guess the ptrace() call got hung itself. (BTW, I once tried stracing
the strace process itself when attaching to a target process and it crashed the target. You have been
warned :)
So, how to see in which system call are we stuck without strace or pstack? Luckily I was running on a
modern kernel say hi to /proc/PID/syscall!
[root@oel6 ~]# cat /proc/27288/syscall

262 0xffffffffffffff9c 0x20cf6c8 0x7fff97c52710 0x100 0x100 0x676e776f645f616d 0x7fff97c52658 0x390e2da8ea
5/25/2015 12:34 PM
10 of 22
Ok WTF am I supposed to do with that?

Well, these numbers usually stand for something. If its 0xAVeryBigNumber then its usually a
memory address (and pmap-like tools could be used where they point), but if its a small number then
perhaps its an index into some array like open file descriptor array (which you can read from
/proc/PID/fd ) or in this case, as we are dealing with system calls its the system call number where
your process is currently in! So, this process is stuck in system call #262!
Note that the system call numbers may differ between OS type, OS version and platform, so you should
read it from the proper .h file on your flavor of OS. Usually searching for syscall* from /usr/include
is a good starting point. On my version and platform (64bit) of Linux the system calls are defined here
in /usr/include/asm/unistd_64.h :
[root@oel6 ~]# grep 262 /usr/include/asm/unistd_64.h

#define __NR_newfstatat
262
Here you go! The syscall 262 is something called newfstatat . Just run man to find out what it is. Heres
a hint regarding system call names if man page doesnt find a call, then try it without possible
suffixes and prefixes (for example man pread instead of man pread64) in this case, run man
without the new prefix man fstatat . Or just google.
Anyway, this system call new-fstat-at allows you to read file properties pretty much like the usual
stat system call. And we are stuck in this file metadata reading operation. So weve gotten one step
further. But we still dont know why are we stuck there?
Well, Say Hello To My Little Friend /proc/PID/stack, which allows you to read kernel stack
backtraces of a process by just cating a proc file!!!
5/25/2015 12:34 PM
11 of 22
[root@oel6 ~]# cat /proc/27288/stack

[] rpc_wait_bit_killable+0x24/0x40 [sunrpc]
[] __rpc_execute+0xf5/0x1d0 [sunrpc]
[] rpc_execute+0x43/0x50 [sunrpc]
[] rpc_run_task+0x75/0x90 [sunrpc]
[] rpc_call_sync+0x42/0x70 [sunrpc]
[] nfs3_rpc_wrapper.clone.0+0x35/0x80 [nfs]
[] nfs3_proc_getattr+0x47/0x90 [nfs]
[] __nfs_revalidate_inode+0xcc/0x1f0 [nfs]
[] nfs_revalidate_inode+0x36/0x60 [nfs]
[] nfs_getattr+0x5f/0x110 [nfs]
[] vfs_getattr+0x4e/0x80
[] vfs_fstatat+0x70/0x90
[] sys_newfstatat+0x24/0x50
[] system_call_fastpath+0x16/0x1b
[] 0xffffffffffffffff
The top function is where in kernel code we are stuck right now which is exactly what the WCHAN
output showed us (note that actually there are some more functions on top of it, such as the kernel
schedule() function which puts processes to sleep or on CPU as needed, but these are not shown here
probably because theyre a result of this wait condition itself, not the cause).
Thanks to the full kernel-land stack of this task we can read it from bottom up to understand how did
we end up calling this rpc_wait_bit_killable function in the top, which ended up calling the scheduler
and put us into sleep mode.
The system_call_fastpath in the bottom is a generic kernel system call handler function which has
called the kernel code for that newfstatat syscall we issued ( sys_newfstatat ). And then, continuing to
go upwards through the child functions we end up seeing a bunch of NFS functions in the stack! This
is a 100% undeniable proof that we are somewhere under NFS codepath. Im not saying in NFS
codepath as when you continue reading upwards past the NFS functions, you see that the topmost
NFS function has in turn called some RPC function (rpc_call_sync) for talking to some other process
in this case probably the [kworker/N:N] , [nfsiod] , [lockd] or [rpciod] kernel IO threads. And we are
5/25/2015 12:34 PM
12 of 22
never getting a reply from those threads for some reason (the usual suspect would be dropped network
connection, packets or just network connectivity issues).
To see whether any of the helper threads are stuck on network-related code, you could sample their
kernel stacks too, although the kworkers do much more than just NFS RPC communication. During a
separate experiment (just copying large files over NFS), I have caught one of the kworkers waiting in
networking code:
[root@oel6 proc]# for i in `pgrep worker` ; do ps -fp $i ; cat /proc/$i/stack ; done

UID
PID
PPID
root
53
C STIME TTY
0 Feb14 ?
TIME CMD
00:04:34 [kworker/1:1]
[] __cond_resched+0x2a/0x40
[] lock_sock_nested+0x35/0x70
[] tcp_sendmsg+0x29/0xbe0
[] inet_sendmsg+0x48/0xb0
[] sock_sendmsg+0xef/0x120
[] kernel_sendmsg+0x41/0x60
[] xs_send_kvec+0x8e/0xa0 [sunrpc]
[] xs_sendpages+0x173/0x220 [sunrpc]
[] xs_tcp_send_request+0x5d/0x160 [sunrpc]
[] xprt_transmit+0x83/0x2e0 [sunrpc]
[] call_transmit+0xa8/0x130 [sunrpc]
[] __rpc_execute+0x66/0x1d0 [sunrpc]
[] rpc_async_schedule+0x15/0x20 [sunrpc]
[] process_one_work+0x13e/0x460
[] worker_thread+0x17c/0x3b0
[] kthread+0x96/0xa0
[] kernel_thread_helper+0x4/0x10
It should be possible to enable kernel tracing for knowing which exact kernel thread communicates
with other kernel threads, but I wont go in there in this post as its supposed to be a practical (and
simple :) troubleshooting exercise!
5/25/2015 12:34 PM
13 of 22
Anyway, thanks to very easy kernel stack sampling in modern Linux kernels (I dont know exact
version when it was introduced) we managed to systematically figure out where the find command was
stuck in NFS code in the Linux kernel. And when you have NFS-related hangs, the most usual
suspects are network issues. If you are wondering how did I reproduce this problem, I mounted a NFS
volume from a VM, started the find command and then just suspended the VM. This produces similar
symptoms as a network (configuration, firewall) issue where a connection is silently dropped without
notifying the TCP endpoints or packets just dont go through for some reason.
As the top function listed in the stack was one of the killable safe-to-kill functions
( rpc_wait_bit_killable ) then we can kill it with kill -9 :
[root@oel6 ~]# ps -fp 27288

UID
PID
root
PPID
27288 27245
C STIME TTY
0 11:57 pts/0
TIME CMD
00:00:01 find . -type f
[root@oel6 ~]# kill -9 27288

[root@oel6 ~]# ls -l /proc/27288/stack
ls: cannot access /proc/27288/stack: No such file or directory
[root@oel6 ~]# ps -fp 27288
UID
PID
PPID
C STIME TTY
TIME CMD
[root@oel6 ~]#
The process is gone.

-
Note that as the /proc/PID/stack looks like just a plain text proc file, you can do poor-mans stack
profiling also on the kernel threads! This is an example of sampling the current system call and the
kernel stack (if in system call) and aggregating it into a semi-hierarchical profile in a poor-mans way:
5/25/2015 12:34 PM
14 of 22
[root@oel6 ~]# export LC_ALL=C ; for i in {1..100} ; do cat /proc/29797/syscall | awk '{ print $1 }' ; cat
69 running
1 ffffff81534c83
2 ffffff81534820
6 247
25 180
100
1
27
3
1
27
0xffffffffffffffff
thread_group_cputime
sysenter_dispatch
ia32_sysret
task_sched_runtime
sys32_pread
compat_sys_io_submit
compat_sys_io_getevents
27
sys_pread64
sys_io_getevents
do_io_submit
27
vfs_read
read_events
io_submit_one
27
do_sync_read
aio_run_iocb
27
1
generic_file_aio_read
aio_rw_vect_retry
27
generic_file_read_iter
generic_file_aio_read
27
1
27
27
mapping_direct_IO
generic_file_read_iter
blkdev_direct_IO
__blockdev_direct_IO
27
do_blockdev_direct_IO
27
dio_post_submission
27
dio_await_completion
blk_flush_plug_list
5/25/2015 12:34 PM
15 of 22
This gives you a very rough sample of where in kernel this process spent its time if at all. The system
call numbers are separated, in bold, above running means that this process was in userland (not in
a system call) during the sample. So, 69% of the time during the sampling, this process was running
userland code. 25% of time in syscall #180 (nfsservctl on my system) and 6% in syscall #247 (waitid).
There are two more functions seen in this output but for some reason they havent been properly
translated to a function name. Well, this address must stand for something, so lets try our luck
manually:
[root@oel6 ~]# cat /proc/kallsyms | grep -i ffffff81534c83

ffffffff81534c83 t ia32_sysret
So it looks that these samples are about a 32-bit architecture-compatible system call return function
but as this function isnt a system call itself (just an internal helper function), perhaps thats why the
/proc/stack routine didnt translate it. Perhaps these addresses show up because theres no read
consistency on the /proc views, when owning threads modify these memory structures and entries,
the reading threads may sometimes read data in-flux.
Let check the other address too:
[root@oel6 ~]# cat /proc/kallsyms | grep -i ffffff81534820

[root@oel6 ~]#
Nothing? Well, the troubleshooting doesnt have to stop yet lets see if theres anything interesting
around that address. I just removed a couple of trailing characters from the address:
5/25/2015 12:34 PM
16 of 22
[root@oel6 ~]# cat /proc/kallsyms | grep -i ffffff815348

ffffffff8153480d t sysenter_do_call
ffffffff81534819 t sysenter_dispatch
ffffffff81534847 t sysexit_from_sys_call
ffffffff8153487a t sysenter_auditsys
ffffffff815348b9 t sysexit_audit
Looks like the sysenter_dispatch function starts just 1 byte before the original address displayed by
/proc/PID/stack. So we had possibly already executed the one byte (possibly a NOP operation left in
there for dynamic tracing probe traps). But nevertheless, looks like these stack samples were in the
sysenter_dispatch
function, which isnt a system call itself, but a system call helper function.
Note that there are different types of stack samplers the Linux Perf, Oprofile tools and Solaris
DTraces profile provider do sample the instruction pointer (EIP on 32bit intel or RIP on x64) and
stack pointer (ESP on 32bit and RSP on 64bit) registers of the currently running thread on CPU and
walk the stack pointers backwards from there. Therefore, these tools only show threads which happen
to be on CPU when sampling!!! This is of course perfect, when troubleshooting high CPU utilization
issues, but not useful at all when troubleshooting hung processes or too long process sleeps/waits.
Tools like pstack on Linux,Solaris,HP-UX, procstack (on AIX), ORADEBUG SHORT_STACK and just
reading the /proc/PID/stack pseudofile provide a good addition (not replacement) to the CPU
profiling tools as they access the probed process memory regardless of their CPU scheduling state
and read the stack from there. If the process is sleeping, not on CPU, the starting point of the stack can
be read from the stored context which is saved to kernel memory by OS scheduler during the context
switch.
Of course, the CPU event profiling tools can often do much more than just a pstack, OProfile, Perf and
even DTraces CPC provider (in Solaris11) can set up and sample CPU internal performance counters
to estimate things like the amount of CPU cycles stalled waiting for main memory, amount of L1/L2
cache misses, etc. Better read what Kevin Closson has to say on these topics: (Perf, Oprofile)
5/25/2015 12:34 PM
17 of 22
Enjoy :-)
What the heck are the /dev/shm/JOXSHM_EXT_x files on Linux?

SGA bigger than than the amount of HugePages configured (Linux 11.2.0.3)
Oracle Exadata Troubleshooting
Oracle Performance & Troubleshooting Online Seminars in 2013
Session Snapper v4 The Worlds Most Advanced Oracle Troubleshooting Script!
120
31
More
Stoyan says:
F E BR UA RY 21 , 20 13 A T 1:48 P M
Just awesome thank you

R EP LY
Panagiotis Papadomitsos says:

M A RC H 7, 2 01 3 AT 4:5 0 AM
This must be one of the best hardcore troubleshooting articles Ive ever read. Thanks!
R EP LY
Renato Araujo says:

D E C EM B ER 18 , 2 01 3 AT 7:1 0 AM
5/25/2015 12:34 PM
18 of 22
+1
Many Thanks
R EP LY
Mani says:
M A RC H 7, 2 01 3 AT 1:5 3 PM
Thank you Tanel for Sharing, it was awesome ride on Linux kernel-land!
R EP LY
Jared says:
A P RIL 2, 20 13 A T 8 :09 P M
Thanks Tanel, this article has been on my todo list for some time, and finally read it today.
Excellent stuff, I need to save it somewhere I can easily find it
My favorite type of work is troubleshooting down to the level where there is no further
visibility,
learning new OS bits and tools, and writing tools as necessary as I go.
Either find the issue, or be able to point to the vendor and prove it is their problem. :)
That threshold of visibility is at a somewhat lower level with *nix than with the Oracle
database.
Although, the OS tools in Linuxland make it possible to delve deeper into Oracle issues than
would be possible otherwise.
5/25/2015 12:34 PM
19 of 22
R EP LY
LAKS says:
A P RIL 9, 20 13 A T 6 :19 A M
Late to post a query here but still curious to know what is the environment details where this
hang was seen. I mean what was the virtualization product used(KVM or xen) to create the
linux guest VM(32 or 64bit). if possible pls reply.thx in advance
R EP LY
Tanel Poder says:

A P RIL 9, 20 13 A T 8 :56 A M
I was using VMWare Fusion 5 on my Mac server.

R EP LY
Tanel Poder says:

A P RIL 9, 20 13 A T 8 :57 A M
But you can use any VM for this VirtualBox is easy to use and
free
R EP LY
Jeyanthan says:
A P RIL 13 , 2 01 3 A T 12 :10 P M
That was a heck of a troubleshooting ! Thanks for your detailed analysis walkthrough, Tanel.
If I were you, Id have done kill -9 27288 right away ;)
5/25/2015 12:34 PM
20 of 22
R EP LY
Ravikumar says:
D E C EM B ER 30 , 2 01 3 AT 3:3 1 AM
Is it possible to get no of kernel memory pages.

R EP LY
Tanel Poder says:

D E C EM B ER 30 , 2 01 3 AT 1:0 3 PM
There are different types (and reasons) why kernel memory pages get allocated.
On linux just cat /proc/meminfo (some homework is needed for understanding
which types are for kernel purposes).
R EP LY
Aleksey says:
J A NU A RY 1 7, 201 4 AT 7 :30 A M
You ref to /home/oracle/os_explain in your poor-man script. What is the content of it?
Also,
ffffffff81534819 t sysenter_dispatch
doesnt starts just 1 byte before ffffff81534820, since there are HEX numbers here. (so, it is 7
bytes actually).
R EP LY
Tanel Poder says:

J A NU A RY 1 7, 201 4 AT 1 1:00 A M
5/25/2015 12:34 PM
21 of 22
check this: http://blog.tanelpoder.com/files/scripts/tools/unix/os_explain

Ha, thanks for catching my hex to dec conversion error :-)
R EP LY
Adi To says:
J U LY 6, 201 4 AT 6 :27 PM
Dear Tanel,
I am late to the party, but chanced upon your blog post: very much appreciate you sharing
your tricks. Very useful for me, delving into Linux from Unix-land. Regards
R EP LY
Frank Scholten says:

J U LY 25, 20 14 A T 10:4 5 AM
Hi Tanel,
Great article!
How come you could kill the process at the end of your article? I thought it was in an disk
sleep state?
Cheers,
Frank
R EP LY
5/25/2015 12:34 PM
22 of 22
Tanel Poder says:

J U LY 31, 20 14 A T 8:55 A M
Hi Frank, the D means uninterruptible sleep (not only Disk sleep), the
WCHAN column shows in which kernel function its stuck and in this case its
rpc_wait, which has something to do with interprocess communication and the
later diagnosis showed it was the NFS codepath. So it wasnt stuck in the disk
block IO layer somewhere, it was waiting for a reply from another process via
RPC.
R EP LY
5/25/2015 12:34 PM

Proc Filesystem For Quick'n'Dirty Troubleshooting - Tanel Poder's Performance & Troubleshooting Blog

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Proc Filesystem For Quick'n'Dirty Troubleshooting - Tanel Poder's Performance & Troubleshooting Blog

Uploaded by

Copyright:

Available Formats

Peeking into Linux kernel-land using /proc filesystem for quickndirty troubleshooting | Tanel Poder'...

Peeking into Linux kernel-land using /proc

[root@oel6 ~]# ps -ef | grep find

00:00:01 find . -type f

00:00:00 grep find

[root@oel6 ~]# top -cbp 27288

load average: 1.21, 0.65, 0.47

SHR S %CPU %MEM

0:01.11 find . -type f

[root@oel6 ~]# strace -cp 27288

strace -cp 27288

[root@oel6 ~]# kill -9 %%

strace -cp 27288

strace -cp 27288

[root@oel6 ~]# pstack 27288

[root@oel6 ~]# kill %%

Pstack also got stuck without returning anything!

[root@oel6 ~]# ps -flp 27288

0 - 28070 rpc_wa 11:57 pts/0

00:00:01 find . -type f

[root@oel6 ~]# cat /proc/27288/wchan

for our purpose:

[root@oel6 ~]# cat /proc/27288/status

[root@oel6 ~]# cat /proc/27288/sched

Here you need to look into the

number (which equals

[root@oel6 ~]# cat /proc/27288/schedstat

And it doesnt increase

[root@oel6 ~]# cat /proc/27288/syscall

Ok WTF am I supposed to do with that?

[root@oel6 ~]# grep 262 /usr/include/asm/unistd_64.h

[root@oel6 ~]# cat /proc/27288/stack

[root@oel6 proc]# for i in `pgrep worker` ; do ps -fp $i ; cat /proc/$i/stack ; done

[root@oel6 ~]# ps -fp 27288

[root@oel6 ~]# kill -9 27288

The process is gone.

[root@oel6 ~]# cat /proc/kallsyms | grep -i ffffff81534c83

[root@oel6 ~]# cat /proc/kallsyms | grep -i ffffff81534820

[root@oel6 ~]# cat /proc/kallsyms | grep -i ffffff815348

What the heck are the /dev/shm/JOXSHM_EXT_x files on Linux?

Just awesome thank you

Panagiotis Papadomitsos says:

Renato Araujo says:

Tanel Poder says:

I was using VMWare Fusion 5 on my Mac server.

Tanel Poder says:

Is it possible to get no of kernel memory pages.

Tanel Poder says:

Tanel Poder says:

check this: http://blog.tanelpoder.com/files/scripts/tools/unix/os_explain

Frank Scholten says:

Tanel Poder says:

You might also like