Error[8]: Undefined offset: 15, File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 121
File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 473, decode(

概述我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪. 例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数. 现在,我 我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪.

例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数.

现在,我知道我可以使用kgdb连接到在虚拟机中运行的调试linux内核,甚至可以设置断点(所以希望也是观察点) – 但问题是我实际上想要调试ALSA驱动程序,特别是播放dma_area缓冲区(我得到了一些意想不到的数据) – 这对时间非常敏感;并且运行调试内核本身会弄乱时间(更不用说在虚拟机中运行它)了.

这里更大的问题是回放dma_area指针仅在回放 *** 作期间(或者换句话说,在_start和_stop处理程序之间)存在 – 所以我必须在每个_start回调中记录dma_area地址,然后以某种方式“在播放 *** 作期间安排“用于”观看“.

所以我希望有一种方法可以直接在驱动程序代码中执行此类 *** 作 – 例如,在此_start回调中添加一些代码来记录dma_area指针,并将其用作启动“watch”的命令的参数改变;从相应的回调函数打印堆栈跟踪. (我知道这也会影响时间,但我希望它能够“轻松”,不会过多地影响“实时”驱动程序 *** 作).

所以我的问题是:在linux内核中进行调试的这种技术是否存在?

如果不是:是否可以设置硬件(或软件)中断,该中断会对特定内存地址的更改做出反应?然后,我可以设置这样的中断处理程序,可以打印出堆栈跟踪吗? (虽然,我认为整个上下文在IRQ处理程序运行时会发生变化,因此可能会出现堆栈跟踪错误)?

如果没有:是否还有其他技术,这将允许我打印进程的堆栈跟踪,该跟踪更改存储在内核中给定内存位置的值(希望在实时的非调试内核中)?

解决方法 非常感谢 @CosminRatiu和 Eugene的回复;多亏了那些,我发现:

> debugging – Linux kernel hardware break points – Stack Overflow
> Hardware Breakpoint (or watchpoint) – The Linux Kernel Archives

…我可以用它来开发我在这里发布的示例,testhrarr.c内核模块/驱动程序和Makefile(下面).它表明硬件观察点跟踪可以通过两种方式实现:使用perf程序,它可以不变地探测驱动程序;或者通过向驱动程序添加一些硬件断点代码(在示例中,由HWDEBUG_STACK定义变量包含).

本质上,调试像int这样的标准原子变量类型(如runco​​unt变量)的内容很简单,只要它们被定义为内核模块中的全局变量,因此它们最终全局显示为内核符号.因此,下面的代码将testhrarr_添加为变量的前缀(以避免命名冲突).但是,由于需要解除引用,调试数组的内容可能有点棘手 – 这就是本文演示的内容,调试testhrarr_arr数组的第一个字节.它完成于:

$echo `cat /etc/lsb-release` disTRIB_ID=Ubuntu disTRIB_RELEASE=11.04 disTRIB_CODEname=natty disTRIB_DESCRIPTION="Ubuntu 11.04"$uname -alinux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/linux$cat /proc/cpuinfo | grep "model name"model name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHzmodel name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHz

testhrarr模块基本上在模块初始化时为小数组分配内存,设置定时器函数,并公开/ proc / testhrarr_proc文件(使用较新的proc_create接口).然后,尝试从/ proc / testhrarr_proc文件(例如,使用cat)读取将触发计时器功能,该功能将修改testhrarr_arr数组值,并将消息转储到/ var / log / syslog.我们期望testhrarr_arr [0]在 *** 作期间会改变三次;一次在testhrarr_startup中,两次在testhrarr_timer_function中(由于换行).

使用perf

使用make构建模块后,您可以使用以下命令加载它:

sudo insmod ./testhrarr.ko

此时,/ var / log / syslog将包含:

kernel: [40277.199913] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [40277.199930]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xed182a80 (0xed182a80) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [40277.220329] HW Breakpoint for testhrarr_arr write installed (0xf84be2a0)

注意,只是将testhrarr_arr作为硬件观察点的符号传递,扫描该变量的地址(0xf84be2a0),而不是数组的第一个元素的地址(0xed182a80)!因此,硬件断点不会触发 – 因此行为就好像硬件断点代码根本不存在(可以通过取消定义HWDEBUG_STACK来实现)!

因此,即使没有通过内核模块代码设置的硬件断点,我们仍然可以使用perf来观察内存地址的变化 – 在perf中,我们指定我们要监视的地址(这里是testhrarr_arr的第一个元素的地址,0xed182a80),以及应该运行的进程:这里我们运行bash,所以我们可以执行cat / proc / testhrarr_proc,它将触发内核模块定时器,然后是sleep 0.5,这将允许定时器完成. -a参数也是必需的,否则可能会遗漏一些事件:

$sudo perf record -a --call-graph --event=mem:0xed182a80:w bash -c 'cat /proc/testhrarr_proc ; sleep 0.5'testhrarr proc: startup[ perf record: Woken up 1 times to write data ][ perf record: Captured and wrote 0.485 MB perf.data (~21172 samples) ]

此时,/ var / log / syslog还包含以下内容:

[40822.114964]  testhrarr_timer_function: testhrarr_runcount 0 [40822.114980]  testhrarr jiffIEs 10130528 ; ret: 1 ; ktnsec: 40822114975062[40822.118956]  testhrarr_timer_function: testhrarr_runcount 1 [40822.118977]  testhrarr jiffIEs 10130529 ; ret: 1 ; ktnsec: 40822118973195[40822.122940]  testhrarr_timer_function: testhrarr_runcount 2 [40822.122956]  testhrarr jiffIEs 10130530 ; ret: 1 ; ktnsec: 40822122951143[40822.126962]  testhrarr_timer_function: testhrarr_runcount 3 [40822.126978]  testhrarr jiffIEs 10130531 ; ret: 1 ; ktnsec: 40822126973583[40822.130941]  testhrarr_timer_function: testhrarr_runcount 4 [40822.130961]  testhrarr jiffIEs 10130532 ; ret: 1 ; ktnsec: 40822130955167[40822.134940]  testhrarr_timer_function: testhrarr_runcount 5 [40822.134962]  testhrarr jiffIEs 10130533 ; ret: 1 ; ktnsec: 40822134958888[40822.138936]  testhrarr_timer_function: testhrarr_runcount 6 [40822.138958]  testhrarr jiffIEs 10130534 ; ret: 1 ; ktnsec: 40822138955693[40822.142940]  testhrarr_timer_function: testhrarr_runcount 7 [40822.142962]  testhrarr jiffIEs 10130535 ; ret: 1 ; ktnsec: 40822142959345[40822.146936]  testhrarr_timer_function: testhrarr_runcount 8 [40822.146957]  testhrarr jiffIEs 10130536 ; ret: 1 ; ktnsec: 40822146954479[40822.150949]  testhrarr_timer_function: testhrarr_runcount 9 [40822.150970]  testhrarr jiffIEs 10130537 ; ret: 1 ; ktnsec: 40822150963438[40822.154974]  testhrarr_timer_function: testhrarr_runcount 10 [40822.154988] testhrarr [ 5,7,9,11,13,]

要读取perf(一个名为perf.data的文件)的捕获,我们可以使用:

$sudo perf report --call-graph flat --stdioNo kallsyms or vmlinux with build-ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 was found[testhrarr] with build ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 not found,continuing without symbolsFailed to open /bin/cat,continuing without symbolsFailed to open /usr/lib/libpixman-1.so.0.20.2,continuing without symbolsFailed to open /usr/lib/xorg/modules/drivers/intel_drv.so,continuing without symbolsFailed to open /usr/bin/Xorg,continuing without symbols# Events: 5  unkNown## Overhead  Command  Shared Object                                Symbol# ........  .......  .............  ....................................#    87.50%     Xorg  [testhrarr]    [k] testhrarr_timer_function            87.50%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                0x30185d                0x2ed701                0x2ed8cc                0x2edba0                0x9d0386                0x8126fc8                0x81217a1                0x811bdd3                0x8070aa7                0x806281c                __libc_start_main                0x8062411     6.25%      cat  [testhrarr]    [k] testhrarr_timer_function             6.25%                testhrarr_timer_function                testhrarr_proc_show                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081     3.12%  swapper  [testhrarr]    [k] testhrarr_timer_function             3.12%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                cpuIDle_IDle_call                cpu_IDle                start_secondary     3.12%      cat  [testhrarr]    [k] 0x356                3.12%                0xf84bc356                0xf84bc3a7                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081## (For a higher level overvIEw,try: perf report --sort comm,dso)#

因此,由于我们正在使用调试on(Makefile中的-g)构建内核模块,所以即使实时内核不是调试内核,perf也不能找到该模块的符号.所以它在大多数时候正确地将testhrarr_timer_function解释为setter,虽然它没有报告testhrarr_startup(但它报告了testhrarr_proc_show调用它).还有对0xf84bc3a7和0xf84bc356的引用无法解析;但请注意,模块加载为0xf84bc000:

$sudo cat /proc/modules | grep testhrtesthrarr 13433 0 - live 0xf84bc000

……并且该条目也以…开头[k] 0x356;如果我们查看内核模块的objdump:

$objdump -S testhrarr.ko | less...00000323 :static voID testhrarr_startup(voID){...    testhrarr_arr[0] = 0; //just the first element 34b:   a1 80 00 00 00          mov    0x80,%eax 350:   c7 00 00 00 00 00       movl   
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr[0]
x0,(%eax) hrtimer_start(&my_hrtimer,ktime_period_ns,HRTIMER_MODE_REL); 356: c7 04 24 01 00 00 00 movl
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr_first
x1,(%esp) ********** 35d: 8b 15 1c 00 00 00 mov 0x1c,%edx...00000375 :static int testhrarr_proc_show(struct seq_file *m,voID *v) {... seq_printf(m,"testhrarr proc: startup\n"); 38f: c7 44 24 04 79 00 00 movl
kernel: [43910.509726] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [43910.509765]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xedf6c5c0 (0xedf6c5c0) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [43910.538535] HW Breakpoint for testhrarr_arr_first write installed (0xedf6c5c0)
x79,0x4(%esp) 396: 00 397: 8b 45 fc mov -0x4(%ebp),%eax 39a: 89 04 24 mov %eax,(%esp) 39d: e8 fc ff ff ff call 39e testhrarr_startup(); 3a2: e8 7c ff ff ff call 323 3a7: eb 1c jmp 3c5 ********** } else { seq_printf(m,"testhrarr proc: (is running,%d)\n",testhrarr_runcount); 3a9: a1 0c 00 00 00 mov 0xc,%eax...

…所以0xf84bc356显然是指hrtimer_start;和0xf84bc3a7 – > 3a7指其调用testhrarr_proc_show函数;值得庆幸的是. (请注意,我已经体验过不同版本的驱动程序,_start可以显示,而timer_function由纯粹的地址表示;不确定是什么原因).

然而,perf的一个问题是,它给了我这些函数的统计“开销”(不确定是什么意思 – 可能是在函数的进入和退出之间花费的时间?) – 但我真正想要的是堆栈跟踪的日志是顺序的.不确定是否可以为此设置perf – 但绝对可以使用内核模块代码来完成硬件断点.

使用内核模块HW断点

HWDEBUG_STACK中的代码实现了HW断点的设置和处理.如上所述,符号ksym_name(如果未指定)的默认设置是testhrarr_arr,它根本不触发硬件断点.在insmod期间,可以在命令行中指定ksym_name参数;在这里我们可以注意到:

$cat /proc/testhrarr_proc testhrarr proc: startup

…在/ var / log / syslog中安装了HW Breakpoint for testhrarr_arr [0]的结果(0x(null)); – 这意味着我们不能使用带括号表示法的符号进行数组访问;谢天谢地,这里的空指针只是意味着HW断点将再次不会触发;它不会完全崩溃 *** 作系统:)

但是,有一个全局变量用于引用testhrarr_arr数组的第一个元素,称为testhrarr_arr_first – 注意如何在代码中专门处理此全局变量,并且需要取消引用,以便获得正确的地址.所以我们这样做:

kernel: [44069.735695] testhrarr_arr_first value is changed[44069.735711] PID: 29320,comm: cat Not tainted 2.6.38-16-generic #67-Ubuntu[44069.735719] Call Trace:[44069.735737] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.735755] [] ? __perf_event_overflow+0x90/0x240[44069.735768] [] ? proc_alloc_inode+0x23/0x90[44069.735778] [] ? proc_alloc_inode+0x23/0x90[44069.735790] [] ? perf_swevent_event+0x136/0x140[44069.735801] [] ? perf_bp_event+0x70/0x80[44069.735812] [] ? prep_new_page+0x110/0x1a0[44069.735824] [] ? get_page_from_freeList+0x12e/0x320[44069.735836] [] ? seq_open+0x3d/0xa0[44069.735848] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.735861] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.735872] [] ? notifIEr_call_chain+0x45/0x60[44069.735883] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.735894] [] ? notify_dIE+0x2d/0x30[44069.735904] [] ? do_deBUG+0x88/0x180[44069.735915] [] ? deBUG_stack_correct+0x30/0x38[44069.735928] [] ? testhrarr_startup+0x33/0x52 [testhrarr][44069.735940] [] ? testhrarr_proc_show+0x32/0x57 [testhrarr][44069.735952] [] ? seq_read+0x145/0x390[44069.735963] [] ? seq_read+0x0/0x390[44069.735973] [] ? proc_reg_read+0x64/0xa0[44069.735985] [] ? vfs_read+0x9f/0x160[44069.735995] [] ? proc_reg_read+0x0/0xa0[44069.736003] [] ? sys_read+0x42/0x70[44069.736013] [] ? syscall_call+0x7/0xb[44069.736019] Dump stack from sample_hbp_handler[44069.740132] testhrarr_timer_function: testhrarr_runcount 0 [44069.740146] testhrarr jiffIEs 10942435 ; ret: 1 ; ktnsec: 44069740142485[44069.740159] testhrarr_arr_first value is changed[44069.740169] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.740176] Call Trace:[44069.740195] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.740213] [] ? __perf_event_overflow+0x90/0x240[44069.740227] [] ? perf_swevent_event+0x136/0x140[44069.740239] [] ? perf_bp_event+0x70/0x80[44069.740253] [] ? sched_clock_local+0xd3/0x1c0[44069.740267] [] ? format_decode+0x323/0x380[44069.740280] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.740292] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.740302] [] ? notifIEr_call_chain+0x45/0x60[44069.740313] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.740324] [] ? notify_dIE+0x2d/0x30[44069.740335] [] ? do_deBUG+0x88/0x180[44069.740345] [] ? deBUG_stack_correct+0x30/0x38[44069.740364] [] ? init_intel_cacheinfo+0x103/0x394[44069.740379] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.740391] [] ? __run_hrtimer+0x6f/0x190[44069.740404] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.740416] [] ? hrtimer_interrupt+0x108/0x240[44069.740430] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.740441] [] ? APIc_timer_interrupt+0x31/0x38[44069.740453] [] ? _raw_spin_unlock_irqrestore+0x15/0x20[44069.740465] [] ? try_to_del_timer_sync+0x67/0xb0[44069.740476] [] ? del_timer_sync+0x29/0x50[44069.740486] [] ? flush_delayed_work+0x13/0x40[44069.740500] [] ? tty_flush_to_ldisc+0x12/0x20[44069.740510] [] ? n_tty_poll+0x4f/0x190[44069.740523] [] ? tty_poll+0x6d/0x90[44069.740531] [] ? n_tty_poll+0x0/0x190[44069.740542] [] ? do_poll.clone.3+0xd0/0x210[44069.740553] [] ? do_sys_poll+0x134/0x1e0[44069.740563] [] ? __pollwait+0x0/0xd0[44069.740572] [] ? pollwake+0x0/0x60...[44069.740742] [] ? pollwake+0x0/0x60[44069.740757] [] ? rw_verify_area+0x6c/0x130[44069.740770] [] ? ktime_get_ts+0xf8/0x120[44069.740781] [] ? poll_select_set_timeout+0x64/0x70[44069.740793] [] ? sys_poll+0x5a/0xd0[44069.740804] [] ? syscall_call+0x7/0xb[44069.740815] [] ? init_intel_cacheinfo+0x23/0x394[44069.740822] Dump stack from sample_hbp_handler[44069.744130] testhrarr_timer_function: testhrarr_runcount 1 [44069.744143] testhrarr jiffIEs 10942436 ; ret: 1 ; ktnsec: 44069744140055[44069.748132] testhrarr_timer_function: testhrarr_runcount 2 [44069.748145] testhrarr jiffIEs 10942437 ; ret: 1 ; ktnsec: 44069748141271[44069.752131] testhrarr_timer_function: testhrarr_runcount 3 [44069.752145] testhrarr jiffIEs 10942438 ; ret: 1 ; ktnsec: 44069752141164[44069.756131] testhrarr_timer_function: testhrarr_runcount 4 [44069.756141] testhrarr jiffIEs 10942439 ; ret: 1 ; ktnsec: 44069756138318[44069.760130] testhrarr_timer_function: testhrarr_runcount 5 [44069.760141] testhrarr jiffIEs 10942440 ; ret: 1 ; ktnsec: 44069760138469[44069.760154] testhrarr_arr_first value is changed[44069.760164] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.760170] Call Trace:[44069.760187] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.760202] [] ? __perf_event_overflow+0x90/0x240[44069.760213] [] ? perf_swevent_event+0x136/0x140[44069.760224] [] ? perf_bp_event+0x70/0x80[44069.760235] [] ? sched_clock_local+0xd3/0x1c0[44069.760247] [] ? format_decode+0x323/0x380[44069.760258] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.760269] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.760279] [] ? notifIEr_call_chain+0x45/0x60[44069.760289] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.760299] [] ? notify_dIE+0x2d/0x30[44069.760308] [] ? do_deBUG+0x88/0x180[44069.760318] [] ? deBUG_stack_correct+0x30/0x38[44069.760334] [] ? init_intel_cacheinfo+0x103/0x394[44069.760345] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.760356] [] ? __run_hrtimer+0x6f/0x190[44069.760366] [] ? send_to_group.clone.1+0xf8/0x150[44069.760376] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.760387] [] ? hrtimer_interrupt+0x108/0x240[44069.760396] [] ? fsnotify+0x1a5/0x290[44069.760407] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.760416] [] ? APIc_timer_interrupt+0x31/0x38[44069.760428] [] ? mem_cgroup_resize_limit+0x108/0x1c0[44069.760437] [] ? fput+0x0/0x30[44069.760446] [] ? sys_write+0x67/0x70[44069.760455] [] ? syscall_call+0x7/0xb[44069.760464] [] ? init_intel_cacheinfo+0x23/0x394[44069.760470] Dump stack from sample_hbp_handler[44069.764134] testhrarr_timer_function: testhrarr_runcount 6 [44069.764147] testhrarr jiffIEs 10942441 ; ret: 1 ; ktnsec: 44069764144141[44069.768133] testhrarr_timer_function: testhrarr_runcount 7 [44069.768146] testhrarr jiffIEs 10942442 ; ret: 1 ; ktnsec: 44069768142976[44069.772134] testhrarr_timer_function: testhrarr_runcount 8 [44069.772148] testhrarr jiffIEs 10942443 ; ret: 1 ; ktnsec: 44069772144121[44069.776132] testhrarr_timer_function: testhrarr_runcount 9 [44069.776145] testhrarr jiffIEs 10942444 ; ret: 1 ; ktnsec: 44069776141971[44069.780133] testhrarr_timer_function: testhrarr_runcount 10 [44069.780141] testhrarr [ 5,]

…并且syslog通知:

CONfig_MODulE_FORCE_UNLOAD=y# deBUG build:# "CFLAGS was changed ... Fix it to use EXTRA_CFLAGS."overrIDe EXTRA_CFLAGS+=-g -O0 obj-m += testhrarr.o#testhrarr-obJs  := testhrarr.oall:    @echo EXTRA_CFLAGS = $(EXTRA_CFLAGS)    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modulesclean:    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

…我们可以看到HW断点设置为0xedf6c5c0,这是testhrarr_arr [0]的地址.现在,如果我们通过/ proc文件触发驱动程序:

/* * [http://www.tldp.org/LDP/lkmpg/2.6/HTML/lkmpg.HTML#AEN189 The linux Kernel Module Programming GuIDe] * https://stackoverflow.com/questions/16920238/reliability-of-linux-kernel-add-timer-at-resolution-of-one-jiffy/17055867#17055867 * https://stackoverflow.com/questions/8516021/proc-create-example-for-kernel-module/18924359#18924359 * http://lxr.free-electrons.com/source/samples/hw_breakpoint/data_breakpoint.c */#include <linux/module.h>   /* Needed by all modules */#include <linux/kernel.h>   /* Needed for KERN_INFO */#include <linux/init.h>     /* Needed for the macros */#include <linux/jiffIEs.h>#include <linux/time.h>#include <linux/proc_fs.h>  /* /proc entry */#include <linux/seq_file.h> /* /proc entry */#define ARRSIZE 5#define MAXRUNS 2*ARRSIZE#include <linux/hrtimer.h>#define HWDEBUG_STACK 1#if (HWDEBUG_STACK == 1)#include <linux/perf_event.h>#include <linux/hw_breakpoint.h>struct perf_event * __percpu *sample_hbp;static char ksym_name[KSYM_name_LEN] = "testhrarr_arr";module_param_string(ksym,ksym_name,KSYM_name_LEN,S_IRUGO);MODulE_PARM_DESC(ksym,"Kernel symbol to monitor; this module will report any"      " write operations on the kernel symbol");#endifstatic volatile int testhrarr_runcount = 0;static volatile int testhrarr_isRunning = 0;static unsigned long period_ms;static unsigned long period_ns;static ktime_t ktime_period_ns;static struct hrtimer my_hrtimer;static int* testhrarr_arr;static int* testhrarr_arr_first;static enum hrtimer_restart testhrarr_timer_function(struct hrtimer *timer){  unsigned long tjNow;  ktime_t kt_Now;  int ret_overrun;  printk(KERN_INFO    " %s: testhrarr_runcount %d \n",__func__,testhrarr_runcount);  if (testhrarr_runcount < MAXRUNS) {    tjNow = jiffIEs;    kt_Now = hrtimer_cb_get_time(&my_hrtimer);    ret_overrun = hrtimer_forward(&my_hrtimer,kt_Now,ktime_period_ns);    printk(KERN_INFO      " testhrarr jiffIEs %lu ; ret: %d ; ktnsec: %lld\n",tjNow,ret_overrun,ktime_to_ns(kt_Now));    testhrarr_arr[(testhrarr_runcount % ARRSIZE)] += testhrarr_runcount;    testhrarr_runcount++;    return HRTIMER_RESTART;  }  else {    int i;    testhrarr_isRunning = 0;    // do not use KERN_DEBUG etc,if printk buffering until newline is desired!    printk("testhrarr_arr [ ");    for(i=0; i<ARRSIZE; i++) {      printk("%d,",testhrarr_arr[i]);    }    printk("]\n");    return HRTIMER_norESTART;  }}static voID testhrarr_startup(voID){  if (testhrarr_isRunning == 0) {    testhrarr_isRunning = 1;    testhrarr_runcount = 0;    testhrarr_arr[0] = 0; //just the first element    hrtimer_start(&my_hrtimer,HRTIMER_MODE_REL);  }}static int testhrarr_proc_show(struct seq_file *m,voID *v) {  if (testhrarr_isRunning == 0) {    seq_printf(m,"testhrarr proc: startup\n");    testhrarr_startup();  } else {    seq_printf(m,testhrarr_runcount);  }  return 0;}static int testhrarr_proc_open(struct inode *inode,struct  file *file) {  return single_open(file,testhrarr_proc_show,NulL);}static const struct file_operations testhrarr_proc_fops = {  .owner = THIS_MODulE,.open = testhrarr_proc_open,.read = seq_read,.llseek = seq_lseek,.release = single_release,};#if (HWDEBUG_STACK == 1)static voID sample_hbp_handler(struct perf_event *bp,struct perf_sample_data *data,struct pt_regs *regs){  printk(KERN_INFO "%s value is changed\n",ksym_name);  dump_stack();  printk(KERN_INFO "Dump stack from sample_hbp_handler\n");}#endifstatic int __init testhrarr_init(voID){  struct timespec tp_hr_res;  #if (HWDEBUG_STACK == 1)  struct perf_event_attr attr;  #endif  period_ms = 1000/HZ;  hrtimer_get_res(CLOCK_MONOTONIC,&tp_hr_res);  printk(KERN_INFO    "Init testhrarr: %d ; HZ: %d ; 1/HZ (ms): %ld ; hrres: %lld.%.9ld\n",testhrarr_runcount,HZ,period_ms,(long long)tp_hr_res.tv_sec,tp_hr_res.tv_nsec );  testhrarr_arr = (int*)kcalloc(ARRSIZE,sizeof(int),GFP_ATOMIC);  testhrarr_arr_first = &testhrarr_arr[0];  hrtimer_init(&my_hrtimer,CLOCK_MONOTONIC,HRTIMER_MODE_REL);  my_hrtimer.function = &testhrarr_timer_function;  period_ns = period_ms*( (unsigned long)1E6L );  ktime_period_ns = ktime_set(0,period_ns);  printk(KERN_INFO    " Addresses: _runcount 0x%p ; _arr 0x%p ; _arr[0] 0x%p (0x%p) ; _timer_function 0x%p ; my_hrtimer 0x%p; my_hrt.f 0x%p\n",&testhrarr_runcount,&testhrarr_arr,&(testhrarr_arr[0]),testhrarr_arr_first,&testhrarr_timer_function,&my_hrtimer,&my_hrtimer.function);  proc_create("testhrarr_proc",NulL,&testhrarr_proc_fops);  #if (HWDEBUG_STACK == 1)  hw_breakpoint_init(&attr);  if (strcmp(ksym_name,"testhrarr_arr_first") == 0) {    // just for testhrarr_arr_first - interpret the found symbol address    // as int*,and dereference it to get the "real" address it points to    attr.bp_addr = *((int*)kallsyms_lookup_name(ksym_name));  } else {    // the usual - address is kallsyms_lookup_name result    attr.bp_addr = kallsyms_lookup_name(ksym_name);  }  attr.bp_len = HW_BREAKPOINT_LEN_1;  attr.bp_type = HW_BREAKPOINT_W ; //| HW_BREAKPOINT_R;  sample_hbp = register_wIDe_hw_breakpoint(&attr,(perf_overflow_handler_t)sample_hbp_handler);  if (IS_ERR((voID __force *)sample_hbp)) {    int ret = PTR_ERR((voID __force *)sample_hbp);    printk(KERN_INFO "Breakpoint registration Failed\n");    return ret;  }  // explicit cast needed to show 64-bit bp_addr as 32-bit address  // https://stackoverflow.com/questions/11796909/how-to-resolve-cast-to-pointer-from-integer-of-different-size-warning-in-c-co/11797103#11797103  printk(KERN_INFO "HW Breakpoint for %s write installed (0x%p)\n",(voID*)(uintptr_t)attr.bp_addr);  #endif  return 0;}static voID __exit testhrarr_exit(voID){  int ret_cancel = 0;  kfree(testhrarr_arr);  while( hrtimer_callback_running(&my_hrtimer) ) {    ret_canceL++;  }  if (ret_cancel != 0) {    printk(KERN_INFO " testhrarr Waited for hrtimer callback to finish (%d)\n",ret_cancel);  }  if (hrtimer_active(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr active hrtimer cancelled: %d (%d)\n",ret_cancel,testhrarr_runcount);  }  if (hrtimer_is_queued(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr queued hrtimer cancelled: %d (%d)\n",testhrarr_runcount);  }  remove_proc_entry("testhrarr_proc",NulL);  #if (HWDEBUG_STACK == 1)  unregister_wIDe_hw_breakpoint(sample_hbp);  printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n",ksym_name);  #endif  printk(KERN_INFO "Exit testhrarr\n");}module_init(testhrarr_init);module_exit(testhrarr_exit);MODulE_liCENSE("GPL");

…我们在syslog中获取:

[+++]

…我们得到一个堆栈跟踪正好三次 – 一次在testhrarr_startup期间,两次在testhrarr_timer_function中:一次用于runco​​unt == 0,一次用于runco​​unt == 5,如预期的那样.

嗯,希望这有助于某人,
干杯!

Makefile文件

[+++]

testhrarr.c

[+++] 总结

以上是内存溢出为你收集整理的调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?全部内容,希望文章能够帮你解决调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

)
File: /www/wwwroot/outofmemory.cn/tmp/route_read.php, Line: 126, InsideLink()
File: /www/wwwroot/outofmemory.cn/tmp/index.inc.php, Line: 166, include(/www/wwwroot/outofmemory.cn/tmp/route_read.php)
File: /www/wwwroot/outofmemory.cn/index.php, Line: 30, include(/www/wwwroot/outofmemory.cn/tmp/index.inc.php)
Error[8]: Undefined offset: 16, File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 121
File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 473, decode(

概述我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪. 例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数. 现在,我 我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪.

例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数.

现在,我知道我可以使用kgdb连接到在虚拟机中运行的调试linux内核,甚至可以设置断点(所以希望也是观察点) – 但问题是我实际上想要调试ALSA驱动程序,特别是播放dma_area缓冲区(我得到了一些意想不到的数据) – 这对时间非常敏感;并且运行调试内核本身会弄乱时间(更不用说在虚拟机中运行它)了.

这里更大的问题是回放dma_area指针仅在回放 *** 作期间(或者换句话说,在_start和_stop处理程序之间)存在 – 所以我必须在每个_start回调中记录dma_area地址,然后以某种方式“在播放 *** 作期间安排“用于”观看“.

所以我希望有一种方法可以直接在驱动程序代码中执行此类 *** 作 – 例如,在此_start回调中添加一些代码来记录dma_area指针,并将其用作启动“watch”的命令的参数改变;从相应的回调函数打印堆栈跟踪. (我知道这也会影响时间,但我希望它能够“轻松”,不会过多地影响“实时”驱动程序 *** 作).

所以我的问题是:在linux内核中进行调试的这种技术是否存在?

如果不是:是否可以设置硬件(或软件)中断,该中断会对特定内存地址的更改做出反应?然后,我可以设置这样的中断处理程序,可以打印出堆栈跟踪吗? (虽然,我认为整个上下文在IRQ处理程序运行时会发生变化,因此可能会出现堆栈跟踪错误)?

如果没有:是否还有其他技术,这将允许我打印进程的堆栈跟踪,该跟踪更改存储在内核中给定内存位置的值(希望在实时的非调试内核中)?

解决方法 非常感谢 @CosminRatiu和 Eugene的回复;多亏了那些,我发现:

> debugging – Linux kernel hardware break points – Stack Overflow
> Hardware Breakpoint (or watchpoint) – The Linux Kernel Archives

…我可以用它来开发我在这里发布的示例,testhrarr.c内核模块/驱动程序和Makefile(下面).它表明硬件观察点跟踪可以通过两种方式实现:使用perf程序,它可以不变地探测驱动程序;或者通过向驱动程序添加一些硬件断点代码(在示例中,由HWDEBUG_STACK定义变量包含).

本质上,调试像int这样的标准原子变量类型(如runco​​unt变量)的内容很简单,只要它们被定义为内核模块中的全局变量,因此它们最终全局显示为内核符号.因此,下面的代码将testhrarr_添加为变量的前缀(以避免命名冲突).但是,由于需要解除引用,调试数组的内容可能有点棘手 – 这就是本文演示的内容,调试testhrarr_arr数组的第一个字节.它完成于:

$echo `cat /etc/lsb-release` disTRIB_ID=Ubuntu disTRIB_RELEASE=11.04 disTRIB_CODEname=natty disTRIB_DESCRIPTION="Ubuntu 11.04"$uname -alinux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/linux$cat /proc/cpuinfo | grep "model name"model name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHzmodel name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHz

testhrarr模块基本上在模块初始化时为小数组分配内存,设置定时器函数,并公开/ proc / testhrarr_proc文件(使用较新的proc_create接口).然后,尝试从/ proc / testhrarr_proc文件(例如,使用cat)读取将触发计时器功能,该功能将修改testhrarr_arr数组值,并将消息转储到/ var / log / syslog.我们期望testhrarr_arr [0]在 *** 作期间会改变三次;一次在testhrarr_startup中,两次在testhrarr_timer_function中(由于换行).

使用perf

使用make构建模块后,您可以使用以下命令加载它:

sudo insmod ./testhrarr.ko

此时,/ var / log / syslog将包含:

kernel: [40277.199913] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [40277.199930]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xed182a80 (0xed182a80) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [40277.220329] HW Breakpoint for testhrarr_arr write installed (0xf84be2a0)

注意,只是将testhrarr_arr作为硬件观察点的符号传递,扫描该变量的地址(0xf84be2a0),而不是数组的第一个元素的地址(0xed182a80)!因此,硬件断点不会触发 – 因此行为就好像硬件断点代码根本不存在(可以通过取消定义HWDEBUG_STACK来实现)!

因此,即使没有通过内核模块代码设置的硬件断点,我们仍然可以使用perf来观察内存地址的变化 – 在perf中,我们指定我们要监视的地址(这里是testhrarr_arr的第一个元素的地址,0xed182a80),以及应该运行的进程:这里我们运行bash,所以我们可以执行cat / proc / testhrarr_proc,它将触发内核模块定时器,然后是sleep 0.5,这将允许定时器完成. -a参数也是必需的,否则可能会遗漏一些事件:

$sudo perf record -a --call-graph --event=mem:0xed182a80:w bash -c 'cat /proc/testhrarr_proc ; sleep 0.5'testhrarr proc: startup[ perf record: Woken up 1 times to write data ][ perf record: Captured and wrote 0.485 MB perf.data (~21172 samples) ]

此时,/ var / log / syslog还包含以下内容:

[40822.114964]  testhrarr_timer_function: testhrarr_runcount 0 [40822.114980]  testhrarr jiffIEs 10130528 ; ret: 1 ; ktnsec: 40822114975062[40822.118956]  testhrarr_timer_function: testhrarr_runcount 1 [40822.118977]  testhrarr jiffIEs 10130529 ; ret: 1 ; ktnsec: 40822118973195[40822.122940]  testhrarr_timer_function: testhrarr_runcount 2 [40822.122956]  testhrarr jiffIEs 10130530 ; ret: 1 ; ktnsec: 40822122951143[40822.126962]  testhrarr_timer_function: testhrarr_runcount 3 [40822.126978]  testhrarr jiffIEs 10130531 ; ret: 1 ; ktnsec: 40822126973583[40822.130941]  testhrarr_timer_function: testhrarr_runcount 4 [40822.130961]  testhrarr jiffIEs 10130532 ; ret: 1 ; ktnsec: 40822130955167[40822.134940]  testhrarr_timer_function: testhrarr_runcount 5 [40822.134962]  testhrarr jiffIEs 10130533 ; ret: 1 ; ktnsec: 40822134958888[40822.138936]  testhrarr_timer_function: testhrarr_runcount 6 [40822.138958]  testhrarr jiffIEs 10130534 ; ret: 1 ; ktnsec: 40822138955693[40822.142940]  testhrarr_timer_function: testhrarr_runcount 7 [40822.142962]  testhrarr jiffIEs 10130535 ; ret: 1 ; ktnsec: 40822142959345[40822.146936]  testhrarr_timer_function: testhrarr_runcount 8 [40822.146957]  testhrarr jiffIEs 10130536 ; ret: 1 ; ktnsec: 40822146954479[40822.150949]  testhrarr_timer_function: testhrarr_runcount 9 [40822.150970]  testhrarr jiffIEs 10130537 ; ret: 1 ; ktnsec: 40822150963438[40822.154974]  testhrarr_timer_function: testhrarr_runcount 10 [40822.154988] testhrarr [ 5,7,9,11,13,]

要读取perf(一个名为perf.data的文件)的捕获,我们可以使用:

$sudo perf report --call-graph flat --stdioNo kallsyms or vmlinux with build-ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 was found[testhrarr] with build ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 not found,continuing without symbolsFailed to open /bin/cat,continuing without symbolsFailed to open /usr/lib/libpixman-1.so.0.20.2,continuing without symbolsFailed to open /usr/lib/xorg/modules/drivers/intel_drv.so,continuing without symbolsFailed to open /usr/bin/Xorg,continuing without symbols# Events: 5  unkNown## Overhead  Command  Shared Object                                Symbol# ........  .......  .............  ....................................#    87.50%     Xorg  [testhrarr]    [k] testhrarr_timer_function            87.50%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                0x30185d                0x2ed701                0x2ed8cc                0x2edba0                0x9d0386                0x8126fc8                0x81217a1                0x811bdd3                0x8070aa7                0x806281c                __libc_start_main                0x8062411     6.25%      cat  [testhrarr]    [k] testhrarr_timer_function             6.25%                testhrarr_timer_function                testhrarr_proc_show                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081     3.12%  swapper  [testhrarr]    [k] testhrarr_timer_function             3.12%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                cpuIDle_IDle_call                cpu_IDle                start_secondary     3.12%      cat  [testhrarr]    [k] 0x356                3.12%                0xf84bc356                0xf84bc3a7                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081## (For a higher level overvIEw,try: perf report --sort comm,dso)#

因此,由于我们正在使用调试on(Makefile中的-g)构建内核模块,所以即使实时内核不是调试内核,perf也不能找到该模块的符号.所以它在大多数时候正确地将testhrarr_timer_function解释为setter,虽然它没有报告testhrarr_startup(但它报告了testhrarr_proc_show调用它).还有对0xf84bc3a7和0xf84bc356的引用无法解析;但请注意,模块加载为0xf84bc000:

$sudo cat /proc/modules | grep testhrtesthrarr 13433 0 - live 0xf84bc000

……并且该条目也以…开头[k] 0x356;如果我们查看内核模块的objdump:

$objdump -S testhrarr.ko | less...00000323 :static voID testhrarr_startup(voID){...    testhrarr_arr[0] = 0; //just the first element 34b:   a1 80 00 00 00          mov    0x80,%eax 350:   c7 00 00 00 00 00       movl   
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr[0]
x0,(%eax) hrtimer_start(&my_hrtimer,ktime_period_ns,HRTIMER_MODE_REL); 356: c7 04 24 01 00 00 00 movl
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr_first
x1,(%esp) ********** 35d: 8b 15 1c 00 00 00 mov 0x1c,%edx...00000375 :static int testhrarr_proc_show(struct seq_file *m,voID *v) {... seq_printf(m,"testhrarr proc: startup\n"); 38f: c7 44 24 04 79 00 00 movl
kernel: [43910.509726] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [43910.509765]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xedf6c5c0 (0xedf6c5c0) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [43910.538535] HW Breakpoint for testhrarr_arr_first write installed (0xedf6c5c0)
x79,0x4(%esp) 396: 00 397: 8b 45 fc mov -0x4(%ebp),%eax 39a: 89 04 24 mov %eax,(%esp) 39d: e8 fc ff ff ff call 39e testhrarr_startup(); 3a2: e8 7c ff ff ff call 323 3a7: eb 1c jmp 3c5 ********** } else { seq_printf(m,"testhrarr proc: (is running,%d)\n",testhrarr_runcount); 3a9: a1 0c 00 00 00 mov 0xc,%eax...

…所以0xf84bc356显然是指hrtimer_start;和0xf84bc3a7 – > 3a7指其调用testhrarr_proc_show函数;值得庆幸的是. (请注意,我已经体验过不同版本的驱动程序,_start可以显示,而timer_function由纯粹的地址表示;不确定是什么原因).

然而,perf的一个问题是,它给了我这些函数的统计“开销”(不确定是什么意思 – 可能是在函数的进入和退出之间花费的时间?) – 但我真正想要的是堆栈跟踪的日志是顺序的.不确定是否可以为此设置perf – 但绝对可以使用内核模块代码来完成硬件断点.

使用内核模块HW断点

HWDEBUG_STACK中的代码实现了HW断点的设置和处理.如上所述,符号ksym_name(如果未指定)的默认设置是testhrarr_arr,它根本不触发硬件断点.在insmod期间,可以在命令行中指定ksym_name参数;在这里我们可以注意到:

$cat /proc/testhrarr_proc testhrarr proc: startup

…在/ var / log / syslog中安装了HW Breakpoint for testhrarr_arr [0]的结果(0x(null)); – 这意味着我们不能使用带括号表示法的符号进行数组访问;谢天谢地,这里的空指针只是意味着HW断点将再次不会触发;它不会完全崩溃 *** 作系统:)

但是,有一个全局变量用于引用testhrarr_arr数组的第一个元素,称为testhrarr_arr_first – 注意如何在代码中专门处理此全局变量,并且需要取消引用,以便获得正确的地址.所以我们这样做:

kernel: [44069.735695] testhrarr_arr_first value is changed[44069.735711] PID: 29320,comm: cat Not tainted 2.6.38-16-generic #67-Ubuntu[44069.735719] Call Trace:[44069.735737] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.735755] [] ? __perf_event_overflow+0x90/0x240[44069.735768] [] ? proc_alloc_inode+0x23/0x90[44069.735778] [] ? proc_alloc_inode+0x23/0x90[44069.735790] [] ? perf_swevent_event+0x136/0x140[44069.735801] [] ? perf_bp_event+0x70/0x80[44069.735812] [] ? prep_new_page+0x110/0x1a0[44069.735824] [] ? get_page_from_freeList+0x12e/0x320[44069.735836] [] ? seq_open+0x3d/0xa0[44069.735848] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.735861] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.735872] [] ? notifIEr_call_chain+0x45/0x60[44069.735883] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.735894] [] ? notify_dIE+0x2d/0x30[44069.735904] [] ? do_deBUG+0x88/0x180[44069.735915] [] ? deBUG_stack_correct+0x30/0x38[44069.735928] [] ? testhrarr_startup+0x33/0x52 [testhrarr][44069.735940] [] ? testhrarr_proc_show+0x32/0x57 [testhrarr][44069.735952] [] ? seq_read+0x145/0x390[44069.735963] [] ? seq_read+0x0/0x390[44069.735973] [] ? proc_reg_read+0x64/0xa0[44069.735985] [] ? vfs_read+0x9f/0x160[44069.735995] [] ? proc_reg_read+0x0/0xa0[44069.736003] [] ? sys_read+0x42/0x70[44069.736013] [] ? syscall_call+0x7/0xb[44069.736019] Dump stack from sample_hbp_handler[44069.740132] testhrarr_timer_function: testhrarr_runcount 0 [44069.740146] testhrarr jiffIEs 10942435 ; ret: 1 ; ktnsec: 44069740142485[44069.740159] testhrarr_arr_first value is changed[44069.740169] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.740176] Call Trace:[44069.740195] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.740213] [] ? __perf_event_overflow+0x90/0x240[44069.740227] [] ? perf_swevent_event+0x136/0x140[44069.740239] [] ? perf_bp_event+0x70/0x80[44069.740253] [] ? sched_clock_local+0xd3/0x1c0[44069.740267] [] ? format_decode+0x323/0x380[44069.740280] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.740292] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.740302] [] ? notifIEr_call_chain+0x45/0x60[44069.740313] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.740324] [] ? notify_dIE+0x2d/0x30[44069.740335] [] ? do_deBUG+0x88/0x180[44069.740345] [] ? deBUG_stack_correct+0x30/0x38[44069.740364] [] ? init_intel_cacheinfo+0x103/0x394[44069.740379] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.740391] [] ? __run_hrtimer+0x6f/0x190[44069.740404] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.740416] [] ? hrtimer_interrupt+0x108/0x240[44069.740430] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.740441] [] ? APIc_timer_interrupt+0x31/0x38[44069.740453] [] ? _raw_spin_unlock_irqrestore+0x15/0x20[44069.740465] [] ? try_to_del_timer_sync+0x67/0xb0[44069.740476] [] ? del_timer_sync+0x29/0x50[44069.740486] [] ? flush_delayed_work+0x13/0x40[44069.740500] [] ? tty_flush_to_ldisc+0x12/0x20[44069.740510] [] ? n_tty_poll+0x4f/0x190[44069.740523] [] ? tty_poll+0x6d/0x90[44069.740531] [] ? n_tty_poll+0x0/0x190[44069.740542] [] ? do_poll.clone.3+0xd0/0x210[44069.740553] [] ? do_sys_poll+0x134/0x1e0[44069.740563] [] ? __pollwait+0x0/0xd0[44069.740572] [] ? pollwake+0x0/0x60...[44069.740742] [] ? pollwake+0x0/0x60[44069.740757] [] ? rw_verify_area+0x6c/0x130[44069.740770] [] ? ktime_get_ts+0xf8/0x120[44069.740781] [] ? poll_select_set_timeout+0x64/0x70[44069.740793] [] ? sys_poll+0x5a/0xd0[44069.740804] [] ? syscall_call+0x7/0xb[44069.740815] [] ? init_intel_cacheinfo+0x23/0x394[44069.740822] Dump stack from sample_hbp_handler[44069.744130] testhrarr_timer_function: testhrarr_runcount 1 [44069.744143] testhrarr jiffIEs 10942436 ; ret: 1 ; ktnsec: 44069744140055[44069.748132] testhrarr_timer_function: testhrarr_runcount 2 [44069.748145] testhrarr jiffIEs 10942437 ; ret: 1 ; ktnsec: 44069748141271[44069.752131] testhrarr_timer_function: testhrarr_runcount 3 [44069.752145] testhrarr jiffIEs 10942438 ; ret: 1 ; ktnsec: 44069752141164[44069.756131] testhrarr_timer_function: testhrarr_runcount 4 [44069.756141] testhrarr jiffIEs 10942439 ; ret: 1 ; ktnsec: 44069756138318[44069.760130] testhrarr_timer_function: testhrarr_runcount 5 [44069.760141] testhrarr jiffIEs 10942440 ; ret: 1 ; ktnsec: 44069760138469[44069.760154] testhrarr_arr_first value is changed[44069.760164] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.760170] Call Trace:[44069.760187] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.760202] [] ? __perf_event_overflow+0x90/0x240[44069.760213] [] ? perf_swevent_event+0x136/0x140[44069.760224] [] ? perf_bp_event+0x70/0x80[44069.760235] [] ? sched_clock_local+0xd3/0x1c0[44069.760247] [] ? format_decode+0x323/0x380[44069.760258] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.760269] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.760279] [] ? notifIEr_call_chain+0x45/0x60[44069.760289] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.760299] [] ? notify_dIE+0x2d/0x30[44069.760308] [] ? do_deBUG+0x88/0x180[44069.760318] [] ? deBUG_stack_correct+0x30/0x38[44069.760334] [] ? init_intel_cacheinfo+0x103/0x394[44069.760345] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.760356] [] ? __run_hrtimer+0x6f/0x190[44069.760366] [] ? send_to_group.clone.1+0xf8/0x150[44069.760376] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.760387] [] ? hrtimer_interrupt+0x108/0x240[44069.760396] [] ? fsnotify+0x1a5/0x290[44069.760407] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.760416] [] ? APIc_timer_interrupt+0x31/0x38[44069.760428] [] ? mem_cgroup_resize_limit+0x108/0x1c0[44069.760437] [] ? fput+0x0/0x30[44069.760446] [] ? sys_write+0x67/0x70[44069.760455] [] ? syscall_call+0x7/0xb[44069.760464] [] ? init_intel_cacheinfo+0x23/0x394[44069.760470] Dump stack from sample_hbp_handler[44069.764134] testhrarr_timer_function: testhrarr_runcount 6 [44069.764147] testhrarr jiffIEs 10942441 ; ret: 1 ; ktnsec: 44069764144141[44069.768133] testhrarr_timer_function: testhrarr_runcount 7 [44069.768146] testhrarr jiffIEs 10942442 ; ret: 1 ; ktnsec: 44069768142976[44069.772134] testhrarr_timer_function: testhrarr_runcount 8 [44069.772148] testhrarr jiffIEs 10942443 ; ret: 1 ; ktnsec: 44069772144121[44069.776132] testhrarr_timer_function: testhrarr_runcount 9 [44069.776145] testhrarr jiffIEs 10942444 ; ret: 1 ; ktnsec: 44069776141971[44069.780133] testhrarr_timer_function: testhrarr_runcount 10 [44069.780141] testhrarr [ 5,]

…并且syslog通知:

CONfig_MODulE_FORCE_UNLOAD=y# deBUG build:# "CFLAGS was changed ... Fix it to use EXTRA_CFLAGS."overrIDe EXTRA_CFLAGS+=-g -O0 obj-m += testhrarr.o#testhrarr-obJs  := testhrarr.oall:    @echo EXTRA_CFLAGS = $(EXTRA_CFLAGS)    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modulesclean:    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

…我们可以看到HW断点设置为0xedf6c5c0,这是testhrarr_arr [0]的地址.现在,如果我们通过/ proc文件触发驱动程序:

/* * [http://www.tldp.org/LDP/lkmpg/2.6/HTML/lkmpg.HTML#AEN189 The linux Kernel Module Programming GuIDe] * https://stackoverflow.com/questions/16920238/reliability-of-linux-kernel-add-timer-at-resolution-of-one-jiffy/17055867#17055867 * https://stackoverflow.com/questions/8516021/proc-create-example-for-kernel-module/18924359#18924359 * http://lxr.free-electrons.com/source/samples/hw_breakpoint/data_breakpoint.c */#include <linux/module.h>   /* Needed by all modules */#include <linux/kernel.h>   /* Needed for KERN_INFO */#include <linux/init.h>     /* Needed for the macros */#include <linux/jiffIEs.h>#include <linux/time.h>#include <linux/proc_fs.h>  /* /proc entry */#include <linux/seq_file.h> /* /proc entry */#define ARRSIZE 5#define MAXRUNS 2*ARRSIZE#include <linux/hrtimer.h>#define HWDEBUG_STACK 1#if (HWDEBUG_STACK == 1)#include <linux/perf_event.h>#include <linux/hw_breakpoint.h>struct perf_event * __percpu *sample_hbp;static char ksym_name[KSYM_name_LEN] = "testhrarr_arr";module_param_string(ksym,ksym_name,KSYM_name_LEN,S_IRUGO);MODulE_PARM_DESC(ksym,"Kernel symbol to monitor; this module will report any"      " write operations on the kernel symbol");#endifstatic volatile int testhrarr_runcount = 0;static volatile int testhrarr_isRunning = 0;static unsigned long period_ms;static unsigned long period_ns;static ktime_t ktime_period_ns;static struct hrtimer my_hrtimer;static int* testhrarr_arr;static int* testhrarr_arr_first;static enum hrtimer_restart testhrarr_timer_function(struct hrtimer *timer){  unsigned long tjNow;  ktime_t kt_Now;  int ret_overrun;  printk(KERN_INFO    " %s: testhrarr_runcount %d \n",__func__,testhrarr_runcount);  if (testhrarr_runcount < MAXRUNS) {    tjNow = jiffIEs;    kt_Now = hrtimer_cb_get_time(&my_hrtimer);    ret_overrun = hrtimer_forward(&my_hrtimer,kt_Now,ktime_period_ns);    printk(KERN_INFO      " testhrarr jiffIEs %lu ; ret: %d ; ktnsec: %lld\n",tjNow,ret_overrun,ktime_to_ns(kt_Now));    testhrarr_arr[(testhrarr_runcount % ARRSIZE)] += testhrarr_runcount;    testhrarr_runcount++;    return HRTIMER_RESTART;  }  else {    int i;    testhrarr_isRunning = 0;    // do not use KERN_DEBUG etc,if printk buffering until newline is desired!    printk("testhrarr_arr [ ");    for(i=0; i<ARRSIZE; i++) {      printk("%d,",testhrarr_arr[i]);    }    printk("]\n");    return HRTIMER_norESTART;  }}static voID testhrarr_startup(voID){  if (testhrarr_isRunning == 0) {    testhrarr_isRunning = 1;    testhrarr_runcount = 0;    testhrarr_arr[0] = 0; //just the first element    hrtimer_start(&my_hrtimer,HRTIMER_MODE_REL);  }}static int testhrarr_proc_show(struct seq_file *m,voID *v) {  if (testhrarr_isRunning == 0) {    seq_printf(m,"testhrarr proc: startup\n");    testhrarr_startup();  } else {    seq_printf(m,testhrarr_runcount);  }  return 0;}static int testhrarr_proc_open(struct inode *inode,struct  file *file) {  return single_open(file,testhrarr_proc_show,NulL);}static const struct file_operations testhrarr_proc_fops = {  .owner = THIS_MODulE,.open = testhrarr_proc_open,.read = seq_read,.llseek = seq_lseek,.release = single_release,};#if (HWDEBUG_STACK == 1)static voID sample_hbp_handler(struct perf_event *bp,struct perf_sample_data *data,struct pt_regs *regs){  printk(KERN_INFO "%s value is changed\n",ksym_name);  dump_stack();  printk(KERN_INFO "Dump stack from sample_hbp_handler\n");}#endifstatic int __init testhrarr_init(voID){  struct timespec tp_hr_res;  #if (HWDEBUG_STACK == 1)  struct perf_event_attr attr;  #endif  period_ms = 1000/HZ;  hrtimer_get_res(CLOCK_MONOTONIC,&tp_hr_res);  printk(KERN_INFO    "Init testhrarr: %d ; HZ: %d ; 1/HZ (ms): %ld ; hrres: %lld.%.9ld\n",testhrarr_runcount,HZ,period_ms,(long long)tp_hr_res.tv_sec,tp_hr_res.tv_nsec );  testhrarr_arr = (int*)kcalloc(ARRSIZE,sizeof(int),GFP_ATOMIC);  testhrarr_arr_first = &testhrarr_arr[0];  hrtimer_init(&my_hrtimer,CLOCK_MONOTONIC,HRTIMER_MODE_REL);  my_hrtimer.function = &testhrarr_timer_function;  period_ns = period_ms*( (unsigned long)1E6L );  ktime_period_ns = ktime_set(0,period_ns);  printk(KERN_INFO    " Addresses: _runcount 0x%p ; _arr 0x%p ; _arr[0] 0x%p (0x%p) ; _timer_function 0x%p ; my_hrtimer 0x%p; my_hrt.f 0x%p\n",&testhrarr_runcount,&testhrarr_arr,&(testhrarr_arr[0]),testhrarr_arr_first,&testhrarr_timer_function,&my_hrtimer,&my_hrtimer.function);  proc_create("testhrarr_proc",NulL,&testhrarr_proc_fops);  #if (HWDEBUG_STACK == 1)  hw_breakpoint_init(&attr);  if (strcmp(ksym_name,"testhrarr_arr_first") == 0) {    // just for testhrarr_arr_first - interpret the found symbol address    // as int*,and dereference it to get the "real" address it points to    attr.bp_addr = *((int*)kallsyms_lookup_name(ksym_name));  } else {    // the usual - address is kallsyms_lookup_name result    attr.bp_addr = kallsyms_lookup_name(ksym_name);  }  attr.bp_len = HW_BREAKPOINT_LEN_1;  attr.bp_type = HW_BREAKPOINT_W ; //| HW_BREAKPOINT_R;  sample_hbp = register_wIDe_hw_breakpoint(&attr,(perf_overflow_handler_t)sample_hbp_handler);  if (IS_ERR((voID __force *)sample_hbp)) {    int ret = PTR_ERR((voID __force *)sample_hbp);    printk(KERN_INFO "Breakpoint registration Failed\n");    return ret;  }  // explicit cast needed to show 64-bit bp_addr as 32-bit address  // https://stackoverflow.com/questions/11796909/how-to-resolve-cast-to-pointer-from-integer-of-different-size-warning-in-c-co/11797103#11797103  printk(KERN_INFO "HW Breakpoint for %s write installed (0x%p)\n",(voID*)(uintptr_t)attr.bp_addr);  #endif  return 0;}static voID __exit testhrarr_exit(voID){  int ret_cancel = 0;  kfree(testhrarr_arr);  while( hrtimer_callback_running(&my_hrtimer) ) {    ret_canceL++;  }  if (ret_cancel != 0) {    printk(KERN_INFO " testhrarr Waited for hrtimer callback to finish (%d)\n",ret_cancel);  }  if (hrtimer_active(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr active hrtimer cancelled: %d (%d)\n",ret_cancel,testhrarr_runcount);  }  if (hrtimer_is_queued(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr queued hrtimer cancelled: %d (%d)\n",testhrarr_runcount);  }  remove_proc_entry("testhrarr_proc",NulL);  #if (HWDEBUG_STACK == 1)  unregister_wIDe_hw_breakpoint(sample_hbp);  printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n",ksym_name);  #endif  printk(KERN_INFO "Exit testhrarr\n");}module_init(testhrarr_init);module_exit(testhrarr_exit);MODulE_liCENSE("GPL");

…我们在syslog中获取:

  

…我们得到一个堆栈跟踪正好三次 – 一次在testhrarr_startup期间,两次在testhrarr_timer_function中:一次用于runco​​unt == 0,一次用于runco​​unt == 5,如预期的那样.

嗯,希望这有助于某人,
干杯!

Makefile文件

[+++]

testhrarr.c

[+++] 总结

以上是内存溢出为你收集整理的调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?全部内容,希望文章能够帮你解决调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

)
File: /www/wwwroot/outofmemory.cn/tmp/route_read.php, Line: 126, InsideLink()
File: /www/wwwroot/outofmemory.cn/tmp/index.inc.php, Line: 166, include(/www/wwwroot/outofmemory.cn/tmp/route_read.php)
File: /www/wwwroot/outofmemory.cn/index.php, Line: 30, include(/www/wwwroot/outofmemory.cn/tmp/index.inc.php)
Error[8]: Undefined offset: 17, File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 121
File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 473, decode(

概述我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪. 例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数. 现在,我 我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪.

例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数.

现在,我知道我可以使用kgdb连接到在虚拟机中运行的调试linux内核,甚至可以设置断点(所以希望也是观察点) – 但问题是我实际上想要调试ALSA驱动程序,特别是播放dma_area缓冲区(我得到了一些意想不到的数据) – 这对时间非常敏感;并且运行调试内核本身会弄乱时间(更不用说在虚拟机中运行它)了.

这里更大的问题是回放dma_area指针仅在回放 *** 作期间(或者换句话说,在_start和_stop处理程序之间)存在 – 所以我必须在每个_start回调中记录dma_area地址,然后以某种方式“在播放 *** 作期间安排“用于”观看“.

所以我希望有一种方法可以直接在驱动程序代码中执行此类 *** 作 – 例如,在此_start回调中添加一些代码来记录dma_area指针,并将其用作启动“watch”的命令的参数改变;从相应的回调函数打印堆栈跟踪. (我知道这也会影响时间,但我希望它能够“轻松”,不会过多地影响“实时”驱动程序 *** 作).

所以我的问题是:在linux内核中进行调试的这种技术是否存在?

如果不是:是否可以设置硬件(或软件)中断,该中断会对特定内存地址的更改做出反应?然后,我可以设置这样的中断处理程序,可以打印出堆栈跟踪吗? (虽然,我认为整个上下文在IRQ处理程序运行时会发生变化,因此可能会出现堆栈跟踪错误)?

如果没有:是否还有其他技术,这将允许我打印进程的堆栈跟踪,该跟踪更改存储在内核中给定内存位置的值(希望在实时的非调试内核中)?

解决方法 非常感谢 @CosminRatiu和 Eugene的回复;多亏了那些,我发现:

> debugging – Linux kernel hardware break points – Stack Overflow
> Hardware Breakpoint (or watchpoint) – The Linux Kernel Archives

…我可以用它来开发我在这里发布的示例,testhrarr.c内核模块/驱动程序和Makefile(下面).它表明硬件观察点跟踪可以通过两种方式实现:使用perf程序,它可以不变地探测驱动程序;或者通过向驱动程序添加一些硬件断点代码(在示例中,由HWDEBUG_STACK定义变量包含).

本质上,调试像int这样的标准原子变量类型(如runco​​unt变量)的内容很简单,只要它们被定义为内核模块中的全局变量,因此它们最终全局显示为内核符号.因此,下面的代码将testhrarr_添加为变量的前缀(以避免命名冲突).但是,由于需要解除引用,调试数组的内容可能有点棘手 – 这就是本文演示的内容,调试testhrarr_arr数组的第一个字节.它完成于:

$echo `cat /etc/lsb-release` disTRIB_ID=Ubuntu disTRIB_RELEASE=11.04 disTRIB_CODEname=natty disTRIB_DESCRIPTION="Ubuntu 11.04"$uname -alinux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/linux$cat /proc/cpuinfo | grep "model name"model name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHzmodel name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHz

testhrarr模块基本上在模块初始化时为小数组分配内存,设置定时器函数,并公开/ proc / testhrarr_proc文件(使用较新的proc_create接口).然后,尝试从/ proc / testhrarr_proc文件(例如,使用cat)读取将触发计时器功能,该功能将修改testhrarr_arr数组值,并将消息转储到/ var / log / syslog.我们期望testhrarr_arr [0]在 *** 作期间会改变三次;一次在testhrarr_startup中,两次在testhrarr_timer_function中(由于换行).

使用perf

使用make构建模块后,您可以使用以下命令加载它:

sudo insmod ./testhrarr.ko

此时,/ var / log / syslog将包含:

kernel: [40277.199913] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [40277.199930]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xed182a80 (0xed182a80) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [40277.220329] HW Breakpoint for testhrarr_arr write installed (0xf84be2a0)

注意,只是将testhrarr_arr作为硬件观察点的符号传递,扫描该变量的地址(0xf84be2a0),而不是数组的第一个元素的地址(0xed182a80)!因此,硬件断点不会触发 – 因此行为就好像硬件断点代码根本不存在(可以通过取消定义HWDEBUG_STACK来实现)!

因此,即使没有通过内核模块代码设置的硬件断点,我们仍然可以使用perf来观察内存地址的变化 – 在perf中,我们指定我们要监视的地址(这里是testhrarr_arr的第一个元素的地址,0xed182a80),以及应该运行的进程:这里我们运行bash,所以我们可以执行cat / proc / testhrarr_proc,它将触发内核模块定时器,然后是sleep 0.5,这将允许定时器完成. -a参数也是必需的,否则可能会遗漏一些事件:

$sudo perf record -a --call-graph --event=mem:0xed182a80:w bash -c 'cat /proc/testhrarr_proc ; sleep 0.5'testhrarr proc: startup[ perf record: Woken up 1 times to write data ][ perf record: Captured and wrote 0.485 MB perf.data (~21172 samples) ]

此时,/ var / log / syslog还包含以下内容:

[40822.114964]  testhrarr_timer_function: testhrarr_runcount 0 [40822.114980]  testhrarr jiffIEs 10130528 ; ret: 1 ; ktnsec: 40822114975062[40822.118956]  testhrarr_timer_function: testhrarr_runcount 1 [40822.118977]  testhrarr jiffIEs 10130529 ; ret: 1 ; ktnsec: 40822118973195[40822.122940]  testhrarr_timer_function: testhrarr_runcount 2 [40822.122956]  testhrarr jiffIEs 10130530 ; ret: 1 ; ktnsec: 40822122951143[40822.126962]  testhrarr_timer_function: testhrarr_runcount 3 [40822.126978]  testhrarr jiffIEs 10130531 ; ret: 1 ; ktnsec: 40822126973583[40822.130941]  testhrarr_timer_function: testhrarr_runcount 4 [40822.130961]  testhrarr jiffIEs 10130532 ; ret: 1 ; ktnsec: 40822130955167[40822.134940]  testhrarr_timer_function: testhrarr_runcount 5 [40822.134962]  testhrarr jiffIEs 10130533 ; ret: 1 ; ktnsec: 40822134958888[40822.138936]  testhrarr_timer_function: testhrarr_runcount 6 [40822.138958]  testhrarr jiffIEs 10130534 ; ret: 1 ; ktnsec: 40822138955693[40822.142940]  testhrarr_timer_function: testhrarr_runcount 7 [40822.142962]  testhrarr jiffIEs 10130535 ; ret: 1 ; ktnsec: 40822142959345[40822.146936]  testhrarr_timer_function: testhrarr_runcount 8 [40822.146957]  testhrarr jiffIEs 10130536 ; ret: 1 ; ktnsec: 40822146954479[40822.150949]  testhrarr_timer_function: testhrarr_runcount 9 [40822.150970]  testhrarr jiffIEs 10130537 ; ret: 1 ; ktnsec: 40822150963438[40822.154974]  testhrarr_timer_function: testhrarr_runcount 10 [40822.154988] testhrarr [ 5,7,9,11,13,]

要读取perf(一个名为perf.data的文件)的捕获,我们可以使用:

$sudo perf report --call-graph flat --stdioNo kallsyms or vmlinux with build-ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 was found[testhrarr] with build ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 not found,continuing without symbolsFailed to open /bin/cat,continuing without symbolsFailed to open /usr/lib/libpixman-1.so.0.20.2,continuing without symbolsFailed to open /usr/lib/xorg/modules/drivers/intel_drv.so,continuing without symbolsFailed to open /usr/bin/Xorg,continuing without symbols# Events: 5  unkNown## Overhead  Command  Shared Object                                Symbol# ........  .......  .............  ....................................#    87.50%     Xorg  [testhrarr]    [k] testhrarr_timer_function            87.50%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                0x30185d                0x2ed701                0x2ed8cc                0x2edba0                0x9d0386                0x8126fc8                0x81217a1                0x811bdd3                0x8070aa7                0x806281c                __libc_start_main                0x8062411     6.25%      cat  [testhrarr]    [k] testhrarr_timer_function             6.25%                testhrarr_timer_function                testhrarr_proc_show                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081     3.12%  swapper  [testhrarr]    [k] testhrarr_timer_function             3.12%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                cpuIDle_IDle_call                cpu_IDle                start_secondary     3.12%      cat  [testhrarr]    [k] 0x356                3.12%                0xf84bc356                0xf84bc3a7                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081## (For a higher level overvIEw,try: perf report --sort comm,dso)#

因此,由于我们正在使用调试on(Makefile中的-g)构建内核模块,所以即使实时内核不是调试内核,perf也不能找到该模块的符号.所以它在大多数时候正确地将testhrarr_timer_function解释为setter,虽然它没有报告testhrarr_startup(但它报告了testhrarr_proc_show调用它).还有对0xf84bc3a7和0xf84bc356的引用无法解析;但请注意,模块加载为0xf84bc000:

$sudo cat /proc/modules | grep testhrtesthrarr 13433 0 - live 0xf84bc000

……并且该条目也以…开头[k] 0x356;如果我们查看内核模块的objdump:

$objdump -S testhrarr.ko | less...00000323 :static voID testhrarr_startup(voID){...    testhrarr_arr[0] = 0; //just the first element 34b:   a1 80 00 00 00          mov    0x80,%eax 350:   c7 00 00 00 00 00       movl   
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr[0]
x0,(%eax) hrtimer_start(&my_hrtimer,ktime_period_ns,HRTIMER_MODE_REL); 356: c7 04 24 01 00 00 00 movl
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr_first
x1,(%esp) ********** 35d: 8b 15 1c 00 00 00 mov 0x1c,%edx...00000375 :static int testhrarr_proc_show(struct seq_file *m,voID *v) {... seq_printf(m,"testhrarr proc: startup\n"); 38f: c7 44 24 04 79 00 00 movl
kernel: [43910.509726] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [43910.509765]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xedf6c5c0 (0xedf6c5c0) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [43910.538535] HW Breakpoint for testhrarr_arr_first write installed (0xedf6c5c0)
x79,0x4(%esp) 396: 00 397: 8b 45 fc mov -0x4(%ebp),%eax 39a: 89 04 24 mov %eax,(%esp) 39d: e8 fc ff ff ff call 39e testhrarr_startup(); 3a2: e8 7c ff ff ff call 323 3a7: eb 1c jmp 3c5 ********** } else { seq_printf(m,"testhrarr proc: (is running,%d)\n",testhrarr_runcount); 3a9: a1 0c 00 00 00 mov 0xc,%eax...

…所以0xf84bc356显然是指hrtimer_start;和0xf84bc3a7 – > 3a7指其调用testhrarr_proc_show函数;值得庆幸的是. (请注意,我已经体验过不同版本的驱动程序,_start可以显示,而timer_function由纯粹的地址表示;不确定是什么原因).

然而,perf的一个问题是,它给了我这些函数的统计“开销”(不确定是什么意思 – 可能是在函数的进入和退出之间花费的时间?) – 但我真正想要的是堆栈跟踪的日志是顺序的.不确定是否可以为此设置perf – 但绝对可以使用内核模块代码来完成硬件断点.

使用内核模块HW断点

HWDEBUG_STACK中的代码实现了HW断点的设置和处理.如上所述,符号ksym_name(如果未指定)的默认设置是testhrarr_arr,它根本不触发硬件断点.在insmod期间,可以在命令行中指定ksym_name参数;在这里我们可以注意到:

$cat /proc/testhrarr_proc testhrarr proc: startup

…在/ var / log / syslog中安装了HW Breakpoint for testhrarr_arr [0]的结果(0x(null)); – 这意味着我们不能使用带括号表示法的符号进行数组访问;谢天谢地,这里的空指针只是意味着HW断点将再次不会触发;它不会完全崩溃 *** 作系统:)

但是,有一个全局变量用于引用testhrarr_arr数组的第一个元素,称为testhrarr_arr_first – 注意如何在代码中专门处理此全局变量,并且需要取消引用,以便获得正确的地址.所以我们这样做:

kernel: [44069.735695] testhrarr_arr_first value is changed[44069.735711] PID: 29320,comm: cat Not tainted 2.6.38-16-generic #67-Ubuntu[44069.735719] Call Trace:[44069.735737] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.735755] [] ? __perf_event_overflow+0x90/0x240[44069.735768] [] ? proc_alloc_inode+0x23/0x90[44069.735778] [] ? proc_alloc_inode+0x23/0x90[44069.735790] [] ? perf_swevent_event+0x136/0x140[44069.735801] [] ? perf_bp_event+0x70/0x80[44069.735812] [] ? prep_new_page+0x110/0x1a0[44069.735824] [] ? get_page_from_freeList+0x12e/0x320[44069.735836] [] ? seq_open+0x3d/0xa0[44069.735848] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.735861] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.735872] [] ? notifIEr_call_chain+0x45/0x60[44069.735883] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.735894] [] ? notify_dIE+0x2d/0x30[44069.735904] [] ? do_deBUG+0x88/0x180[44069.735915] [] ? deBUG_stack_correct+0x30/0x38[44069.735928] [] ? testhrarr_startup+0x33/0x52 [testhrarr][44069.735940] [] ? testhrarr_proc_show+0x32/0x57 [testhrarr][44069.735952] [] ? seq_read+0x145/0x390[44069.735963] [] ? seq_read+0x0/0x390[44069.735973] [] ? proc_reg_read+0x64/0xa0[44069.735985] [] ? vfs_read+0x9f/0x160[44069.735995] [] ? proc_reg_read+0x0/0xa0[44069.736003] [] ? sys_read+0x42/0x70[44069.736013] [] ? syscall_call+0x7/0xb[44069.736019] Dump stack from sample_hbp_handler[44069.740132] testhrarr_timer_function: testhrarr_runcount 0 [44069.740146] testhrarr jiffIEs 10942435 ; ret: 1 ; ktnsec: 44069740142485[44069.740159] testhrarr_arr_first value is changed[44069.740169] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.740176] Call Trace:[44069.740195] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.740213] [] ? __perf_event_overflow+0x90/0x240[44069.740227] [] ? perf_swevent_event+0x136/0x140[44069.740239] [] ? perf_bp_event+0x70/0x80[44069.740253] [] ? sched_clock_local+0xd3/0x1c0[44069.740267] [] ? format_decode+0x323/0x380[44069.740280] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.740292] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.740302] [] ? notifIEr_call_chain+0x45/0x60[44069.740313] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.740324] [] ? notify_dIE+0x2d/0x30[44069.740335] [] ? do_deBUG+0x88/0x180[44069.740345] [] ? deBUG_stack_correct+0x30/0x38[44069.740364] [] ? init_intel_cacheinfo+0x103/0x394[44069.740379] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.740391] [] ? __run_hrtimer+0x6f/0x190[44069.740404] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.740416] [] ? hrtimer_interrupt+0x108/0x240[44069.740430] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.740441] [] ? APIc_timer_interrupt+0x31/0x38[44069.740453] [] ? _raw_spin_unlock_irqrestore+0x15/0x20[44069.740465] [] ? try_to_del_timer_sync+0x67/0xb0[44069.740476] [] ? del_timer_sync+0x29/0x50[44069.740486] [] ? flush_delayed_work+0x13/0x40[44069.740500] [] ? tty_flush_to_ldisc+0x12/0x20[44069.740510] [] ? n_tty_poll+0x4f/0x190[44069.740523] [] ? tty_poll+0x6d/0x90[44069.740531] [] ? n_tty_poll+0x0/0x190[44069.740542] [] ? do_poll.clone.3+0xd0/0x210[44069.740553] [] ? do_sys_poll+0x134/0x1e0[44069.740563] [] ? __pollwait+0x0/0xd0[44069.740572] [] ? pollwake+0x0/0x60...[44069.740742] [] ? pollwake+0x0/0x60[44069.740757] [] ? rw_verify_area+0x6c/0x130[44069.740770] [] ? ktime_get_ts+0xf8/0x120[44069.740781] [] ? poll_select_set_timeout+0x64/0x70[44069.740793] [] ? sys_poll+0x5a/0xd0[44069.740804] [] ? syscall_call+0x7/0xb[44069.740815] [] ? init_intel_cacheinfo+0x23/0x394[44069.740822] Dump stack from sample_hbp_handler[44069.744130] testhrarr_timer_function: testhrarr_runcount 1 [44069.744143] testhrarr jiffIEs 10942436 ; ret: 1 ; ktnsec: 44069744140055[44069.748132] testhrarr_timer_function: testhrarr_runcount 2 [44069.748145] testhrarr jiffIEs 10942437 ; ret: 1 ; ktnsec: 44069748141271[44069.752131] testhrarr_timer_function: testhrarr_runcount 3 [44069.752145] testhrarr jiffIEs 10942438 ; ret: 1 ; ktnsec: 44069752141164[44069.756131] testhrarr_timer_function: testhrarr_runcount 4 [44069.756141] testhrarr jiffIEs 10942439 ; ret: 1 ; ktnsec: 44069756138318[44069.760130] testhrarr_timer_function: testhrarr_runcount 5 [44069.760141] testhrarr jiffIEs 10942440 ; ret: 1 ; ktnsec: 44069760138469[44069.760154] testhrarr_arr_first value is changed[44069.760164] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.760170] Call Trace:[44069.760187] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.760202] [] ? __perf_event_overflow+0x90/0x240[44069.760213] [] ? perf_swevent_event+0x136/0x140[44069.760224] [] ? perf_bp_event+0x70/0x80[44069.760235] [] ? sched_clock_local+0xd3/0x1c0[44069.760247] [] ? format_decode+0x323/0x380[44069.760258] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.760269] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.760279] [] ? notifIEr_call_chain+0x45/0x60[44069.760289] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.760299] [] ? notify_dIE+0x2d/0x30[44069.760308] [] ? do_deBUG+0x88/0x180[44069.760318] [] ? deBUG_stack_correct+0x30/0x38[44069.760334] [] ? init_intel_cacheinfo+0x103/0x394[44069.760345] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.760356] [] ? __run_hrtimer+0x6f/0x190[44069.760366] [] ? send_to_group.clone.1+0xf8/0x150[44069.760376] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.760387] [] ? hrtimer_interrupt+0x108/0x240[44069.760396] [] ? fsnotify+0x1a5/0x290[44069.760407] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.760416] [] ? APIc_timer_interrupt+0x31/0x38[44069.760428] [] ? mem_cgroup_resize_limit+0x108/0x1c0[44069.760437] [] ? fput+0x0/0x30[44069.760446] [] ? sys_write+0x67/0x70[44069.760455] [] ? syscall_call+0x7/0xb[44069.760464] [] ? init_intel_cacheinfo+0x23/0x394[44069.760470] Dump stack from sample_hbp_handler[44069.764134] testhrarr_timer_function: testhrarr_runcount 6 [44069.764147] testhrarr jiffIEs 10942441 ; ret: 1 ; ktnsec: 44069764144141[44069.768133] testhrarr_timer_function: testhrarr_runcount 7 [44069.768146] testhrarr jiffIEs 10942442 ; ret: 1 ; ktnsec: 44069768142976[44069.772134] testhrarr_timer_function: testhrarr_runcount 8 [44069.772148] testhrarr jiffIEs 10942443 ; ret: 1 ; ktnsec: 44069772144121[44069.776132] testhrarr_timer_function: testhrarr_runcount 9 [44069.776145] testhrarr jiffIEs 10942444 ; ret: 1 ; ktnsec: 44069776141971[44069.780133] testhrarr_timer_function: testhrarr_runcount 10 [44069.780141] testhrarr [ 5,]

…并且syslog通知:

CONfig_MODulE_FORCE_UNLOAD=y# deBUG build:# "CFLAGS was changed ... Fix it to use EXTRA_CFLAGS."overrIDe EXTRA_CFLAGS+=-g -O0 obj-m += testhrarr.o#testhrarr-obJs  := testhrarr.oall:    @echo EXTRA_CFLAGS = $(EXTRA_CFLAGS)    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modulesclean:    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

…我们可以看到HW断点设置为0xedf6c5c0,这是testhrarr_arr [0]的地址.现在,如果我们通过/ proc文件触发驱动程序:

/* * [http://www.tldp.org/LDP/lkmpg/2.6/HTML/lkmpg.HTML#AEN189 The linux Kernel Module Programming GuIDe] * https://stackoverflow.com/questions/16920238/reliability-of-linux-kernel-add-timer-at-resolution-of-one-jiffy/17055867#17055867 * https://stackoverflow.com/questions/8516021/proc-create-example-for-kernel-module/18924359#18924359 * http://lxr.free-electrons.com/source/samples/hw_breakpoint/data_breakpoint.c */#include <linux/module.h>   /* Needed by all modules */#include <linux/kernel.h>   /* Needed for KERN_INFO */#include <linux/init.h>     /* Needed for the macros */#include <linux/jiffIEs.h>#include <linux/time.h>#include <linux/proc_fs.h>  /* /proc entry */#include <linux/seq_file.h> /* /proc entry */#define ARRSIZE 5#define MAXRUNS 2*ARRSIZE#include <linux/hrtimer.h>#define HWDEBUG_STACK 1#if (HWDEBUG_STACK == 1)#include <linux/perf_event.h>#include <linux/hw_breakpoint.h>struct perf_event * __percpu *sample_hbp;static char ksym_name[KSYM_name_LEN] = "testhrarr_arr";module_param_string(ksym,ksym_name,KSYM_name_LEN,S_IRUGO);MODulE_PARM_DESC(ksym,"Kernel symbol to monitor; this module will report any"      " write operations on the kernel symbol");#endifstatic volatile int testhrarr_runcount = 0;static volatile int testhrarr_isRunning = 0;static unsigned long period_ms;static unsigned long period_ns;static ktime_t ktime_period_ns;static struct hrtimer my_hrtimer;static int* testhrarr_arr;static int* testhrarr_arr_first;static enum hrtimer_restart testhrarr_timer_function(struct hrtimer *timer){  unsigned long tjNow;  ktime_t kt_Now;  int ret_overrun;  printk(KERN_INFO    " %s: testhrarr_runcount %d \n",__func__,testhrarr_runcount);  if (testhrarr_runcount < MAXRUNS) {    tjNow = jiffIEs;    kt_Now = hrtimer_cb_get_time(&my_hrtimer);    ret_overrun = hrtimer_forward(&my_hrtimer,kt_Now,ktime_period_ns);    printk(KERN_INFO      " testhrarr jiffIEs %lu ; ret: %d ; ktnsec: %lld\n",tjNow,ret_overrun,ktime_to_ns(kt_Now));    testhrarr_arr[(testhrarr_runcount % ARRSIZE)] += testhrarr_runcount;    testhrarr_runcount++;    return HRTIMER_RESTART;  }  else {    int i;    testhrarr_isRunning = 0;    // do not use KERN_DEBUG etc,if printk buffering until newline is desired!    printk("testhrarr_arr [ ");    for(i=0; i<ARRSIZE; i++) {      printk("%d,",testhrarr_arr[i]);    }    printk("]\n");    return HRTIMER_norESTART;  }}static voID testhrarr_startup(voID){  if (testhrarr_isRunning == 0) {    testhrarr_isRunning = 1;    testhrarr_runcount = 0;    testhrarr_arr[0] = 0; //just the first element    hrtimer_start(&my_hrtimer,HRTIMER_MODE_REL);  }}static int testhrarr_proc_show(struct seq_file *m,voID *v) {  if (testhrarr_isRunning == 0) {    seq_printf(m,"testhrarr proc: startup\n");    testhrarr_startup();  } else {    seq_printf(m,testhrarr_runcount);  }  return 0;}static int testhrarr_proc_open(struct inode *inode,struct  file *file) {  return single_open(file,testhrarr_proc_show,NulL);}static const struct file_operations testhrarr_proc_fops = {  .owner = THIS_MODulE,.open = testhrarr_proc_open,.read = seq_read,.llseek = seq_lseek,.release = single_release,};#if (HWDEBUG_STACK == 1)static voID sample_hbp_handler(struct perf_event *bp,struct perf_sample_data *data,struct pt_regs *regs){  printk(KERN_INFO "%s value is changed\n",ksym_name);  dump_stack();  printk(KERN_INFO "Dump stack from sample_hbp_handler\n");}#endifstatic int __init testhrarr_init(voID){  struct timespec tp_hr_res;  #if (HWDEBUG_STACK == 1)  struct perf_event_attr attr;  #endif  period_ms = 1000/HZ;  hrtimer_get_res(CLOCK_MONOTONIC,&tp_hr_res);  printk(KERN_INFO    "Init testhrarr: %d ; HZ: %d ; 1/HZ (ms): %ld ; hrres: %lld.%.9ld\n",testhrarr_runcount,HZ,period_ms,(long long)tp_hr_res.tv_sec,tp_hr_res.tv_nsec );  testhrarr_arr = (int*)kcalloc(ARRSIZE,sizeof(int),GFP_ATOMIC);  testhrarr_arr_first = &testhrarr_arr[0];  hrtimer_init(&my_hrtimer,CLOCK_MONOTONIC,HRTIMER_MODE_REL);  my_hrtimer.function = &testhrarr_timer_function;  period_ns = period_ms*( (unsigned long)1E6L );  ktime_period_ns = ktime_set(0,period_ns);  printk(KERN_INFO    " Addresses: _runcount 0x%p ; _arr 0x%p ; _arr[0] 0x%p (0x%p) ; _timer_function 0x%p ; my_hrtimer 0x%p; my_hrt.f 0x%p\n",&testhrarr_runcount,&testhrarr_arr,&(testhrarr_arr[0]),testhrarr_arr_first,&testhrarr_timer_function,&my_hrtimer,&my_hrtimer.function);  proc_create("testhrarr_proc",NulL,&testhrarr_proc_fops);  #if (HWDEBUG_STACK == 1)  hw_breakpoint_init(&attr);  if (strcmp(ksym_name,"testhrarr_arr_first") == 0) {    // just for testhrarr_arr_first - interpret the found symbol address    // as int*,and dereference it to get the "real" address it points to    attr.bp_addr = *((int*)kallsyms_lookup_name(ksym_name));  } else {    // the usual - address is kallsyms_lookup_name result    attr.bp_addr = kallsyms_lookup_name(ksym_name);  }  attr.bp_len = HW_BREAKPOINT_LEN_1;  attr.bp_type = HW_BREAKPOINT_W ; //| HW_BREAKPOINT_R;  sample_hbp = register_wIDe_hw_breakpoint(&attr,(perf_overflow_handler_t)sample_hbp_handler);  if (IS_ERR((voID __force *)sample_hbp)) {    int ret = PTR_ERR((voID __force *)sample_hbp);    printk(KERN_INFO "Breakpoint registration Failed\n");    return ret;  }  // explicit cast needed to show 64-bit bp_addr as 32-bit address  // https://stackoverflow.com/questions/11796909/how-to-resolve-cast-to-pointer-from-integer-of-different-size-warning-in-c-co/11797103#11797103  printk(KERN_INFO "HW Breakpoint for %s write installed (0x%p)\n",(voID*)(uintptr_t)attr.bp_addr);  #endif  return 0;}static voID __exit testhrarr_exit(voID){  int ret_cancel = 0;  kfree(testhrarr_arr);  while( hrtimer_callback_running(&my_hrtimer) ) {    ret_canceL++;  }  if (ret_cancel != 0) {    printk(KERN_INFO " testhrarr Waited for hrtimer callback to finish (%d)\n",ret_cancel);  }  if (hrtimer_active(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr active hrtimer cancelled: %d (%d)\n",ret_cancel,testhrarr_runcount);  }  if (hrtimer_is_queued(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr queued hrtimer cancelled: %d (%d)\n",testhrarr_runcount);  }  remove_proc_entry("testhrarr_proc",NulL);  #if (HWDEBUG_STACK == 1)  unregister_wIDe_hw_breakpoint(sample_hbp);  printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n",ksym_name);  #endif  printk(KERN_INFO "Exit testhrarr\n");}module_init(testhrarr_init);module_exit(testhrarr_exit);MODulE_liCENSE("GPL");

…我们在syslog中获取:

  

…我们得到一个堆栈跟踪正好三次 – 一次在testhrarr_startup期间,两次在testhrarr_timer_function中:一次用于runco​​unt == 0,一次用于runco​​unt == 5,如预期的那样.

嗯,希望这有助于某人,
干杯!

Makefile文件

testhrarr.c

[+++] 总结

以上是内存溢出为你收集整理的调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?全部内容,希望文章能够帮你解决调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

)
File: /www/wwwroot/outofmemory.cn/tmp/route_read.php, Line: 126, InsideLink()
File: /www/wwwroot/outofmemory.cn/tmp/index.inc.php, Line: 166, include(/www/wwwroot/outofmemory.cn/tmp/route_read.php)
File: /www/wwwroot/outofmemory.cn/index.php, Line: 30, include(/www/wwwroot/outofmemory.cn/tmp/index.inc.php)
调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?_系统运维_内存溢出

调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?

调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?,第1张

概述我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪. 例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数. 现在,我 我想以某种方式“监视” Linux内核中的变量(或内存地址)(确切地说是内核模块/驱动程序);并找出改变它的原因 – 基本上,当变量改变时打印出堆栈跟踪.

例如,在this answer年末列出的内核模块testjiffy-hr.c中,我想在每次runco​​unt变量更改时打印出堆栈跟踪;希望堆栈跟踪然后会提到testjiffy_timer_function,这确实是改变该变量的函数.

现在,我知道我可以使用kgdb连接到在虚拟机中运行的调试linux内核,甚至可以设置断点(所以希望也是观察点) – 但问题是我实际上想要调试ALSA驱动程序,特别是播放dma_area缓冲区(我得到了一些意想不到的数据) – 这对时间非常敏感;并且运行调试内核本身会弄乱时间(更不用说在虚拟机中运行它)了.

这里更大的问题是回放dma_area指针仅在回放 *** 作期间(或者换句话说,在_start和_stop处理程序之间)存在 – 所以我必须在每个_start回调中记录dma_area地址,然后以某种方式“在播放 *** 作期间安排“用于”观看“.

所以我希望有一种方法可以直接在驱动程序代码中执行此类 *** 作 – 例如,在此_start回调中添加一些代码来记录dma_area指针,并将其用作启动“watch”的命令的参数改变;从相应的回调函数打印堆栈跟踪. (我知道这也会影响时间,但我希望它能够“轻松”,不会过多地影响“实时”驱动程序 *** 作).

所以我的问题是:在linux内核中进行调试的这种技术是否存在?

如果不是:是否可以设置硬件(或软件)中断,该中断会对特定内存地址的更改做出反应?然后,我可以设置这样的中断处理程序,可以打印出堆栈跟踪吗? (虽然,我认为整个上下文在IRQ处理程序运行时会发生变化,因此可能会出现堆栈跟踪错误)?

如果没有:是否还有其他技术,这将允许我打印进程的堆栈跟踪,该跟踪更改存储在内核中给定内存位置的值(希望在实时的非调试内核中)?

解决方法 非常感谢 @CosminRatiu和 Eugene的回复;多亏了那些,我发现:

> debugging – Linux kernel hardware break points – Stack Overflow
> Hardware Breakpoint (or watchpoint) – The Linux Kernel Archives

…我可以用它来开发我在这里发布的示例,testhrarr.c内核模块/驱动程序和Makefile(下面).它表明硬件观察点跟踪可以通过两种方式实现:使用perf程序,它可以不变地探测驱动程序;或者通过向驱动程序添加一些硬件断点代码(在示例中,由HWDEBUG_STACK定义变量包含).

本质上,调试像int这样的标准原子变量类型(如runco​​unt变量)的内容很简单,只要它们被定义为内核模块中的全局变量,因此它们最终全局显示为内核符号.因此,下面的代码将testhrarr_添加为变量的前缀(以避免命名冲突).但是,由于需要解除引用,调试数组的内容可能有点棘手 – 这就是本文演示的内容,调试testhrarr_arr数组的第一个字节.它完成于:

$echo `cat /etc/lsb-release` disTRIB_ID=Ubuntu disTRIB_RELEASE=11.04 disTRIB_CODEname=natty disTRIB_DESCRIPTION="Ubuntu 11.04"$uname -alinux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/linux$cat /proc/cpuinfo | grep "model name"model name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHzmodel name  : Intel(R) Atom(TM) cpu N450   @ 1.66GHz

testhrarr模块基本上在模块初始化时为小数组分配内存,设置定时器函数,并公开/ proc / testhrarr_proc文件(使用较新的proc_create接口).然后,尝试从/ proc / testhrarr_proc文件(例如,使用cat)读取将触发计时器功能,该功能将修改testhrarr_arr数组值,并将消息转储到/ var / log / syslog.我们期望testhrarr_arr [0]在 *** 作期间会改变三次;一次在testhrarr_startup中,两次在testhrarr_timer_function中(由于换行).

使用perf

使用make构建模块后,您可以使用以下命令加载它:

sudo insmod ./testhrarr.ko

此时,/ var / log / syslog将包含:

kernel: [40277.199913] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [40277.199930]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xed182a80 (0xed182a80) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [40277.220329] HW Breakpoint for testhrarr_arr write installed (0xf84be2a0)

注意,只是将testhrarr_arr作为硬件观察点的符号传递,扫描该变量的地址(0xf84be2a0),而不是数组的第一个元素的地址(0xed182a80)!因此,硬件断点不会触发 – 因此行为就好像硬件断点代码根本不存在(可以通过取消定义HWDEBUG_STACK来实现)!

因此,即使没有通过内核模块代码设置的硬件断点,我们仍然可以使用perf来观察内存地址的变化 – 在perf中,我们指定我们要监视的地址(这里是testhrarr_arr的第一个元素的地址,0xed182a80),以及应该运行的进程:这里我们运行bash,所以我们可以执行cat / proc / testhrarr_proc,它将触发内核模块定时器,然后是sleep 0.5,这将允许定时器完成. -a参数也是必需的,否则可能会遗漏一些事件:

$sudo perf record -a --call-graph --event=mem:0xed182a80:w bash -c 'cat /proc/testhrarr_proc ; sleep 0.5'testhrarr proc: startup[ perf record: Woken up 1 times to write data ][ perf record: Captured and wrote 0.485 MB perf.data (~21172 samples) ]

此时,/ var / log / syslog还包含以下内容:

[40822.114964]  testhrarr_timer_function: testhrarr_runcount 0 [40822.114980]  testhrarr jiffIEs 10130528 ; ret: 1 ; ktnsec: 40822114975062[40822.118956]  testhrarr_timer_function: testhrarr_runcount 1 [40822.118977]  testhrarr jiffIEs 10130529 ; ret: 1 ; ktnsec: 40822118973195[40822.122940]  testhrarr_timer_function: testhrarr_runcount 2 [40822.122956]  testhrarr jiffIEs 10130530 ; ret: 1 ; ktnsec: 40822122951143[40822.126962]  testhrarr_timer_function: testhrarr_runcount 3 [40822.126978]  testhrarr jiffIEs 10130531 ; ret: 1 ; ktnsec: 40822126973583[40822.130941]  testhrarr_timer_function: testhrarr_runcount 4 [40822.130961]  testhrarr jiffIEs 10130532 ; ret: 1 ; ktnsec: 40822130955167[40822.134940]  testhrarr_timer_function: testhrarr_runcount 5 [40822.134962]  testhrarr jiffIEs 10130533 ; ret: 1 ; ktnsec: 40822134958888[40822.138936]  testhrarr_timer_function: testhrarr_runcount 6 [40822.138958]  testhrarr jiffIEs 10130534 ; ret: 1 ; ktnsec: 40822138955693[40822.142940]  testhrarr_timer_function: testhrarr_runcount 7 [40822.142962]  testhrarr jiffIEs 10130535 ; ret: 1 ; ktnsec: 40822142959345[40822.146936]  testhrarr_timer_function: testhrarr_runcount 8 [40822.146957]  testhrarr jiffIEs 10130536 ; ret: 1 ; ktnsec: 40822146954479[40822.150949]  testhrarr_timer_function: testhrarr_runcount 9 [40822.150970]  testhrarr jiffIEs 10130537 ; ret: 1 ; ktnsec: 40822150963438[40822.154974]  testhrarr_timer_function: testhrarr_runcount 10 [40822.154988] testhrarr [ 5,7,9,11,13,]

要读取perf(一个名为perf.data的文件)的捕获,我们可以使用:

$sudo perf report --call-graph flat --stdioNo kallsyms or vmlinux with build-ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 was found[testhrarr] with build ID 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 not found,continuing without symbolsFailed to open /bin/cat,continuing without symbolsFailed to open /usr/lib/libpixman-1.so.0.20.2,continuing without symbolsFailed to open /usr/lib/xorg/modules/drivers/intel_drv.so,continuing without symbolsFailed to open /usr/bin/Xorg,continuing without symbols# Events: 5  unkNown## Overhead  Command  Shared Object                                Symbol# ........  .......  .............  ....................................#    87.50%     Xorg  [testhrarr]    [k] testhrarr_timer_function            87.50%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                0x30185d                0x2ed701                0x2ed8cc                0x2edba0                0x9d0386                0x8126fc8                0x81217a1                0x811bdd3                0x8070aa7                0x806281c                __libc_start_main                0x8062411     6.25%      cat  [testhrarr]    [k] testhrarr_timer_function             6.25%                testhrarr_timer_function                testhrarr_proc_show                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081     3.12%  swapper  [testhrarr]    [k] testhrarr_timer_function             3.12%                testhrarr_timer_function                __run_hrtimer                hrtimer_interrupt                smp_APIc_timer_interrupt                APIc_timer_interrupt                cpuIDle_IDle_call                cpu_IDle                start_secondary     3.12%      cat  [testhrarr]    [k] 0x356                3.12%                0xf84bc356                0xf84bc3a7                seq_read                proc_reg_read                vfs_read                sys_read                syscall_call                0xaa2416                0x8049f4d                __libc_start_main                0x8049081## (For a higher level overvIEw,try: perf report --sort comm,dso)#

因此,由于我们正在使用调试on(Makefile中的-g)构建内核模块,所以即使实时内核不是调试内核,perf也不能找到该模块的符号.所以它在大多数时候正确地将testhrarr_timer_function解释为setter,虽然它没有报告testhrarr_startup(但它报告了testhrarr_proc_show调用它).还有对0xf84bc3a7和0xf84bc356的引用无法解析;但请注意,模块加载为0xf84bc000:

$sudo cat /proc/modules | grep testhrtesthrarr 13433 0 - live 0xf84bc000

……并且该条目也以…开头[k] 0x356;如果我们查看内核模块的objdump:

$objdump -S testhrarr.ko | less...00000323 :static voID testhrarr_startup(voID){...    testhrarr_arr[0] = 0; //just the first element 34b:   a1 80 00 00 00          mov    0x80,%eax 350:   c7 00 00 00 00 00       movl   
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr[0]
x0,(%eax) hrtimer_start(&my_hrtimer,ktime_period_ns,HRTIMER_MODE_REL); 356: c7 04 24 01 00 00 00 movl
$sudo rmmod testhrarr    # remove module if still loaded$sudo insmod ./testhrarr.ko ksym=testhrarr_arr_first
x1,(%esp) ********** 35d: 8b 15 1c 00 00 00 mov 0x1c,%edx...00000375 :static int testhrarr_proc_show(struct seq_file *m,voID *v) {... seq_printf(m,"testhrarr proc: startup\n"); 38f: c7 44 24 04 79 00 00 movl
kernel: [43910.509726] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001kernel: [43910.509765]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xedf6c5c0 (0xedf6c5c0) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27ckernel: [43910.538535] HW Breakpoint for testhrarr_arr_first write installed (0xedf6c5c0)
x79,0x4(%esp) 396: 00 397: 8b 45 fc mov -0x4(%ebp),%eax 39a: 89 04 24 mov %eax,(%esp) 39d: e8 fc ff ff ff call 39e testhrarr_startup(); 3a2: e8 7c ff ff ff call 323 3a7: eb 1c jmp 3c5 ********** } else { seq_printf(m,"testhrarr proc: (is running,%d)\n",testhrarr_runcount); 3a9: a1 0c 00 00 00 mov 0xc,%eax...

…所以0xf84bc356显然是指hrtimer_start;和0xf84bc3a7 – > 3a7指其调用testhrarr_proc_show函数;值得庆幸的是. (请注意,我已经体验过不同版本的驱动程序,_start可以显示,而timer_function由纯粹的地址表示;不确定是什么原因).

然而,perf的一个问题是,它给了我这些函数的统计“开销”(不确定是什么意思 – 可能是在函数的进入和退出之间花费的时间?) – 但我真正想要的是堆栈跟踪的日志是顺序的.不确定是否可以为此设置perf – 但绝对可以使用内核模块代码来完成硬件断点.

使用内核模块HW断点

HWDEBUG_STACK中的代码实现了HW断点的设置和处理.如上所述,符号ksym_name(如果未指定)的默认设置是testhrarr_arr,它根本不触发硬件断点.在insmod期间,可以在命令行中指定ksym_name参数;在这里我们可以注意到:

$cat /proc/testhrarr_proc testhrarr proc: startup

…在/ var / log / syslog中安装了HW Breakpoint for testhrarr_arr [0]的结果(0x(null)); – 这意味着我们不能使用带括号表示法的符号进行数组访问;谢天谢地,这里的空指针只是意味着HW断点将再次不会触发;它不会完全崩溃 *** 作系统:)

但是,有一个全局变量用于引用testhrarr_arr数组的第一个元素,称为testhrarr_arr_first – 注意如何在代码中专门处理此全局变量,并且需要取消引用,以便获得正确的地址.所以我们这样做:

kernel: [44069.735695] testhrarr_arr_first value is changed[44069.735711] PID: 29320,comm: cat Not tainted 2.6.38-16-generic #67-Ubuntu[44069.735719] Call Trace:[44069.735737] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.735755] [] ? __perf_event_overflow+0x90/0x240[44069.735768] [] ? proc_alloc_inode+0x23/0x90[44069.735778] [] ? proc_alloc_inode+0x23/0x90[44069.735790] [] ? perf_swevent_event+0x136/0x140[44069.735801] [] ? perf_bp_event+0x70/0x80[44069.735812] [] ? prep_new_page+0x110/0x1a0[44069.735824] [] ? get_page_from_freeList+0x12e/0x320[44069.735836] [] ? seq_open+0x3d/0xa0[44069.735848] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.735861] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.735872] [] ? notifIEr_call_chain+0x45/0x60[44069.735883] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.735894] [] ? notify_dIE+0x2d/0x30[44069.735904] [] ? do_deBUG+0x88/0x180[44069.735915] [] ? deBUG_stack_correct+0x30/0x38[44069.735928] [] ? testhrarr_startup+0x33/0x52 [testhrarr][44069.735940] [] ? testhrarr_proc_show+0x32/0x57 [testhrarr][44069.735952] [] ? seq_read+0x145/0x390[44069.735963] [] ? seq_read+0x0/0x390[44069.735973] [] ? proc_reg_read+0x64/0xa0[44069.735985] [] ? vfs_read+0x9f/0x160[44069.735995] [] ? proc_reg_read+0x0/0xa0[44069.736003] [] ? sys_read+0x42/0x70[44069.736013] [] ? syscall_call+0x7/0xb[44069.736019] Dump stack from sample_hbp_handler[44069.740132] testhrarr_timer_function: testhrarr_runcount 0 [44069.740146] testhrarr jiffIEs 10942435 ; ret: 1 ; ktnsec: 44069740142485[44069.740159] testhrarr_arr_first value is changed[44069.740169] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.740176] Call Trace:[44069.740195] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.740213] [] ? __perf_event_overflow+0x90/0x240[44069.740227] [] ? perf_swevent_event+0x136/0x140[44069.740239] [] ? perf_bp_event+0x70/0x80[44069.740253] [] ? sched_clock_local+0xd3/0x1c0[44069.740267] [] ? format_decode+0x323/0x380[44069.740280] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.740292] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.740302] [] ? notifIEr_call_chain+0x45/0x60[44069.740313] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.740324] [] ? notify_dIE+0x2d/0x30[44069.740335] [] ? do_deBUG+0x88/0x180[44069.740345] [] ? deBUG_stack_correct+0x30/0x38[44069.740364] [] ? init_intel_cacheinfo+0x103/0x394[44069.740379] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.740391] [] ? __run_hrtimer+0x6f/0x190[44069.740404] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.740416] [] ? hrtimer_interrupt+0x108/0x240[44069.740430] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.740441] [] ? APIc_timer_interrupt+0x31/0x38[44069.740453] [] ? _raw_spin_unlock_irqrestore+0x15/0x20[44069.740465] [] ? try_to_del_timer_sync+0x67/0xb0[44069.740476] [] ? del_timer_sync+0x29/0x50[44069.740486] [] ? flush_delayed_work+0x13/0x40[44069.740500] [] ? tty_flush_to_ldisc+0x12/0x20[44069.740510] [] ? n_tty_poll+0x4f/0x190[44069.740523] [] ? tty_poll+0x6d/0x90[44069.740531] [] ? n_tty_poll+0x0/0x190[44069.740542] [] ? do_poll.clone.3+0xd0/0x210[44069.740553] [] ? do_sys_poll+0x134/0x1e0[44069.740563] [] ? __pollwait+0x0/0xd0[44069.740572] [] ? pollwake+0x0/0x60...[44069.740742] [] ? pollwake+0x0/0x60[44069.740757] [] ? rw_verify_area+0x6c/0x130[44069.740770] [] ? ktime_get_ts+0xf8/0x120[44069.740781] [] ? poll_select_set_timeout+0x64/0x70[44069.740793] [] ? sys_poll+0x5a/0xd0[44069.740804] [] ? syscall_call+0x7/0xb[44069.740815] [] ? init_intel_cacheinfo+0x23/0x394[44069.740822] Dump stack from sample_hbp_handler[44069.744130] testhrarr_timer_function: testhrarr_runcount 1 [44069.744143] testhrarr jiffIEs 10942436 ; ret: 1 ; ktnsec: 44069744140055[44069.748132] testhrarr_timer_function: testhrarr_runcount 2 [44069.748145] testhrarr jiffIEs 10942437 ; ret: 1 ; ktnsec: 44069748141271[44069.752131] testhrarr_timer_function: testhrarr_runcount 3 [44069.752145] testhrarr jiffIEs 10942438 ; ret: 1 ; ktnsec: 44069752141164[44069.756131] testhrarr_timer_function: testhrarr_runcount 4 [44069.756141] testhrarr jiffIEs 10942439 ; ret: 1 ; ktnsec: 44069756138318[44069.760130] testhrarr_timer_function: testhrarr_runcount 5 [44069.760141] testhrarr jiffIEs 10942440 ; ret: 1 ; ktnsec: 44069760138469[44069.760154] testhrarr_arr_first value is changed[44069.760164] PID: 4302,comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu[44069.760170] Call Trace:[44069.760187] [] ? sample_hbp_handler+0x2d/0x3b [testhrarr][44069.760202] [] ? __perf_event_overflow+0x90/0x240[44069.760213] [] ? perf_swevent_event+0x136/0x140[44069.760224] [] ? perf_bp_event+0x70/0x80[44069.760235] [] ? sched_clock_local+0xd3/0x1c0[44069.760247] [] ? format_decode+0x323/0x380[44069.760258] [] ? hw_breakpoint_handler.clone.0+0x102/0x130[44069.760269] [] ? hw_breakpoint_exceptions_notify+0x22/0x30[44069.760279] [] ? notifIEr_call_chain+0x45/0x60[44069.760289] [] ? atomic_notifIEr_call_chain+0x22/0x30[44069.760299] [] ? notify_dIE+0x2d/0x30[44069.760308] [] ? do_deBUG+0x88/0x180[44069.760318] [] ? deBUG_stack_correct+0x30/0x38[44069.760334] [] ? init_intel_cacheinfo+0x103/0x394[44069.760345] [] ? testhrarr_timer_function+0xed/0x160 [testhrarr][44069.760356] [] ? __run_hrtimer+0x6f/0x190[44069.760366] [] ? send_to_group.clone.1+0xf8/0x150[44069.760376] [] ? testhrarr_timer_function+0x0/0x160 [testhrarr][44069.760387] [] ? hrtimer_interrupt+0x108/0x240[44069.760396] [] ? fsnotify+0x1a5/0x290[44069.760407] [] ? smp_APIc_timer_interrupt+0x56/0x8a[44069.760416] [] ? APIc_timer_interrupt+0x31/0x38[44069.760428] [] ? mem_cgroup_resize_limit+0x108/0x1c0[44069.760437] [] ? fput+0x0/0x30[44069.760446] [] ? sys_write+0x67/0x70[44069.760455] [] ? syscall_call+0x7/0xb[44069.760464] [] ? init_intel_cacheinfo+0x23/0x394[44069.760470] Dump stack from sample_hbp_handler[44069.764134] testhrarr_timer_function: testhrarr_runcount 6 [44069.764147] testhrarr jiffIEs 10942441 ; ret: 1 ; ktnsec: 44069764144141[44069.768133] testhrarr_timer_function: testhrarr_runcount 7 [44069.768146] testhrarr jiffIEs 10942442 ; ret: 1 ; ktnsec: 44069768142976[44069.772134] testhrarr_timer_function: testhrarr_runcount 8 [44069.772148] testhrarr jiffIEs 10942443 ; ret: 1 ; ktnsec: 44069772144121[44069.776132] testhrarr_timer_function: testhrarr_runcount 9 [44069.776145] testhrarr jiffIEs 10942444 ; ret: 1 ; ktnsec: 44069776141971[44069.780133] testhrarr_timer_function: testhrarr_runcount 10 [44069.780141] testhrarr [ 5,]

…并且syslog通知:

CONfig_MODulE_FORCE_UNLOAD=y# deBUG build:# "CFLAGS was changed ... Fix it to use EXTRA_CFLAGS."overrIDe EXTRA_CFLAGS+=-g -O0 obj-m += testhrarr.o#testhrarr-obJs  := testhrarr.oall:    @echo EXTRA_CFLAGS = $(EXTRA_CFLAGS)    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modulesclean:    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

…我们可以看到HW断点设置为0xedf6c5c0,这是testhrarr_arr [0]的地址.现在,如果我们通过/ proc文件触发驱动程序:

/* * [http://www.tldp.org/LDP/lkmpg/2.6/HTML/lkmpg.HTML#AEN189 The linux Kernel Module Programming GuIDe] * https://stackoverflow.com/questions/16920238/reliability-of-linux-kernel-add-timer-at-resolution-of-one-jiffy/17055867#17055867 * https://stackoverflow.com/questions/8516021/proc-create-example-for-kernel-module/18924359#18924359 * http://lxr.free-electrons.com/source/samples/hw_breakpoint/data_breakpoint.c */#include <linux/module.h>   /* Needed by all modules */#include <linux/kernel.h>   /* Needed for KERN_INFO */#include <linux/init.h>     /* Needed for the macros */#include <linux/jiffIEs.h>#include <linux/time.h>#include <linux/proc_fs.h>  /* /proc entry */#include <linux/seq_file.h> /* /proc entry */#define ARRSIZE 5#define MAXRUNS 2*ARRSIZE#include <linux/hrtimer.h>#define HWDEBUG_STACK 1#if (HWDEBUG_STACK == 1)#include <linux/perf_event.h>#include <linux/hw_breakpoint.h>struct perf_event * __percpu *sample_hbp;static char ksym_name[KSYM_name_LEN] = "testhrarr_arr";module_param_string(ksym,ksym_name,KSYM_name_LEN,S_IRUGO);MODulE_PARM_DESC(ksym,"Kernel symbol to monitor; this module will report any"      " write operations on the kernel symbol");#endifstatic volatile int testhrarr_runcount = 0;static volatile int testhrarr_isRunning = 0;static unsigned long period_ms;static unsigned long period_ns;static ktime_t ktime_period_ns;static struct hrtimer my_hrtimer;static int* testhrarr_arr;static int* testhrarr_arr_first;static enum hrtimer_restart testhrarr_timer_function(struct hrtimer *timer){  unsigned long tjNow;  ktime_t kt_Now;  int ret_overrun;  printk(KERN_INFO    " %s: testhrarr_runcount %d \n",__func__,testhrarr_runcount);  if (testhrarr_runcount < MAXRUNS) {    tjNow = jiffIEs;    kt_Now = hrtimer_cb_get_time(&my_hrtimer);    ret_overrun = hrtimer_forward(&my_hrtimer,kt_Now,ktime_period_ns);    printk(KERN_INFO      " testhrarr jiffIEs %lu ; ret: %d ; ktnsec: %lld\n",tjNow,ret_overrun,ktime_to_ns(kt_Now));    testhrarr_arr[(testhrarr_runcount % ARRSIZE)] += testhrarr_runcount;    testhrarr_runcount++;    return HRTIMER_RESTART;  }  else {    int i;    testhrarr_isRunning = 0;    // do not use KERN_DEBUG etc,if printk buffering until newline is desired!    printk("testhrarr_arr [ ");    for(i=0; i<ARRSIZE; i++) {      printk("%d,",testhrarr_arr[i]);    }    printk("]\n");    return HRTIMER_norESTART;  }}static voID testhrarr_startup(voID){  if (testhrarr_isRunning == 0) {    testhrarr_isRunning = 1;    testhrarr_runcount = 0;    testhrarr_arr[0] = 0; //just the first element    hrtimer_start(&my_hrtimer,HRTIMER_MODE_REL);  }}static int testhrarr_proc_show(struct seq_file *m,voID *v) {  if (testhrarr_isRunning == 0) {    seq_printf(m,"testhrarr proc: startup\n");    testhrarr_startup();  } else {    seq_printf(m,testhrarr_runcount);  }  return 0;}static int testhrarr_proc_open(struct inode *inode,struct  file *file) {  return single_open(file,testhrarr_proc_show,NulL);}static const struct file_operations testhrarr_proc_fops = {  .owner = THIS_MODulE,.open = testhrarr_proc_open,.read = seq_read,.llseek = seq_lseek,.release = single_release,};#if (HWDEBUG_STACK == 1)static voID sample_hbp_handler(struct perf_event *bp,struct perf_sample_data *data,struct pt_regs *regs){  printk(KERN_INFO "%s value is changed\n",ksym_name);  dump_stack();  printk(KERN_INFO "Dump stack from sample_hbp_handler\n");}#endifstatic int __init testhrarr_init(voID){  struct timespec tp_hr_res;  #if (HWDEBUG_STACK == 1)  struct perf_event_attr attr;  #endif  period_ms = 1000/HZ;  hrtimer_get_res(CLOCK_MONOTONIC,&tp_hr_res);  printk(KERN_INFO    "Init testhrarr: %d ; HZ: %d ; 1/HZ (ms): %ld ; hrres: %lld.%.9ld\n",testhrarr_runcount,HZ,period_ms,(long long)tp_hr_res.tv_sec,tp_hr_res.tv_nsec );  testhrarr_arr = (int*)kcalloc(ARRSIZE,sizeof(int),GFP_ATOMIC);  testhrarr_arr_first = &testhrarr_arr[0];  hrtimer_init(&my_hrtimer,CLOCK_MONOTONIC,HRTIMER_MODE_REL);  my_hrtimer.function = &testhrarr_timer_function;  period_ns = period_ms*( (unsigned long)1E6L );  ktime_period_ns = ktime_set(0,period_ns);  printk(KERN_INFO    " Addresses: _runcount 0x%p ; _arr 0x%p ; _arr[0] 0x%p (0x%p) ; _timer_function 0x%p ; my_hrtimer 0x%p; my_hrt.f 0x%p\n",&testhrarr_runcount,&testhrarr_arr,&(testhrarr_arr[0]),testhrarr_arr_first,&testhrarr_timer_function,&my_hrtimer,&my_hrtimer.function);  proc_create("testhrarr_proc",NulL,&testhrarr_proc_fops);  #if (HWDEBUG_STACK == 1)  hw_breakpoint_init(&attr);  if (strcmp(ksym_name,"testhrarr_arr_first") == 0) {    // just for testhrarr_arr_first - interpret the found symbol address    // as int*,and dereference it to get the "real" address it points to    attr.bp_addr = *((int*)kallsyms_lookup_name(ksym_name));  } else {    // the usual - address is kallsyms_lookup_name result    attr.bp_addr = kallsyms_lookup_name(ksym_name);  }  attr.bp_len = HW_BREAKPOINT_LEN_1;  attr.bp_type = HW_BREAKPOINT_W ; //| HW_BREAKPOINT_R;  sample_hbp = register_wIDe_hw_breakpoint(&attr,(perf_overflow_handler_t)sample_hbp_handler);  if (IS_ERR((voID __force *)sample_hbp)) {    int ret = PTR_ERR((voID __force *)sample_hbp);    printk(KERN_INFO "Breakpoint registration Failed\n");    return ret;  }  // explicit cast needed to show 64-bit bp_addr as 32-bit address  // https://stackoverflow.com/questions/11796909/how-to-resolve-cast-to-pointer-from-integer-of-different-size-warning-in-c-co/11797103#11797103  printk(KERN_INFO "HW Breakpoint for %s write installed (0x%p)\n",(voID*)(uintptr_t)attr.bp_addr);  #endif  return 0;}static voID __exit testhrarr_exit(voID){  int ret_cancel = 0;  kfree(testhrarr_arr);  while( hrtimer_callback_running(&my_hrtimer) ) {    ret_canceL++;  }  if (ret_cancel != 0) {    printk(KERN_INFO " testhrarr Waited for hrtimer callback to finish (%d)\n",ret_cancel);  }  if (hrtimer_active(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr active hrtimer cancelled: %d (%d)\n",ret_cancel,testhrarr_runcount);  }  if (hrtimer_is_queued(&my_hrtimer) != 0) {    ret_cancel = hrtimer_cancel(&my_hrtimer);    printk(KERN_INFO " testhrarr queued hrtimer cancelled: %d (%d)\n",testhrarr_runcount);  }  remove_proc_entry("testhrarr_proc",NulL);  #if (HWDEBUG_STACK == 1)  unregister_wIDe_hw_breakpoint(sample_hbp);  printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n",ksym_name);  #endif  printk(KERN_INFO "Exit testhrarr\n");}module_init(testhrarr_init);module_exit(testhrarr_exit);MODulE_liCENSE("GPL");

…我们在syslog中获取:

  

…我们得到一个堆栈跟踪正好三次 – 一次在testhrarr_startup期间,两次在testhrarr_timer_function中:一次用于runco​​unt == 0,一次用于runco​​unt == 5,如预期的那样.

嗯,希望这有助于某人,
干杯!

Makefile文件

testhrarr.c

总结

以上是内存溢出为你收集整理的调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?全部内容,希望文章能够帮你解决调试 – 观察Linux内核中的变量(内存地址)更改,并在更改时打印堆栈跟踪?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/yw/1047955.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-25
下一篇 2022-05-25

发表评论

登录后才能评论

评论列表(0条)

保存