Nathan Friess
2018-08-19 18:50:55 UTC
Hi,
While testing out the new PVH support in a domU (which is running
great!), I discovered a kernel panic related to xen and vimage support
when trying to add an xn interface into a bridge.
I'm running r337024 from svn. Removing vimage (which seems to be turned
on in 12-CURRENT now) allows using the bridge with no panics. As part
of attempting to debug this I enabled vimage in my 11.2 domU and that
also panics in the same code.
I'm not sure if the problem is a xen issue or a vimage issue so I
haven't submitted a PR yet. The kernel output is listed below.
It looks like netfront_backend_changed() calls netfront_send_fake_arp(),
which calls arp_ifinit() on the interface. The first line of the call
stack with arprequest+0x454 corresponds to a call to
ARPSTAT_INC(txrequests) at the end of arprequest, which expands to
VNET_PCPUSTAT_ADD(). I tried to debug further and I got a little lost,
but that's where I figured out that vimage is involved somehow.
Are there any thoughts on why the xn interface would cause a panic there?
Thanks,
Nathan
=======
Steps to reproduce:
# ifconfig bridge create
bridge0
# ifconfig bridge0 addm xn0
(panic...)
======
Kernel output:
xn0: performing interface reset due to feature change
(... lock reversal)
xn0: backend features: feature-sg feature-gso-tcp4
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x28
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80d15db4
stack pointer = 0x0:0xfffffe0000483840
frame pointer = 0x0:0xfffffe0000483940
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 14 (xenwatch)
[ thread pid 14 tid 100033 ]
Stopped at arprequest+0x454: movq ll+0x7(%rax),%rax
db> bt
Tracing pid 14 tid 100033 td 0xfffff800032f5000
arprequest() at arprequest+0x454/frame 0xfffffe0000483940
arp_ifinit() at arp_ifinit+0x58/frame 0xfffffe0000483980
netfront_backend_changed() at netfront_backend_changed+0x144/frame
0xfffffe0000483a40
xenwatch_thread() at xenwatch_thread+0x182/frame 0xfffffe0000483a70
fork_exit() at fork_exit+0x84/frame 0xfffffe0000483ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000483ab0
======
While testing out the new PVH support in a domU (which is running
great!), I discovered a kernel panic related to xen and vimage support
when trying to add an xn interface into a bridge.
I'm running r337024 from svn. Removing vimage (which seems to be turned
on in 12-CURRENT now) allows using the bridge with no panics. As part
of attempting to debug this I enabled vimage in my 11.2 domU and that
also panics in the same code.
I'm not sure if the problem is a xen issue or a vimage issue so I
haven't submitted a PR yet. The kernel output is listed below.
It looks like netfront_backend_changed() calls netfront_send_fake_arp(),
which calls arp_ifinit() on the interface. The first line of the call
stack with arprequest+0x454 corresponds to a call to
ARPSTAT_INC(txrequests) at the end of arprequest, which expands to
VNET_PCPUSTAT_ADD(). I tried to debug further and I got a little lost,
but that's where I figured out that vimage is involved somehow.
Are there any thoughts on why the xn interface would cause a panic there?
Thanks,
Nathan
=======
Steps to reproduce:
# ifconfig bridge create
bridge0
# ifconfig bridge0 addm xn0
(panic...)
======
Kernel output:
xn0: performing interface reset due to feature change
(... lock reversal)
xn0: backend features: feature-sg feature-gso-tcp4
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x28
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80d15db4
stack pointer = 0x0:0xfffffe0000483840
frame pointer = 0x0:0xfffffe0000483940
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 14 (xenwatch)
[ thread pid 14 tid 100033 ]
Stopped at arprequest+0x454: movq ll+0x7(%rax),%rax
db> bt
Tracing pid 14 tid 100033 td 0xfffff800032f5000
arprequest() at arprequest+0x454/frame 0xfffffe0000483940
arp_ifinit() at arp_ifinit+0x58/frame 0xfffffe0000483980
netfront_backend_changed() at netfront_backend_changed+0x144/frame
0xfffffe0000483a40
xenwatch_thread() at xenwatch_thread+0x182/frame 0xfffffe0000483a70
fork_exit() at fork_exit+0x84/frame 0xfffffe0000483ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000483ab0
======