panic-recovery 底层实现总结
1. 数据结构
panic 关键字在 Go 语言的源代码是由数据结构 runtime._panic 表示的:
// A _panic holds information about an active panic.
//
// A _panic value must only ever live on the stack.
//
// The argp and link fields are stack pointers, but don't need special
// handling during stack growth: because they are pointer-typed and
// _panic values only live on the stack, regular stack pointer
// adjustment takes care of them.
type _panic struct {
argp unsafe.Pointer // pointer to arguments of deferred call run during panic; cannot move - known to liblink
arg any // argument to panic
link *_panic // link to earlier panic
pc uintptr // where to return to in runtime if this panic is bypassed
sp unsafe.Pointer // where to return to in runtime if this panic is bypassed
recovered bool // whether this panic is over
aborted bool // the panic was aborted
goexit bool
}
argp:指向defer调用时参数的指针arg:调用panic时传入的参数link:指向更早调用的runtime._panic结构;recovered:当前panic是否被恢复aborted: 表示当前的panic是否被强行终止
panic 函数可以被连续多次调用,它们之间通过 link 可以组成链表。
2. 触发 panic
编译器会将关键字 panic 转换成 runtime.gopanic,该函数的执行过程包含以下几个步骤:
- 创建新的
runtime._panic并添加到所在 Goroutine 的_panic链表的最前面; - 在循环中不断从当前 Goroutine 的
_defer中链表获取runtime._defer并调用runtime.reflectcall运行延迟调用函数; - 调用
runtime.fatalpanic中止整个程序
func gopanic(e interface{}) {
gp := getg()
...
var p _panic
p.arg = e
p.link = gp._panic
gp._panic = (*_panic)(noescape(unsafe.Pointer(&p)))
for {
d := gp._defer
if d == nil {
break
}
d._panic = (*_panic)(noescape(unsafe.Pointer(&p)))
reflectcall(nil, unsafe.Pointer(d.fn), deferArgs(d), uint32(d.siz), uint32(d.siz))
d._panic = nil
d.fn = nil
gp._defer = d.link
freedefer(d)
if p.recovered {
...
}
}
fatalpanic(gp._panic)
*(*int)(nil) = 0
}
func fatalpanic(msgs *_panic) {
pc := getcallerpc()
sp := getcallersp()
gp := getg()
if startpanic_m() && msgs != nil {
atomic.Xadd(&runningPanicDefers, -1)
printpanics(msgs)
}
if dopanic_m(gp, pc, sp) {
crash()
}
exit(2)
}
3. 执行 recovery
编译器会将关键字 recover 转换成 runtime.gorecover:
func gorecover(argp uintptr) interface{} {
gp := getg()
p := gp._panic
if p != nil && !p.recovered && argp == uintptr(p.argp) {
p.recovered = true
return p.arg
}
return nil
}
如果当前 Goroutine 没有调用 panic,那么该函数会直接返回 nil,这也是崩溃恢复在非 defer 中调用会失效的原因。
它会修改 runtime._panic 的 recovered 字段,runtime.gorecover 函数中并不包含恢复程序的逻辑,程序的恢复是由 runtime.gopanic 函数负责的:
func gopanic(e interface{}) {
...
for {
// 执行延迟调用函数,可能会设置 p.recovered = true
...
pc := d.pc
sp := unsafe.Pointer(d.sp)
...
if p.recovered {
gp._panic = p.link
for gp._panic != nil && gp._panic.aborted {
gp._panic = gp._panic.link
}
if gp._panic == nil {
gp.sig = 0
}
gp.sigcode0 = uintptr(sp)
gp.sigcode1 = pc
mcall(recovery)
throw("recovery failed")
}
}
...
}
从 runtime._defer 中取出了程序计数器 pc 和栈指针 sp 并调用 runtime.recovery 函数触发 Goroutine 的调度,调度之前会准备好 sp、pc 以及函数的返回值:
func recovery(gp *g) {
sp := gp.sigcode0
pc := gp.sigcode1
gp.sched.sp = sp
gp.sched.pc = pc
gp.sched.lr = 0
gp.sched.ret = 1
gogo(&gp.sched)
}
在调用 defer 关键字时,调用时的栈指针 sp 和程序计数器 pc 就已经存储到了 runtime._defer 结构体中,这里的 runtime.gogo 函数会跳回 defer 关键字调用的位置。
runtime.recovery 在调度过程中会将函数的返回值设置成 1。当 runtime.deferproc 函数的返回值是 1 时,编译器生成的代码会直接跳转到调用方函数返回之前并执行 runtime.deferreturn:
func deferproc(siz int32, fn *funcval) {
...
return0()
}
跳转到 runtime.deferreturn 函数之后,程序就已经从 panic 中恢复了并执行正常的逻辑,而 runtime.gorecover 函数也能从 runtime._panic 结构中取出了调用 panic 时传入的 arg 参数并返回给调用方
