Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 运行正常情况下突然出现日志 #434

Open
3 tasks done
zhj0811 opened this issue Jan 12, 2023 · 18 comments
Open
3 tasks done

[Bug]: 运行正常情况下突然出现日志 #434

zhj0811 opened this issue Jan 12, 2023 · 18 comments
Assignees
Labels
bug Something isn't working needs investigation needs more info waiting for response waiting for the response from commenter working on it

Comments

@zhj0811
Copy link

zhj0811 commented Jan 12, 2023

Actions I've taken before I'm here

  • I've thoroughly read the documentations on this issue but still have no clue.
  • I've searched the current list of Github issues but didn't find any duplicate issues that have been solved.
  • I've searched the internet with this issue, but haven't found anything helpful.

What happened?

gnet 监听udp服务,打印日志v2@v2.1.2/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown;随后进程终止

Major version of gnet

v2

Specific version of gnet

v2.1.2

Operating system

Linux

Relevant log output

v2@v2.1.2/reactor_default_linux.go:124  event-loop(6) is exiting due to error: server is going to be shutdown

Code snippets (optional)

No response

How to Reproduce

没特殊操作,正常运行的简单udp接收服务

Does this issue reproduce with the latest release?

It can reproduce with the latest release

@zhj0811 zhj0811 added the bug Something isn't working label Jan 12, 2023
@panjf2000
Copy link
Owner

频繁发生?

@panjf2000
Copy link
Owner

有看到 error occurs in event-loop: 的错误日志吗?没有的话应该是你在 OnTraffic() 里 return Shutdown 导致的。

@zhj0811
Copy link
Author

zhj0811 commented Jan 13, 2023

频繁发生?

发生概率不频繁,运行3天后才出现的

@zhj0811
Copy link
Author

zhj0811 commented Jan 13, 2023

有看到 error occurs in event-loop: 的错误日志吗?没有的话应该是你在 OnTraffic() 里 return Shutdown 导致的。

日志文件仅包含v2@v2.1.2/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown
2023/01/12 16:48:45 Init config success. 这条日志。Ontraffic 无return shutdown
image

@panjf2000
Copy link
Owner

OnClose() 和 OnTick() 里有没有返回 Shutdown?或者其他可能的地方,你全局搜一下 Shutdown 看看?一般来说这个错误是人为触发的,你看看你自己的业务代码有没有哪里有错误日志?

@zhj0811
Copy link
Author

zhj0811 commented Jan 13, 2023

自己写的部分全局都没使用Shutdown
image

@zhj0811
Copy link
Author

zhj0811 commented Jan 13, 2023

image
日志都是正常的,就多了一行异常日志

@zhj0811
Copy link
Author

zhj0811 commented Jan 13, 2023

OnClose() 和 OnTick() 里有没有返回 Shutdown?或者其他可能的地方,你全局搜一下 Shutdown 看看?一般来说这个错误是人为触发的,你看看你自己的业务代码有没有哪里有错误日志?

有类似Kill命令会触发Shutdown吗

@panjf2000
Copy link
Owner

暂时想不到还有什么情况会出现这种情况。你能提供一下能复现这个问题的 demo 代码吗?

@zhj0811
Copy link
Author

zhj0811 commented Feb 2, 2023

暂时想不到还有什么情况会出现这种情况。你能提供一下能复现这个问题的 demo 代码吗?

type netServer struct {
	gnet.BuiltinEventEngine
	eng       gnet.Engine
	network   string
	addr      string
	multicore bool
}

func (s *netServer) OnBoot(eng gnet.Engine) gnet.Action {
	logger.Infof("running server on %s with multi-core=%t", fmt.Sprintf("%s://%s", s.network, s.addr), s.multicore)
	s.eng = eng
	return gnet.None
}

func (s *netServer) OnOpen(c gnet.Conn) ([]byte, gnet.Action) {
	logger.Debugf("connected with fd: %d, remote_addr: %s\n", c.Fd(), c.RemoteAddr().String())
	//c.SetContext(new(protocol.NetProtoCodec))
	return nil, gnet.None
}

func (s *netServer) OnClose(c gnet.Conn, err error) (action gnet.Action) {
	if err != nil {
		logger.Errorf("error occurred on fd: %d, remote_addr: %s, %v\n", c.Fd(), c.RemoteAddr().String(), err)
	}

	return gnet.Close
}

func (s *netServer) OnTraffic(c gnet.Conn) (action gnet.Action) {

	codec := new(NetProtoHeader)
	data, err := codec.Decode(c)
	if err != nil {
		logger.Errorf("invalid packet: %v", err)
		return gnet.Close

	}

	logger.Infof("gnet codec: %v", codec)
	err = ants.Submit(func() {
		switch codec.MsgType {
		case MsgTypePTRQuery:

			req := new(PTRReq)
			err := req.Unpack(data)
			if err != nil {
				logger.Errorf("Query ptr invalid packet: %s", err.Error())
				return
			}
			logger.Infof("PTR request: %+v", req)
			payload, err := singlePTR(req)
			if err != nil {
				logger.Errorf("Query ptr record by ip %s failed: %s", req.Qip.IpAddr.String(), err.Error())
				//return
				header := NetProtoHeader{MsgType: MsgTypePTRRes}
				resBody := DNSProtoSignalRes{
					DstIp:     req.DstIp,
					ProtoBody: []byte{0x00, 0x00},
				}
				payload = header.Encode(resBody.Pack())
			}
			sendSignalReqChan <- signalReq{Dst: req.DstIp.IpAddr.String(), Type: "ptr response", Data: payload}

		case MsgTypeTypeRes:
			//ants.Submit(func(){
			var ip net.IP
			switch len(data) {
			case 5 + 4, 17 + 4:
				ip = data[1 : len(data)-4]
			default:
				logger.Errorf("invalid payload body length: %d", len(data))
				return
			}
			logger.Infof("AIO addr: %s", ip.String())
			key := fmt.Sprintf("%s,%d", ip.String(), MsgTypeTypeRes)
			fn := func() (interface{}, error) {
				cType := binary.BigEndian.Uint16(data[len(data)-4:])
				ch, ok := ip2CType.Get(ip.String())
				if !ok {
					return nil, fmt.Errorf("invalid ip %s to conn type channel", ip.String())
				}
				ch <- cType
				logger.Infof("Ip %s connection type %d", ip.String(), cType)
				return nil, nil
			}
			_, err, _ := singleGroup.Do(key, fn)
			if err != nil {
				logger.Warnf("Handle ip %s type response failed %s", ip.String(), err.Error())
				return
			}
		default:
			logger.Errorf("invalid payload type: %d", codec.MsgType)
			return
		}
	})
	if err != nil {
		logger.Errorf("Submits a task to ants pool failed %s", err.Error())
	}
	return
}

func InitPtrServer() {
	port := viper.GetInt("ptr.port")
	if port == 0 {
		port = 30053
	}
	service := &netServer{
		network:   "udp",
		addr:      fmt.Sprintf(":%d", port),
		multicore: true,
	}
	err := gnet.Run(service, service.network+"://"+service.addr, gnet.WithMulticore(service.multicore), gnet.WithLogger(logger))
	if err != nil {
		logger.Errorf("running server on %s with multi-core=%t failed", fmt.Sprintf("%s://%s", service.network, service.addr), service.multicore)
		panic(err)
	}
}


func InitPtrServer() {
	port := viper.GetInt("ptr.port")
	if port == 0 {
		port = 30053
	}
	service := &netServer{
		network:   "udp",
		addr:      fmt.Sprintf(":%d", port),
		multicore: true,
	}
	err := gnet.Run(service, service.network+"://"+service.addr, gnet.WithMulticore(service.multicore), gnet.WithLogger(logger))
	if err != nil {
		logger.Errorf("running server on %s with multi-core=%t failed", fmt.Sprintf("%s://%s", service.network, service.addr), service.multicore)
		panic(err)
	}
}

@zhj0811
Copy link
Author

zhj0811 commented Feb 2, 2023

上面的就是项目中关于gnet的全部代码了

@zhj0811
Copy link
Author

zhj0811 commented Feb 2, 2023

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

@panjf2000
Copy link
Owner

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

什么意思?你是主动调用了 gnet.Stop 方法吗?

@zhj0811
Copy link
Author

zhj0811 commented Feb 6, 2023

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

什么意思?你是主动调用了 gnet.Stop 方法吗?

没主动调用 gnet.Stop方法,项目设计的gnet部分代码全部在上面了,上面的 InitPtrServer() 是 go InitPtrServer() 通过协程方式启动的,v2@v2.1.2/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown 日志会导致整个进程断掉吗

@panjf2000
Copy link
Owner

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

什么意思?你是主动调用了 gnet.Stop 方法吗?

没主动调用 gnet.Stop方法,项目设计的gnet部分代码全部在上面了,上面的 InitPtrServer() 是 go InitPtrServer() 通过协程方式启动的,v2@v2.1.2/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown 日志会导致整个进程断掉吗

不会使得整个进程退出,只是使 gnet.Run() 结束而已,而且如果发生了 server is going to be shutdown 错误,应该也要打印所有 eventloop 的退出日志,而你这里就只打印了其中一个,这整个过程都太奇怪了,后来还有出现过相同的问题吗?

@zhj0811
Copy link
Author

zhj0811 commented Feb 6, 2023

后面没出现过这种日志

@panjf2000
Copy link
Owner

我准备优化一下错误打印这部分代码,把相应的堆栈信息也一起打印出来,这样后续定位问题更准确,至于你这个情况,麻烦你持续观察下,如果后面还有复现再一起来看看,谢谢。

@zhj0811
Copy link
Author

zhj0811 commented Feb 8, 2023

好的,问题复现再追踪下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs investigation needs more info waiting for response waiting for the response from commenter working on it
Projects
None yet
Development

No branches or pull requests

2 participants