Hey,
We're looking for an experienced system admin or network engineer to help us solve an issue we've been struggling with for a few months. We will try to describe the problem and everything we know about it so far below:
Every few minutes, a player will (seemingly randomly) be kicked from the server. An error can be seen in the logs of our proxy when it occurs,
There are no errors on our main production server, nor in the clients' log files. We can confidently rule out this issue being on the clients' side for a few reasons:
1. It happens regardless of the player's region & ping
2. It happens regardless of how long the player has been online
3. It happens to players on all different versions (1.8.x <-> 1.20.x)
This leads us to believe that the cause of the issue is somewhere in our setup, whether that be on the production server, the proxy, or the machine itself. Here is a brief overview of our setup:
Setup:
The issue occurs a lot on our Java and (less so) our Minehut proxy, but hasn't occurred once on our Bedrock proxy recently. It's difficult to know whether this is because the issue isn't present there, or whether there haven't been enough players on the Bedrock proxy for it to occur.
Here's a list of what we have tried in the last couple months to fix the issue:
- Switched DDoS protection
- From TCPShield to Papyrus to Cosmic Guard to now bare OVH
- Changed server jar
- From Vortex Spigot to a custom spigot to bare Paperspigot
- Changed proxy software
- From Flamecord to Waterfall, and from Bungee based to Velocity
- Attempted no proxy at all (?)
- For about a day, we ran with no proxy at all. We don't know for certain whether or not this fixed our issue because we had network compression disabled on the standalone server, so players were having ping issues and being kicked as a result. By the time we realised that was the issue, we had already switched back to a proxy (Velocity). There were no errors in the logs, but it is also possible that the stack trace was just not printed on the standalone server. We don't know. The only pointer we have is that we didn't see a rise in the player count during this time, which we would expect if the kicking issue were fixed.
- Ran WireShark on clients (nothing was found)
Given what we know about the issue, we can make some educated guesses about what is likely not responsible. We can rule these out because we have already attempted to switch hosts, software and network provider:
Safe to say, we are almost out of ideas. These are some wild guesses about what it could be caused by, but we have no idea anymore:
If you're interested in trying to solve this issue with us, please leave your discord and any desired compensation below so I can get in contact.
Thanks in advance
We're looking for an experienced system admin or network engineer to help us solve an issue we've been struggling with for a few months. We will try to describe the problem and everything we know about it so far below:
Every few minutes, a player will (seemingly randomly) be kicked from the server. An error can be seen in the logs of our proxy when it occurs,
Code:
[16:42:31] [Netty Worker IO Thread #44/WARN]: [/xxx:54631|ttryy] -> UpstreamBridge - NativeIoException: recvAddress(..) failed: Connection reset by peer
[16:42:31] [Netty Worker IO Thread #44/INFO]: [xxx] disconnected with: NativeIoException : recvAddress(..) failed: Connection reset by peer
1. It happens regardless of the player's region & ping
2. It happens regardless of how long the player has been online
3. It happens to players on all different versions (1.8.x <-> 1.20.x)
This leads us to believe that the cause of the issue is somewhere in our setup, whether that be on the production server, the proxy, or the machine itself. Here is a brief overview of our setup:
Setup:
- 5 Hub servers, 1 production (Prison) server
- Flamecord (Java proxy) & Bungeecord with Floodgate (Bedrock/Minehut proxy)
- 1.8 Server jar
- Base pterodactyl install
- UFW for firewall
- Databases ran within docker
- Ubuntu 22.04
- Hosted at OVH
The issue occurs a lot on our Java and (less so) our Minehut proxy, but hasn't occurred once on our Bedrock proxy recently. It's difficult to know whether this is because the issue isn't present there, or whether there haven't been enough players on the Bedrock proxy for it to occur.
Here's a list of what we have tried in the last couple months to fix the issue:
- Updated all plugins
- Switched dedicated servers
- Switched DDoS protection
- From TCPShield to Papyrus to Cosmic Guard to now bare OVH
- Changed server jar
- From Vortex Spigot to a custom spigot to bare Paperspigot
- Changed proxy software
- From Flamecord to Waterfall, and from Bungee based to Velocity
- Attempted no proxy at all (?)
- For about a day, we ran with no proxy at all. We don't know for certain whether or not this fixed our issue because we had network compression disabled on the standalone server, so players were having ping issues and being kicked as a result. By the time we realised that was the issue, we had already switched back to a proxy (Velocity). There were no errors in the logs, but it is also possible that the stack trace was just not printed on the standalone server. We don't know. The only pointer we have is that we didn't see a rise in the player count during this time, which we would expect if the kicking issue were fixed.
- Tried removing unnecessary plugins
- Tried reverting old gameplay changes we had recently made
- Reached out to the Via team to confirm it wasn't caused by ViaVersion
- Tried running MTRs from clients to the server
- Checking configuration for firewall (no rate limits / bad rules)
- Ran WireShark on clients (nothing was found)
Given what we know about the issue, we can make some educated guesses about what is likely not responsible. We can rule these out because we have already attempted to switch hosts, software and network provider:
- Network stability
- Network bandwidth issue
- Hardware issue with the dedi
- Server / proxy software (potentially the setup though)
Safe to say, we are almost out of ideas. These are some wild guesses about what it could be caused by, but we have no idea anymore:
- Firewall/related
- Switching MC version
- Switching Java version
- Reinstalling our dedi and changing our setup in some way
- Some database connection issue
If you're interested in trying to solve this issue with us, please leave your discord and any desired compensation below so I can get in contact.
Thanks in advance
- Type
- Requesting
- Provided by
- Team
- Operating system
-
- Ubuntu
Last edited:
