Daisy 9735 Rep Farm Share Posted September 19, 2024 Hey all, I am happy to provide an update on the recent server crashes that we have been experiencing since the 1.20.6 upgrade. The root cause of the issue has been found, and stems from an bug in MyPet, our pet plugin for Aether VIPs. The bug has been reported to the plugin authors, and that feature will remain disabled until the issue is fixed. Other features that were disabled in narrowing this down such as ViaVersion (1.21+ client support), DecentHolograms (floating text at Cloud Temple), and LibsDisguises (disguises for ST) have been re-enabled. I have provided a full breakdown of how we identified the issue and a root cause analysis below, and if you want a bit more background you can read my response to @Orlanth's post about where we were on this issue just a few days ago. Identification / Resolution Last week we consulted folks in the PurpurMC (our server software) Discord server about our crash issue, though they said they could not provide support for outdated versions as we were running 1.20.6. Annoying, but a fair policy as version upgrades solve many common issues. Over the weekend, we began work to upgrade our plugins to 1.21.1, and concluded regression testing on the development server on Monday evening. Later that night, we updated the beta server to 1.21.1 and were testing Surge with only 3 players on, and when another player joined we encountered the crash. We forwarded this crash log to the PurpurMC discord and they suggested it was a plugin at fault, and linked a message from another server owner who was also encountering the issue. I compared the plugin list between that server and ours, and found around a dozen plugins in common, though many of which we have run for years. On Tuesday and Wednesday, I tried disabling a number of these plugins, such as ViaVersion, LibsDisguises, and DecentHolograms, though the crashes continued. I also reached out to the other server owner on Wednesday, and they provided a list of plugins they disabled in their testing which solved the issue, with MyPet being the common denominator. At approximately 5 PM EST Wednesday, I disabled MyPet, and the server has been stable since. Root Cause Analysis The Minecraft server splits workloads between threads. The main server tick occurs on one thread - the main thread - which handles things like the world, entities, players, etc, while other processes such as chunk loading and networking occur on other threads. There are numerous networking threads running in parallel at any time, handling all inbound and outbound packets for the server. Generally, networking calls happen without blocking the main thread, though player logins do initiate an uninterruptible call to the network thread pool, which must be completed before the main thread is unblocked. The main thread being blocked here was the cause of the crash indicated in our logs. MyPet has a packet listener for entity interaction packets, which they updated for 1.19+ servers. This listener intercepts inbound packets on one of the networking threads, and then blocks that thread while it waits for the main thread to fetch the entity from the world, so it can then modify the packet. The problem occurs if, during that time, a player joins the server and the same networking thread that is handling the entity interaction packet is selected to initiate that player's login. The main thread is blocked until that is completed, so it can’t respond to MyPet’s request, but the networking thread is blocked as well, waiting on the main thread. The networking thread and the main thread are stuck waiting on each other, so a deadlock occurs. The issue seems to be random in nature as it requires both an entity to be interacted with, a player to login, and that login to be assigned to the same networking thread as the interact packet, all at the same time. Somewhat rare, but as many of you experienced, almost guaranteed to happen right after a restart, when many players are logging in and all networking threads are in use. Due to the nature of the crash, the logs don’t obviously point to MyPet, since all we see is the main thread is blocked, until we knew to look for MyPet in the crash logs, it was hard to single it out one thread as we have hundreds of threads running at any time for things such as database interactions, so seeing a plugin in a crash log isn't usually a cause for alarm unless it is within the stack trace of the main thread itself. Thankfully we’ve gotten to the bottom of this and can continue enjoying the Premium Minecraft Roleplay Experience without these annoying interruptions and inventory rollbacks. A special thanks to @Priceflash for helping with both the root cause analysis and confirming my suspicion that MyPet was the issue, @Java17, @Greehn, and the Administration for restarting the server every time it crashed, and a huge shoutout to the Triage Team (@_mady07, @ibleesian, @shay, @teeylin, @Liam5232, and @HobbitForHire) for handling the elevated number of inventory rollback tickets. As a bonus, we'll be getting 1.21 in the next week or so as the work has already been completed for it. Worry not for your mods, we'll be using ViaBackwards to provide support to 1.20.6 clients for a some time (though not forever as that plugin is kind of buggy). A further announcement post will follow with information about this upgrade. Now I can sleep. Cheers, Llir | Technical Administrator p.s. If you need an inventory rollback, please create a /treq following the guide in this thread so we can handle it in a timely manner. Tickets made without the needed details may not be able to be resolved. p.p.s. I'm like 99.99% sure this is the cause of the issue but only time will tell for sure, maybe this post is a jumping the gun a bit, but I want to be transparent with the community on this issue. 105 Link to post Share on other sites More sharing options...
Java17 2216 Share Posted September 19, 2024 Big props to everyone involved in finding the root cause of this issue. I appreciate that we're making strides as a team to make sure that both the cause and resolution of these issues are transparent to our community. 28 Link to post Share on other sites More sharing options...
wowj 9760 Share Posted September 19, 2024 I KNEW IT WAS MYPET ALL ALONG!!! 6 Link to post Share on other sites More sharing options...
Pancho 4197 Share Posted September 19, 2024 W Llir + W Java, but I will be demanding for Aether compensation through new emote colors 6 Link to post Share on other sites More sharing options...
Orlanth 4667 Share Posted September 19, 2024 Absolute W by the tech team. Exactly what we needed. 5 Link to post Share on other sites More sharing options...
Crevel 7557 Share Posted September 19, 2024 We need reparations for the removal of MyPet right now (new aether emote colours will suffice) 6 Link to post Share on other sites More sharing options...
Morigung-oog 5697 Share Posted September 19, 2024 I didn't pay $500 for this. Jk, y'all did really good on this one if it is the case. Ws all round. priceflash redemption arc goes HARD 6 Link to post Share on other sites More sharing options...
Nug 2011 Share Posted September 19, 2024 another loss for VIP.......... 3 Link to post Share on other sites More sharing options...
Nooblius 7965 Share Posted September 19, 2024 you should all be terrified this means priceflash has a new redeemable get out of jail free card 14 Link to post Share on other sites More sharing options...
Unwillingly 18126 Share Posted September 19, 2024 6 minutes ago, Llir said: The root cause of the issue has been found, and stems from an bug in MyPet, our pet plugin for Aether VIPs. adding this to my extremely long list of why aether VIPs are the root cause of all evil on the server 14 Link to post Share on other sites More sharing options...
SimplySeo 6999 Share Posted September 19, 2024 Thank you Priceflash! 4 Link to post Share on other sites More sharing options...
Daisy 9735 Author Popular Post Share Posted September 19, 2024 ironic af all those saying "why can't all this donation money fix the server" but the donation money was the thing breaking the server 34 Link to post Share on other sites More sharing options...
Crevel 7557 Share Posted September 19, 2024 2 minutes ago, Llir said: ironic af all those saying "why can't all this donation money fix the server" but the donation money was the thing breaking the server @satinkira are you going to take this? 3 Link to post Share on other sites More sharing options...
Onnensr 1908 Share Posted September 19, 2024 eight MILLION new chat colors Spoiler great work everyone involved, it was a quick fix really and I'm glad we can get back to what really matters in full confidence. pvp WAR 2 Link to post Share on other sites More sharing options...
KamikazeReaper 457 Share Posted September 19, 2024 well done on the works boys, i know such bug hunting aint easy, yall should go get a few beers on the morrow! 3 Link to post Share on other sites More sharing options...
Recommended Posts