Nightbot isn't saying anything in my live stream

I added some metrics to better track what is causing these issues today. For that reason, there were a lot of rolling restarts today, which means that Nightbot may have been missing for 5 minutes sporadically.

With that said, it seems there is some strange cases where Nightbot can fail polling a channel due to YouTube serving an error, and then part the channel due to it thinking the channel is unavailable (polling is costly, so we leave channels which are constantly erroring).

For @Victor_Kuznetsov, @nourish, and @Jodeb, this can be verified in logs:

Parted liveChat Chillhop Music:UCOxqgCwgOqC2lMqC5PYz_Dg:EiEKGFVDT3hxZ0N3Z09xQzJsTXFDNVBZel9EZxIFL2xpdmU no successful polls
Parted liveChat KavaGames:UCdK-SBGzSwDkJoYHQp8FzFw:EiEKGFVDZEstU0JHelN3RGtKb1lIUXA4RnpGdxIFL2xpdmU no successful polls
Parted liveChat nourish.:UC7tdoGx0eQfRJm9Qj6GCs0A:EiEKGFVDN3Rkb0d4MGVRZlJKbTlRajZHQ3MwQRIFL2xpdmU no successful polls

In order to address this, I’ve increased the allotted time for errors to 2 minutes, up from 30 seconds. I will be monitoring to see if this helps, and, if not, then it means that the errors are non-recoverable and the issue sounds like it’s on YouTube’s side.

Another interesting bit is that it takes Nightbot longer to process polls the longer it is online. I’m not sure why this is, but I presume it’s due to channels not being marked offline by YouTube, so Nightbot will not leave them after they become inactive. We used to rely on the channels polling to remove chats, but YouTube’s polling API for that would randomly mark channels offline and then online again (false positives). Unfortunately a lot of Nightbot’s stability issues on YouTube are a direct result of the difficulty of dealing with their API. I’ve been asking them for WebSocket support for over a year. As Nightbot gains more users, it will only become more unstable since polling is not an efficient means of receiving chat messages.

Here’s a graph of the polling times for chats over time:

The steep drop is a reboot of the process to change the polling timeout to 2 minutes, as noted above.