GKE WebSockets timeouts in a chat / voice web server running in production.
I typically don’t write blogs about GKE and infrastructure stuff, I’m not an expert in this space and I am sure lots of people can do a better job than me. I’m a conversational engineer. So with this disclaimer said out loud, I do want to share with you the pain that I struggled with, while deploying my chat / Twilio web server to production. More importantly, if you are facing similar issues, how you can solve this!
My chat server makes use of WebSockets. It’s a Node.js project, running the Express framework with Express-WS as the WebSockets module. I containerized it as a back-end container. I also have a web container, with an Angular website, that talks to it.
It works perfectly! ...on my machine.
I deployed it to Google Cloud with GKE. I could have chosen Google Cloud App Engine Flex or Compute Engine for WebSocket heavy applications as well. But based on my container architecture with deployment scripts and the granularity, I choose Kubernetes.
When I deployed the text chat server, and I would test my application in production, it seemed to work fine. However, when I looked in my browser network tab, or in my Google Cloud logs, I would see that all my web sockets would get killed, a bunch of times per second.
Hmm, well does that matter?
It seems it does, especially once I deployed my virtual agent in a call center using Twilio. I noticed that each phone call would automatically hang up the phone after a certain amount of seconds. Digging into the logs, after first reading my Twilio code millions of times, (assuming I made a programming mistake because working with streams is just hard), I figured that it had to do with the WebSockets time outs.
More specifically: Google Cloud load balancers are not by default configured to handle WebSockets, because by default the load balancers have 30 second timeouts in place that cause connections to close.
I’ve figured that you can create a backend config that updates these timeouts. So I’ve created a yaml file that updates these timeouts to 30 minutes (1800 seconds) per socket, which should be long enough for my phone calls.
backendconfig.yaml
apiVersion: cloud.google.com/v1kind: BackendConfigmetadata: name: ccai-backendconfigspec: timeoutSec: 1800 connectionDraining: drainingTimeoutSec: 1800
Then I would need to update my services yaml file, which exposes the containers, to make use of my backend config. See the line. In the Services i would create an annotation. cloud.google.com/backend-config: ‘{“default”: “ccai-backendconfig”}’
See here my complete services.yaml
apiVersion: v1kind: Servicemetadata: name: chatserver-service annotations: cloud.google.com/app-protocols: ‘{“my-https-port”:”HTTPS”,”my-http-port”:”HTTP”}’ cloud.google.com/backend-config: ‘{“default”: “ccai-backendconfig”}’spec: type: NodePort selector: app: chatserver ports: - name: my-http-port port: 8080 targetPort: 8080---apiVersion: v1kind: Servicemetadata: name: web-service annotations: cloud.google.com/backend-config: ‘{“default”: “ccai-backendconfig”}’spec: type: LoadBalancer selector: app: web ports: - name: my-http-port port: 80 targetPort: 80
Then apply my both yamls through the console:
kubectl apply -f backendconfig.yamlkubectl apply -f services.yaml
After a short wait, the services were deployed. And there we go, after testing the text chat app, I immediately noticed that I didn’t have that many logs! I started calling my virtual agent over the phone, and guess what, the virtual agent didn’t hang up on me! Pffeww! That saved my day. If you found this blog article through Google Search, then hopefully I solved your problem too!