11-hour deployment disaster
55sStarts with a relatable horror story that hooks anyone who has dealt with production failures.
▶ Play ClipLeo Sjoberg shares a story of a disastrous production deployment that took 11 hours, contrasting it with the ease of using Kubernetes. He demonstrates how to deploy and auto-scale a Laravel application on Kubernetes without changing any code, showing a live demo where a slow site recovers from 15-second load times to 0.1 seconds after auto-scaling kicks in.
A routine 30-minute deployment turned into an 11-hour ordeal, motivating the speaker to adopt Kubernetes.
Kubernetes ensures applications run smoothly and handles failures gracefully, allowing developers to deal with issues on their own time.
A service with 5 customers on a $20 VPS gets tweeted by Taylor, causing 4,000 concurrent users. Without auto-scaling, the site slows to a halt; with it, it handles the load seamlessly.
Siege with 25 concurrent requests shows 15-second response times. After applying the Horizontal Pod Autoscaler, a second instance spins up and load time drops to ~0.1 seconds.
Kubernetes auto-scaling works with existing code; no need to rewrite tests or add special cases.
A Deployment tracks replicas and specifies Docker containers (FPM + Nginx). Auto-scaling adjusts the replica count dynamically.
ConfigMap stores the Nginx configuration file as a key-value pair, mounted into the Nginx container. The config is identical to a VPS setup.
Secrets are base64-encoded key-value stores for sensitive data like DB passwords. They are automatically decoded when injected as environment variables.
A Pod has an IP, but for multiple replicas or services, a reverse proxy (Ingress) and load balancer (Service) are required.
Service forwards port 80 to target port 80 on pods. Ingress routes requests to the service by name.
HPA scales the number of pod replicas based on metrics like CPU, memory, latency, or custom metrics.
HPA specifies target reference (deployment), min/max replicas, and metrics. Example: scale when average request time exceeds 1000ms.
If the app cannot load faster than 400ms, setting a target of 300ms will cause infinite scaling and high costs.
Applying Deployment, ConfigMap, Secret, Service, Ingress, and HPA takes about 3 minutes. Live demo shows the process.
The demo is simplified; production deployment requires more research due to Kubernetes' steep learning curve.
Deploying and auto-scaling Laravel on Kubernetes is achievable with minimal YAML files and no code changes. While the demo is simplified, it demonstrates the core concepts that make Kubernetes powerful for handling traffic spikes.
"The title promises infinite scale and delivers a live demo of auto-scaling Laravel on Kubernetes, exactly as described."
What is the primary benefit of using Kubernetes for deployment?
It acts like a dedicated 24/7 ops team that ensures your application runs smoothly and handles failures gracefully.
01:01
What two things does a Kubernetes Deployment resource do?
It tracks how many copies (replicas) of the application are running and specifies what code (Docker containers) to run.
06:09
What is a ConfigMap in Kubernetes?
A key-value store used to store configuration files, such as Nginx config, which can be mounted into containers.
07:52
How are secrets stored in Kubernetes?
Secrets are base64-encoded key-value stores. Kubernetes automatically decodes them when injecting into environment variables.
09:37
What is the purpose of a Service in Kubernetes?
It provides load balancing by forwarding requests from a port to the target port on pods.
14:01
What does the Horizontal Pod Autoscaler do?
It automatically scales the number of pod replicas up or down based on configured metrics like CPU, memory, or latency.
15:54
What happens if you set the HPA target metric too low (e.g., 300ms when the app can only do 400ms)?
The autoscaler will keep scaling up until it reaches the maximum replicas, leading to high resource usage and costs.
18:41
What command is used to apply Kubernetes resource files?
kubectl apply -f <filename>
11:00
What is the role of the Ingress controller?
It acts as a reverse proxy that routes external requests to the appropriate Service based on rules.
12:28
Why is the demo setup not production-ready?
Because it is simplified; production deployment requires more research due to Kubernetes' steep learning curve.
24:11
11-hour deployment disaster
Illustrates the pain point that Kubernetes solves, making the talk relatable.
00:05Kubernetes as 24/7 ops team
Clear analogy that explains Kubernetes' value proposition.
01:01Live scaling from 15s to 0.1s
Dramatic demo showing the power of auto-scaling in real time.
03:07No code changes needed
Key selling point: existing Laravel code works without modification.
05:09Setting realistic scaling constraints
Critical advice to avoid runaway scaling and high costs.
18:41[00:00] [Music]
[00:05] all right so about four months ago it
[00:10] was a Thursday at 4 p.m. we were doing a
[00:14] routine production deployment right we
[00:16] do them every week they take maybe 30
[00:19] minutes very easy
[00:23] 11 hours later I left the office now
[00:29] this means one of two things can have
[00:31] happened either our production deploy
[00:35] went really well and I was celebrating
[00:39] until 3:00 in the morning alternatively
[00:44] everything that could have gone wrong
[00:46] went wrong with that deployment now you
[00:50] can probably guess which one it was now
[00:55] I'm telling you this story because that
[00:59] wouldn't have been a problem if we'd
[01:01] been using kubernetes kubernetes is like
[01:06] having your own dedicated 24/7 ops team
[01:10] that makes sure your application runs
[01:12] smoothly and when things do go wrong it
[01:16] makes sure that that is also handled
[01:18] smoothly so that you can deal with
[01:20] problems in your own time and that's the
[01:24] reason I submitted a talk for being here
[01:26] today and which is why for the next 27
[01:31] minutes we're gonna go over how to
[01:34] deploy and scale laravel and kubernetes
[01:38] now you can't really talk about
[01:42] kubernetes without mentioning auto
[01:45] scaling auto scaling is this fabled
[01:49] thing that everyone wants right you want
[01:51] it to scale instantly and automatically
[01:54] and infinitely and it's really quite
[01:58] simple
[01:59] and it's fast so let's take an example
[02:02] let's say you just deployed a new
[02:05] service a month ago right you've got it
[02:07] on like a twenty dollar digitalocean VPS
[02:10] you've got maybe five customers then
[02:15] this happens Tayler tweets out that you
[02:19] just used your service it's awesome
[02:21] you've got four thousand people there
[02:24] suddenly hitting your service now that
[02:29] can go one of two ways either it goes
[02:33] really well because you have auto
[02:37] scaling setup and so you can handle
[02:39] 4,000 people and you just got four
[02:42] thousand new customers it's pretty
[02:44] awesome alternatively of course you
[02:48] don't have any order scaling your site
[02:50] just slowed to a halt and all the
[02:53] requests are taking 30 seconds no one
[02:56] can sign up your customers can't even
[02:58] use it you're getting angry tweets from
[03:00] people that think everything is terrible
[03:02] in life so let's look at how this works
[03:07] now if we just check out our super
[03:15] awesome service here which I should say
[03:18] it is the base installation of laravel
[03:21] which is a great service which I'm
[03:23] offering entirely free to everyone here
[03:25] by the way then we're also going to
[03:28] check we're running one instance there
[03:29] and then Taylor's tweet comes so we're
[03:32] gonna run siege for running 25
[03:34] concurrent requests and I'm not sure if
[03:37] you can see it you probably can't see it
[03:39] that well they're actually I can barely
[03:41] see it but it's about 10 15 second
[03:45] requests here and in fact if we open up
[03:48] a new tab you will quickly see or rather
[03:52] not so quickly that it's loading
[03:55] incredibly slowly right it's gonna take
[03:58] about 15 seconds so what if we just
[04:02] deploy auto scaling just gonna do run
[04:06] one command here and we're gonna get the
[04:11] auto scaling running you can see it's
[04:13] still loading
[04:13] that request raised getting up to 15
[04:16] seconds now it's finally loaded and now
[04:22] the horizontal called autoscaler which
[04:25] I'm gonna get to far later it's gonna
[04:27] kick in you can see there in the top
[04:29] right right we just got a second
[04:32] instance and in fact if you now go and
[04:35] refresh that page or in this case open
[04:38] up a new tab to load it you'll find that
[04:42] we have orders killing and you might be
[04:46] thinking oh it's gonna go from like 15
[04:48] seconds to 3 seconds no that's gonna go
[04:52] from 15 seconds to about 0.1 seconds so
[04:59] that's sort of scaling now I know you
[05:02] all want this right everyone's like I
[05:04] want that right I don't want my site
[05:07] just slowed completely that looks
[05:09] awesome
[05:09] how do I get it and then there are some
[05:12] of you will be thinking I don't want to
[05:17] change my code again I just rewrote all
[05:19] my tests to run and Travis CI like a
[05:22] month ago and now you're telling me I
[05:24] need to do something else no you don't
[05:26] need to change your code and this is a
[05:29] really appealing thing nothing needs to
[05:33] change you set it up once and then you
[05:37] just run it you code your app just the
[05:40] way you used to you don't have to think
[05:42] about special cases and it all just
[05:45] works so without further ado we're just
[05:48] gonna jump into deploying laravel on
[05:50] kubernetes now I do apologize in advance
[05:54] because there's gonna be a lot of llamo
[05:56] and I know that llamo is probably most
[05:59] people's least favorite markup language
[06:02] here don't worry I hate it too so I feel
[06:05] you on that ink you burn eddie's we have
[06:09] something called a deployment deployment
[06:12] looks pretty scary and don't worry
[06:14] you're not meant to be able to read this
[06:16] but it really just does two things right
[06:20] the first thing
[06:22] is they tracked how many copies of our
[06:26] application is running which you can see
[06:28] by the number of replicas right and this
[06:30] is the only thing that we care about
[06:32] when we're auto-scaling when we're
[06:34] auto-scaling we're just adjusting this
[06:37] number dynamically right so it's being
[06:40] changed on the fly the second thing it
[06:44] does of course is it specifies what code
[06:47] we're running specifies the docker
[06:49] containers so there's a section in there
[06:53] called spec and I'm gonna go through
[06:55] this reasonably quickly to not bore you
[06:58] with llamó so you had your containers
[07:01] you start off with an application
[07:03] container now i've already published a
[07:06] container on docker hub which contains
[07:08] FP m and the base level installation
[07:10] that's all that's in there secondly you
[07:15] had nginx right this is just as if
[07:17] you're deploying on a regular VPS you're
[07:20] adding FP m with your application code
[07:23] guarding nginx and of course we need to
[07:27] expose the container port as well right
[07:29] we're running on port 80
[07:31] this week's post port 80 but there's
[07:34] another thing you need when you're
[07:36] deploying on a regular VPS as well right
[07:39] you don't just need FP m and nginx
[07:42] installed but you need an internet
[07:44] connection at ease there's something
[07:52] called a config map config map is really
[07:56] straightforward it is a key value store
[07:59] you literally just insert keys and
[08:02] values when you store files you will
[08:05] usually use the file name as the key and
[08:09] then the content as the value now I have
[08:13] no reason at all to even show you the
[08:16] contents of this because this is the
[08:17] exact same nginx config file that you
[08:21] would use when you're deploying on a
[08:22] regular VPS that's what I'm saying right
[08:25] there's no changes that you need to make
[08:27] you deploy the same code
[08:29] so let's jump back to the deployment and
[08:32] we don't get this config map right our
[08:36] nginx configuration we need to get that
[08:37] into the nginx container somehow so in
[08:42] queue Bernays you just declare a volume
[08:44] at the top here of the spec and then you
[08:49] give it a name you're saying I want to
[08:50] load a volume from this config map that
[08:53] we just created and you had a mount I
[08:57] know this can be a bit confusing but
[09:01] it's quite straightforward right you're
[09:02] mounting this nginx configuration and
[09:05] you're mounting it into the path et Cie
[09:08] nginx and your next or conf now we have
[09:15] fpm and our code bundled in that we have
[09:20] nginx and the nginx config but if you
[09:24] were deploying on a regular server
[09:27] there's still one more thing you need
[09:29] but you need your environment variables
[09:32] in there you need your app peer database
[09:34] password and all of that and of course
[09:37] kubernetes has a way to do with this
[09:38] it's called a secret despite the name a
[09:43] secret isn't as secret as it sounds a
[09:48] secret is effectively a config map
[09:52] that's just basic ste 4 encoded so as
[09:56] you can see all the values here are just
[09:58] basic ste 4 encoded values right and so
[10:01] Kuban edits will automatically decode
[10:04] that when it puts it into your
[10:05] environment variables but this means
[10:10] don't commit secrets now what I mean by
[10:15] this is you can still totally commit the
[10:19] turbinates resource called a secret but
[10:22] don't commit your secret values don't
[10:24] commit your API keys don't commit your
[10:26] passwords don't commit any credentials
[10:28] or an T from the talk yesterday is gonna
[10:31] come shouting at you for security flaws
[10:35] but you can still commit the secret
[10:37] resourcing kubernetes right you can use
[10:40] environment substitution or something
[10:42] similar so if we were deploying on a
[10:48] regular VPS now we'd be ready right so
[10:53] if we just deploy it or work naturally
[10:56] so to deploy in cuber Nettie's you use
[11:00] the cube CTL command which is the
[11:03] command-line interface for interacting
[11:04] with kubernetes and you call apply and
[11:08] then you use the dash F argument which
[11:10] passes a file or you can pass a
[11:12] directory of files and so in this
[11:15] cubanelles directory I just have the
[11:18] config map
[11:19] I have the secretly created and the
[11:21] deployment next we can just open up our
[11:27] website right and it's gonna be there
[11:28] and everything can be awesome nope
[11:33] unfortunately it's not quite that simple
[11:37] if you made a request with our current
[11:41] set up what happens is basically this
[11:45] right we get what's called a pod which
[11:48] contains fpm and nginx that's what we
[11:50] specified kubernetes assigns an IP
[11:54] address to it and so your natural
[11:56] instinct will be let's just point the
[11:59] internet at it right how you would
[12:02] normally do it just put DNS there but
[12:05] what happens if you deploy a second
[12:07] service right what if you want to split
[12:10] your app into an off service and your
[12:13] primary service well you can't point it
[12:17] at both at once
[12:18] and so we basically need a reverse proxy
[12:22] right and so kubernetes has that built
[12:25] in which is really awesome and it's
[12:28] called the ingress controller the
[12:31] ingress controller for all intents and
[12:34] purposes right now anyway it's just a
[12:36] reverse proxy and so you can just create
[12:40] an ingress resource and you tell it to
[12:42] write to where you want it to go
[12:45] and that's all good and well so if we do
[12:50] that now we should have a production
[12:52] environment right but there's still one
[12:56] thing missing the problem is what if we
[13:01] start auto scaling it
[13:03] now we've got three IP addresses for one
[13:07] service and we got two for the other we
[13:09] need some form of load balancing here
[13:11] and our reverse proxy the ingress
[13:14] control that can't handle load balancing
[13:17] so we need some way of handling load
[13:20] balancing as you use a service for that
[13:23] now this is all a big massive theory
[13:26] lesson about kubernetes networking which
[13:29] can't get a bit involved so let's jump
[13:31] back to where we're at now this is what
[13:34] we have we've a single port with an IP
[13:38] address and we have an ingress
[13:40] controller now of course the ingress
[13:43] controller is a reverse proxy and if you
[13:45] don't tell you where to go what service
[13:48] to hit you get it 404 so unfortunately
[13:54] we're gonna have to dig in and create
[13:56] the service and the ingress the service
[14:01] is fairly straightforward you only
[14:05] specify really one thing you're saying
[14:07] if a request comes in to port 80
[14:10] that's the port then forward it to the
[14:14] target port 80 in the port that is being
[14:17] targeted right which is the level
[14:18] application that we just deployed lark
[14:20] on 2019
[14:21] [Music]
[14:23] next you give it a name so that the
[14:26] ingress controller can then find the
[14:28] service right next more gamal I
[14:33] apologize we go through the ingress
[14:38] again this is might be a bit confusing
[14:43] because it is very indented but there's
[14:46] actually only two things that you need
[14:48] right you need to have the path and you
[14:54] need to have a service name which we
[14:56] just created with the service
[14:58] so now we've created all this service
[15:02] and ingress stuff you've probably grown
[15:05] really tired of llamó by now and you
[15:08] know when you go back to try this on
[15:10] your own you'll be spending you know 20
[15:12] hours writing llamó files and being
[15:14] frustrated fortunately that's all there
[15:19] is we've deployed laravel we've deployed
[15:23] level and kubernetes and it all works
[15:27] so we're basically done right we're
[15:32] still missing the the order scaling so
[15:39] with all this done right our application
[15:42] is deployed we just have auto scaling
[15:44] left let's dig into that in kubernetes
[15:54] you have what's called horizontal pod
[15:57] order scalar which is a really long name
[16:01] and a really convoluted way of saying
[16:03] this thing just keeps track of how many
[16:06] copies of your application are running
[16:09] all it does is it scales this pod right
[16:14] with fpm and nginx scale start up and
[16:17] down based on whatever metrics you
[16:19] choose in fact you can scale it based on
[16:23] any metrics you can scale it based on
[16:25] latency or you could scale it on CPU or
[16:28] memory or you could scale it based on
[16:31] how many people are in your office right
[16:33] now if you set up a custom metric to
[16:35] track that now we don't want our site to
[16:41] be slow that's why we added order
[16:44] scaling in the first place that's why we
[16:46] want it and so a natural thing to do is
[16:49] of course you scale on latency right you
[16:51] want to make sure that your site loads
[16:52] quickly so you can scale on a request
[16:55] time this is what a horizontal pod
[16:59] autoscaler looks like I know it's a lot
[17:03] of llamó unfortunately this is also the
[17:06] most important llamó file of in
[17:09] tired talk which means we're gonna have
[17:11] to go through it so I'm gonna try to
[17:15] simplify this and only highlight the
[17:17] important parts here so you start with a
[17:21] target reference and even though it's
[17:25] like four attributes and nested three
[17:27] levels deep it does one thing it tells
[17:31] you what it is you want to scale so we
[17:35] deployed a deployment that's what we
[17:38] created that contains our containers
[17:40] that's really awkward to say and we just
[17:45] say we're targeting the deployment
[17:47] let's named Larkin 2019 after that
[17:53] you've got the rest of it right so you
[17:56] specify the minimum and maximum number
[17:59] of copies would you want to run that's
[18:01] the minimum replicas and the maximum and
[18:04] then last but not least the most
[18:07] important part you specify what you're
[18:11] scaling on you specify the metrics that
[18:14] you want to scale based on now in this
[18:18] case I have a metric that's called
[18:20] average request time in milliseconds but
[18:23] you could use any metrics now you'll
[18:27] notice here I've set a constraint of a
[18:30] thousand milliseconds so what I'm saying
[18:33] is I don't want my site to load slower
[18:36] than one second but what if you reduce
[18:41] that to 500 right we don't want our
[18:44] sites to load and a second really we
[18:46] want it to load in 0.1 second so you
[18:52] need to know your constraints because if
[18:55] you know that your allowable application
[18:57] no matter how much power you give it
[19:01] will never load faster than 400
[19:04] milliseconds if you set your target to
[19:08] be 300 milliseconds the order scalar is
[19:10] going to see but we're not at 300
[19:13] milliseconds yet I'm gonna scale up and
[19:15] then it's gonna scale up and as you can
[19:19] see what's it's still 400 milliseconds
[19:21] so it's gonna keep scaling up
[19:23] and then you'll end up at your maximum
[19:25] number of replicas and you're probably
[19:28] gonna end up using effectively all the
[19:30] resources you have and you'll end up
[19:32] with a massive massive bill so make sure
[19:36] you don't set your constraints too tight
[19:41] now then let's go through what this all
[19:45] looks like the whole thing start to
[19:49] finish in three minutes
[19:52] now that sounds fast but it doesn't take
[19:55] more than three minutes to deploy to
[19:57] kubernetes and have order scaling
[20:00] running so first of all is just double
[20:06] check of course that we don't have
[20:08] anything deployed yet and so we're
[20:11] getting at 4:04 justice were supposed to
[20:14] and so we can also just start watching
[20:18] these pods right so we're gonna check if
[20:20] we have any pods at all and as you'll
[20:24] see we don't have any pods running right
[20:26] because we haven't deployed anything yet
[20:28] so we've got nothing there so then we
[20:32] can jump in and we can just deploy the
[20:36] deployment first right we run cube CTL
[20:39] apply onto the deployment and in the top
[20:43] right you can see it's starting to
[20:44] create that container next we're just
[20:49] creating the nginx configuration and
[20:52] follow that up by of course creating the
[20:56] secret with our environment variables
[21:01] now with all that applied we're
[21:04] basically at the first step right we
[21:05] have the setup that you need for
[21:07] deploying on a VPS and as we went
[21:11] through before you're gonna get it for
[21:13] four on this see if you're refresh it
[21:15] you're still getting it for four but we
[21:20] already know how to fix this right you
[21:22] just add the service you have the
[21:24] ingress and it should work so let's pray
[21:28] and also if there I can type correctly
[21:33] so if we apply the service
[21:36] it should still 404 because we don't
[21:38] have an ingress yet but then as soon as
[21:43] we do add our ingress so again right now
[21:46] still 404
[21:48] though you add the ingress you go back
[21:51] and reload the page and behold you've
[21:57] got the most awesome service which
[22:00] Taylor tweeted about now of course we
[22:05] still have one thing left we have the
[22:07] auto scaling so we're gonna do what we
[22:11] did before we're gonna simulate this
[22:13] Taylor tweet we're doing 25 concurrent
[22:16] requests and just double check that
[22:23] everything is running incredibly slowly
[22:26] as you can see it's gonna keep loading
[22:29] that for a good good 15 seconds maybe up
[22:33] to 20 and then of course we apply the
[22:38] autoscaler now as you apply the
[22:42] autoscaler it's gonna start taking in
[22:45] those metrics and then after a few
[22:49] seconds it's gonna decide whether or not
[22:51] it should scale up and so in this case
[22:54] because it can detect that it's running
[22:56] so slowly right it's not responding
[22:58] faster than a second you'll see that
[23:01] it's gonna scale up in just a moment
[23:09] there we go
[23:11] so now if we go and reload the page
[23:15] again or rather open it up in a new tab
[23:17] just so he can see that I'm not using a
[23:20] cached version or anything you'll see
[23:24] that it's gonna load pretty fast it's
[23:27] not gonna load in 15 seconds it's gonna
[23:30] load in less than a second and so we
[23:32] don't need to keep scaling out and
[23:35] that's all there is to deploy in kubera
[23:38] Nettie's sorry deploying laravel on
[23:41] kubernetes and auto scaling it
[23:45] that just took three minutes of applying
[23:49] a few files which are all I believe less
[23:53] than a grand total of a hundred lines of
[23:55] llamó now I know 100 lines of llamó is
[23:58] like the equivalent of writing a
[24:00] thousand lines of PHP but it's still not
[24:03] bad so that's all there is to it now I
[24:11] should mention of course before you
[24:14] deploy to production right before you go
[24:17] back and sit down with your laptop's now
[24:19] and you're like I'm going to deploy in
[24:20] kubernetes this was great I just learned
[24:22] how to do it I shouldn't mention of
[24:25] course this is not production ready
[24:27] right and before you deploy to
[24:29] production make sure that you do all
[24:32] your research because there is a lot to
[24:35] learn unfortunately there is a steep
[24:37] learning curve to kubernetes but in this
[24:40] talk I just wanted to show that can be
[24:42] easy you don't need all the bells and
[24:46] whistles sometimes you can just get it
[24:49] done with five files also if you're not
[24:55] that comfortable with docker then I
[24:58] highly recommend that you go and watch
[25:00] David McKay's talk from last year didn't
[25:03] excellent talk on effectively how docker
[25:09] works and how to write docker files and
[25:11] docker images with laravel as an example
[25:14] as that was last year's Larrick on but
[25:17] now I'm going to provide you all a break
[25:20] from llamó let you download the cube CTL
[25:24] and hopefully enjoy the next talk thank
[25:26] you
[25:27] [Applause]
[25:37] you
[25:45] you
⚡ Saved you time reading this? Transcribe any YouTube video for free — no signup needed.