TubeSum ← Transcribe a video

Benchmarking PHP code the right way

Transcribed Jun 14, 2026 Watch on YouTube ↗
Intermediate 8 min read For: PHP developers interested in performance benchmarking and using tools like Hyperfine.
538
Views
46
Likes
0
Comments
0
Dislikes
8.6%
🚀 Viral

AI Summary

This video discusses the challenges of benchmarking PHP code and introduces Hyperfine, a tool that helps avoid common pitfalls. It covers seven key problems to account for when benchmarking, such as environment differences, opcache optimizations, and machine load, and demonstrates how Hyperfine addresses them.

[00:40]
What is Hyperfine?

Hyperfine is a command-line benchmarking tool that can be installed via package managers like brew. It allows running multiple commands and comparing their performance.

[02:44]
Problem 1: Not using production-equivalent environment

CLI PHP may not have opcache enabled by default, and xdebug should be disabled. The video shows how to check and enable opcache for CLI.

[05:14]
Problem 2: Opcache optimizing away code

Opcache may inline simple functions, making benchmarks misleading. The video demonstrates how to prevent this by adding dummy code.

[07:51]
Problem 3: Not accounting for machine load

Developer machines often have background processes that affect benchmarks. Hyperfine detects outliers and warns about load.

[09:12]
Problem 4: Not accounting for PHP startup time

PHP has a startup overhead (~50ms). The video recommends making benchmark scripts run at least 500-800ms to minimize its impact.

[10:41]
Problem 5: Not repeating tests

Hyperfine automatically runs multiple iterations to detect outliers and provide statistically reliable results.

[12:42]
Problem 6: Not using statistical methods

Hyperfine provides mean and standard deviation, allowing comparison of overlapping ranges to determine if differences are significant.

[14:42]
Problem 7: Optimizing what doesn't matter

Microbenchmarks may show large improvements that are negligible in real applications if the code is not executed frequently.

Proper benchmarking requires accounting for environment, opcache, machine load, startup time, repetition, statistics, and relevance. Hyperfine helps with many of these issues, but developers must still ensure their benchmarks reflect real-world usage.

Clickbait Check

90% Legit

"Title accurately reflects content; video delivers a comprehensive guide to correct benchmarking practices."

Mentioned in this Video

Tutorial Checklist

1 01:18 Install Hyperfine using package manager (e.g., brew install hyperfine).
2 01:28 Prefix any command with hyperfine to benchmark it, e.g., hyperfine 'php script.php'.
3 01:43 Benchmark multiple commands by passing them as arguments to hyperfine for comparison.
4 02:44 Ensure production-equivalent environment: enable opcache for CLI and disable xdebug.
5 05:14 Prevent opcache from optimizing away code by adding dummy operations that confuse the optimizer.
6 09:12 Account for PHP startup time by ensuring benchmark scripts run at least 500-800ms.
7 10:41 Let Hyperfine repeat tests automatically to detect outliers and ensure statistical reliability.
8 12:42 Use Hyperfine's statistical output (mean and standard deviation) to compare results and check for overlapping ranges.
9 17:54 Use Hyperfine's -L flag to parameterize benchmarks, e.g., hyperfine -L version 8.2,8.4 -L mode with_backslash,without_backslash 'php{version} bench.php {mode}'.

Study Flashcards (8)

What is the main purpose of Hyperfine?

easy Click to reveal answer

Hyperfine is a command-line benchmarking tool that helps compare the performance of commands and detect outliers.

00:40

Why should opcache be enabled when benchmarking PHP CLI?

medium Click to reveal answer

Because CLI PHP may not have opcache enabled by default, and without it, performance can drastically differ from a production web server environment.

02:44

How can opcache make benchmarks misleading?

medium Click to reveal answer

Opcache may inline simple functions, making them appear faster than they actually are in production code.

05:14

What is the recommended minimum runtime for a benchmark script to minimize the impact of PHP startup time?

medium Click to reveal answer

At least 500 milliseconds, preferably around 800 milliseconds.

09:12

How does Hyperfine help with machine load issues?

easy Click to reveal answer

It detects statistical outliers and warns the user to run benchmarks on a quieter system.

07:51

What statistical information does Hyperfine provide?

hard Click to reveal answer

It provides the mean runtime and standard deviation, allowing comparison of overlapping ranges to determine if differences are significant.

12:42

What is the danger of microbenchmarking?

medium Click to reveal answer

Optimizing code that is not executed frequently in real applications can lead to negligible improvements.

14:42

How can you parameterize benchmarks in Hyperfine?

hard Click to reveal answer

Using the -L flag to introduce parameters, e.g., hyperfine -L version 8.2,8.4 'php{version} script.php'.

17:54

💡 Key Takeaways

🔧

Hyperfine as a benchmarking tool

Introduces a dedicated tool that addresses common benchmarking pitfalls.

00:40
💡

Production environment matters

Highlights the critical difference between CLI and web server environments.

02:44
📊

Opcache can skew results

Demonstrates how opcache optimizations can lead to false conclusions.

05:14
📊

PHP startup overhead

Quantifies the startup time and provides a practical mitigation strategy.

09:12
⚖️

Optimize what matters

Warns against micro-optimizations that don't impact real-world performance.

14:42

✂️ Creator Tools: Viral Hooks

AI-generated clip ideas for Shorts based on the transcript

Why your PHP benchmarks are wrong

45s

High relatability for developers who have made benchmarking mistakes, with a promise of a solution.

▶ Play Clip

Hyperfine: The tool that fixes benchmarking

60s

Introduces a practical tool with clear benefits, appealing to developers seeking better workflows.

▶ Play Clip

7 benchmarking pitfalls you must avoid

60s

List-based content performs well; each pitfall is a quick, actionable insight.

▶ Play Clip

Opcache is ruining your benchmarks

60s

Surprising technical detail that many developers overlook, with a clear demonstration of the problem.

▶ Play Clip

How to avoid PHP startup time skewing results

60s

Practical tip with a concrete example (empty script taking 50ms) that is easy to understand and apply.

▶ Play Clip

[00:00] Benchmarking PHP code is fun and it has

[00:02] always fascinated me, but it's also

[00:05] tricky and there are more ways to do it

[00:07] wrong than you can count on two hands.

[00:10] And if you're benchmarking wrong, then

[00:12] the results are at best useless, at

[00:15] worst harmful. And at the end of this

[00:18] video, you have learned a list of

[00:19] preconditions for good benchmarking in

[00:22] PHP. One tool that has helped me

[00:24] immensely in designing good benchmarks

[00:26] is Hyperfine. And I want to show you how

[00:28] it works, what its benefits are, and

[00:30] what problems you still need to account

[00:32] for. Mo, I am Benjamin and I have helped

[00:36] thousands of developers with PHP

[00:38] performance over the last 10 years. What

[00:40] is Hyperfine? You can write benchmarking

[00:43] code easily in PHP without additional

[00:46] help using HR time function or more

[00:50] commonly seen the micro time function.

[00:53] After doing this for myself for years,

[00:56] my colleague Fulker convinced me to give

[00:58] Hyperfine a try and it stuck. Why even

[01:02] use a dedicated tool for benchmarking?

[01:04] Fulker wrote about Hyperfine in our blog

[01:06] and we are going through his work and is

[01:09] examples to understand why Hyperfine

[01:11] helps with avoiding a bunch of

[01:13] benchmarking problems that are very

[01:15] common but also introduces some. You can

[01:18] install uh Hyperfine easily on a Mac

[01:21] with brew install Hyperfine and on many

[01:24] other systems as well with a package

[01:26] manager. And if you have it installed,

[01:28] you can prefix any command with

[01:30] Hyperfine. For example, in this case for

[01:32] the bench sleep script which runs u

[01:36] sleep um for 100 milliseconds

[01:43] and then you can also run it uh for

[01:46] multiple commands. For example, PHP

[01:48] bench sleep, PHP with a factor of two.

[01:52] And then again PHP bench sleep

[01:56] with a factor of three. And then it will

[01:58] run each of them individually and it

[02:01] will compare the performance against

[02:03] each other. This way you can run

[02:05] alternative implementations for your

[02:07] code and then benchmark them against

[02:10] each other and see which one is faster,

[02:11] which is slower and pick one of um the

[02:15] solutions that you want to use. Uh we

[02:18] use it for example also for testing the

[02:21] performance of different PHP versions

[02:23] against each other. Let's say if then

[02:25] improve is there an improvement in a

[02:27] version and then tests the same code

[02:29] against multiple versions of PHP and see

[02:32] how the performance changed. So what

[02:34] problems do exist with benchmarking? Now

[02:36] I want to talk about seven different

[02:38] things that you should account for when

[02:40] benchmarking PHP code.

[02:44] The first problem we should look into is

[02:46] not using a production equivalent

[02:49] environment. And this can happen because

[02:51] we are running the benchmarks uh using

[02:53] the PHP CLI command. This means that by

[02:57] default opcache is not loaded or not

[03:00] enabled for the CLI. And this can

[03:03] drastically change the performance um

[03:06] compared to running the same code in a

[03:08] web server. So let's check that Z

[03:11] opcache is enabled and also in get

[03:18] opcache enable CLSI is set otherwise

[03:23] throw new exception

[03:27] opach not loaded.

[03:31] So let's run

[03:36] the code again. we see oh okay opache

[03:39] was not loaded so let's change

[03:45] the ini for the cli settings we load

[03:48] opcache

[03:50] and next run works

[03:52] uh what we also should account for is

[03:54] that xdebug is not loaded so this is

[03:57] something where I had like several

[04:00] embarrassing moments where I tested for

[04:03] example I think PHP P 5.6 against PHP 7

[04:09] or so and I saw it so much faster PHP 7

[04:14] and then realized not only of course

[04:17] PHP7 was much faster than PHP56

[04:20] but I was off by a big factor because on

[04:23] PHP 5.6 I also had XD debug running and

[04:27] I didn't have that on um

[04:32] I didn't have that on PHP7. So uh if PHP

[04:36] XD debug uh either either it's not

[04:38] loaded or throw new exception XD debug

[04:43] is loaded and then we run that and this

[04:48] is something that uh specifically with

[04:50] benchmarking can happen because it's a

[04:52] system you run them on a system where

[04:54] you also develop. So it might be the

[04:57] case that XDbug is actually on. So here

[05:00] it's off um on my machine and that's

[05:02] fine.

[05:04] We could also account for other things

[05:06] that are non-production ready. It

[05:07] depends on your environment um what what

[05:10] you think of what you should account

[05:11] for.

[05:14] The second problem that you can have is

[05:16] writing PHP code that opcache optimizes

[05:19] away. So let's say we have um code here

[05:24] that has a function fu returning one and

[05:29] um we want to test if adding a function

[05:32] returning this is faster than just

[05:34] iterating the code here. So this is the

[05:37] original code that we want to test and

[05:40] um we are iterating this and dumping at

[05:43] the end and then we have also the code

[05:46] where we call the full function and we

[05:49] compare that against each other. So

[05:51] let's run PHP

[05:55] bench

[05:56] function loop PHP against PHP

[06:01] bench function

[06:05] optimized away.php and the the file name

[06:08] already gives away the problem. Uh

[06:11] what's happening here is that with

[06:13] opcache

[06:15] some of the functions are actually

[06:17] inlined and optimized away. So because

[06:20] the fu function is returning a static

[06:22] value

[06:24] um it will actually compile the code

[06:26] down to look exactly like this. Um and

[06:30] that means that you need to be careful

[06:32] with certain functions that you don't

[06:35] write them in a way that opcache

[06:37] optimizes them for your benchmarking

[06:40] scenario and in production you're using

[06:43] a different function. you write it a

[06:45] little bit differently or it's more

[06:47] complex because that's just the way

[06:49] production code is. So um that way you

[06:52] would see that um the performance is

[06:56] quite different. So what we found out

[06:58] here is that actually the code for um

[07:03] the function optimized away ran faster

[07:06] than the code um that's just

[07:09] incrementing the number. So if we relied

[07:11] on that benchmark, we would think that

[07:13] functions in PHP code are faster than

[07:16] just incrementing an integer, which is

[07:18] not true. Opcimize this away. In this

[07:22] case, we could fool OPC cache um into

[07:24] thinking this function cannot um uh be

[07:28] inlined

[07:30] by writing some dummy code that it gets

[07:34] confused by. And if we rerun the

[07:36] benchmarks now then we see the loop is

[07:39] at 100 milliseconds and then the

[07:41] function call is uh much slower. The

[07:44] factor is it's a factor of two and this

[07:47] is way more realistic. Now

[07:51] the third problem is not accounting for

[07:53] load on the machine and this happens

[07:55] with developer machines all the time.

[07:58] So, my developer machine has Docker

[08:00] running, has Spotify running, has a

[08:03] browser running, open with 10,000 tabs.

[08:05] Yes, I'm a tap messy. And it is usually

[08:09] under quite a bit of load. In this

[08:11] example, we see what's uh happening

[08:13] under load. the benchmark can be off

[08:17] because the um uh the machine actually

[08:21] doesn't have enough CPU to run a

[08:23] benchmark um on a CPU core without

[08:27] getting interrupted. And Hyperfind

[08:29] checks for interruptions. It checks for

[08:32] statistical outliers and it warns you

[08:35] about this. If there are outliers, if

[08:37] there are problems, it um explains to

[08:40] you that you should run it on a quieter

[08:42] system. And this is very helpful because

[08:44] you can now run the benchmarks as long

[08:48] as

[08:50] um

[08:53] uh you have a run that doesn't show a

[08:55] warning of this kind. Um maybe you need

[08:58] to stop a few programs for this to

[09:00] happen. Um but you will know that like

[09:03] the benchmarks here are not

[09:04] statistically valid because um there was

[09:08] a big like spike or change.

[09:12] The third problem you can run into when

[09:14] benchmarking, at least with Hyperfine,

[09:16] is that you don't account for the

[09:18] startup time of PHP itself. So let's run

[09:22] a PHP script through Hyperfine that does

[09:25] absolutely nothing. So it's PHP

[09:27] empty.php. We run it and we see that

[09:32] running this empty script takes 50

[09:34] milliseconds. And this means that

[09:37] running any PHP script like with this

[09:40] PHP um runs 50 milliseconds needs 50

[09:43] milliseconds of startup time and we need

[09:46] to discount that from all the benchmarks

[09:48] that we are running. So maybe you

[09:50] remember our benchmark from the

[09:51] beginning with the sleep. We can run it

[09:54] again. PHP bench sleep.php with a factor

[09:58] of one. We just had 100 milliseconds of

[10:00] US sleep.

[10:03] And as we can see the script is actually

[10:06] not running 100 milliseconds. On average

[10:08] it's running 167 milliseconds. And this

[10:11] is because of the startup time of PHP

[10:14] itself. And you need to discount that uh

[10:17] to be able to compare the different runs

[10:19] with each other. What we usually do is

[10:23] we um make sure that the benchmark

[10:26] script uh runs at least 500 milliseconds

[10:29] or better like around 800 milliseconds.

[10:32] so that the actual startup time is um

[10:35] becoming a very small part of this

[10:38] runtime.

[10:41] The next problem that you can have with

[10:42] benchmarking is not repeating the tests

[10:45] and this is a problem that uh Hyperfind

[10:48] accounts for automatically. Um as you

[10:51] remember uh when we run hyperfind for

[10:54] example on our

[10:58] um function bench function loop.php.

[11:03] So what we do here is uh 10 million

[11:08] iterations of an increment

[11:12] then it will run this 15 times and um

[11:16] Hyperfine looks at the execution length

[11:20] of the first test and then determines

[11:23] how often it should run them to be able

[11:25] to make a statistically good um estimate

[11:29] of the performance. And the reason is

[11:31] you should account for this is first um

[11:36] to detect outliers on the machine. So

[11:39] with this we are able to detect that one

[11:41] run took significantly uh longer like

[11:44] this one here. And this might

[11:46] potentially mean that the machine is

[11:47] under load and the benchmark is not

[11:49] reliable. But also you cannot just take

[11:53] one um

[11:56] um test and then use this number as the

[12:00] truth. That is not statistically um good

[12:03] and you can be way off by not accounting

[12:06] for changes that um or variances in

[12:09] this. Maybe you could argue that um

[12:12] running 10 million iterations is already

[12:14] repetition, but uh for me it's really

[12:17] not. The problem is that you want to see

[12:20] that the performance runs the same when

[12:23] you do 10 million repetitions over and

[12:26] over again. And this is how Hyperfine is

[12:29] able to determine that there are

[12:31] outliers, there is load. And

[12:33] statistically, it's very important to

[12:35] have multiple runs and then average

[12:38] them.

[12:42] This goes into the sixth uh problem that

[12:44] you can have with benchmarking. Not

[12:47] using statistical methods to compare the

[12:50] performance. And um one problem with

[12:54] benchmarking and comparing stuff against

[12:55] each other is that there is a lot of

[12:58] variance in benchmarks. And with

[13:01] statistical tests, you can give a range

[13:06] uh that is statistically significant for

[13:09] benchmarks that you run. So let's take

[13:11] again the benchmarking loop code that we

[13:14] had before. PHP bench fn optimized

[13:21] away.php.

[13:24] We run both tests against each other

[13:28] and we can see that um Hyperfine shows

[13:34] the range of the results it has and it

[13:38] also um calculates um the variance and

[13:42] it says not that the um the the test ran

[13:45] for 127 milliseconds on average, but it

[13:50] also says that statistically it um

[13:54] fluctuates by 36 milliseconds plus minus

[13:58] around this average. And the same is for

[14:01] the second test. It's 229 milliseconds

[14:05] plus minus 18 milliseconds

[14:07] statistically. So you see this sort of

[14:10] range and then you can compare if both

[14:14] ranges interlap over each other and if

[14:17] the ranges overlap for both tests then

[14:20] this means that statistically they might

[14:23] not actually be different that much

[14:26] because you could see the same

[14:27] performance for both of them.

[14:30] In this case we can see the inter um the

[14:33] ranges don't really overlap. So

[14:35] statistically this is really a different

[14:37] performance between the two scripts.

[14:42] The seventh problem is optimizing what

[14:45] doesn't matter and um this is a problem

[14:48] that uh happens a lot with

[14:50] microbenchmarking.

[14:52] So if you're microbenchmarking some

[14:54] construct against each other and then

[14:57] you're increasing the iterations to 10

[15:00] million, 20 million, 50 million, then

[15:02] you have to ask yourself, are you

[15:04] actually running this that often in your

[15:07] actual PHP code? And then if in the

[15:09] actual PHP code you're only running it

[15:11] like 100 times then maybe even if you

[15:14] can optimize something then it wouldn't

[15:17] matter for the end for your uh script

[15:19] itself because um in real production

[15:22] code it would be just a few nanoseconds

[15:25] uh improvement that isn't really

[15:27] measurable. Let's take for example um

[15:30] this case where we run the difference

[15:33] between uh calling a function a prefix

[15:37] with a name space. So we importing it or

[15:39] we calling it with a prefix name and

[15:41] doing not doing that. So we're not

[15:43] importing that. Um and the difference

[15:46] here is that with this um in the case

[15:49] where it's imported opach can run some

[15:52] uh optimizations and inline the function

[15:55] and in this case here um it cannot do

[15:57] that. So let's run uh both against each

[16:01] other. We can see for the unoptimized

[16:04] code it runs for 500 milliseconds. For

[16:07] the optimized code, it runs for 300

[16:10] something milliseconds and the

[16:12] difference is 1.6 times faster. However,

[16:17] when we look at this, we are really

[16:18] calling count and checking this 20

[16:21] million times. So, the question really

[16:23] is, is this happening 20 million times

[16:26] in our application? Uh, if yes, then

[16:29] this is a very good optimization. uh if

[16:31] count is only called 1,000 times, then

[16:34] you won't see a difference at all. So,

[16:37] let's go back to a full example from

[16:39] Fala's blog post where he tests the uh

[16:43] sprint f changes um that compile down to

[16:47] string concatenation that were added in

[16:50] PHP 8.4 against um each other uh

[16:53] previous versions and using batch

[16:55] classes and not. So, the example first

[16:58] checks for X debug as we've seen. It

[17:00] checks for Zent OPC cache and that it's

[17:02] enabled. Uh has 10 million iterations.

[17:06] Two cases. Um I can run the script with

[17:09] backslash or without backslash. And the

[17:12] functions are declared here as using a

[17:15] backslash. Then this will trigger the

[17:19] optimization in PHP 8.4 or without a

[17:22] backslash it will not trigger the

[17:24] iteration. So we are iterating over um

[17:29] so running 10 million iterations calling

[17:32] this function with a string and an

[17:35] identifier concatenating this.

[17:38] So what Fulka showed is that Hyperfine

[17:41] can be used to parameterize tests in a

[17:46] very interesting way. Um so it allows

[17:49] you to more easily run benchmarks with a

[17:51] lot of different parameters.

[17:54] I have the command here. So we can use

[17:57] the minus L flag to introduce a

[18:00] parameter. In our case, we want to run

[18:03] H.2 against H.4. And then we also

[18:06] introduce a second parameter mode with

[18:09] backslash and without backslash. And

[18:11] then uh we specify only one command PHP

[18:15] and then appending the version number

[18:17] running the bench script that we see

[18:20] here above and then the mode. This will

[18:22] create four tests run against each

[18:24] other. So PHP 8.2 8.2 with backslash

[18:30] runs around 800 milliseconds. PHP 8.4

[18:35] with backslash runs much faster 580

[18:38] milliseconds. And then again PHP 82

[18:42] without a backslash runs similarly

[18:45] around 800 milliseconds. And then u PHP

[18:49] 8.4 4 without a backslash also runs at

[18:52] around 800 milliseconds.

[18:55] And we see the end result. PHP 8.4 um

[18:59] with backslash run 1.3 times faster than

[19:03] 8.2 with a backslash 1.4 times 1.48

[19:08] times faster than without a backslash.

[19:12] So this helps seeing the performance

[19:15] difference. We run the same script and

[19:18] we add some variation and parameters

[19:20] through Hyperfine. This video gave a

[19:22] good overview about seven different

[19:25] problems that you can have uh

[19:26] benchmarking PHP code. Uh how we use the

[19:29] Hyperfine tool, different ways of using

[19:31] Hyperfine and I hope that you really

[19:34] took something away from it for your own

[19:35] benchmarking. If you like this content

[19:38] about PHP performance, you can follow

[19:40] this channel on YouTube or subscribe to

[19:43] our newsletter. The link is also in the

[19:45] description. Thank you very much.

⚡ Saved you time reading this? Transcribe any YouTube video for free — no signup needed.