Optimize JSON pretty print indentation performance by LamentXU123 · Pull Request #21474 · php/php-src

LamentXU123 · 2026-03-20T11:20:49Z

This PR optimizes the php_json_pretty_print_indent function in the JSON extension to improve performance when encoding structures with JSON_PRETTY_PRINT.

When I was reading this, I found that now, the indentation logic used a for loop to append spaces in increments of 4 characters per depth level. For a JSON structure with a depth of N, this resulted in N consecutive calls to smart_str_appendl. This approach introduces unnecessary overhead due to repeated function calls.

So I think I would introduce this space-time optimization to add a static constant string of pre-allocated spaces. For almost all typical JSON depths, this reduces the number of smart_str_appendl calls from O(N) to exactly 1. This significantly reduces function call overhead and improves CPU performance during json_encode with JSON_PRETTY_PRINT, especially for deeply nested data.

ext/json/json_encoder.c

staabm · 2026-03-20T12:39:04Z

This significantly reduces function call overhead and improves CPU performance during json_encode with JSON_PRETTY_PRINT, especially for deeply nested data.

Could you give some before/after numbers?

LamentXU123 · 2026-03-20T12:51:46Z

This significantly reduces function call overhead and improves CPU performance during json_encode with JSON_PRETTY_PRINT, especially for deeply nested data.

Could you give some before/after numbers?

Sure do, but maybe tomorrow. I think this is a pretty obvious optimization so I don't do this initially

iluuu1994 · 2026-03-20T13:48:56Z

A short benchmark would be appreciated. smart_str_appendl() is inlined, so whether this actually improves performance is hard to predict.

LamentXU123 · 2026-03-21T03:48:34Z

A short benchmark would be appreciated. smart_str_appendl() is inlined, so whether this actually improves performance is hard to predict.

(Edited) The benchmark script:

For very simple json structures (depth<=2), the original version is slightly faster.

<?php
$data = [];
$ptr = &$data;
for ($i = 0; $i < 2; $i++) {
    $ptr["level_$i"] = ["msg" => "opt_test", "id" => $i];
    $ptr = &$ptr["level_$i"];
}
for ($i = 0; $i < 500000; $i++) {
    json_encode($data, JSON_PRETTY_PRINT);
}
?>

For complex structures (depth=3), the optimized version is slightly faster.

<?php
$data = [];
$ptr = &$data;
for ($i = 0; $i < 3; $i++) {
    $ptr["level_$i"] = ["msg" => "opt_test", "id" => $i];
    $ptr = &$ptr["level_$i"];
}
for ($i = 0; $i < 500000; $i++) {
    json_encode($data, JSON_PRETTY_PRINT);
}
?>

For very complex structures (depth=50), the optimized version is faster.

<?php
$data = [];
$ptr = &$data;
for ($i = 0; $i < 50; $i++) {
    $ptr["level_$i"] = ["msg" => "opt_test", "id" => $i];
    $ptr = &$ptr["level_$i"];
}
for ($i = 0; $i < 500000; $i++) {
    json_encode($data, JSON_PRETTY_PRINT);
}
?>

It seems like the original one is faster in case of very simple structures, I am thinking something like:

static inline void php_json_pretty_print_indent(smart_str *buf, int options, const php_json_encoder *encoder) /* {{{ */
{
    if (options & PHP_JSON_PRETTY_PRINT) {
        int depth = encoder->depth;
        if (depth <= 2) {
            int i;
            for (i = 0; i < depth; i++) {
                smart_str_appendl(buf, "    ", 4);
            }
        } else {
            size_t remaining = (size_t) depth * 4;
            char *dst = smart_str_extend(buf, remaining);
            memset(dst, ' ', remaining);
        }
    }
}

I would run the benchmark again with the aforementioned code.

So with this code the performance difference in simple structure (depth<=2) is lowered to 1.03x

Overall, the optimized version is slightly slower in simple structures with depth lower than 3 (1.03x slower) but offer significant enhancement in nested data (2x faster in data with depth more than 50, 1.02x faster in data with 3 depth), I think this seems good to me.

bukka · 2026-03-21T08:29:20Z

So you should probably raise threshold in that condition (if (depth <= 2) {), right?

LamentXU123 · 2026-03-21T08:49:08Z

So you should probably raise threshold in that condition (if (depth <= 2) {), right?

Well when depth > 2 the performance of the optimized version start to be better than the original one, and the benefits become bigger when depth increase. I think the threshold can be increased to 8, because since than it provides a 1.10x optimization which is useful enough

mvorisek · 2026-03-21T10:37:51Z

ext/json/json_encoder.c

+        if (depth <= 8) {
+            int i;
+            for (i = 0; i < depth; i++) {
+                smart_str_appendl(buf, "    ", 4);
+            }
+        } else {
+            size_t remaining = (size_t) depth * 4;
+            char *dst = smart_str_extend(buf, remaining);
+            memset(dst, ' ', remaining);
+        }


How can the original loop can be faster? The inlined functions are almost the same:

php-src/Zend/zend_smart_str.h

Lines 130 to 132 in d3727f4

size_t new_len = smart_str_alloc(dest, len, persistent);

memcpy(ZSTR_VAL(dest->s) + ZSTR_LEN(dest->s), str, len);

ZSTR_LEN(dest->s) = new_len;

php-src/Zend/zend_smart_str.h

Lines 58 to 61 in d3727f4

size_t new_len = smart_str_alloc(dest, len, persistent);

char *ret = ZSTR_VAL(dest->s) + ZSTR_LEN(dest->s);

ZSTR_LEN(dest->s) = new_len;

return ret;

It the return variable the bottleneck? Or maybe use the original approach with longer spaces literal " ..." with like 64 spaces?

How can the original loop can be faster?

I am wondering this too, probably because memset is causing extra cost... But I think thats ok the loss can be ignored when depth goes bigger.

I will run benchmarks to test the original approach later (but I highly doubt if it could be better than memset)

Just a guess, memset with an arbitrary length may be overspecialized for large sizes. In that case, I don't see a big point of this PR. 50 levels of nesting seem pretty artificial. Nevertheless, I'm not code owner so I'll keep that decision up to those who are.

50 levels of nesting seem pretty artificial.

The original approach (by defining space constant) should work in simple json structures I guess. I will send in the test results later in this thread

Just a guess, memset with an arbitrary length may be overspecialized for large sizes. In that case, I don't see a big point of this PR. 50 levels of nesting seem pretty artificial. Nevertheless, I'm not code owner so I'll keep that decision up to those who are.

<?php $data = []; $ptr = &$data; for ($i = 0; $i < 2; $i++) { $ptr["level_$i"] = ["msg" => "opt_test", "id" => $i]; $ptr = &$ptr["level_$i"]; } for ($i = 0; $i < 500000; $i++) { json_encode($data, JSON_PRETTY_PRINT); } ?>

depth 2:

depth 1:

So by using the original optimization approach the optimized version is 1.29x slower in depth 2 and 1.12x slower in depth 1.

So yes, this pr could only optimize nested data.

It the return variable the bottleneck? Or maybe use the original approach with longer spaces literal " ..." with like 64 spaces?

This happens to be even slower, weird. Probably because of compiler's optimization, because smart_str_appendl(buf, " ", 4) allows the compiler to optimize the constant size 4 into a single 32-bit integer store, bypassing the overhead of a variable-length memcpy and branch evaluations present in the optimized block.

This reverts commit cd9192c.

Update json_encoder.c

1d40408

LamentXU123 requested a review from bukka as a code owner March 20, 2026 11:20

github-actions bot added the Extension: json label Mar 20, 2026

Update UPGRADING

a967ff1

mvorisek reviewed Mar 20, 2026

View reviewed changes

ext/json/json_encoder.c Outdated Show resolved Hide resolved

apply changes from reviews

9ac40b5

Update json_encoder.c

b6875ae

update threshold

c3ed1af

mvorisek reviewed Mar 21, 2026

View reviewed changes

LamentXU123 added 3 commits March 21, 2026 20:23

Update json_encoder.c

cd9192c

Revert "Update json_encoder.c"

6c5cd75

This reverts commit cd9192c.

lower the threshold through benchmark tests

42b9ba6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize JSON pretty print indentation performance#21474

Optimize JSON pretty print indentation performance#21474
LamentXU123 wants to merge 8 commits intophp:masterfrom
LamentXU123:optimaze-1

LamentXU123 commented Mar 20, 2026

Uh oh!

Uh oh!

staabm commented Mar 20, 2026

Uh oh!

LamentXU123 commented Mar 20, 2026

Uh oh!

iluuu1994 commented Mar 20, 2026

Uh oh!

LamentXU123 commented Mar 21, 2026 •

edited

Loading

Uh oh!

bukka commented Mar 21, 2026

Uh oh!

LamentXU123 commented Mar 21, 2026

Uh oh!

mvorisek Mar 21, 2026

Uh oh!

LamentXU123 Mar 21, 2026

Uh oh!

iluuu1994 Mar 21, 2026

Uh oh!

LamentXU123 Mar 21, 2026 •

edited

Loading

Uh oh!

LamentXU123 Mar 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	size_t new_len = smart_str_alloc(dest, len, persistent);
	memcpy(ZSTR_VAL(dest->s) + ZSTR_LEN(dest->s), str, len);
	ZSTR_LEN(dest->s) = new_len;

	size_t new_len = smart_str_alloc(dest, len, persistent);
	char *ret = ZSTR_VAL(dest->s) + ZSTR_LEN(dest->s);
	ZSTR_LEN(dest->s) = new_len;
	return ret;

Conversation

LamentXU123 commented Mar 20, 2026

Uh oh!

Uh oh!

staabm commented Mar 20, 2026

Uh oh!

LamentXU123 commented Mar 20, 2026

Uh oh!

iluuu1994 commented Mar 20, 2026

Uh oh!

LamentXU123 commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bukka commented Mar 21, 2026

Uh oh!

LamentXU123 commented Mar 21, 2026

Uh oh!

mvorisek Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

LamentXU123 Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

iluuu1994 Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

LamentXU123 Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LamentXU123 Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

LamentXU123 commented Mar 21, 2026 •

edited

Loading

LamentXU123 Mar 21, 2026 •

edited

Loading

LamentXU123 Mar 21, 2026 •

edited

Loading