PHP - Performance - Adjacency set to nested JSON

Before You Begin

In this article, we will compare the performance of Method 1 and Method 2 as used in - PHP – Convert Adjacency set to nested JSON.

We will be utilizing an extensive data set with approximately 500K records. I have downloaded these records from itis.gov and made some minor modification for this article. Data can be downloaded from here - https://drive.google.com/open?id=115_d5PV4OQ948cWP5TLXUpN2TvLvvo_y

We will be using MySQL CTE and break the data into chunks to test the performance for different size of records. You can refer to the CTE article here - MySQL – Adjacency List Model For Hierarchical Data Using CTE.

There is a slight change made to getDataFromDatabase() method as we want to limit the records to test the performance. Below you can find the updated definition. I have added the placeholder inside the CTE as [ID_OF_ENTITY]. This will be our starting point and then I will retrieve all descendants for this entity and convert it to a nested response.

Method 1

Method 2

Performance Comparision

For performance comparison, I am going to use blackfire.io. We will be ignoring the performance of all other components like MySQL except for transformTree().

Results

ID_OF_ENTITY = 846539, ~5,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 4,960 1.58 s 58.9 ms 1.52 s 3.88 MB
Method 2 - Result 847 88 ms 79.4 ms 8.65 ms 4.41 MB

ID_OF_ENTITY = 846542, ~10,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 12,014 21.3 s 40 ms 21.2 s 8.28 MB
Method 2 - Result 1,807 185 ms 158 ms 27.4 ms 10.3 MB

ID_OF_ENTITY = 846535, ~20,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 19,059 1 min 5 s 82.3 ms 1 min 5 s 13.6 MB
Method 2 - Result 3,061 179 ms 122 ms 57.6 ms 16.8 MB

ID_OF_ENTITY = 846504, ~50,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 54,703 11 min 33 s 1.03 s 11 min 32 s 37.4 MB
Method 2 - Result 8,589 534 ms 404 ms 129 ms 47.1 MB

Conclusion

As you can see from the above result set, Method 2 is significantly faster than Method 1 when it comes to big data set.

I tried to run Method 1 on the complete data set with ~ 500K records, and I waited for more than an hr, but it never executed. On the flip side - Method 2 was still able to complete within 10 seconds.

The main reason for Method 2 being superior is because of the recursive method called significantly less than Method 1.

In future articles, I will talk about how you can best mock any third party service while running PHPUnit tests inside Laravel. As making a real connection slows down the complete suite.

Posted in JSON, PERFORMANCE, PHP, BLOG
Akki K

Akki

Experienced Backend Developer.