When you look at Excel solutions for PHP, the first thing you find is PHPExcel, digging a little bit further lets you find libXl. The prior is a pure PHP solution, the latter a native library for a couple of operating systems. There is a PHP extension that exposes this library to PHP, so you can do the same things with it that you can do with PHPExcel.
TL;DR: PHPExcel is nice, libXl with php_excel can do the same WAY faster. Use the Diagrams below to convince your customer/boss/CFO to invest in performance.
Recently, over at Sitepoint, Taylor Ren wrote about PHPExcel for generating Excel files with pure PHP, reacting to feedback demanding windows-free solutions. The same applied to us in a recent project. Reading and writing XLS aswell as XLSX files sent us to the same library, since it is state-of-the-art, robust, open-source and most importantly runs on linux servers.
We had one problem though: XLS files can get huge. Any modern pure PHP solution has to represent Cells and Rows and Sheets with a huge number of objects and will thus grow exponentially. But there is an alternative:
libXl offers the same range of Excel formats as PHPExcel (sans ODS, HTML and other non-proprietary types). It is available for Linux, Mac and Windows, so it can support pure Linux environments. For use in PHP, there is an extension that exposes all library functionality to PHP. If your customer requires you to read and write files that are generated and might be edited by typical corporate users, the task can be solved by either LibXl or PHPExcel. One is free to use, the other one costs between 200 and 1500 EUR – not that much in corporate terms, but you need to explain why the customer should spend his money.
So we did some benchmarking. First a couple of writes to a document with 5 sheets of 26 columns and 2000 rows.
As you can see in the graphs (note the logarithmic y scale), both PHPExcel and libXl use an exponential amount of time, but libXl is more than one order of magnitude faster than PHPExcel.
Memory can not be compared directly unfortunately, since PHP can not measure what the system library uses. PHPExcel’s memory usage follows an exponential course with a peak value of 180 megabyte for 200000 writes to the XLSX document.
Next, reads are measured. We read from a document with 5 sheets and 20 columns.
Similar to the writes, the runtime of both PHPExcel and libXl follow an exponential curve, again libXl is significantly better, this times almost two orders of magnitude. In absolute numbers, for 200000 reads from a XLSX document with 40000 entries PHPExcel needs 55 seconds, whereas libXl is done in 700 milliseconds.
Reading lets libXl use some memory, so we at least have something to compare, although the difference can rather be used to determine what PHP uses to store the data, instead of the amount of memory used by the library itself.
Finally, we did some measurements of extreme cases, just to be able to reach the limits of what can be done.
Test-Case | Number of Writes | Execution Time (libxl) | Memory Usage* (libxl) | Execution Time (PHPExcel) | Memory Usage* (PHPExcel) |
---|---|---|---|---|---|
1 sheet, 10 columns, 1.000 rows | 10.000 | 58ms | 262k / 28M** | 599ms | 14M / 40M** |
1 sheet, 10 columns, 500.000 rows | 5.000.000 | 34s | 262k / 985M** | 37min | 4,2G / 4.2G** |
1 sheet, 10 columns, 1.048.576 rows (max rows since Excel 2007) |
10.485.760 | 74s | 262k / 2.1G** | PHP Fatal error: Allowed memory size of 8589934592 bytes exhausted | |
1 sheet, 10 rows, 1.000 columns | 10.000 | 58ms | 262k / 27M** | 567ms | 14M / 40M** |
1 sheet, 10 rows, 16.384 columns (max columns since Excel 2007) |
163.840 | 1.010ms | 262k / 54M** | 10s | 148M / 170M** |
1 sheet, 16.384 columns, 3.500 rows | 57.344.000 | 8min | 262k / > 8G** | PHP Fatal error: Allowed memory size of 8589934592 bytes exhausted | |
1 sheet, 16.384 columns, 1.048.576 rows | 17.179.869.184 | Neither tool can handle this | |||
10.000 sheets, 1 column, 1 row | 10.000 | 124s | 262k / 62M** | 124s | 93M / 123M** |
20.000 sheets, 1 column, 1 row | 20.000 | 10min | 252k / 94M** | 39min | 188M / 220M** |
* measurement with memory_get_peak_usage(true) gives an idea how much PHPExcel uses, libXl can’t be measured from within PHP.
** approximate overall memory use with ps
As you can see, due to a smaller memory footprint, libXl is slightly more useful when you have a special set of data to crunch. Apparently you should be able to solve problems with either by simply adding enough memory. PHPExcel is perfectly sufficient for all kinds of data that normally fits on your screen. For larger data sets, you can use libXl to boost your performance significantly.
Schreibe einen Kommentar