Processing CSV files in a memory efficient way

Processing CSV files in a memory efficient way

A little while ago I had to dive deeper into the performance optimized usage of PHPExcel. Our users are uploading files like Excel or CSV with a lot data to process. Initially we used the PHPEXcel instance without any tuning of the default configuration which lead to heavy memory issues on relativly small files. So I had to avoid reading all file content at ones to the buffer (like file_get_contents does).

In my research mainly optimizing the usage of PHPExcel I came across a tiny library I am grown really fond of. It is called Goodby/CSV. Both tools have a very well grounded documentation to read in and understand the basics and the usage.

Goodby/CSV is highly memory efficent and declares itself as extendable, although I did not check the second part. Goodby/CSV make use of the closure feature of PHP (introduced in PHP 5.3), hereby you define an anonymous function as callback for each read file row.

So as mighty as PHPExcel is, it brings a lot of overhead on reading files with itself, especially on reading CSV files.

I did a little time measurement test on reading a CSV file (1988 entries, filesize about 1.9 MB). Here are the results:

Runs PHPExcel Goodby/CSV
  Duration Mem. Usage Duration Mem. Usage
1 14,99s 51,76MB 0,25s 21,00MB
1 14,91s 51,76MB 0,25s 21,00MB
1 15,27s 51,76MB 0,25s 21,00MB

So this is it, a way lot faster accessing and reading CSV file content by Goodby/CSV. The library also provides support for easy CSV file writing. The tool is licensed under MIT License, so there should be no problems using the libary in your application.

Eintrag von Florian Horn am 23.04.2015

Tags: PHP, CSV, Efficiency

Diese Webseite verwendet Cookies, um die Bedienfreundlichkeit zu erhöhen. Mit der Nutzung unserer Webseite wird das Einverständnis erklärt, dass wir Cookies verwenden. Weitere Informationen.