Posted on Wednesday, July 7, 2010 at 5:11AM by Jack Servedio
When iterating through a long queue and executing a web service on each item, especially time consuming web services, where getting a response from the web service is critical, a simple for or while loop can take an extremely long time to run. This is because PHP can only execute one web service at a time with the loop, and it has to wait for the response to the web service before it can continue on to the next item. In some cases, you can simply have PHP not wait for a response and continue on to the next item to speed things up, but what about when the web service's response is critical to your application?
I ran into this problem while working on a system to send out SMS Messages in a queue implemented with MySQL through the Ericsson Gateway. To send a message through the Ericsson Gateway, you have to make a SOAP request to their web service and pass along a number of parameters, such as username, password, short code, phone number, message body, etc. This system has to catch any errors in sending the request to Ericsson's web service to ensure that the message would make it into the aggregator queue. If the web service returns an error, the SMS message must be re-sent through the web service. As you can see, the only way to do this is to wait for the response from the web service and handle it.
When running through a queue with a simple loop, using the NuSoap class to make the SOAP request to Ericsson, the script would run at between .75 and 1.5 messages per second. This may seem decent, but when you are sending out blasts of over 100,000 messages, and your maximum inbound message rate at the aggregator is 20 messages per second, this is far too slow.
If you have to wait for the web service request to be returned, and it is taking between 0.5 and 1.5 seconds to be returned, and PHP isn't using resources while waiting for the web service to be returned, the logical way to speed things up is to run multiple web services all at once. But how do you do this while maintaining queue integrity, keeping the web services from running away and executing thousands at once, and most importantly, how to handle and retry web service errors? The best way to do this is by executing each web service in its own process while still sharing a single queue. To execute multiple requests at once, each with its own individual process, while maintaining queue integrity and handling errors, the best solution is Forking using the PCNTL and POSIX Process Control Extensions. Below is my proof of concept using Forking to iterate through a MySQL queue, keeping its integrity using transactions, and allowing a limited number processes at once to be run.
Download the Source Here and take a look at the code. I have included the MySQL database table structure along with a rudimentary database class to handle queries and returned data, the NuSoap Library, and the script (forking_api.php). The configuration allows you to set a maximum number of forks (instances of the script running in parallel, each instance with its own PID), the amount of time to wait before attempting to fork if the maximum number of forks is reached, and a maximum number of times to try waiting before shutting down. The script starts by instantiating a NuSoap client object to make the SOAP request and connecting to the database to access the queue. Next, the script begins a transaction and queries the queue for the next unsent message. When the script queries the queue for the next message, it locks that row, which keeps integrity of the queue when multiple instances of the script are accessing the queue all at once. The query gets the message ID for indexing and the phone number, short code, and message body from the queue to make the SOAP request to the aggregator. If the queue is empty, it will stop the script. If the query retrieved a message from the queue, it will set the message as sent and commit the transaction. Now is where it gets interesting. The script will now fork. When a script forks, it runs everything below the fork in two processes, meaning everything below the fork is run twice. However, using POSIX, you can make your script aware of its process ID, and do different actions based on its process ID. If the script detects that it is running in a new process, it will check to see if the maximum number of forks has been reached, and if it has, go into a holding pattern. If the maximum number of forks has not been reached, it will use passthru to open up a new instance of itself in a new process ID and then close. Now you are left with the original script, in the original process ID, at the point in the script where it forked, and another instance of forking_api.php at line 1. The original process will execute the SOAP request and wait for the response. If the web service executes with no errors, the script will end. However, if there is an error, it will revert the message to unsent, putting it at the front of the queue. While the SOAP request is executing, your second instance of the script will keep running, keep forking, and keep executing SOAP requests until it reaches the maximum number of forks.
By using forking, you can ensure that you have a pre-defined number of instances of your SOAP request running asynchronously at all times until the queue is empty. By using this method, you are only limited by the amount of processing power, memory, and bandwidth your servers have. Using this script on a message queue, I was able to sustain 10 - 20 messages per second or more for extended periods of time.
Please remember this script is only a proof of concept, and is not intended to go directly into production so use it at your own risk.
Download Source - Contains self forking script, NuSoap Library, a very rudimentary database class, and a SQL file containing the table structure for the queue used.