The cURL library proves a valuable resource for developers needing to make use of common URL-based protocols (e.g., HTTP, FTP, etc.) for exchanging data. PHP provides a set of curl* wrapper functions in an extension that nicely integrates cURL’s functionality.
When you have to make multiple requests in a script, it’s often more efficient to utilize the curl_multi* functions (e.g., curl_multi_init), which make it possible to process requests concurrently. For example, if you have to make 2 web requests in a script and each one requires 2 seconds to complete, making 2 separate curl requests, one right after the other, requires 4 seconds. However, if you make use of the curl_multi* functions, the requests will be made concurrently (i.e., we no longer have to wait for one request to finish to start the next one), and requires only 2 seconds (the actual execution time depends on if the scripts are truly running in parallel or merely concurrently.)
Let’s take a look at a function that provides a simple interface to the concurrent capabilities of cURL and is extensible to most situations, as the curl_multi* functions can be cumbersome.
/** * Simple wrapper function for concurrent request processing with PHP's cURL functions (i.e., using curl_multi* functions.) * * @param array $requests Array containing request url, post_data, and settings. * @param array $opts Optional array containing general options for all requests. * @return array Array containing keys from requests array and values of arrays each containing data (response, null if response empty or error), info (curl info, null if error), and error (error string if there was an error, otherwise null). */ function multi(array $requests, array $opts = []) { // create array for curl handles $chs = []; // merge general curl options args with defaults $opts += [CURLOPT_CONNECTTIMEOUT => 3, CURLOPT_TIMEOUT => 3, CURLOPT_RETURNTRANSFER => 1]; // create array for responses $responses = []; // init curl multi handle $mh = curl_multi_init(); // create running flag $running = null; // cycle through requests and set up foreach ($requests as $key => $request) { // init individual curl handle $chs[$key] = curl_init(); // set url curl_setopt($chs[$key], CURLOPT_URL, $request['url']); // check for post data and handle if present if ($request['post_data']) { curl_setopt($chs[$key], CURLOPT_POST, 1); curl_setopt($chs[$key], CURLOPT_POSTFIELDS, $request['post_array']); } // set opts curl_setopt_array($chs[$key], (isset($request['opts']) ? $request['opts'] + $opts : $opts)); curl_multi_add_handle($mh, $chs[$key]); } do { // execute curl requests curl_multi_exec($mh, $running); // block to avoid needless cycling until change in status curl_multi_select($mh); // check flag to see if we're done } while($running > 0); // cycle through requests foreach ($chs as $key => $ch) { // handle error if (curl_errno($ch)) { $responses[$key] = ['data' => null, 'info' => null, 'error' => curl_error($ch)]; } else { // save successful response $responses[$key] = ['data' => curl_multi_getcontent($ch), 'info' => curl_getinfo($ch), 'error' => null]; } // close individual handle curl_multi_remove_handle($mh, $ch); } // close multi handle curl_multi_close($mh); // return respones return $responses; }
To use this function, you can call it like so:
$responses = multi([ 'google' => ['url' => 'http://google.com', 'opts' => [CURLOPT_TIMEOUT => 2]], 'msu' => ['url'=> 'http://msu.edu'] ]);
And, then you can cycle through the responses:
foreach ($responses as $response) { if ($response['error']) { // handle error continue; } // check for empty response if ($response['data'] === null) { // examine $response['info'] continue; } // handle data $data = $response['data']; // do something extraordinary }
While the above function is helpful for a few requests, if you need to make a large number of requests (perhaps more than 5), then instead you should have a look at the rolling curl library, which makes better use of resources.
And your significant other said you couldn’t multitask 🙂
Leave a Reply