Making Concurrent cURL Requests Using PHP’s curl_multi* Functions

The cURL library proves a valuable resource for developers needing to make use of common URL-based protocols (e.g., HTTP, FTP, etc.) for exchanging data. PHP provides a set of curl* wrapper functions in an extension that nicely integrates cURL’s functionality.

When you have to make multiple requests in a script, it’s often more efficient to utilize the curl_multi* functions (e.g., curl_multi_init), which make it possible to process requests concurrently. For example, if you have to make 2 web requests in a script and each one requires 2 seconds to complete, making 2 separate curl requests, one right after the other, requires 4 seconds. However, if you make use of the curl_multi* functions, the requests will be made concurrently (i.e., we no longer have to wait for one request to finish to start the next one), and requires only 2 seconds (the actual execution time depends on if the scripts are truly running in parallel or merely concurrently.)

Let’s take a look at a function that provides a simple interface to the concurrent capabilities of cURL and is extensible to most situations, as the curl_multi* functions can be cumbersome.

/**
* Simple wrapper function for concurrent request processing with PHP's cURL functions (i.e., using curl_multi* functions.)
*
* @param array $requests Array containing request url, post_data, and settings.
* @param array $opts Optional array containing general options for all requests.
* @return array Array containing keys from requests array and values of arrays each containing data (response, null if response empty or error), info (curl info, null if error), and error (error string if there was an error, otherwise null).
*/
function multi(array $requests, array $opts = [])
{
    // create array for curl handles
    $chs = [];
    // merge general curl options args with defaults
    $opts += [CURLOPT_CONNECTTIMEOUT => 3, CURLOPT_TIMEOUT => 3, CURLOPT_RETURNTRANSFER => 1];
    // create array for responses
    $responses = [];
    // init curl multi handle
    $mh = curl_multi_init();
    // create running flag
    $running = null;
    // cycle through requests and set up
    foreach ($requests as $key => $request) {
        // init individual curl handle
        $chs[$key] = curl_init();
        // set url
        curl_setopt($chs[$key], CURLOPT_URL, $request['url']);
        // check for post data and handle if present
        if ($request['post_data']) {
            curl_setopt($chs[$key], CURLOPT_POST, 1);
            curl_setopt($chs[$key], CURLOPT_POSTFIELDS, $request['post_array']);
        }
        // set opts 
        curl_setopt_array($chs[$key], (isset($request['opts']) ? $request['opts'] + $opts : $opts));
        curl_multi_add_handle($mh, $chs[$key]);
    }
    do {
        // execute curl requests
        curl_multi_exec($mh, $running);
        // block to avoid needless cycling until change in status
        curl_multi_select($mh);
    // check flag to see if we're done
    } while($running > 0);
    // cycle through requests
    foreach ($chs as $key => $ch) {
        // handle error
        if (curl_errno($ch)) {
            $responses[$key] = ['data' => null, 'info' => null, 'error' => curl_error($ch)];
        } else {
            // save successful response
            $responses[$key] = ['data' => curl_multi_getcontent($ch), 'info' => curl_getinfo($ch), 'error' => null];
        }
        // close individual handle
        curl_multi_remove_handle($mh, $ch);
    }
    // close multi handle
    curl_multi_close($mh);
    // return respones
    return $responses;
}

To use this function, you can call it like so:

$responses = multi([
    'google' => ['url' => 'http://google.com', 'opts' => [CURLOPT_TIMEOUT => 2]],
    'msu' => ['url'=> 'http://msu.edu']
]);

And, then you can cycle through the responses:

foreach ($responses as $response) {
    if ($response['error']) {
        // handle error
        continue;
    }
    // check for empty response
    if ($response['data'] === null) {
        // examine $response['info']
        continue;
    }
    // handle data
    $data = $response['data'];
    // do something extraordinary
}

While the above function is helpful for a few requests, if you need to make a large number of requests (perhaps more than 5), then instead you should have a look at the rolling curl library, which makes better use of resources.

And your significant other said you couldn’t multitask 🙂

5 thoughts on “Making Concurrent cURL Requests Using PHP’s curl_multi* Functions”

    1. This particular task isn’t CPU heavy. Rather, without the parallel processing of requests, the requests would occur in sequence (i.e., one after the other), and the script would spend the vast majority of time waiting for responses over the network. By using the parallel version, the script waits for all of the responses at once, so less CPU cycles are wasted blocking for the responses.

Leave a Reply

Your email address will not be published. Required fields are marked *