I recently had the problem described in this post:
As this thread is a bit old to post a solution, I open a new one with my solution.
PHP functions like fopen(), fread() and file_get_contents() don’t have timeout protection. That means that if the requested file or page exists but that the server is slow or fail to send the 200 OK header, your script will freeze.
The set_time_limit() function that changes the value of the max_execution_time configuration variable has no effect on these streams. The only way to stop the frozen script is to hit the navigator stop button or wait until the web server reaches its time limit and sends a HTTP fail header.
I see this as a shortcoming of these PHP functions.
I have recently written a little function that simulate file_get_contents() with a timeout using CURL (also tried and succeeded with fsockopen but it’s not as fast as CURL)
The function
/*
UrlGetContentsCurl ( string url [, int timeout [, bool content [, int offset [, int maxlen]]]] )
Arguments:
string url: url with its protocole. For ex.
http://www.rfc-editor.org/rfc/rfc2606.txt
ftp://ftp.rfc-editor.org/in-notes/rfc2606.txt
int timeout: time limit.
Optional. Default: current value of max_execution_time.
bool content: true to get the content. False, the function returns the response time
of the page or file.
Optional. Default: true.
int offset: offset applied to start the content capture.
Optional. Default: 0 (start of the page)
int maxlen: number of bytes to capture satrting at offset.
Optional. Default: null (all the file/page)
Returns:
False failure to connect to the server or to receive a 200 OK status from the server in due time.
String The actual content of page/file if [content] is set to true
Float Response time to establish the connection to the server and to receive the 200 OK status
for the requested file/page ([content] set to false)
*/
function UrlGetContentsCurl(){
// parse the argument passed and set default values
$arg_names = array('url', 'timeout', 'getContent', 'offset', 'maxLen');
$arg_passed = func_get_args();
$arg_nb = count($arg_passed);
if (!$arg_nb){
echo 'At least one argument is needed for this function';
return false;
}
$arg = array (
'url' => null,
'timeout' => ini_get('max_execution_time'),
'getContent'=> true,
'offset' => 0,
'maxLen' => null
);
foreach ($arg_passed as $k=>$v){
$arg[$arg_names[$k]] = $v;
}
// CURL connection and result
$ch = curl_init($arg['url']);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_RESUME_FROM, $arg['offset']);
curl_setopt($ch, CURLOPT_TIMEOUT, $arg['timeout']);
$result = curl_exec($ch);
$elapsed = curl_getinfo ($ch, CURLINFO_TOTAL_TIME);
$CurlErr = curl_error($ch);
curl_close($ch);
if ($CurlErr) {
echo $CurlErr;
return false;
}elseif ($arg['getContent']){
return $arg['maxLen']
? substr($result, 0, $arg['maxLen'])
: $result;
}
return $elapsed;
}
How to use it?
$url = 'http://www.rfc-editor.org/rfc/rfc2606.txt';
$timeout = 5;
$getContent = true;
$offset = 0;
$maxLen = 50;
echo UrlGetContentsCurl($url, $timeout, $getContent, $offset, $maxLen);
// or
echo UrlGetContentsCurl($url, $timeout, false);
// or
echo UrlGetContentsCurl($url, $timeout);
// or
echo UrlGetContentsCurl($url);
I benchmarked my function against the PHP fopen() and file_get_contents() and I noticed no difference in time.
Edit: Added User-Ugent as some sites use to reject unsigned request.