"Unable to fork" on a UNIX environment

I have an application which has performed fine in two test environments but only works intermittently on a live environment. In order to debug I created the script below to test the area of the main application that fails.

Basically it has to call a shell script on the UNIX environment which sets some environment variables and then it has to call an executable which performs a password reset on an application called Siebel Analytics. The problem is that it sometimes runs and sometimes comes back with an “Unable to fork” error. I have spent 3 days tearing my hair out trying to find out what causes it but no luck. The live environment and one of the test environments are identical but the issue only seems to occur on live which makes me think it may be down to the fact that Live has more concurrent users although the web server only ever shows around 5% CPU usage

You’ll be able to see a couple of functions I’ve tried to use instead of system() but they aren’t any more successful

Any help much appreciated


<?php
	// set the error level to show everything
	error_reporting(E_ALL);
	$runShell = 'yes';
	$runChangePassword = 'yes';
	$useBasicSystem = 'yes';
	$shellResult = '';
	$shellResult2 = '';
	$shellResult3 = '';
	$shellResult4 = '';

	// print header and page title
	echo '<html><head><title>';
	if ($_SERVER['HTTP_HOST'] == 'xxx.xxx.xxx.xxx')
	{
		echo 'REF';
		$shellScriptExe = '/websrvr_iplanet/analytics753110/Web/Servlet/php/sa-cli.sh';
		$passwordChangeLocation = '/websrvr_iplanet/analytics753110/Bin/nqschangepassword.exe';
	} else {
		echo 'LIVE';
		$shellScriptExe = '/appserver_iplanet/analytics753110/Web/Servlet/php/sa-cli.sh';
		$passwordChangeLocation = '/appserver_iplanet/analytics753110/Bin/nqschangepassword.exe';
	}
	echo '</title>
	<style type="text/css">
		body, body td {
			font: normal 12px Verdana;
			color: #333333;}
		h1 {
			color: #ff6600;
			font: bold 14px Verdana;
			margin: 20px 0 5px 0;
			padding: 3px;
			background: #ededed;
			border-bottom: 1px #aaaaaa solid;}
		h2 {
			color: #003366;
			font: bold 12px Verdana;
			margin: 10px 0 0 0;}
	</style>
	</head>';
	
	
	function mysystem($command)
	{
 		if (!($p=popen($command,"r")))
		{
   			return 126;
 		}
		$out = '';
		while (!feof($p))
		{
			$line=fgets($p,1000);
			$out .= $line;
		}
		pclose($p);
		return $out;
	}

	function mysystem2($command, $hide=false)
	{		
		if ( !($p=popen("($command)2>&1","r")) )
		{
			return 126;
		}
		while (!feof($p))
		{
			$l=fgets($p,1000);
			if (!$hide)
			{
				print $l;
			}
		}
		return pclose($p);
	}
	
	if ($runShell == 'yes')
	{
		// define shell script location
		$shellResult =  '<h1>SHELL SCRIPT ['.date('d-m-Y H:i:s').']</h1><strong>Shell Script Command</strong><br />'.$shellScriptExe.'<br />
		<h2>Shell Script Results</h2><pre>';
		echo $shellResult;
		if ($useBasicSystem == 'no')
		{
			$shellScript = mysystem ($shellScriptExe);
			$shellResult2 = '</pre><strong>Result: </strong>'.$shellScript;
		} else {
			$shellScript = system ($shellScriptExe, $ret);
			$shellResult2 = '</pre><strong>Command: </strong>'.$shellScript.'<br /><strong>Return Value: </strong>'.$ret;
		}
		echo $shellResult2;
	}
	
	if ($runChangePassword == 'yes')
	{
		$shellResult3 = '<h1>PASSWORD CHANGE ['.date('d-m-Y H:i:s').']</h1>';
		// define password change location
		$params['dsnName'] = 'AnalyticsWeb';
		$params['dbUser'] = 'USER1';
		$params['oldPassword'] = 'PASSWORD';
		$params['newPassword'] = 'PASSWORD2';
		$shellResult3 .=  '<strong>Parameters being passed to the password change executable</strong>
		<table>';
		foreach ($params as $key=>$value)
		{
			$shellResult3 .= '<tr><td>'.$key.'</td><td>'.$value.'</td></tr>';
		}
		$shellResult3 .= '</table><br />';
		$passwordChangeExe = $passwordChangeLocation.' -d '.$params['dsnName'].' -u '.$params['dbUser'].' -p '.$params['oldPassword'].' -n '.$params['newPassword'];
		$shellResult3 .= '<strong>Password Change Command</strong><br />'.$passwordChangeExe.'<br />
		<h2>Password Change Results</h2><pre>';
		echo $shellResult3;
		if ($useBasicSystem == 'no')
		{
			$passwordChange = mysystem ($passwordChangeExe);
			$shellResult4 = '</pre><strong>Result: </strong>'.$passwordChange;
		} else {
			$passwordChange = system ($passwordChangeExe,$ret2);
			// $shellResult4 = `$passwordChangeExe; echo $?`;
			$shellResult4 = '</pre><strong>Command: </strong>'.$passwordChange.'<br /><strong>Return Value: </strong>'.$ret2;
		}
		
		echo $shellResult4;
	}
	
	$file = fopen('shell_log.htm','a+');
	if (!fwrite ($file,$shellResult.$shellResult2.$shellResult3.$shellResult4))
	{
		echo 'Could not write result to file';
	}
	fclose($file);
	
echo '<body></body></html>';

?>

An example of the output when it fails is shown below


<html><head><title>LIVE</title>
	<style type="text/css">
		body, body td {
			font: normal 12px Verdana;
			color: #333333;}
		h1 {
			color: #ff6600;
			font: bold 14px Verdana;
			margin: 20px 0 5px 0;
			padding: 3px;
			background: #ededed;
			border-bottom: 1px #aaaaaa solid;}
		h2 {
			color: #003366;
			font: bold 12px Verdana;
			margin: 10px 0 0 0;}
	</style>
	</head><h1>SHELL SCRIPT [21-05-2004 09:44:45]</h1><strong>Shell Script Command</strong><br />/appserver_iplanet/analytics753110/Web/Servlet/php/sa-cli.sh<br />
		<h2>Shell Script Results</h2><pre><br />
<b>Warning</b>:  system(): Unable to fork [/appserver_iplanet/analytics753110/Web/Servlet/php/sa-cli.sh] in <b>/appserver_iplanet/analytics753110/Web/Servlet/php/shell.php</b> on line <b>88</b><br />
</pre><strong>Command: </strong><br /><strong>Return Value: </strong>-1<h1>PASSWORD CHANGE [21-05-2004 09:44:45]</h1><strong>Parameters being passed to the password change executable</strong>
		<table><tr><td>dsnName</td><td>AnalyticsWeb</td></tr><tr><td>dbUser</td><td>USER1</td></tr><tr><td>oldPassword</td><td>PASSWORD</td></tr><tr><td>newPassword</td><td>PASSWORD2</td></tr></table><br /><strong>Password Change Command</strong><br />/appserver_iplanet/analytics753110/Bin/nqschangepassword.exe -d AnalyticsWeb -u USER1 -p PASSWORD -n PASSWORD2<br />
		<h2>Password Change Results</h2><pre><br />
<b>Warning</b>:  system(): Unable to fork [/appserver_iplanet/analytics753110/Bin/nqschangepassword.exe -d AnalyticsWeb -u USER1 -p PASSWORD -n PASSWORD2] in <b>/appserver_iplanet/analytics753110/Web/Servlet/php/shell.php</b> on line <b>118</b><br />
</pre><strong>Command: </strong><br /><strong>Return Value: </strong>-1<body></body></html>

Az,

Two possibles: 1- Resource exhaustion. When you get the cannot fork error, try logging in to that box directly to see if you can run a bunch of commands at the same time like shells or xterms that create multiple ptys.

2- Security. Your live site may have chroot’ed php, or may be running in safe mode, or may have some other thing going on that is prohibiting you from doing what you want. Bring this up with your web admin and/or system admin.

=Austin

Thanks for the reply. We’ve ruled out resource exhaustion as the web server is never more than 5% utilised and the system resources are never realy challenged by anything that’s happening on the box.

PHP isn’t running in safe mode and we have chmodded all the relevant directories and files to 777 to try and debug. I’m not UNIX literate though and I don’t understand the chroot you mention although I’l have a google now

I think we’ve narrowed it down to the file descriptors on the box. For some reason there are bugs in libc of Solaris which restrict the use of fopen()/fdopen() to 255 files, which is a too low limit in this type of servers. (see http://www.thetaphi.de/php-ressources/)

I am now trying to get PHP recompiled to start scripts using CGI as apparently that gets round the file descriptor issues. I also have Sun analysing the box to see if they can uncover the issue.

I’ll look up the chroot thing and keep you posted of our attempts at solving this. Any more suggestions are welcome though

thanks

Looking at chroot I don’t think that can be the issue as the application works some of the time and not others. If chroot had been applied then it should return consistent failures or successes (unless my understanding is wrong).

Az,

The 255 files thing is pretty much common knowledge. Sorry, I didn’t realize you had that much going on.

The solution there will depend on how you’ve got your server configured, but under SunOS there was a kernel parameter you could tweak to increase the max handles/process (as well as max handles/total, which doesn’t seem to be your issue).

If you can’t control the handles issue via the Solaris kernel, there’s probably a library you can run that will swap these things for you – you’d have to link apache against that lib, which means a rebuild, but it’s not that big a deal.

=Austin

We tweaked the max handles up from 12228 to 65536 which is the max I believe and it made no difference. Well, I think it was the max handles it was certainly one of the parameters under the -ulimit command.

It’s not Apache, it’s iPlanet, again something I know nothing about but I’ll ask the sys admins if there is anything they can do about linking it to a lib in a similar manner to Apache. The server is a Sun F15K, one of the biggest boxes Sun do and the application I have written runs quite happily on a P3 500MHz Windows 2000 box so it must be a bug somewhere in the Solaris setup rather than a resources issue.

The guy from Sun has suggested 6 patches which should be installed to solve an issue where if more than 255 file descriptors are already open then gethostbyname fails. I’m not sure that will help as I don;t think we’re using gethostbyname anywhere but I’m following their advice

thanks again

Here’s the highly detailed explanation from Sun

We have downloaded the PHP source code and this is what we have discovered.

First, we need to know what generates the error ‘Unable to fork’. It’s here:

php-4.3.6/ext/standard/exec.c

#ifdef PHP_WIN32
fp = VCWD_POPEN(d, “rb”);
#else
fp = VCWD_POPEN(d, “r”);
#endif
if (!fp) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, “Unable to fork [%s]”, d);
efree(d);
efree(buf);
#if PHP_SIGCHILD
signal (SIGCHLD, sig_handler);
#endif
return -1;
}

Now we know that when fd is NULL, you get the error.

fp = VCWD_POPEN(d, “r”);

We need to know what VCWD_POPEN is. It is defined here:

php-4.3.6/TSRM/tsrm_virtual_cwd.h

#define VCWD_POPEN(command, type) virtual_popen(command, type TSRMLS_CC)

Then, what is virtual_popen?

php-4.3.6/TSRM/tsrm_virtual_cwd.c

#else /* Unix */

CWD_API FILE *virtual_popen(const char *command, const char *type TSRMLS_DC)
{
int command_length;
char *command_line;
char *ptr;
FILE *retval;

    command_length = strlen(command);

    ptr = command_line = (char *) malloc(command_length + sizeof("cd  ; ") + CWDG(cwd).cwd_length+1);
    if (!command_line) {
            return NULL;
    }
    memcpy(ptr, "cd ", sizeof("cd ")-1);
    ptr += sizeof("cd ")-1;

    if (CWDG(cwd).cwd_length == 0) {
            *ptr++ = DEFAULT_SLASH;
    } else {
            memcpy(ptr, CWDG(cwd).cwd, CWDG(cwd).cwd_length);
            ptr += CWDG(cwd).cwd_length;
    }

    *ptr++ = ' ';
    *ptr++ = ';';
    *ptr++ = ' ';

    memcpy(ptr, command, command_length+1);
    retval = popen(command_line, type);    &lt;------------

    free(command_line);
    return retval;

}

Now we know PHP uses popen(). When popen() returns NULL, you get the problem.

Just like fopen(), popen is a standard C IO function.

 #include &lt;stdio.h&gt;

 FILE *popen(const char *command, const char *mode);

fopen and popen are of data type (structure) FILE which is an unsigned char, i.e. it has a range of 0-255. That’s why you get the problem when there are more than 256 open file descriptors.

This problem is mentioned on this web page:

http://www.thetaphi.de/php-ressources/

which you already found last week.

Despite what it says, it is not a bug in Solaris. Solaris just complies with the standard.

The PHP application should have used the UNIX system call function - open() instead.

 #include &lt;sys/types.h&gt;
 #include &lt;sys/stat.h&gt;
 #include &lt;fcntl.h&gt;

 int open(const char *path, int oflag, /* mode_t mode */...);

As you can see the open() function returns an int rather than char, so the 256 limit does not apply to it.

Unless someone is going to change the code in PHP, I think you need to use the CGI workaround provided on that web page.

wow, thats what i call support man. im kinda impressed.

Sike

yes it really is a very detailed explanation, something you don’t see very often, awesome :eek2:

Hi. I apologize for bringing up this old thread but I’m having the same problem and unfortunately, the page that Sun’s response links to is no longer valid:

Unless someone is going to change the code in PHP, I think you need to use the CGI workaround provided on that web page.

This is the page that’s supposed to have the workaround but no longer exists: http://www.thetaphi.de/php-ressources/

Az or anyone else: has this problem been fixed and if so, how?

I’m also trying to execute some server side code from PHP by using either the system function or the back tick and I get the same “Unable to fork” error message. My server is a Sun box running Solaris 8 and the web server is iPlanet 6.0.

I’ve googled this problem for 2 days in a row with no luck. Most of the people that have had this problem and have fixed it are running in Windows and therefore, doesn’t apply.

Any suggestions will be greatly appreciated.

Wayback machine to the rescue.

Thanks kyber. I’ll forward that info to our sys admins and we’ll see how that goes.