Products Resources Support About Us

PHP 7 character set issues


#1

Hi. I’ve encountered a couple of issues with the handling of ASCII and EBCDIC in PHP 7 (these examples work correctly in PHP 5). I have PHP 7 CGIs working successfully using the requires environment variables and IHS definitions but these specific situations don’t seem to correctly handle the conversion of character sets.

The first is the exec() function. This can return an array of output generated by the externally called program. I have a PHP script calling a REXX script with the REXX returning multiple lines of output via the SAY statement. When the output arrives back to PHP it appears to be ASCII encoded EBCDIC and the array is not correctly populated.

Here is the PHP script called exec1:

#!/ported/php/bin/php-cgi
<?php
if($_GET["conv"]=="yes")
 $p=" | iconv -f IBM-1047 -t ISO8859-1";
else
 $p="";
$output=array();
exec("./exec2 input one".$p,$output,$rc);
echo "RC=$rc";
echo "<br />output[0]=".$output[0];
echo "<br />output[1]=".$output[1];
echo "<br />output[2]=".$output[2];
echo "<br />output[0]=".iconv("IBM-1047","ISO8859-1",$output[0]);
echo "<br />output[1]=".iconv("IBM-1047","ISO8859-1",$output[1]);
echo "<br />output[2]=".iconv("IBM-1047","ISO8859-1",$output[2]);
?>

Here is the external REXX called exec2:

/*rexx*/
parse arg in
say "hello world"
say in
say "last line"
exit 4

Navigating to exec1 in the browser shows that all of the exec2 output lines are placed as a single string in the first element of array $output and the string is EBCDIC so doesn’t display very well. Adding query string exec1?conv=yes passes exec2 output through iconv and $output is then populated correctly (this is not a very good solution). It seems as though the exec() function isn’t handling character conversion correctly as the output generated by REXX exec2 is EBCDIC.

The second situation involves POSTing form data back to the PHP script. It is also coming back as EBCDIC (probably due to IHS processing) and consequently the PHP $_POST superglobal is not populated correctly. Here is an example PHP script called testpost:

#!/ported/php/bin/php-cgi
<?php
$cginame=$_SERVER["PHP_SELF"];
echo "<form action=\"$cginame\" method=\"post\" name=\"form1\">\n";
echo "<br /><strong>Edit1:</strong>";
echo "<input type=\"text\" value=\"".$_POST["edit1"]."\" ".
     "name=\"edit1\" size=\"20\">\n";
echo "<br /><strong>Edit2:</strong>";
echo "<input type=\"text\" value=\"".$_POST["edit2"]."\" ".
     "name=\"edit2\" size=\"20\">\n";
echo "<br /><input type=\"submit\" value=\"Refresh\" name=\"refresh\">\n";
echo "</form>\n";
echo "<br />Edit1: ".$_POST["edit1"];
echo "<br />Edit2: ".$_POST["edit2"];
echo "<br />";
print_r($_POST);
foreach($_POST as $n=>$v)
 echo "<br />".iconv("IBM-1047","ISO8859-1",$n).",$v";
?>

It’s not beautiful to look at but when run in the browser it shows that $_POST contains all of the variable information as a single string in a single array element. The script also passes $_POST through iconv to show that the data is actually EBCDIC by converting it to ASCII.

It’s most likely that something is misconfigured somewhere but it’s not clear what that might be so any guidance is appreciated.

Richard.


#2

I’m not able to easily test your second case. I was able to run your first case, though, from the USS command line (not by navigating to the file in a browser). Here’s the output I received (with newlines added for readability):

 X-Powered-By: PHP/7.0.5
 Content-type: text/html; charset=UTF-8

 RC=4<br />
 output[0]=hello world<br />
 output[1]=input one<br />
 output[2]=last line<br />
 output[0]=\150\145%%?\040\167?\162%\144<br />
 output[1]=\151>\160\165\164\040?>\145<br />
 output[2]=%/\163\164\040%\151>\145

Is this what you get if you run this from the command line?

Can you please provide the output of php --version? Here’s what I get:

 bash-4.3$ php --version
 PHP 7.0.5 (cli) (built: Jun 19 2017 12:14:11) ( NTS )
 Copyright (c) 1997-2016 The PHP Group
 Zend Engine v3.0.0, Copyright (c) 1998-2016 Zend Technologies
 bash-4.3$

#3

Jerry, I’m a colleague of Richard.

$php --version
PHP 7.0.5 (cli) (built: Jun 13 2017 21:46:49) ( NTS )
Copyright © 1997-2016 The PHP Group
Zend Engine v3.0.0, Copyright © 1998-2016 Zend Technologies

VERSION.ZOS shows
Tool: php
Version: 7.0.5
Build Number: 012

Did I miss a newer version to download?

– Manfred


#4

Hi,
12 is actual build.
Could you provide the output when you run exec1 from the USS command line?


#5

It looks like my test was run against an “in-progress” version, not the latest released version. My apologies.


#6

Just suggestion,
exec1 has ASCII encoding, exec2 - EBCDIC ?


#7

Internal ticket is created USSP-855


#8

When I execute exec1 from an OMVS command line I get similar output to what I see in the browser so this is consistent. Note that the CGI_ environment variables are configured the same in both environments so I would expect this (both are set to EBCDIC). Both file exec1 and file exec2 contain EBCDIC and neither file is tagged. I tried changing the tagging but this didn’t affect the output.

This is the OMVS command line output I see when invoking exec1 without arguments:

X-Powered-By: PHP/7.0.5                                                                                     
Content-type: text/html; charset=UTF-8                                                                      
                                                                                                            
RC=4<br />output[0]=-                                                                                       
                     ---@¦--------¤£@--                                                                     
                                       ---¢£@---                                                            
                                                -<br />output[1]=<br />output[2]=<br />output[0]=hello world
input one                                                                                                   
last line                                                                                                   
<br />output[1]=<br />output[2]=

If I invoke as “exec1 conv=yes” then the following is displayed:

X-Powered-By: PHP/7.0.5                                                                                                             
Content-type: text/html; charset=UTF-8                                                                                              
                                                                                                                                    
RC=0<br />output[0]=hello world<br />output[1]=input one<br />output[2]=last line<br />output[0]=ÇÁ%%?-Ï?Ê%À<br />output[1]=Ñ>øÍÈ-?>
Á<br />output[2]=%/ËÈ-%Ñ>Á

This result seems to indicate that the PHP script can not handle the EBCDIC output generated by the externally called REXX. This scenario worked correctly in PHP 5.

The second example displays a simple form in the browser and POSTs the query string back to itself. It’s very basic and yet it seems that PHP is unable to translate the returned data correctly. Most likely this is due to translation performed by Apache. Once again this scenario worked correctly in PHP 5.

The output for php --version is the same as documented by Manfred.

Thanks for raising the internal ticket Tatyana. I can’t log in to see it’s content so is there some way I can track it?

Richard.


#9

Hi Richard,

Could you try to add the directive to httpd.conf:

SetEnv _BPXK_AUTOCVT ON

Then restart server.
Internal ticket is for our usage.


#10

Tatyana, the SetEnv _BPXK_AUTOCVT ON statement is already present in the httpd.conf file for the IHS. This was required to enable PHP to read the php.ini file as it is tagged as EBCDIC. Richard.


#11

Hi Tatyana, additionally to what Richard said.

If the php.ini (whose content is IBM-1047) were not tagged as IBM-1047 then php would assume that the php.ini is ASCII. So it seems that the tagging of php.ini (in case it is IBM-1047) is required, and thus:

SetEnv _BPXK_AUTOCVT ON

is required as well.


#12

Could you provide your httpd.conf?


#13

Our httpd.conf is split over multiple files and includes variables defined in envvars files. Probably not too convenient to post them here so can I send them to you some other way?

By the way, were you able to reproduce the issues using the examples that I provided? If you can reproduce then no need to provide any files. If you aren’t encountering the issues then would it be possible for you to provide a minimal recommended httpd.conf and maybe php.ini so that we can verify settings? It might be that we have something misconfigured.

Thanks, Richard.


#14

Tayana, as Richard said his httpd.conf configuration is pretty complicated. I have a test system where the httpd.conf is considerably simpler.

Let’s do like follows: I will try to make a minimal httpd.conf on my test httpd server and provide it. I’ll try to do it some time this week. Then you have something to feed the httpd server on your system.

– Manfred


#15

Hi Tatyana,
Here is the stuff.

envvars: https://paste.fedoraproject.org/paste/7tLMMoxQ54m9AjFeHcURLw

httpd.conf: https://paste.fedoraproject.org/paste/AxSyPaV0hhde7q5OjrdO7A

You have to adjust some things. At least server name and presumably the directory names.

When I run exec1 I get:

RC=4
output[0]=�����@����������@�������@����
output[1]=
output[2]=
output[0]=hello world input one last line
output[1]=
output[2]=

Hope this helps to track things down.

Thanks a lot for your support.

– Manfred


#16

Hi,

Yes, we reproduced this issue. And setup _BPXK_AUTOCVT solved this problem.
I looked at your config and found several differences. From README.ZOS:

Recommended settings for IBM HTTP Server are:

CharsetSourceEnc ISO8859-1
CharsetDefault ISO8859-1

These environment variables can be set in Apache configuration file, e.g.

SetEnv CGI_HEADER_ENCODING EBCDIC
SetEnv CGI_BODY_ENCODING NONE

Could you try to check using exactly these settings (of course, including SetEnv _BPXK_AUTOCVT ON), please?


#17

Hm, I did as suggested, i.e. httpd.conf contains now:

SetEnv _BPXK_AUTOCVT ON 
SetEnv CGI_HEADER_ENCODING EBCDIC 
SetEnv CGI_BODY_ENCODING NONE 
                                                  
CharsetSourceEnc ISO8859-1 
CharsetDefault ISO8859-1 

When running exec1 I get:

����ʀa�?�����$�)hello world input one last line �ʀa�?�����$�)�ʀa�?�����$)�ʀa�?�����$�)��%%?��?�%���>��Ȁ?>��%/�Ȁ%�>���ʀa�?�����$�)�ʀa�?�����$)

which looks bad.

Anything, I missed or mixed up?


#18

Yes,

export _CEE_RUNOPTS=“FILETAG(AUTOCVT,AUTOTAG) POSIX(ON)”


#19

I added it into envvars, no change at all. Then I addded it even into the httpd.conf . Also no change.

Does the setting work differently for you?


#20

I’m not sure if it’s related to this particular issue but I’ve also found that file uploads don’t appear to work in PHP 7. This can be observed by adding enctype=“multipart/form-data” to my POST example above along with an input field of type=“file”. Adding a vardump($_FILES) returns an empty array as does vardump($_POST) so not even ASCII characters are getting back to the PHP script. This same scenario works correctly with Rocket PHP 5.4.4 so it seems something in PHP 7 is flushing the multipart payload.

None of these specific issues were encountered in Rocket PHP 5.4.4.