If you use PHP, and are familiar with PHP's error handling system, you know that by default, there's an entire class of errors, E_NOTICE, that you don't see (PHP's default error handling settings are equivalent to error_reporting(E_ALL & ~E_NOTICE)). But, you're really supposed to at least develop, if not deploy, with notice errors on.
Notice-level errors come up most often when you index into an array that doesn't have that index defined.
So if I were to say:
<?php
$arr = array();
$foo = $arr['bar'];
?>
that would give me a notice error. So, you have to have things like:
<?php
$foo = (isset($arr['bar']) ? $arr['bar'] : NULL);
?>
all over the place. Very irritating. Perl (or maybe just Perl 6) has a special operator for this very reason. I think the operator is '\=', but I'd have to check. Oops, got it backwards (look for the heading 'binary //'). The operator is //=. I don't know if that exists in Perl 5.
Anyway, PHP doesn't have that, so typing the line above all the time gets a little tedious. But what I discovered today is more sinister.
I have a function called getPath() that will take a string expression that looks like "foo|bar|baz" and index into an array. So if you make the following getPath() call:
<?php
$foo = &getPath($data, 'foo|bar|baz');
?>
that would be equivalent to saying:
<?php
$foo = &$data['foo']['bar']['baz'];
?>
Don't ask me why I have this function, I can't tell you. It's part of a secret larger project. Just believe me that it makes a lot of sense for me to have it.
Now, the important thing to notice is that I'm taking a reference in the code up there. First of all, if the expression I give getPath() results in a large array, I'd rather take a reference to it than copy the whole thing and waste space. More significantly, I want to be able to get a reference to a piece of a data structure, and then assign to that. So, for instance, I have something like the following setPath() function that I want to behave like this:
<?php
function setPath(&$data, $path, $value){
$foo = &getPath($data, $path);
$foo = $value;
}
?>
The actual specifics (function names, details, etc.) are different than that, but that gives you the idea. So now you understand the second reason why I wanted to be able to get a reference out of getPath().
Now we're going to get to the sinister part.
Take a look at the previous version of my getPath() function:
1: <?php
2: function & getPath(&$data, $path){
3: if(is_array($data)){
4: $path_parts = explode('|', trim($path, '|'));
5: $c = count($path_parts);
6: $temp = &$data;
7: for($n=0;$n<$c;$n++){
8: if(isset($temp[$path_parts[$n]])){
9: $temp = &$temp[$path_parts[$n]];
10: }else{
11: return NULL;
12: }
13: }
14: return $temp;
15: }else{
16: return NULL;
17: }
18: }
19: ?>
You can see I'm being really careful to make sure that the piece of $temp I'm taking a reference to in line 8 is set with the isset() function, so that I don't get NOTICE errors if the $path I give points to somewhere that's undefined. The problem is twofold. First, isset() returns false if the value is NULL, even if you actually set it to NULL. So if I try to index into a piece of the array that's NULL, get the reference to it, and assign to that reference, instead of replacing the actual NULL value in the array, because I was being careful and returning a "new" NULL in line 11, I wouldn't be changing anything in the old data structure anymore.
At that point I thought I was stuck. But I made sure error_reporting(E_ALL) was on and did some testing. It turns out that this code gives a NOTICE error:
<?php
$arr = array();
$foo = $arr['bar'];
?>
But this code doesn't!
<?php
$bar = &$blah['bazooooooo']['foo']['roar'];
?>
even if the array doesn't even exist yet (yes, that last bit was actual code I used to test it). It turns out that if I then assign to $bar the following array is created:
array(1) {
["bazooooooo"]=>
array(1) {
["foo"]=>
array(1) {
["roar"]=>
&string(3) "foo"
}
}
}
So the point is that trying to take the value of a non-existent array index is a NOTICE error, but taking a reference to something that doesn't exist is just fine
So that's great! I was able to change my getPath() function into this, and it worked fine:
1: <?php
2: function & getPath(&$data, $path){
3: if(!is_array($data)){
4: return NULL;
5: }else{
6: $path_parts = explode('|', trim($path, '|'));
7: $c = count($path_parts);
8: $temp = &$data;
9: for($n=0;$n<$c;$n++){
10: $temp = &$temp[$path_parts[$n]];
11: }
12: return $temp;
13: }
14: }
15: ?>
Problem solved. Note that I changed a little bit stylistically as well. If you have a condition in which one branch corresponds to a check for an error condition that is only a few lines long, and the other branch has many more lines, it's typically better to put the quick error clause closer to the condition, which is what I did above.
So, hopefully we all learned a little bit about PHP we didn't know before through my little exercise.
Feel free to post a comment below. Please see my comment policy.
Formatting Rules (No HTML):