Recently on reddit.com/r/perl, pmz posted a question like this:
XML::Twig, get value (or not) without dying
Currently I do:
if (defined $elt->first_child('addr')->first_child('postalCode')) { $patient{patient_postal_code} = $elt->first_child('addr')->first_child('postalCode')->text ; }because if I don't check for "defined" and the resulting value is null , it dies.
Link to the original post: https://www.reddit.com/r/perl/comments/1492sc1/xmltwig_get_value_or_not_without_dying/
While on one hand the question is about how to use XML::Twig, on the other hand the obvious inconvenience here is when first_child('addr')
returns undef
, which means there are no <addr>
element underneath, the following call of first_child('postalCode')
would make the programm die. Generally speaking: in a chain of calls we expect objects to be present in all positions, but sometimes there are undef
. Given that, is there a way to avoid the program from dying and let the entire call chain return undef
if undef
is encountered in the calling chain ?
To formalize the question a bit more generically: assume a class with instance methods a()
, b()
, and c()
. These methods may return an instance of same class, or ocassionally, undef
. Consider the following chain of calls originally from $o
:
$res = $o->a()->b()->c();
In case any of a()
, b()
, or c()
returns undef
, the program dies with messages like this:
Can't call method "c" on an undefined value
Which suggests b()
returns undef
and since undef
is not an object, we cannot call methods on it.
Now, could we rewrite the same program to prevent the abovementioned error from happening, while making $res
be undef
if any of a()
, b()
, c()
returns undef
, or otherwise, the return value of c()
?
In some other programming languages, such purpose could be satisfied by using the safe-navigation operator, such as ?.
in javascript or kotlin:
res = o.a()?.b()?.c();
Or in raku, .?
$res = $o.a().?b().?c();
However, we haven't seen anything similar up until perl 5.38 just yet.
A rather intuitive way to rewrite would be something like this:
$res_a = $o->a();
$res_b = $res_a && $res_a->b();
$res = $res_b && $res_b->c();
However, besides making the program much longer and less easier to grasp, the rewrite is not generic. It'll be different for similar statements with different method names. Not a good strategy.
Meanwhile, here's a super simple and generic way:
$res = eval { $o->a()->b()->c() };
However, with the powerful side-effect of eval
, all exceptions would be ignored while we are only interested in ignoring undefined values. That is a lot more than what we want. Even though it looks simple, it is probably not applicable.
Here is a solution with Monad design pattern.
The rewritten version looks like this:
$res = SafeNav->wrap($o) ->a()->b()->c() ->unwrap();
The SafeNav
is defined as the folowing.
use v5.36;
package SafeNav {
sub wrap ($class, $o) { bless \$o, $class }
sub unwrap ($self) { $$self }
sub AUTOLOAD {
our $AUTOLOAD;
my $method = substr $AUTOLOAD, 2 + rindex($AUTOLOAD, '::');
my ($self, @args) = @_;
# [a]
(defined $$self) ?
__PACKAGE__->wrap( $$self -> $method(@args) ) : # [a.1]
$self; # [a.2]
}
sub DESTROY {}
};
SafeNav
is a class that wraps all scalar values and equips with AUTOLOAD
for responding to all method calls.
Inside AUTOLOAD
there is the core part of our logic in [a]: If we are not wrapping an undef
value, we call the original method on it, then re-wrap the return value ([a.1]). Or if we are wrapping an undef
, we ignore the method call and just lay down and keep being ourselves ([a.2]).
Thanks to the mechanism of AUTOLOAD
, the original form of ->a()->b()->c()
is kept exactly the same after the rewrite. Let's put both versions side-by-side for a quick comparison:
$res = $o ->a()->b()->c();
$res = SafeNav->wrap($o) ->a()->b()->c() ->unwrap();
The wrap()
at the front, together with unwrap()
at the back, form a clear boundary in which SafeNav
is effective. Method calls after unwrap()
are not not guarded by SafeNav
.
With that, we properly ignore undef
values, nothing more. If other kinds of exceptions are thrown from method a, b, c, the program would correctly abort. In the 3 proposed ways to rewrite the program in this article, the SafeNav
monad is both generic and not adding too much verbosity to the original program.
Original post: https://gugod.org/2023/06/perl-safe-navigation-monad-en/
Top comments (3)
Hmm, I'd argue that something like this would work better:
Advantages:
$o
has its own methods calledwrap
andunwrap
.$o
at the beginning of the chain, so it's more visible.$_wrap
and$_unwrap
into lexical (my) variables instead of package (our) variables.I like this version better, especially so since
$o
remains to be the "subject".If I make a cpan module out of this, I'd probably name those 2 scalar varibale
$safenav_begin
and$savenav_end
instead -- otherwise it may not be clear what they are.Okay, this catches a very small annoying edge case:
People might want to do
$o->$_wrap->import(123)->$_unwrap
in the rare case that$o
has a method calledimport
that does something useful.Putting the SafeNav import stuff in a separate package to the AUTOLOAD stuff allows that.