Perl variable scope confusion

I was having a look at some Perl code that used our quite a lot. I’ve never needed to use our and it made me question my understanding of Perl variables and scope. So I thought I’d read up on it a bit. This is how I think it all works. Corrections appreciated.

Perl has two completely separate sets of variables, Package Variables and Lexical Variables

Package Variables

Perl packages are basically just namespaces. Each package (including the implicit main package) has a symbol table that holds the names defined in that package namespace. You can change the current namespace like package Package::Name.

Package names can contain “::” and this is often used to group conceptually related packages, for example DBD::Pg is a Postgres driver for DBI and DBD::mysql is MySQL driver for DBI, but this is just a useful naming convention, it doesn’t imply any inheritance or other relationship to Perl.

Package Variables are stored in the package symbol table. They are also what you get by default in Perl 5 when you just create a variable.

#!/usr/bin/perl

#define a Package variable (in package "main")
$foo = 10;

# can be accessed by its full name
print "$main::foo\n";
#=>10

# but, as we're in package "main", we don't need to qualify it:
print "$foo\n";
#=>10

# switch to namespace Foo
package Foo;

# Now $foo refers to Foo::foo
$foo = 20;

print "$foo\n";
# =>20

# switch back to namespace main
package main;

#and $foo refers to main::foo
print "$foo\n";
# => 10

#but we can still get at the foo we defined in
#the Foo package:
print "$Foo::foo\n";
# => 20

Global variables are generally considered a BadThing™. Although Perl’s Package Variables are namespaced to avoid the collision problems of global variables, they are globally accessible. Allowing any bit of your (or someone else’s) code to change the state of a variable at any point is generally a recipe for an unmaintainable, buggy mess. There may well be situations in which a Package Variable is the right thing to use, but for the most part you’re better off with lexicals.

While we’re on the subject of Globals and Package Variables, it is worth noting that most of Perl’s Special Variables are defined as Package Variables in main. Unlike regular Package Variables they are truly global in the sense that you can refer to them anywhere without a qualifying namespace and you’re still referring to the main variable.

#!/usr/bin/perl

$_ = "test";

print "$_\n";
# => test

print "$main::_\n";
# => test

package Foo;

# still referring to $_ in main
print "$_\n";
print "$main::_\n";

#Foo::_ is a different variable
print "$Foo::_\n";

$Foo::_ = "another test";

# still referring to $_ in main
print "$_\n";

print "$Foo::_\n";

The same arguments about messing with global variables apply to these special variables too, so if you want to change the value of one of these variables in your code without wreaking havoc, you can use local. All this does is allow you to temporarily change the value of a package variable and have it automatically reset to its original value at the end of the current scope. This means other code calling yours won’t have to deal with unexpected side effects to global variables. It’s something you should use sparingly, but there are occasions when it’s useful, for example slurping the contents of a file:

#!/usr/bin/perl

#The <> reads a line, where "line" is defined by the record separator $/
print "Record separator is a newline by default: $/ see?\n";

{
    #undefine $/ it in the scope of this block
    local $/ = undef;
    print "Record sep isn't a newline anymore: $/ see? \n";

    #and now you can read a whole file into a string in one go
    open FILE, "myfile" or die "Couldn't open file: $!";
    $string = <FILE>;
    close FILE;
}

print "Record sep is back to a newline: $/ see?\n";

There are also special variables that aren’t global and don’t require the use of local. Some, like $| are per-filehandle. Others are always local to the current block, like $1..$9 used in pattern matches and $a and $b used in sort. Check perlvar for details.

Lexical Variables

Perl 5 introduced lexically scoped variables to avoid the problems caused by using global variables. You can declare a lexical variable using my and this variable will only be visible within the scope in which it is defined.

A scope in Perl can be the whole file, a block (delimited by {}) or an eval.

#!/usr/bin/perl

#this is scoped to the file
my $foo = 100;

{
    #we can see it in this block
    print "$foo\n";
    # => foo

    #this is scoped to this block
    my $bar = 50;

}

# you can't see the lexical $bar outside the
# block, so this is referring to $main::bar
print "$bar\n";
# =>

In the above code, when we refer to $bar in main the lexical variable $bar from the block is no longer in scope. If you refer to a variable that doesn’t exist Perl assumes you want a package variable with that name and will create it for you. In fact, this is rarely what you want and can lead to difficult to detect bugs when, for example, you misspell a variable name. Unless you have a good reason not to, you should always use the strict pragma to avoid this problem. strict will generate a compile-time error if you try to access a variable that hasn’t been defined in this scope using my, unless it is a fully qualified package variable:

#!/usr/bin/perl
use strict;

# This will create a lexical variable under strict
my $foo = 10;

# This will create a package variable under strict
$main::bar = 10;

# But this isn't a lexical and it isn't a fully qualified package variable, so you'll get an error:
$bar = 10;
#=> Global symbol "$bar" requires explicit package name...

Lexical scoping has nothing to do with packages. So:

#!/usr/bin/perl
use strict;

my $message = "hello!";

package Foo;

#this is lexically scoped to the file and is oblivious to the change of package
print "$message\n";
# => hello!

Which, I guess, is a good argument for writing packages in their own files – then your file scoped lexical variables happen to coincide with your package scope.

Which brings me to our. If you really want to use a package variable under strict and you can’t be arsed typing the fully qualified name, you can use our to define a lexically scoped alias for it. The older (and now deprecated) vars pragma did a similar thing, but it was package scoped.

#!/usr/bin/perl

use strict;

{
    # access the package variable main::foo inside the current scope without having to fully qualify it
    our $foo;
    $foo = "test";
    print "$foo\n";
    # => test

    # you can assign a value to the package variable at the same time as declaring the alias:
    our $bar = 100;
}

# strict will raise an error if we try to refer to a  variable by an alias that is no longer in scope
print "$foo\n";
# => Global symbol "$foo" requires explicit package name

# but "our" just defines a locally scoped *alias* to  a package variable to keep strict happy.
# It doesn't magically confer lexical scope to the  package variable itself.
# We can still get at the package variable here if we use its fully qualified name:
print "$main::foo\n";
# => test

$main::bar+=10;
print "$main::bar\n"
# => 110

I’m not entirely sure I see what you gain from the use of our. If just using Package Variables in the first place can cause enough problems, what benefit do you get from creating aliases that make them look as though they’re lexically scoped when they aren’t really? Even if there are situations in which a Package Variable would be more suitable then a Lexical Variable, surely sticking with strict and explicitly giving it its fully qualified name would make the code clearer?

Advertisements

2 responses to “Perl variable scope confusion

  1. hey, a related question that i can’t find an answer to:

    i generally use strict; but strict does not complain about a variable prefixed with a package name, EVER. even if that variable totally does not exist. i guess the assumption is that it might be defined/used later in the file, and perl can’t be sure what order you’ll define things in.

    it will be nice if, at least at runtime, perl would complain if you use a variable that has never been defined with any keyword like my or our. any idea how i could get error checking like that working? the only thing i can think of is to put defined() tests around all of my package variables, always.

  2. Note to self – modern perl book (http://modernperlbooks.com/books/modern_perl_2016/05-perl-functions.html#U2NvcGU) notes that ‘our is most useful with package global variables such as $VERSION and $AUTOLOAD. You get a little bit of typo detection’

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s