Ruby Blocks, Procs, Lambdas

I’m trying to get my head around how Ruby’s various closure-like things work. This is what I’ve concluded from a bit of playing around and with some help from Google. Not sure how accurate it is though. I’ve tried the following code with Ruby 1.9.1 and 1.9.2 with the same results.

You can grab all the code in one go from https://gist.github.com/706651 if you want to play with it.

All Ruby methods can take a block as their final argument. yield will run the code in that block and block_given? will test the existence of that block.

The block can be anonymous:

def anonymous_block_test()
  if block_given?
    yield
  end
end

anonymous_block_test do
  puts "yield runs an anonymous block param"
end
# => yield runs an anonymous block param

Or the block can be given a name, using the & prefix. yield works the same, regardless of whether the block is named:

def named_block_test(&ablock)
  if block_given?
    yield
  end
end

named_block_test do 
  puts "yield also runs a named block param"
end 
# => yield also runs a named block param

But it is also possible to implicity coerce a named block to a Proc object and run it with .call:

def named_block_proc_test(&ablock)
  ablock.call
  puts ablock.class
end

named_block_proc_test do 
  puts "named block params can be run with .call"
end 
# => named block params can be run with .call
# => Proc

Methods will only accept a single block argument, whether named or anonymous and it must be the last argument given. However, Proc objects can be explicitly created and can be passed to methods as regular arguments.

There are multiple ways to create Proc objects:

Using Proc.new:

doubler = Proc.new {|n| n * 2 }
puts doubler.class
# => Proc
puts doubler.call(10)
# => 20

Using the kernel method lambda:

doubler_lambda = lambda {|n| n * 2 }
puts doubler_lambda.class
# => Proc
puts doubler_lambda.call(10)
# => 20 

Using the kernel method proc

doubler_proc = proc {|n| n * 2 }
puts doubler_proc.class
# => Proc
puts doubler_proc.call(10)
# => 20 

Despite the above examples, Proc.new and lambda are not functionally identical. To further confuse things, in Ruby 1.8 proc was equivalent to lambda but in Ruby 1.9+ proc is equivalent to Proc.new

Proc objects generated by lambda will check the number of arguments they are given when their .call method is called. Proc objects generated by Proc.new will not.

When given the correct number of arguments, they behave the same:

test_proc = proc {|n| puts "I can haz #{n}?"}
test_lambda = lambda {|n| puts "I can haz #{n}?"}

test_proc.call("proc")
# => I can haz proc?

test_lambda.call("lambda")
# => I can haz lambda?

But only lambda-defined Proc objects will complain if given too few arguments:


begin
  test_proc.call()
rescue => e
  puts "Can't haz proc!"
  puts e.message
end
# => I can haz ?

begin
 test_lambda.call()
rescue => e
  puts "Can't haz lambda"
  puts e.message
end
# => Can't haz lambda
# => wrong number of arguments (0 for 1)

Or too many arguments

begin
  test_proc.call("one","two")
rescue => e
  puts "Can't haz proc!"
  puts e.message
end
# => I can haz one?

begin
  test_lambda.call("one", "two")
rescue => e
  puts "Can't haz lambda"
  puts e.message
end
# => Can't haz lambda
# => wrong number of arguments (2 for 1)

Blocks have access to the values of variables in the context in which they were created:

def block_bind
  yield
end
x = 10
puts block_bind {x+=10}
# => 20

def named_block_bind(&prc)
  prc.call()
end
puts named_block_bind {x +=10 }
# => 30

As a corollary to the above, they don’t have access to the local variable values in the context in which they are called:

def test_with_var 
  x = 100
  yield
  puts "inner x:  #{x}"
  # => still 100, the block refers to its own x
end

x = 10
test_with_var { x += 10}
puts "outer x: #{x}"
# => 20

As the same variable binding process happens with Proc objects, we can make something that looks pretty much like a closure: a Proc object containing a bit of code, with bound values, that can be assigned to a variable, passed around between methods and executed when needed.

A Proc object contains a Binding object, which can be accessed via the .binding method. Unfortunately, the Binding class doesn’t provide any methods for examining it’s contents, but the kernel method eval will evaluate a string in the context of a Binding object, so you can determine the bound values of variables:

def proc_bind(param)
  return Proc.new {}
end 

param = "param value in calling context"
pb = proc_bind("param value in creation context")

puts eval("param", pb.binding)
# => param value in creation context

The above works just the same for lambda or Proc.new

There’s another fundamental difference between Proc objects created by Proc.new and those created by lambda, which becomes apparent if you try to call return inside a Proc block.

Calling return in a Proc.new-generated Proc object is like calling return in a block. A block doesn’t have its own scope from which to return so, as you might expect, this fails because there’s nowhere to return from:

proc_returner = Proc.new {return}
begin
 proc_returner.call()
 puts "after calling return in a proc"
rescue => e
  puts "Fails"
  puts e.message
  puts e.backtrace
end
# => Fails
# => unexpected return
# => ruby_proc_test.rb:209:in `block in <main>' 
# => ruby_proc_test.rb:212:in `call'
# => ruby_proc_test.rb:212:in `<main>'

But when you are in a method, you can call return from a Proc.new Proc object and it will return from that method:

def return_test
  proc_returner = Proc.new{
    puts caller() # just to show where we are
    return()}
  begin
    proc_returner.call()
    puts "Never gets here, the return in the Proc returns from the method"
  rescue => e
    puts "Fails"
    puts e.message
  end
end
return_test
# => ruby_proc_test.rb:234:in `call'
# => ruby_proc_test.rb:234:in `return_test'  
# => ruby_proc_test.rb:241:in `<main>'

Perhaps more surprisingly, if you have a Proc.new object that has been created outside a method, and passed in as a parameter and you call return in that Proc, it doesn’t return from the enclosing method, it tries to return from where it was created, and fails:

def return_test_two(prc) 
  begin 
    prc.call()
    puts "After calling proc"
  rescue => e
    puts "Fails"
    puts e.message
    puts e.backtrace
  end
end
return_test_two(proc_returner)
# => Fails
# => unexpected return
# => ruby_proc_test.rb:210:in `block in <main>'  #block is in <main> not in return_test_two
# => ruby_proc_test.rb:255:in `call'
# => ruby_proc_test.rb:255:in `return_test_two'
# => ruby_proc_test.rb:263:in `<main>'

And if you create a Proc.new Proc inside a method, return it and try to run it, you might expect it to fail because you’re not in a method to return from. In fact, it fails because it is trying to return from the method in which it was created and you’re not in that method any more:

def make_a_returner
  return Proc.new {return()}
end
prc = make_a_returner
begin
  prc.call
rescue => e
  puts "Fails"
  puts e.message
  puts e.backtrace
end
# => Fails
# => unexpected return
# => ruby_proc_test.rb:277:in `block in make_a_returner' #can't return from here, we're in <main>
# => ruby_proc_test.rb:281:in `call'
# => ruby_proc_test.rb:281:in `<main>'

Blocks in lambda-generated Proc objects don’t actually behave like blocks, they behave more like functions. If you call return in the block in a lambda then you will return from the lamdba, not from the surrounding method:

def lambda_return_test
  returner = lambda{return()}
  begin
    returner.call()
    puts "returns from lambda and continues in method"
  rescue => e
    puts "Fails"
    puts e.message
  end
end
lambda_return_test
# => return() from the lambda scope and carries on

So you can safely pass a lambda with a return in it between functions and everything won’t blow up when you call it:

lam = lambda {return()}
def lambda_return_test_two(lam) 
  begin 
    lam.call()
    puts "returns from lambda and continues in method"
  rescue => e
    puts "Fails"
    puts e.message
    puts e.backtrace
  end
end
lambda_return_test_two(lam)
# => returns from lambda and continues in method

Ruby blocks don’t really have their own scope, but from 1.9.1+ they do have scope just for the block parameters. So:

x = 1
y = 1
loop {|x| 
  x = 10
  y = 10
  break}
p x    #=> 1  x is a block param and so is local to the block
p y    #=> 10 y isn't a block param, so is the y from the enclosing scope

As previously mentioned, Proc objects have access to the variables in the scope in which they were created:

x = 10
prc = Proc.new {puts x+5}
lam = lambda {puts x+5} 

prc.call
# => 15
lam.call
# => 15
p x
# => 10

This access is by-reference – if you change the value of a variable inside a Proc, it has effects outside the Proc:

prc = Proc.new {puts x+=5}
lam = lambda {puts x+=5} 

prc.call
# => 15
lam.call
# => 20
p x
# => 20

But, if you pass the variable into the Proc explicitly as a parameter then it becomes a block parameter and changes to the variable inside the proc have no effect on the variable outside:

prc = Proc.new {|x| puts x+=5}
lam = lambda {|x| puts x+=5} 
prc.call(x)
# => 25
lam.call(x)
# => 25
p x
# => 20
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s