Gangadhar NPK | 3 Sep 15:28

Question regarding upValues in closures

All,
I am trying to understand the implementation of Lua. I am refereing to
this document for
details[http://www.tecgraf.puc-rio.br/~lhf/ftp/doc/jucs05.pdf]. I have
a question regarding the implementation of upValues as done for
function closures.
Takking the example given in the document:
function add(x)
  return function(y)
   	 return x+y
       end
end

add2=add(2)
print(add2(5))

As explained further in the document, the reference to the variable x
is stored in an upValue which can either point to the stack (when
referenced from within the inner function) or can point to itself
(when referenced from the creator of the closure).

Consider the following flow of actions:
1.Take the creation of the closure add2
2.function add is called with the parameter 2
3.within the inner function of the add function, the reference to the
parameter x is on the stack (x is actually 2 on the stack now)
4.An inner function is returned to the caller which contains a
reference to the x on the stack. Of course, this reference to the x on
the stack will no longer exist, as we have returned from the function
call and hence the stack frame will not be valid
(Continue reading)

Favicon

Re: Question regarding upValues in closures

> when does the value x change from being a variable on the stack to a
> value within the upValue.

Set the output of luac -l and the CLOSURE and GETUPVAL instructions.

Alex Davies | 3 Sep 16:58
Favicon

Re: Question regarding upValues in closures

Gangadhar NPK wrote:
> In this scenario, the question I have is - when does the value x
> change from being a variable on the stack to a value within the
> upValue. How is this transition done transparent to the caller. If you
> think this is too basic a question and I ought to figure this out,
> please let me know the source file and I shall try to figure it out.

Lua has a close opcode. It gives a stack position, and any "open" upvalues, 
ie upvalues pointing to the stack, above (or equal to, I believe) that stack 
position are closed. Closing upvalues involves copying the current stack 
data to the space reserved for a tvalue within the UpVal struct, and setting 
the pointer to point to that tvalue. Thus future getupval/setupvals won't 
attempt to modify the stack.

Close is inserted automatically by the compiler when variables go out of 
scope. Ie, if you're creating closures inside a for loop it'll insert a 
close at the bottom to ensure that the closures created access the correct 
iterator variable, but variables declared above that block will not be 
closed (as all closures created inside the for loop should share access to 
those).

Also, whenever a function returns it closes all upvalues down to reg 0. So 
there won't be a OP_CLOSE on your example, it's implicit. This is only done 
if lua has any open upvalues, but one thing I've noticed is that performance 
can be further improved (quite noticeably in synthetic compiler shootout 
benchmarks) by only doing it if the function returning has sub-protos. (As 
there's nearly always an open upvalue somewhere, yet few functions actually 
create closures).

- Alex 
(Continue reading)

Re: Question regarding upValues in closures

> Also, whenever a function returns it closes all upvalues down to reg 0. 
> So there won't be a OP_CLOSE on your example, it's implicit. This is only 
> done if lua has any open upvalues, but one thing I've noticed is that 
> performance can be further improved (quite noticeably in synthetic 
> compiler shootout benchmarks) by only doing it if the function returning 
> has sub-protos. (As there's nearly always an open upvalue somewhere, yet 
> few functions actually create closures).

Unfortunately, with a one-pass compiler, by the time a return is compiled
we cannot be sure that there won't be a sub-proto later. For instance,
consider a code like this:

function foo (x)
  while true do
    if something then return end   --<<< 1
    b = function () return x end
  end
end

When the return at (1) is compiled, Lua still thinks that the function
does not have a sub-proto. But that return may need to close the upvalues
used by 'b' (if 'something' is true only after some iterations).

-- Roberto

Alex Davies | 5 Sep 07:55
Favicon

Re: Question regarding upValues in closures

Roberto Ierusalimschy wrote:
> Unfortunately, with a one-pass compiler, by the time a return is compiled
> we cannot be sure that there won't be a sub-proto later. For instance,
> consider a code like this:

It can be implemented in the vm though, try benching the current 
implementation:

if (L->openupval) luaF_close(L, base);
vs
if (L->openupval && cl->p->p) luaF_close(L, base);
or
if (cl->p->p) luaF_close(L, base);

Although probably a minor point in real world use, is quite measurable on 
the recursion compiler shootout benches. (And yes, probably is a bad idea to 
optimize a vm based on synthetic benchmarks, but interesting nonetheless?). 
(may make more sense to simply rewrite those code pieces such that they 
close all upvalues before running the functions, which results in similar 
speedup).

- Alex 

Mike Pall | 5 Sep 12:38

Re: Question regarding upValues in closures

Roberto Ierusalimschy wrote:
> Unfortunately, with a one-pass compiler, by the time a return is compiled
> we cannot be sure that there won't be a sub-proto later. For instance,
> consider a code like this:
> 
> function foo (x)
>   while true do
>     if something then return end   --<<< 1
>     b = function () return x end
>   end
> end
> 
> When the return at (1) is compiled, Lua still thinks that the function
> does not have a sub-proto. But that return may need to close the upvalues
> used by 'b' (if 'something' is true only after some iterations).

I've done the following for the LJ2 bytecode:
- The RET* and CALLT* opcodes do not close upvalues.
- The UCLO opcode has a base register plus a jump target. It
  closes all upvalues higher or equal than the base register and
  then jumps to the target. By default this is the next opcode.
- UCLO are often followed by jumps, so these are merged in.
- When a return or tail call is emitted, the HAS_RETURN flag is set.
- When a sub-function is emitted, the HAS_CLOSURE flag is set. If
  the HAS_RETURN flag was set before, the FIXUP_RETURN flag is set.
- Returns and tail calls get an additional UCLO if the HAS_CLOSURE
  flag is set.
- In the rare case that the FIXUP_RETURN flag is set at the end of
  a function definition, the bytecode is scanned for all RET* and
  CALLT*. They are then moved to the end of the bytecode and get
(Continue reading)


Gmane