Kenneth Graunke | 5 Aug 2012 09:04

i965 texturing bugfix

While working on

https://bugs.freedesktop.org/show_bug.cgi?id=52129

I discovered that we have a nasty bug in our texturing code, dating all
the way back to the advent of the new FS backend.  If you generate a
sampler message whose subexpressions (like LOD or shadow comparitor)
require SENDs to generate, we end up clobbering the MRFs containing the
texcoords, causing all kinds of nonsense.

This showed up with my textureGrad rework on Gen4-5, where I made the
LOD log2(sqrt(...)), both of which involve math operations, aka SENDs.
We would load the U-coordinate in m2, clobber it to do math, then go
ahead and do the sampler send.

This could probably also happen on Gen6+ with, say, nested texture
calls, or perhaps pull constants/UBO loads.  Worth fixing regardless.

Piglit-tested on G965, G45, Ironlake, and Sandybridge.  *Not* tested
on Ivybridge (I moved this weekend and can't find my power brick).
Also not oglconform tested.
Kenneth Graunke | 5 Aug 2012 09:04

[PATCH 1/4] i965/fs: Move message header and texture offset setup to generate_tex().

Setting the texture offset bits in the message header involves very
specific hardware register descriptions.  As such, I feel it's better
suited for the lower level "generate" layer that has direct access to
the weird register layouts, rather than at the fs_inst abstraction layer.

This also parallels the approach I took in the VS backend.

Signed-off-by: Kenneth Graunke <kenneth <at> whitecape.org>
---
 src/mesa/drivers/dri/i965/brw_fs.h           |  1 +
 src/mesa/drivers/dri/i965/brw_fs_emit.cpp    | 21 +++++++++++++++++++++
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26 +++++---------------------
 3 files changed, 27 insertions(+), 21 deletions(-)

Not a bugfix, but I've been meaning to do this forever, and since I had to
rework fs_visitor::visit(ir_texture *), I figured I'd move some of the
complexity out of it.  I believe Eric had a similar patch in his Ivybridge
texture-from-GRFs branch.

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
index 031d541..246dcf0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
 <at>  <at>  -164,6 +164,7  <at>  <at>  public:

    int mlen; /**≤ SEND message length */
    int base_mrf; /**≤ First MRF in the SEND message, if mlen is nonzero. */
+   uint32_t texture_offset; /**≤ Texture offset bitfield */
    int sampler;
    int target; /**≤ MRT target. */
(Continue reading)

Kenneth Graunke | 5 Aug 2012 09:04

[PATCH 2/4] i965/fs: Factor out texcoord setup into a helper function.

With the textureRect support and GL_CLAMP workarounds, it's grown
sufficiently that it deserves its own function.  Separating it out
makes the original function much more readable.

While we're refactoring it, tidy up a conditional.  Instead of:

   if (gen < 6 && rect)
      ...
   else if (rect)
      ...

We can do the more readable:

   if (rect) {
      if (gen < 6)
         ...
      else
         ...
   }

Signed-off-by: Kenneth Graunke <kenneth <at> whitecape.org>
---
 src/mesa/drivers/dri/i965/brw_fs.h           |  1 +
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 38 ++++++++++++++++++++--------
 2 files changed, 28 insertions(+), 11 deletions(-)

Also purely for sanity.  Simplify the function before adding more complexity.

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
index 246dcf0..1d9c686 100644
(Continue reading)

Eric Anholt | 6 Aug 2012 18:50
Gravatar

Re: [PATCH 2/4] i965/fs: Factor out texcoord setup into a helper function.

Kenneth Graunke <kenneth <at> whitecape.org> writes:

> With the textureRect support and GL_CLAMP workarounds, it's grown
> sufficiently that it deserves its own function.  Separating it out
> makes the original function much more readable.

This looks good.

> While we're refactoring it, tidy up a conditional.  Instead of:

>
>    if (gen < 6 && rect)
>       ...
>    else if (rect)
>       ...
>
> We can do the more readable:
>
>    if (rect) {
>       if (gen < 6)
>          ...
>       else
>          ...
>    }

I don't see this having actually happened in this commit.  Which I'm
fine with, just as long as the commit message and code are consistent :)
_______________________________________________
(Continue reading)

Kenneth Graunke | 5 Aug 2012 09:04

[PATCH 3/4] i965/fs: Don't clobber sampler message MRFs with subexpressions.

Consider a texture call such as:

   textureLod(s, coordinate, log2(...))

First, we begin setting up the sampler message by loading the texture
coordinates into MRFs, starting with m2.  Then, we realize we need the
LOD, and go to compute it with:

   ir->lod_info.lod->accept(this);

On Gen4-5, this will generate a SEND instruction to compute log2(),
loading the operand into m2, and clobbering our texcoord.

Similar issues exist on Gen6+.  For example, nested texture calls:

  textureLod(s1, c1, texture(s2, c2).x)

Any texturing call where evaluating the subexpression trees for LOD or
shadow comparitor would generate SEND instructions could potentially
break.  In some cases (like register spilling), we get lucky and avoid
the issue by using non-overlapping MRF regions.  But we shouldn't count
on that.

Fixes four Piglit test regressions on Gen4-5:
- glsl-fs-shadow2DGradARB-{01,04,07,cumulative}

NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
Signed-off-by: Kenneth Graunke <kenneth <at> whitecape.org>
(Continue reading)

Kenneth Graunke | 5 Aug 2012 09:04

[PATCH 4/4] i965/vs: Don't clobber sampler message MRFs with subexpressions.

See the preceding commit for a description of the problem.

NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
Signed-off-by: Kenneth Graunke <kenneth <at> whitecape.org>
---
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 60 ++++++++++++++++++--------
 1 file changed, 43 insertions(+), 17 deletions(-)

No observed benefit to this patch, at least in Piglit, but it certainly
is worth doing.

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index c77dc91..03089d0 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 <at>  <at>  -1786,6 +1786,42  <at>  <at>  vec4_visitor::visit(ir_texture *ir)
    /* Should be lowered by do_lower_texture_projection */
    assert(!ir->projector);

+   /* Generate code to compute all the subexpression trees.  This has to be
+    * done before loading any values into MRFs for the sampler message since
+    * generating these values may involve SEND messages that need the MRFs.
+    */
+   src_reg coordinate;
+   if (ir->coordinate) {
+      ir->coordinate->accept(this);
+      coordinate = this->result;
+   }
(Continue reading)

Eric Anholt | 6 Aug 2012 19:00
Gravatar

Re: [PATCH 4/4] i965/vs: Don't clobber sampler message MRFs with subexpressions.

Kenneth Graunke <kenneth <at> whitecape.org> writes:

> See the preceding commit for a description of the problem.
>
> NOTE: This is a candidate for stable release branches.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
> Signed-off-by: Kenneth Graunke <kenneth <at> whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 60 ++++++++++++++++++--------
>  1 file changed, 43 insertions(+), 17 deletions(-)
>
> No observed benefit to this patch, at least in Piglit, but it certainly
> is worth doing.
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index c77dc91..03089d0 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp

> +   src_reg lod, dPdy;
> +   switch (ir->op) {
> +   case ir_txf:
> +   case ir_txl:
> +   case ir_txs:
> +      ir->lod_info.lod->accept(this);
> +      lod = this->result;
> +      break;
> +   case ir_txd:
> +      ir->lod_info.grad.dPdx->accept(this);
(Continue reading)


Gmane