Mathias Herberts (JIRA | 18 Jun 2012 15:24
Picon
Favicon

[Created] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Mathias Herberts created PIG-2760:
-------------------------------------

             Summary: resources added with a relative path are added to the JobXXXX jar file under their absolute path
                 Key: PIG-2760
                 URL: https://issues.apache.org/jira/browse/PIG-2760
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.10.0
            Reporter: Mathias Herberts

When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
absolute path.

If a pig script contains the following:

REGISTER etc/foo;

and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:

/PATH/TO/DIR/etc/foo

instead of

etc/foo

which was the previous behavior

--
This message is automatically generated by JIRA.
(Continue reading)

Cheolsoo Park (JIRA | 18 Jun 2012 18:41
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396016#comment-13396016
] 

Cheolsoo Park commented on PIG-2760:
------------------------------------

This is a regression of PIG-2623:

{code}
-        File f = new File(path);
+        File f = FileLocalizer.fetchFile(pigContext.getProperties(), path).file;
{code}

where fetchFile() converts a relative path to absolute path.

In fact, converting a relative path to an absolute path isn't an issue, but the leading "/" makes registered
files not found. That is fixed at PIG-2745.

Thanks!

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
(Continue reading)

Mathias Herberts (JIRA | 19 Jun 2012 10:14
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396596#comment-13396596
] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

Converting a relative path to an absolute one may be an issue when accessing a resource in a UDF using getResourceAsStream

Previously, if we used 'REGISTER foo/bar;' in a script, you could access 'bar' by calling
this.getClass().getClassLoader().getResourceAsStream('foo/bar'); in your UDF, and this would
work whatever directory the pig script is run from.

If converting the relative path to an absolute one (with no leading '/'), the argument to
getResourceAsStream will need to be dependent on the directory from which the pig script is run, this
kinds of defeat usability.

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
absolute path.
(Continue reading)

Mathias Herberts (JIRA | 19 Jun 2012 11:30
Picon
Favicon

[Updated] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


     [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mathias Herberts updated PIG-2760:
----------------------------------

    Attachment: PIG-2760.patch

This patch changes the name used in the job jar to relative paths if the added resource lies under the current
working directory.

It also strips leading '/' as PIG-2745

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
(Continue reading)

Cheolsoo Park (JIRA | 19 Jun 2012 21:38
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397007#comment-13397007
] 

Cheolsoo Park commented on PIG-2760:
------------------------------------

Hi Mathias,

Agreed. I haven't thought about the use case that you're describing. :-) Thanks for explaining!

I like your patch because it solves all the cases that I can think of. Just a minor comment. Can't you collapse
the following lines of code into a single line?

{code}
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp;
// Strip leading path.sep
if (nameInJar.startsWith("/")) {
    nameInJar = nameInJar.substring(1);
}
{code}

=>

{code}
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
{code}

Given that cp is always going to be an absolute path (as a relative path is converted to an absolute one by
(Continue reading)

Mathias Herberts (JIRA | 19 Jun 2012 22:04
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397030#comment-13397030
] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

I guess with an appropriate comment the two code chunks could be collapsed into one yes.

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
(Continue reading)

Rohini Palaniswamy (JIRA | 20 Jun 2012 00:17
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397111#comment-13397111
] 

Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------

*This will cause getScriptAsStream() to error out on the backend as f.getPath() minus leading / will not be
in the jar.

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
(Continue reading)

Rohini Palaniswamy (JIRA | 20 Jun 2012 00:17
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397110#comment-13397110
] 

Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------

<code>
se.registerFunctions(f.getPath(), namespace, pigContext);
String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
String cp = f.getCanonicalPath();
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
pigContext.addScriptFile(nameInJar,f.getPath());
</code>

  The function is still registered with f.getPath() even though nameInJar is going to be relative to current
directory. This will cause getScriptAsStream() on the backend as f.getPath() minus leading / will not be
in the jar.  

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
(Continue reading)

Cheolsoo Park (JIRA | 20 Jun 2012 01:07
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397142#comment-13397142
] 

Cheolsoo Park commented on PIG-2760:
------------------------------------

Indeed. Good catch!

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
(Continue reading)

Mathias Herberts (JIRA | 20 Jun 2012 11:19
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397368#comment-13397368
] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

So this means we should do the following:

<code>
String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
String cp = f.getCanonicalPath();
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
pigContext.addScriptFile(nameInJar,f.getPath());
se.registerFunctions(nameInJar, namespace, pigContext);
</code>

right?

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
(Continue reading)

Rohini Palaniswamy (JIRA | 21 Jun 2012 04:05
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398072#comment-13398072
] 

Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------

Mathias,
 It requires one more minor change. If we just do cp.startsWith(cwd), then even if absolute path was
specified and if the script is in a subdirectory under current directory, the jar entry only has the
relative path instead of the absolute path. Need to do cp.equals(cwd + "/" + patch). 

I was addressing script loading issue in PIG-2761 and had some modifications for the same line of code. So
added your fix to it also and tested and also made it part of PIG-2761 patch. Hope you don't mind. If you
could, I would appreciate you reviewing PIG-2761.

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
(Continue reading)

Mathias Herberts (JIRA | 25 Jun 2012 15:21
Picon
Favicon

[Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


    [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400459#comment-13400459
] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

Your patch attached to PIG-2761 Looks Good To Me for what concerns PIG-2760.

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its
absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
(Continue reading)

Daniel Dai (JIRA | 26 Jun 2012 22:36
Picon
Favicon

[Resolved] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path


     [
https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2760.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.10.1
                   0.11
         Assignee: Rohini Palaniswamy
     Hadoop Flags: Reviewed

This is fixed along with PIG-2761. Thanks folks!

> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.11, 0.10.1
>
>         Attachments: PIG-2760.patch
>
>
(Continue reading)


Gmane