Tyson Whitehead | 23 Oct 17:43 2013
Picon

Better versions of this code

Hey All,

I wrote some quick code to change mediawiki preformatted text (adjacent lines begining with a ' ') to nowiki
blocks (first line with a ' ' and rest enclosed in <nowiki>...</nowiki> quotes).

http://lpaste.net/94688

I frequently find myself pretty unsatisfied with this type of code though.  Anyone have any rewrites that
would help teach me how to do this more elegantly in the future.

Thanks!  -Tyson

PS:  The mediawiki text I was using the above on has no existing <nowiki>...</nowiki> blocks.
Joey Adams | 24 Oct 04:43 2013
Picon

Re: Better versions of this code

On Wed, Oct 23, 2013 at 11:43 AM, Tyson Whitehead <twhitehead <at> gmail.com> wrote:
Hey All,

I wrote some quick code to change mediawiki preformatted text (adjacent lines begining with a ' ') to nowiki blocks (first line with a ' ' and rest enclosed in <nowiki>...</nowiki> quotes).

http://lpaste.net/94688

I frequently find myself pretty unsatisfied with this type of code though.  Anyone have any rewrites that would help teach me how to do this more elegantly in the future.

My first reaction was: use function composition to get rid of all those variables.  But this code is interesting because some intermediate results are used multiple times.  Here is the dependency graph: http://i.imgur.com/Smid5VZ.png (source file: http://lpaste.net/94718).

Before tackling the organization, I did some trivial refactoring: http://lpaste.net/94717.  Namely:

 * import Data.Function (on) instead of redefining 'on'.

 * import Data.Text (Text) to avoid qualified import for T.Text everywhere.

 * Use the OverloadedStrings extension to say "hello" instead of T.pack "hello".

 * Back off the indentation in 'update' by putting the first guard on a new line.

 * Move 'update' to a separate definition so we know it doesn't need any extra variables from 'replace'.

 * Use T.concat instead of multiple T.append calls, to make update much more readable.  There's also the <> operator in Data.Monoid (added in base 4.6).

 * Use map instead of fmap so it's clear we're working with a list.  Not everyone will agree with me on this.

 * Use newlines to group the steps of the process.

Also, be wary of partial functions (functions that fail for some inputs) like 'head' and 'fromJust'.  They're fine for quick scripts, but not for code other people will be using.  When a program crashes with "Prelude.head: empty list", it makes all of us look bad.  Instead, favor pattern matching or functions like 'fromMaybe' and 'maybe'.  On the other hand, 'head' and 'groupBy' are okay to use together because 'groupBy' returns lists that are all non-empty.

I'll say it again: partial functions are *fine* for quick scripts.  When you are the only one running the program, you can save a lot of time by making assumptions about the input.

All that's left is the fun part: making 'replace' easier to understand.  I think the key is to move the "Group lines according to classification" logic to a separate function.  The other steps of 'replace' have simple code, so this should help a lot.

Hope this helps,
-Joey
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Martijn Schrage | 24 Oct 12:23 2013

Re: Better versions of this code

By using a list comprehension you can keep everything together and avoid the zipping and unzipping. It also lets you get rid of the partiality introduced by head. See http://lpaste.net/94731 for a new version.

There's a couple more changes, some of which might be a matter of taste:
  - variable renames: update ~> updateLine, target ~> isIndented
  - if expression instead of boolean guard
  - introduced a couple of $'s to reduce parentheses
  - no partial fromJust functions in updateLine
  - case for pattern matching in main
 
Cheers,
Martijn

On 24-10-13 04:43, Joey Adams wrote:
On Wed, Oct 23, 2013 at 11:43 AM, Tyson Whitehead <twhitehead <at> gmail.com> wrote:
Hey All,

I wrote some quick code to change mediawiki preformatted text (adjacent lines begining with a ' ') to nowiki blocks (first line with a ' ' and rest enclosed in <nowiki>...</nowiki> quotes).

http://lpaste.net/94688

I frequently find myself pretty unsatisfied with this type of code though.  Anyone have any rewrites that would help teach me how to do this more elegantly in the future.

My first reaction was: use function composition to get rid of all those variables.  But this code is interesting because some intermediate results are used multiple times.  Here is the dependency graph: http://i.imgur.com/Smid5VZ.png (source file: http://lpaste.net/94718).

Before tackling the organization, I did some trivial refactoring: http://lpaste.net/94717.  Namely:

 * import Data.Function (on) instead of redefining 'on'.

 * import Data.Text (Text) to avoid qualified import for T.Text everywhere.

 * Use the OverloadedStrings extension to say "hello" instead of T.pack "hello".

 * Back off the indentation in 'update' by putting the first guard on a new line.

 * Move 'update' to a separate definition so we know it doesn't need any extra variables from 'replace'.

 * Use T.concat instead of multiple T.append calls, to make update much more readable.  There's also the <> operator in Data.Monoid (added in base 4.6).

 * Use map instead of fmap so it's clear we're working with a list.  Not everyone will agree with me on this.

 * Use newlines to group the steps of the process.

Also, be wary of partial functions (functions that fail for some inputs) like 'head' and 'fromJust'.  They're fine for quick scripts, but not for code other people will be using.  When a program crashes with "Prelude.head: empty list", it makes all of us look bad.  Instead, favor pattern matching or functions like 'fromMaybe' and 'maybe'.  On the other hand, 'head' and 'groupBy' are okay to use together because 'groupBy' returns lists that are all non-empty.

I'll say it again: partial functions are *fine* for quick scripts.  When you are the only one running the program, you can save a lot of time by making assumptions about the input.

All that's left is the fun part: making 'replace' easier to understand.  I think the key is to move the "Group lines according to classification" logic to a separate function.  The other steps of 'replace' have simple code, so this should help a lot.

Hope this helps,
-Joey


_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe <at> haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe


-- Martijn Schrage, Oblomov Systems 06-31920188 | martijn <at> oblomov.com | http://www.oblomov.com LinkedIn: http://www.linkedin.com/pub/martijn-schrage/6/677/505
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Tyson Whitehead | 24 Oct 17:32 2013
Picon

Re: Better versions of this code

On October 23, 2013 22:43:03 Joey Adams wrote:
> On Wed, Oct 23, 2013 at 11:43 AM, Tyson Whitehead <twhitehead <at> gmail.com>wrote:
> > I wrote some quick code to change mediawiki preformatted text (adjacent
> > lines begining with a ' ') to nowiki blocks (first line with a ' ' and rest
> > enclosed in <nowiki>...</nowiki> quotes).
> 
> ...
> 
> Before tackling the organization, I did some trivial refactoring:
> http://lpaste.net/94717.  Namely:
> 
> ...

Hi Joey,

I was thinking more about it last night and came up with another version.  With the help of your formatting
advice, this one actually feels pretty good

http://lpaste.net/94745

The prev and next input_class shifts could be more efficient, but I think I like having the symmetry better
than the performance.

Thanks!  -Tyson

Gmane