Annual Black & White Play

The famous Black and White plays… well, there’s just no way to describe it.

Here is Tao and me before the play, with my brother-in-law in the frame as well. It is traditional to dress in black and white, especially for the New Year performance. But that’s not why it’s called a black and white play.

before-show

After the show, we met with the cast.  Here I am with Nick and Nora Charles straight off the screen from the 1934 film.
after-show-thinman

Well, actually it’s the world famous detective and aspiring actor Harry Bunsnacker and his paid by the hour assistant and close personal friend Nigel Grouse, in a clever disguise.  Below I’m with Lt. Foster, the show’s straight man.

after-show-foster

My family has been attending the Pegasus Theater’s productions since 1986.  And as I stated at the beginning, it’s indescribable.

A trip to the symphony

Dressed up to go out

Since Itzhak Perlman was going to be in town around the time of Tao’s birthday, I bought tickets for that show — six months ago when I saw the schedule.

Jaap van Zweden conducts
Itzhak Perlman, violin

RAVEL Daphnis et Chloé Suite No. 2
BRUCH Violin Concerto No. 1
Itzhak Perlman, violin
TCHAIKOVSKY Capriccio italien

It turns out this was the hottest ticket in town, being a special 25th anniversary gala at the Meyerson.  As well as starting a little late, people kept making speeches and I wondered how much music there would be given the times stated!  Among the “honorary organizers” were the mayor of Dallas and Ross Perot.

There was a special award for the architect I. M. Pei which was accepted by one of his sons.  Since it consisted of a “piece of original limestone” and a chunk of crystal, it looked rather heavy to the people who did handle it.  The 97 year old architect has won every award remotely connected to his field, so it makes sense that he would just send someone to pick it up [pun intended] for him.

Another speaker was Sarah, Duchess of York which surprised me.  I didn’t think that people outside of the area were involved, but apparently the building of the Meyerson was world-wide news and interested people in the Arts from all over.

As for the music, I needn’t have worried.  They simply played until they were done, never mind the printed times for the after party.  In fact, there were two encores: Perlman played the theme from Schindler’s List, and the orchestra continued with a waltz which name I did not catch.

As usual, the live performance at the Meyerson was richer and more powerful than any recording.  That’s the exact opposite of the situation with “pop” music in venues with acoustics so bad that it’s not about the sound at all but the experience of gathering.  I think I will find a good recording of Bruch, though, which is missing from my collection.

The Joy of Reading (21st century style)

I decided to read a science fiction novel.  Not a unique occurrence, as I have thousands of them in boxes and storage and shelves.  And that’s the problem, really.  Hard drive space is already increasing faster than my reading pace, so I could store books electronically and add to them indefinitely and keep the same physical volume.

Now I’m already well practiced with using my original Nook reader.  After downloading the file, I would next import it into calibre, not just to keep track of it but to massage the data.  But first it tells me that my version is woefully out of date and I go download and install the latest, which offers many improvements and rearranges the controls.

The book’s file is a bit strange, it seems, as it was not importing right.  Easiest thing is just to load it in Sigil instead (though I suppose I could figure out the import options or use new features to edit the files and not need Sigil anymore for its use in later steps) and save it again.

A first look at the formatted ebook is fair; I’ve seen much worse from some publishers.  Why can’t they do as well as, say, Project Gutenberg, and just put the text in a file?  But I digress.  I fire up calibre’s “heuristic” processing to clean up all the junk, and use its formatting features to optimize the file for my device’s liking and my reading preferences.  Ah, but that’s set for Nook.  Well, a new software reader probably doesn’t have as many peculiar issues as an old dedicated reader, so I probably don’t need that anymore.  Generic output profile to start with, but still specify traditional print-style rendering where paragraphs are indented on the first line as opposed to having double vertical space between them.  Margins and other stuff should be taken care of by the reader software.

Normally this is where I then load the resulting file into Sigil and see if there are any bizzare features that can be fixed with a simple global search-and-replace on the HTML source, if that is still necessary.  At the least I’ll manually retouch the css file to delete stuff that ought to be unspecified so the reader doesn’t feel it’s being bossed around, and get rid of the text-align: justify line since that doesn’t work as well on the old low-resolution e-paper display.  It looks better if the horizontal spacing is optimized for letterform appearance and not also trying to get a specific length too.

On the Nook, I then plugged in the USB cable (which was charging anyway) and had calibre export to it.  But how do I read it on the Android tablet?  USB filesystem hasn’t worked for a few years now and it’s futile to try.  It doesn’t have SMB file networking built in, but there are apps for that.  I know I’ve tried a fancy file manager that includes network access, and it doesn’t work.  I use the network plug-in for the well-regarded Midnight Commander port, and it doesn’t work.  I tried a few more programs, and nothing could get past the names of the file shares, if it got that far at all.  Must be some “security” thing?

Next I try a couple features in calibre.  One is wireless device access, and I’m not sure what that does, but a couple readers and stand-alone programs allow the Android device to use it, it seems.  Well, I can’t get anything to do anything with that.  The other feature is better:  a web server interface.  It tells me the local IP address and port, so I make that into a URL and feed it to Firefox.  Success!  It lets me browse the book collection on the Android tablet, and download files via HTTP.  So, now I have the book file on the tablet.

Next question:  which reader software?  A Google search turns up a few reviews.  Mostly they don’t address the features I’m looking for, or any real features pertaining to the core function of reading stuff presented on the screen.  I don’t care which stores they are integrated with, or how pretty the book chooser screen looks and all the skeuomorphisms present.  A shame that “able to load files on local storage” is a feature that needs to be checked for!  The supplied Google Play Read for example, has its collection of things you bought from them, and no way to point to an actual file.

I end up trying two, and spend the rest of the afternoon figuring out how to make it dance with the song I sing for it.  I’m glad to say that I had success in setting font appearance and size, getting the line spacing to look right, having it show margins rather than printing all the way to the edge of the screen, and so on.

The page is looking quite presentable.  I do mean “looks”, since I haven’t actually read the first page yet.  That’s a chore for next weekend.  It does seem like a lot of effort for a book I’m not going to like anyway, but that’s why I wanted to save five bucks for a remaindered copy plus shipping.

HOTEL alfa sierra kilo echo lima×2

From earlier posts, you know that I’ve been learning Haskell. Now it’s one thing to memorize some syntax rules, and be able to discuss intelligently the semantics and ramifications of some particular language feature, even typing a line or two at an interactive prompt. It’s another to face an empty page in a text editor with the intent of writing a program.

For whatever reason, it occurred to me to do something with the spoken (phonetic) alphabet.  It’s good to pick something you easily know how to do, as to concentrate on the mechanical details of the tools instead of the problem.

The Main Loop

To that end, I started with looking up how to fetch command-line arguments (simple output was already covered from hello world) and spent a long time pondering just how to structure the loop over the arguments.

Pretty basic, right?  With pure functions, given something to do on one input value (say, foo) and an ordinary list of input values, you can process all of them using any of several ways of saying “map”.

map foo values
foo <$> values
mapF foo values
mapM foo values
forM values foo
ff

The problem is that both the thing I want to do on each value and the function that obtains the list of values is “in” the IO monad. So instead of one token to put these two things together (whether a word like map or a infix operator like <$>), I need two separate tokens for monad composition and the mapping, and had further trouble combining the two things.

Separate steps for the monad and the map isn’t too unreasonable, but is just a little bit. There are different levels of wrapping, like having different levels of indirection. f $ g is one way to write simple function application, and the entire thing g is the argument of one call to f; f <$> g uses a different symbol to mean apply f to what’s inside g treated as a container, so f is called multiple times with values taken from inside g.  There are any number of different wrapping layers and different things can be wrapped in different numbers of layers, so having unique marks for each possibility is prohibitive.  There are ways to bump things up in general, so f g is plain function application and liftM f g tells f to look inside its argument.

So, given that the values I have are double wrapped and I’m targeting the middle layer (neither the innermost nor the outermost) it seems reasonable that an extra mark of some kind is needed to specify, in addition to a mark that says “put these two things together”, making two in all.

The other combining problem is harder to explain:  Given one function (foo) that produced something that you wanted to feed to another function (bar), simply combining them is a no-brainer, you would think.  In a C-like language, bar(foo()) is pretty obvious.  In Haskell you don’t even use the parenthesis, so it’s just bar foo.

You could use an intermediate variable name, like

s= foo();
bar (s);

but you don’t have to.  In fact, that’s one of the things wrong with using so-called OUT parameters, is that it ruins the expressiveness and you can’t simply chain them together.  And this kind of chaining is very much embraced by functional style and Haskell in particular.

If foo is actually getArgs, and bar is a function that takes a list of strings (don’t even worry about the mapping at this point—just write a function that explicitly takes a list), you can’t do it!

main = do
    s <- getArgs
    mapM_ body s

badmain = mapM_ body getArgs

Writing that without the named intermediate value and do-block, the “obvious” main = mapM_ body getArgs doesn’t work.

When I was pondering this as I was working on it, I concluded that the only thing that can be done to getArgs is the bind operation (>>=).  Now, maybe that’s not quite true if common helpers exist that themselves use bind, such as the liftM mentioned earlier.  Ideally I’d somehow mark the argument that is wrapped too much for the function, as the first argument (the body) is fine and there may be variations like liftM2 but not one that skips lifting some arguments and then lifts others.  Would ap or <*> mixed with $ work here?  That’s something to try next time.

Meanwhile, my first concern was with writing the for-each construct in a clear manner, without separating body into its own named function since it isn’t that complicated.  I wondered what the usual common idiom might be?

No matter how you arrange the pieces though, I needed to name the parameter for the individual value/iteration being processed.  In common imperative “structured” languages, naming the loop control variable is a basic part of the syntax of a for-each looping construct.  Using a higher-order function instead, writing a lambda expression for the body was a bit clunky.  In any case, the layout started giving me trouble, so I gave up on that and just made body a separate named function.

What I ended up with is (starts with, anyway)

main =
    getArgs >>= mapM_ body

and with the body being a single word, the arrangement of the components (looping construct, list of inputs, body) is not nearly so important.

 Look-up Table

Haskell’s go-to data structure is a list.  There is no simple way to make a O(1) random-access array.  I mean, there are in fact such arrays as libraries, but to populate it you need a list anyway.  Such an array would help random-access speed, but only makes creating the table more complicated.

For key-value lookup, the basic feature is a function called lookup.  It operates on a list of (key,value) pairs, and I really didn’t want to write out the keys a through z and pairs, but rather just list the values (the alfa bravo keywords) in order.  The zip function took care of that.

Since it is most definitely not taking any advantage of contiguous range of keys, I decided to use that as a feature and add keywords for some odd punctuation marks and such.

phon_table= nato_table ++ digit_table ++ 
    [('.',"Period"), ('-',"Dash"), ('#',"TicTacToe"), ('_',"Underscore"), (' ',"Space"), ('@',"Snail") ]

nato_table= zip ['a'..'z'] ["alfa", "bravo", "charlie", "delta", "echo", "foxtrot", "golf", "hotel",
                "india", "juliett", "kilo", "lima", "mike", "november", "oscar", "papa", "quebec",
                "romeo", "sierra", "tango", "uniform", "victor", "whiskey", "xray", "yankee", "zulu" ]

digit_table=  -- FAA style 
    zip ['0'..'9'] ["Zero","One","Two","Three","Four","Five","Six","Seven","Eight","Niner"]

Thinking ahead, variations can be accommodated by building the master list out of different pieces.  For example, use ITU instead of FAA code words for the digits, and add application-specific words.

Function Decomposition

Functions need to be written in small pieces.  In real-world code (using common imperitive languages) I’ve seen people write what I call meandering functions, which is a lack of decomposition: one defined unit (procedure or function in the high-level language) does stuff, then does more stuff, then goes on to work on something else, etc.  Often they are much much too long, running hundreds of lines of source code.

In Haskell, pure functions don’t have multiple statements.  That makes it hard to meander, or, without being adept at functional programming yet, even writing more than one line!

I also see lots of code that is not meandering in that the function does “one thing”, but yet is not decomposed well in that the thing it’s doing is obscured (even overwhelmed) with lower-level details that should be unimportant to “the thing” being expressed in this function.  Many programmers have a hard time with that, apparently not recognizing or even understanding what that is.

Those same programmers would more naturally break things up more in Haskell, since it is awkward to stuff more detail inside the argument of a function.  Writing a list of statements, it’s all to easy to stop what you’re really doing and issue lower-level detail work, and then put in more higher level work.  When everything is an argument to a single function, it’s more in-your-face that you are interrupting your expression with details that should be elsewhere.

Haskell facilitates writing very small functions with features such as writing multiple branches as separate top-level functions, and nested functions which close over the variables in their parent scope.

So is it “smaller is better”, period?  How much logic should go into one function as subexpressions rather than separate named functions?  Since the stuff in a where block have the same meaning as being in-line, thanks to the parent variables still being available, naming many individual pieces of a function seems to be the way to go.  What to not break up would be logical constructs that are read as a unit, and idiomatic uses that are recognized as one phrase but would be disguised if broken up.

When I first started to formulate some code, I ran into trouble with multiple nestings of local definitions, or trying to combine them in ways that either were not supported or couldn’t work in the layout rules.

A harder thing to know how to get just right is when to make local nested functions vs top-level functions.  My instinct is to nest everything and not expose internal details, since there isn’t any level of organization and name hiding smaller than the module.

Once I got past the initial main-loop panic, most of my time was spent fiddling with just how to decompose the functions.  As an anchor point, it made sense that the principle function solving this problem would be a named function and the driver (getting the args and displaying the results) would be a sample usage of that function.  In a real project, that is what I’d be interested in, and what I have here as main would just be test code, as the function would be called by other parts of the program.

One interesting detail is that my function phon processes one character, producing one code word as a result.  It doesn’t provide a function for processing an entire string, which is something you’d probably add in typical programming languages.

Given the function phon, you can call it on a character using ordinary function application, but it’s just as easy to process an entire string so there is no need to provide another function for that!

codeword = phon ch -- mundane call
codeword = phon $ ch -- means the same thing
codewords = phon <$> s -- process whole string of input
codewords = map phon s -- same thing

It’s also especially important in Haskell to keep the physical realization of the output separate from the computation. In most common programming languages, it would be just as easy to have the principle function print the result, and multiple calls would print successive items. In Haskell it is not just as easy — The function phon in this case cannot call putStr or anything like that.

In this toy program, where phon is called right from main and nothing else is happening, writing phon as an IO function would not seem awkward.  But it’s still a deliberate act, and you’d certainly notice when making a recursive call (or trying to).

Of course, that’s good in general.  In my toy program printing the results was part of the specification, but in real projects such a function will be consumed by other code, so the result needs to be in a form that’s still internalized and can be handled however the caller wants.  Unfortunately, it also makes it difficult to add debugging trace statements to existing code, but that’s another story.

Here is the complete program, as I left it.

module Main (main) where

import System.Environment (getArgs)
import Control.Applicative
import Data.Char


main =
    getArgs >>= mapM_ body
    where
        body s = do 
            putStrLn $ announce s 
            putStr $ format $ phon <$> s
        announce s = "string \"" ++ s ++ "\" spoken as:"

format =
    concatMap decorate
    where decorate w =  "\t" ++ w ++ "\n"


phon :: Char -> String
phon c | isUpper c = toUpper <$> (phon $ toLower c)
phon c = 
    maybe (quoted c) (id) result
    where result= lookup c phon_table
          quoted c= ['"',c,'"']
    
    
phon_table= nato_table ++ digit_table ++ 
    [('.',"Period"), ('-',"Dash"), ('#',"TicTacToe"), ('_',"Underscore"), (' ',"Space"), ('@',"Snail") ]

nato_table= zip ['a'..'z'] ["alfa", "bravo", "charlie", "delta", "echo", "foxtrot", "golf", "hotel",
                "india", "juliett", "kilo", "lima", "mike", "november", "oscar", "papa", "quebec",
                "romeo", "sierra", "tango", "uniform", "victor", "whiskey", "xray", "yankee", "zulu" ]
                
digit_table=  -- FAA style 
    zip ['0'..'9'] ["Zero","One","Two","Three","Four","Five","Six","Seven","Eight","Niner"]

I welcome comments from experienced Haskell programmers on these issues, and overall style and idioms in general.

A Spriral Approach

I’ve heard of spiral approaches to learning, meaning that material is covered without much detail and making successive passes with more detail.

My recent experience makes me think of a slightly different spiral:

You understand, then you learn more, and then you are more confused than ever.

zeta

Maybe that’s why I like this particular representation of the Riemann ζ function?

−7/4 and Interactive Play

I saw this video concerning iterative function sequences

and thought that would be a good example to play around with in Haskell.  Haskell has arbitrary precision integers, and also has rational numbers, standard.  (I hesitate to say “built in” because, like many languages, the true built-in core is minimal with layers of library code on top of that.  But it was available out of the box.)

It also has an iterate function. What’s interesting is that the result is naturally presented as an infinite list. Other languages with abstract iteration or generators will provide some kind of interface to make your custom thing handleable in a common way. Perl even has a tie feature to make your custom thing seem exactly like the built-in collections, which pushes the interface contract down to include native datatypes, rather than being layered on top of them.

But in Haskell, the generator is the native list. Rather than make some custom generating code support the iteration mechanism, the same old list holds what appears to be the complete result, as if done as an earlier step.

In some purely imperative language, you might ordinarily do things like:

  1. Select the common, standard, container type with the appropriate properties (e.g. linear, sequential).
  2. Run a computation loop and fill the container with the results.
  3. Pass the container to another procedure, which can use the contents for other work (e.g. print them to a file).

Suppose that was unsuitable for some reason and you wanted to delay the computations of step 2, running the producer in parallel with the consumer, or not bothering until/unless a value was actually used by step 3, or not knowing how many items will really be needed.  You would have to explicitly program in a more complex sequence of commands to achieve that.

Instead of a plain common collection (e.g. a C++ vector) you’d need something else.  You need to arrange producer/consumer threads with a work queue between them, or callbacks, or something.  Step 2 would be preparing the whatsit, and step 3 would still be easy enough using standard iterators to abstract the actual details of the collection/stream/whatever.

Now look at it in Haskell.  Steps 1 is not much to think about as the list is king.  Step 2 is:

lz = iterate (λx→2*x+1) 0

Later, other code can get values from this list, e.g.

take 10 lz

produces a result of

[0,1,3,7,15,31,63,127,255,511]

The value lz is just another list of Integer, as far as the language is concerned. But clearly it could not have been filled in step 2 because it is as long as I need it to be, and never specified a size when filling it. It is actually an infinite list!

This certainly looks like the simple approach of: create the values, then use them. However, it actually doesn’t do any work when lz was defined, and only computes values when subsequent code reads them. Here I asked for the first 10 elements, and the list then went ahead and realized those values. If I ask for the value at index 30 (lz!!30) it needs to run the computation all the way to that element before telling me it’s 1,073,741,823.

This illustrates that the program here is not imperative. I didn’t program in “do this step, do the next step, etc.” Rather, I just specified that I wanted the 30th item of the list, and it did the work of figuring out just how to do that based on the definitions. It’s more like a spreadsheet: cells are not computed in order, but based on what they depend on. Haskell uses a top-down approach: when a specific value is really needed, it looks at the formula for it. Gathering the values needed for that function will set off other chains of downward dependent calculations.

Interactive Play

Using the GHCi interactive shell, it’s easy to play with little programs like this.  A more difficult problem featured in the video is

zz = abs <$> numerator <$> iterate (λx→x^2−7%4) 0

Trying a line of code, then recalling it to adjust and submit again, reminded me of a time long ago when microcomputers ran BASIC (or assembly language) and were attached to the household television.

Interactive play is the way to really learn something, and with a more complex build environment or larger programs you still try and do that.  It’s the combination of a direct Read-eval-print-loop and the ability to write interesting things in a single line of text that makes this experience especially efficient.

Want to try it yourself without installing anything?  Try Haskell! is a browser-based experience that invites you to “type these out and see what happens”  (psst: copy/paste is faster if you really can’t wait to find out).

Back to the Sequence

The actual sequence generated by iteration is rational numbers, but we want to ignore the denominator.  Then, I saw that the sign of the number was preserved with numerator, so the abs stripped that off.

The result starts off as [0,7,21,7,114639] and then the numbers get very large. Index 7 is 595096023596888261906480183623645954687, and it quickly gets overwhelming and impractical to display.

  • n=4 has 6 digits (is exactly 114639)
  • n=7 has 39 digits
  • n=28 has 80,807,125 digits
  • n=29 has 161,614,249 digits
  • n=30 has 323,228,496 digits
  • n=31 has 646,456,993 digits

That’s the largest I could compute with this program.  The Windows version compiles to a 32-bit program, but I used editbin to set the 3G flag, so it had 3G of user address space instead of 2.  That allowed it to computer n=30.  Running on a Mac, which is 64-bit, only increased that by one more before getting Segmentation fault.  I suspect there are things in the implementation that top out at 4G, even when more memory is available.

Foods for Thought

Here is a little problem that is isomorphic to something in real life that I was discussing with a co-worker.  Given a list of items (of a simple enumeration type), and knowing implicitly that each item has an associated priority, return the best item of the list.  Further refinements:  an empty list produces a default 0-priority item, as would a list that contains only negative-priority items.  Oh, and there might be unexpected items present whose priority is not known to the code.

To give some concrete examples, let me define the item type Snack as the set { Nothing, Gorp, Doughnut, Broccoli, Cheese_puffs, Apple, Cookie }, and a definition of the priorities, you can give the sequence (Gorp, Cheese_puffs, Apple) and the function returns Cheese_puffs.  Given the empty sequence () it returns Nothing.  Given the sequence containing only (Broccoli), it still returns Nothing.

Since I’ve been reading about and watching videos on the subject of functional programming, I recognize this an example of a foldl operation.  In particular, see C9 Lectures: Dr. Erik Meijer – Functional Programming Fundamentals Chapter 7 of 13.

Even if we’re not using higher-order functions, it still qualifies as a design pattern even if the language doesn’t do all the work for us.  In C++ pseudo-code, the solution ought to look like this:

result= initialvalue;
for (item : collection) result= f(result,item);

The logic of interest ought to be a succinct expression, something like result= std::max(result,item, comp); where comp is a comparison function that knows to compare the corresponding priorities rather than the numbers themselves.

There is a max_element algorithm in std (and in boost::range) that will go through a sequence and apply the max comparison to each element, as in the previous listing.  But, it doesn’t like empty lists.  It’s hard to beat a foreach loop for ease and clarity!  In C++11, the range-based for loop is the easy way to go through a collection.  You put other statements before, after, and inside it to control the details.  It’s not a function as a first-class object, but that’s still an easy way to specify such things.

I’ve been a fan of the “range” idea for a long time, and think it’s a nice way to make STL more friendly.  Boost has range (whole container) versions of std algorithms, but where is the humble map algorithm?  In Boost::range, you don’t use a range algorithm for this at all, but rather the concept of map is filled via range adaptors, with a generic map higher-order function called transformed.

Applying ideas from functional programming, the function (lets call it g) that takes a Snack value and returns the corresponding priority can return a pair, containing the original argument as well as the result.  That is, it decorates the value but does not destroy it.  Something like this:

pair<Snack,int> x = g(Nothing);
for (item : input | boost::transformed(g))  x= max(x,item,compare_second);
result= x.first;

compare_second should do the normal operator< on the .second, but it doesn’t appear as a predefined standard function.  It could be done using a lambda expression, but (in C++11) you would have to declare all the types and parameter arguments.

But remember the earlier musing on the utility of for loops — there’s nothing wrong with using them.  Instead of putting details inside a lambda expression, don’t bother.  max and compare_second are so simple to express naturally that it’s not worth any fuss to compose canned implementations of them.  (Now if we were working with parts that had more complexity, it would be another story)

pair<Snack,int> x = g(Nothing);
for (auto item : input | boost::transformed(g))   {
    if (item.second >= x.second)  x= item;
}
result= x.first;

Note that this naturally handles empty lists and lists containing only undesired items, and if g defaults to 0 for items not known at compile time, this handles all the desired cases without any special-case code.

Giving up on using a single function expression as the body of the loop, there is no need to have g return pairs and then extract first and second.  But, it still has an advantage.  If you kept two distinct variables instead:

Snack x = Nothing;
int pri= 0;
for (auto item : input)   {
    int newpri= g(item);
    if (newpri >= pri)  {
        x= item;
        pri= newpri;
    }
}
result= x;

It turns out longer!  The two variables x and pri, kept separately, need to be updated as a unit.  In more complex code, that is not nice as mistakes can be made.  We still want both the item from the input list and the associated priority, so yet another variable is created.

It is more natural in C++ to have g just return the result, though.  So let’s take the best features of both and split the difference:

pair<Snack,int> x  (Nothing, g(Nothing));
for (auto item : input)   {
    pair<Snack,int> temp (item, g(item));
    if (temp.second >= x.second)  x= temp;
}
result= x.first;

Now remembering the associated priority with the best-so-far assumes that g has some overhead.  E.g. it’s a configuration file read a run-time and stored in a fancy data structure where items with priority defined are sparsely represented.

But if that’s not the case, and there’s no good reason not to make redundant calls to g, you don’t have to remember it.

Snack x = Nothing;
for (auto item : input)   {
    if (g(item) >= g(x))  x= item;
}
result= x;

It’s just so annoying to me that g is called for again for the same value it just used on the previous iteration!  If the priorities are known at compile time, you might suppose the compiler to inline and optimize out the calls to g.

I will continue this later with an exploration of how it works in a functional language.

Haskell Exploration 2, or “who needs loops?”

The previous post used a one-liner to solve the problem, using a list comprehension. Trying different limits meant editing the number in the middle of the line and entering again. That’s really the only thing you can easily do in the GHCi interactive environment—any functions or more complex code needs to be in an actual source file. At least GHCi can load your source file and then let you interact with the functions defined in it.

So, here’s a function which takes the max in the problem statement. To completely generalize, it also takes the list of factors as a parameter. After all, the original code was starting to look repetitive on the handling of the two hard-coded factors.

 

divisors ∷ [Int]IntInt
divisors factors max =
sum [x | x ← [1‥max], factor factors x]
 
factor factors n =
any (λx → n `mod` x ≡ 0) factors
-- swap out these alternatives by leaving exactly one un-commented
-- or [b | x ← factors, let b= n `mod` x ≡ 0]
-- or [True | x ← factors, n `mod` x ≡ 0]

 

As well as using recursion, the list comprehension syntax offers a way to write what would be a loop in imperative languages.  The generator portion of the construct, x ← [1‥max], is like a foreach loop, though it is not limited to using one generator.  The things after the vertical bar can include generators, constraint tests, and named values to use in other things.  However, it doesn’t seem directly suited to using the factors list directly as a generator with other tests — the built-in meaning is an AND between constraints, not an OR.  So, push that down to another function, top-down style.  That is, sum all the numbers up to max where the number passes the test of function factor.

I knew that there are logical functions that apply across all elements in a list, but didn’t recall how they were named.  Looking through the library, I found any first, which is even more suitable.  It reads more clearly to indicate the intent, rather than having to know that a multi-input OR can be used in that manner.  But, it needs the test packaged as a separate function.  Here I did that in-line as a lambda.

Since the point of this exercise is to work with list comprehensions, I tried it using the or function, too.  Mapping the list of factors to a list of True/False is done within the list comprehension syntax, doesn’t need extra parens, but isn’t really clearer: it trades the lambda for a let.

But, there is no reason to populate all the False entries.  The natural use of the filtering clause is to reject elements that don’t meet the criteria.  So, accept the ones that would give True, rather than mapping to True.  A form similar to this would be useful if I wanted to list the ones that fit, rather than just finding out if any fit.

Interestingly, the first form can be written without using a lambda expression.  Nice normal function composition doesn’t work out because the argument is deep in the center rather than last, but section syntax lets you use either the left or right argument in an operator.  The section syntax uses parenthesis, and you end up needing parens around the whole thing because the infix ∘ has a lower precedence than function application.

factor factors n = any ((0≡)∘(n `mod`)) factors

First attempt at Haskell

Haskell is what’s called a functional programming language.  Compared to BASIC, Pascal, batch-file, C, C++, Java, JavaScript, C♯, Ada, FORTRAN, assembly language, and any any other procedural (whether structured or OO) language, Haskell is very different.

And that’s really the point:  I don’t need to pick up on a slightly different syntax and somewhat different environment for representing the same kinds of constructs I’m used to.  There might be a small number of new and interesting concepts to inspire you, but for the most part they are “me too” on the current engineering paradigms and if anything leaning back to simplicity and leaving out features that could separate the top coders from the mundane (since the latter will need to maintain the code later).

But brushing up on techniques and ways of thinking that are truly different helps improve your engineering skills in general.

For example, a number of years ago I re-implemented a piece of code for laying out images on a page.  The old code had been extended, fixed, patched up, and “maintained” to the point where it was impossible to understand what was going on.  Cleaning it off would be a temporary measure if new requirements and changes continued as they always have.  So, I approached it using concepts from functional programming, even though written in (an early version) of C♯.  Defining the constraints and properties of the desired layout, rather than a spaghetti flowchart of patched procedures that manipulate all the state variables, means at the very least that it could be extended without breaking it.  The logic was unfathomable because of state dependance—what happened before affects what some block of code will do to it next.  So I had no state variables at all.  The layout rectangle of an item was expressed as a pure function, and the various bottom level functions could easily be reviewed for correctness.

Functional programming has been making inroads into mainstream programming, with otherwise procedural and OO languages acquiring features friendly to techniques gained from FP.

In particular, template metaprogramming in C++ (which I’ve seen described as “a slow descent into madness”) exists because the template system is a Turing-complete, albeit crudely primitive, functional programming language.

Meanwhile, I’ve read that monads have a strange property: anyone who comes to understand it loses any ability to explain it to others.  That reminds me of the science fiction novel Babel-17.  In fact as in the story, language influences thought and perception, which is what I was getting at earlier in this essay.  Being a writer on programming topics, I thought I’d take that as a challenge.  Maybe I’ll write a truly good explanation of monads; or maybe it will end up joining the hundreds of others that are are either indecipherable or lack proper deep meaning.  (See also monad tutorial fallacy)

chart

A lot of what I see of beginners’ Haskell examples remind me of Prolog.

Anyway, I just tried crafting my first Haskell snippet, other than things like 2+2 or copying lines from the book.  Project Euler, problem 1,

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

Here is the session from the interactive environment:

Prelude> [x | x <- [1..10], x `mod` 3 == 0]
[3,6,9]
Prelude> [x | x <- [1..10], x `mod` 3 == 0 || x `mod` 5 == 0]
[3,5,6,9,10]
Prelude> sum [x | x <- [1..10], x `mod` 3 == 0 || x `mod` 5 == 0]
33
Prelude> sum [x | x <- [1..9], x `mod` 3 == 0 || x `mod` 5 == 0]
23
Prelude> sum [x | x <- [1..999], x `mod` 3 == 0 || x `mod` 5 == 0]
233168
Prelude>  [x | x <- [1..999], x `mod` 3 == 0 || x `mod` 5 == 0]
[3,5,6,9,10,12,15,18,20,21,24,25,27,30,33,35,36,39,40,42,45,48,50,51,54,55,57,60,63,65,66,69,70,72,75,78,80,81,84,85,87,90,93,95,96,99,100,102,105,108,110,111,114,115,117,120,123,125,126,129,130,132,135,138,140,141,144,145,147,150,153,155,156,159,160,162,165,168,170,171,174,175,177,180,183,185,186,189,190,192,195,198,200,201,204,205,207,210,213,215,216,219,220,222,225,228,230,231,234,235,237,240,243,245,246,249,250,252,255,258,260,261,264,265,267,270,273,275,276,279,280,282,285,288,290,291,294,295,297,300,303,305,306,309,310,312,315,318,320,321,324,325,327,330,333,335,336,339,340,342,345,348,350,351,354,355,357,360,363,365,366,369,370,372,375,378,380,381,384,385,387,390,393,395,396,399,400,402,405,408,410,411,414,415,417,420,423,425,426,429,430,432,435,438,440,441,444,445,447,450,453,455,456,459,460,462,465,468,470,471,474,475,477,480,483,485,486,489,490,492,495,498,500,501,504,505,507,510,513,515,516,519,520,522,525,528,530,531,534,535,537,540,543,545,546,549,550,552,555,558,560,561,564,565,567,570,573,575,576,579,580,582,585,588,590,591,594,595,597,600,603,605,606,609,610,612,615,618,620,621,624,625,627,630,633,635,636,639,640,642,645,648,650,651,654,655,657,660,663,665,666,669,670,672,675,678,680,681,684,685,687,690,693,695,696,699,700,702,705,708,710,711,714,715,717,720,723,725,726,729,730,732,735,738,740,741,744,745,747,750,753,755,756,759,760,762,765,768,770,771,774,775,777,780,783,785,786,789,790,792,795,798,800,801,804,805,807,810,813,815,816,819,820,822,825,828,830,831,834,835,837,840,843,845,846,849,850,852,855,858,860,861,864,865,867,870,873,875,876,879,880,882,885,888,890,891,894,895,897,900,903,905,906,909,910,912,915,918,920,921,924,925,927,930,933,935,936,939,940,942,945,948,950,951,954,955,957,960,963,965,966,969,970,972,975,978,980,981,984,985,987,990,993,995,996,999]
Prelude>

After expressing the shorter example, first to get the list of terms to see if that part is right thus far, and then summing them, I changed the bound to the larger value and got an answer of 233168. Just to see what it was, I then backed off the final sum to get the list of terms.

So far, so good.