Archive for the ‘Erlang’ Category

Problem 127

Thursday, April 10th, 2008

It was over a year ago now… Project Euler - I was in the top 10, then they posted Problem 127. All of the problems there should execute in under the minute, and I thought I had optimized my code, but it was running for many minutes when I stopped it to see what was going on. I instrumented my code to see and saw that it was just chugging away. It would have taken 10-ish hours to complete! What was the problem? Python.

Quite a few of the problems at Project Euler require numbers to be larger than 32 bits. I originally used Project Euler to teach myself Python, and Python deals with big numbers fairly easily in code. My language of choice at the time, C/C++/Asm, didn’t. Even in Python, an interpretted language, I was able to get most of my solutions to run in under a minute - and that was the goal. Every so often, like times where the code iterated over large arrays, or basically anything that had a large number of real operations that had to be performed, I would have to switch back to C/C++. If the solution was taking minutes to run under Python, it would complete in a second or two under C/C++. But the time I saved writing the code in Python for the cases it could solve was worth it.

Now I have F# - my new favorite language. It also has big number support built in (so does Erlang). I was able to recode my old solution from memory from last year, and suddenly, I had a solution in under a minute! Well, it took a bit of work, since I had a bug in my code - I missed an overflow, so it was doing way to many calculations - it was taking minutes to generate the wrong answer. (comparing numbers to see if a calculation needed to be done, but the numbers being compared overflowed, so it was always doing the calculation, which was very slow as well as incorrect)

Some important things I learned from this year-ish journey:

1) People always tell me that only the order of the algorithm matter, linear speed ups will be swamped by faster machines. If the linear is hundreds, thousands, or millions, that is false. Rewriting critical code in a compiled or assembly language will always be of use.

2) Numbers shouldn’t have ranges. The overflow bug that cost me a day wouldn’t have existed if the language could handle numbers cleanly, not just integers. I had written the code in Erlang last night, but it was too slow to complete (linked lists as were the killer there). Erlang transparently works with all numbers, so that code couldn’t have that bug. Couldn’t. Bug-free code is the goal, so the languages need to handle these types of things transparently.

3) Getting things done is important. I used Python because it was trivial to write most of the code needed for Project Euler. The code only had to execute once (well, I reused a lot of it, so it had to execute once per problem), so optimizing my time speeds up the time from start to answer-in-hand. And that is what programming is about, getting the correct answer as quickly as possible.

-Edward

GCD

Wednesday, April 9th, 2008

I was working on a new problem from the Euler Project - they post new problems every week-ish.  I decided to try this one in Erlang, to give myself more exposure to the language to see what I think of it.  I realized that something was missing, and actually, I think it is missing from most languages:  GCD.

 Alex Stepanov has given many talks about the importance of GCD - I think there are a few four hour talks out there that have been recorded and are very interesting to watch.  I realize more and more how right he is about this.  (His paper on GCD is linked to at Stepanov Papers, which I have linked to on the right - some great reading on that site.)

It seems to be a useful function, one that should be built in.  Given that it is not, I’ve included one in this blog post.  Code like this should only be written once, programmers should be able to focus on the problem at hand, not worry about missing common functions - it takes away from focus.  Sure, everyone could write their own, but the same is true for sin, cos, etc.  GCD should be a standard, it is in J the programming language.

Before I wrote this, I did some searching, and found that a few people had written a GCD and posted their code.  It was after looking at their code that I decided to open a dialog about this, since some of the ones I found were wrong, or longer, or more complex than the actual GCD calculation.

-Edward

gcd(M, 0) -> M;
gcd(M, N) -> gcd(N, M rem N).

More on Erlang Factoring

Wednesday, April 9th, 2008

I tried something new.  I was curious as to why the times varied so much, so I installed VMWare on my MacPro and ran the previously mentioned code under a few OSes.  I have Ubuntu 64 bit installed on a separate HD for my Mac, so I installed one under VMWare to figure out how much the VM software would skew the other results.  Looks like not by much.

MacPro, Ubuntu 64, direct boot -> 3.7 seconds
MacPro, Ubuntu 64, VMWare -> 3.9 seconds

With this out of the way, I can tell that the numbers I get from under VMWare would match up pretty closely to what I would see if I installed the OSes/Erlang directly.  Here is what I found under the VMs:

MacPro, Windows XP 32bit -> 6.9 seconds
MacPro, Ubuntu 32 -> 18.9 seconds

What does this tell me?  That the test I was using for Erlang speed is heavily based on the underlying bit-ness.  Erlang on a 64 bit system is faster than the 32 bit version when doing large number arithmetic.  And the AMD 4600+ is half the speed of the Xeon chip in the MacPro.

It looks like this is a win for 64 bit.  These new results help explain something else I had seen last week - the speeds of F# and Haskell on the same problem.  It took 32 seconds under Haskell and minutes under F# on a 32 bit machine to do the same factoring.  I’m guessing that they would be faster on a 64 bit version, though I know that F# is only 32 bit right now, but I hope it becomes 64 bit soon!

-Edward

Factor in Erlang

Sunday, April 6th, 2008

 The first piece of code that I wrote in Erlang was to factor numbers.  I thought it would be a good test to learn the basics of Erlang (very basics).  I have appended the code to this post.  I am not sure if it the fastest way to factor a number (vs building up the list, but it would have to pass an accumulator around), I will have to investigate that in the future, but it served its purpose for now. 

I then used this code to test which machines and implementations of Erlang were the fastest and got some interesting results.  The code factored 380312393432894324523423103, which I discovered by typing in random numbers to factor.  It has a large factor, 9517107199440611, which makes it a reasonable benchmark for a simple test.  Here are my results:

AMD 64, 4600+, Windows XP -> 15.4 seconds
MacPro, 2.66 Xeon, Linux -> 3.7 seconds
MacPro, 2.66 Xeon, MacOSX -> 7.0 seconds
MacPro, 2.66 Xeon, Vista64 -> 7.3 seconds

I only have one OS on the AMD, but it would be interesting to see if Linux is 2x faster, as it was on the MacPro.  I was surprised that the Xeon chipset in the MacPro was so much faster than the AMD64, but I guess they are more than a year apart in age.

-Edward

fac(N) -> fac(N,2).
fac(N, F) when F*F > N -> [N];
fac(N, F) when N rem F =:= 0 -> [F | fac(N div F, F)];
fac(N, 2) -> fac( N, 3 );
fac(N, F) -> fac(N, F+2).