I have posted an update to my trigonometry-intense floating point benchmark which adds Raku to the list of languages in which the benchmark is implemented. A new release of the benchmark collection including Raku is now available for downloading.
Raku has had a gestation period which may be longer than any other programming language to ever actually be released. Following the release of version 5 of Perl, Larry Wall began the process of designing its successor, dubbed, not surprisingly, “Perl 6”. This was around the year 2000, and design documents began to appear. Perl 6 was intended to be a major overhaul of the language, in which upward compatibility with existing code would be sacrificed where necessary to clean up some of the infelicitous syntax and semantics that Perl had accreted over the years and gave it a reputation of being, while powerful, expressive, and concise, ugly to read and confusing and error-prone to write.
Over the years, it became apparent that Perl 6 was a moving target, as many of the design documents superseded aspects of those which preceded them. Further, the landscape of programming languages was changing over time, with techniques such as functional programming, asynchronous concurrent processing, optimisation for vector and parallel architectures, and strong type checking to guard against common programming errors, coming into use and expected to be present in any new language to enter the arena.
Perl version 5, while once dominating the system administration tool application space, began to look increasingly long in the tooth, with newer entrants such as Ruby and especially Python being the tools of choice for new projects and programmers entering the job market. By 2019, it was judged that the language, while clearly descended from Perl, had evolved into something sufficiently different that a new name was called for, and Perl 6 was henceforth called “Raku”. Unlike Perl, whose operation was essentially defined by what the implementation did, Raku has a formal specification, allowing multiple implementers to build their own compilers and libraries for the language. At the moment, the most widely used and actively developed compiler is Rakudo, which is available for a variety of machines and operating systems.
I developed the Raku implementation of the floating point benchmark
with Rakudo v2021.09, which implements Raku language specification v6.d,
running on Xubuntu Linux version 20.04 LTS on a 64-bit Intel
x86 architecture machine. I developed and benchmarked
two separate versions of the program. The first was a minimal port of
the existing Perl implementation of fbench
. I simply fixed
the code to accommodate changes in the language, but used none
of the new features or program structuring tools introduced in Raku—I
call this the "port" version. The second was a clean sheet Raku implementation
based upon the object oriented architecture used by the C++ version and
other modern language implementations such as Haskell, Scala, Erlang, Rust, and Go.
This I call the "native" version, which uses Raku's object oriented features,
strong typing, enumeration and constant types, and improved control
structures to structure the code.
For timing comparisons, I used the C version, compiled with GCC version 9.3.0, which executed the C benchmark with a timing of 0.795 microseconds per iteration, and Perl version v5.30.0, which ran the Perl implementation at 32.1619 microseconds/iteration, or 40.46 times slower than C.
So, how does Raku stack up against these two mainstays of the systems programming world? Well, the good news is that it got identical answers to the eleven decimal places we validate. The bad news is that it is hideously, nay, appallingly slow. How slow? Well, the native version, where I used Raku the way I understand it is supposed to be used, and with the benefit of more than a day's experimenting, tweaking the code, and trying to understand what was going on, produced a timing of 163.42 microseconds per iteration, which is five times slower than Perl and two hundred and six times slower than C. And the minimal port, representative of what you get if you take an existing Perl program performing numerical calculations and migrate it to Raku by simply fixing the changes in language? It runs at a rate of 584.6 microseconds per iteration, which is eighteen times slower than Perl and seven hundred and thirty-five times slower than C.
What is going on here? First of all, I suspect that with development of a compiler and support libraries to support such an ambitious and, until recently, rapidly evolving language, priority has rightly been given to a complete and correct implementation of the language specification rather than optimisation. This is to be expected, and performance should improve over time, especially since language features in Raku such as strong typing and immutable and private variables should permit compiler optimisation better than that of Perl, whose heritage was a typeless, interpreted language.
But, for numerical programming (which, to be fair, was never Perl's strong
point or a major application area), Raku's type system is distinctly odd and
full of pitfalls for the incautious programmer unaware of what is going on
under the hood. Raku has base numeric types of Int
(arbitrary
precision integers), Rat
(arbitrary precision rational numbers [fractions]),
Num
(IEEE 754 double precision floating point), and Complex
(complex numbers made up of two Num
s). These, in turn, form
“roles” such as Real
, which encompasses the Int
,
Rat
, and Num
types. Now, this seems eminently reasonable.
But now consider the following innocent statement:
my Num $f = 0;
which declares a floating point variable and sets it to zero. What happens?
Why, you get a fatal error message because you've tried to
initialise a variable of type Num
to an Int
value, the constant
zero. All right, you say, this seems a bit reminiscent of 1950s programming
languages where every floating point number had to have a decimal point or
an exponent, so you replace “0
” with “0.0
”
and try again. (At least you don't have to repunch the card and hand your deck in
across the counter then wait six hours to see what happens.) And the result
is…blooie!—now you've tried to assign a Rat
(rational number) to a Num
, because everybody knows that “0.0”
is just shorthand for “0/10” a rational number if anybody's ever seen one.
The only way to get this statement past the compiler is to write the right hand
side as “0e0
” or “0.0e0
”, which it
deems to be a Num
, or else explicitly convert the type with
0.Num
or Num(0)
.
This may seem to be runaway pedantry which will probably get fixed in a
future release, but it's actually more pernicious than you might think. Suppose
you declare all of your floating point variables as Real
, which encompasses
all the (non-complex) numeric types. Now you can assign integers, decimal
numbers, and floating point numbers with exponents to them without error. But if you assign
a value like, say, 5895.944
(the wavelength, in angstroms, of the “D”
spectral line used in evaluating optical designs) to your variable, it takes on a type
of Rat
, and calculations with it will be performed in library-implemented
rational arithmetic, which is much slower than hardware floating point. And when you
have a mixed-type expression involving a Num
, it has to promote the
value from rational to floating point, another slow operation. Note that this will
happen if you so much as use a decimal constant in an expression involving
floating point values. If you fail to tack an exponent onto it, everything slows down
like molasses in mid-winter. Before I figured this out and explicitly typed all
of the constants in my program, the “native” version ran more than
three times slower than the results I report here.
A suitably smart compiler should be able to analyse the code and do much of
this conversion at compile time, but the existence of polymorphic types such as
Real
may render this impossible in some cases. In any case, a programming
language which requires such extreme fussiness to avoid painful and non-obvious
speed penalties will have a steep hill to climb in competition with others that impose no such
burden on their users.
The relative performance of the various language implementations (with C taken as 1) is as follows. All language implementations of the benchmark listed below produced identical results to the last (11th) decimal place. In the table below, I show Perl as 23.6 times slower than C, not the 40.46 times I measured as a part of these tests. I suspect this is due to the value in the table being measured on a 32 bit machine where the advantage of C-generated machine code is less than the 64 bit machine on which I ran this test.
Language | Relative Time |
Details |
---|---|---|
C | 1 | GCC 3.2.3 -O3 , Linux |
JavaScript | 0.372 0.424 1.334 1.378 1.386 1.495 |
Mozilla Firefox 55.0.2, Linux Safari 11.0, MacOS X Brave 0.18.36, Linux Google Chrome 61.0.3163.91, Linux Chromium 60.0.3112.113, Linux Node.js v6.11.3, Linux |
Chapel | 0.528 0.0314 |
Chapel 1.16.0, -fast , LinuxParallel, 64 threads |
Visual Basic .NET | 0.866 | All optimisations, Windows XP |
C++ | 0.939 0.964 31.00 189.7 499.9 |
G++ 5.4.0, -O3 ,
Linux, double long double (80 bit)__float128 (128 bit)MPFR (128 bit) MPFR (512 bit) |
Modula-2 | 0.941 | GNU Modula-2 gm2-1.6.4 -O3 , Linux |
FORTRAN | 1.008 | GNU Fortran (g77) 3.2.3 -O3 , Linux |
Pascal | 1.027 1.077 |
Free Pascal 2.2.0 -O3 , LinuxGNU Pascal 2.1 (GCC 2.95.2) -O3 , Linux |
Swift | 1.054 | Swift 3.0.1, -O , Linux |
Rust | 1.077 | Rust 0.13.0, --release , Linux |
Java | 1.121 | Sun JDK 1.5.0_04-b05, Linux |
Visual Basic 6 | 1.132 | All optimisations, Windows XP |
Haskell | 1.223 | GHC 7.4.1-O2 -funbox-strict-fields , Linux |
Scala | 1.263 | Scala 2.12.3, OpenJDK 9, Linux |
FreeBASIC | 1.306 | FreeBASIC 1.05.0, Linux |
Ada | 1.401 | GNAT/GCC 3.4.4 -O3 , Linux |
Go | 1.481 | Go version go1.1.1 linux/amd64, Linux |
Julia | 1.501 | Julia version 0.6.1 64-bit -O2 --check-bounds=no , Linux |
Simula | 2.099 | GNU Cim 5.1, GCC 4.8.1 -O2, Linux |
Lua | 2.515 22.7 |
LuaJIT 2.0.3, Linux Lua 5.2.3, Linux |
Python | 2.633 30.0 |
PyPy 2.2.1 (Python 2.7.3), Linux Python 2.7.6, Linux |
Erlang | 3.663 9.335 |
Erlang/OTP 17, emulator 6.0, HiPE [native, {hipe, [o3]}] Byte code (BEAM), Linux |
ALGOL 60 | 3.951 | MARST 2.7, GCC 4.8.1 -O3, Linux |
PHP | 5.033 | PHP (cli) 7.0.22, Linux |
PL/I | 5.667 | Iron Spring PL/I 0.9.9b beta, Linux |
Lisp | 7.41 19.8 |
GNU Common Lisp 2.6.7, Compiled, Linux GNU Common Lisp 2.6.7, Interpreted |
Smalltalk | 7.59 | GNU Smalltalk 2.3.5, Linux |
Ruby | 7.832 | Ruby 2.4.2p198, Linux |
Forth | 9.92 | Gforth 0.7.0, Linux |
Prolog | 11.72 5.747 |
SWI-Prolog 7.6.0-rc2, Linux GNU Prolog 1.4.4, Linux, (limited iterations) |
COBOL | 12.5 46.3 |
Micro Focus Visual COBOL 2010, Windows 7 Fixed decimal instead of computational-2 |
Algol 68 | 15.2 | Algol 68 Genie 2.4.1 -O3, Linux |
Perl | 23.6 | Perl v5.8.0, Linux |
BASICA/GW-BASIC | 53.42 | Bas 2.4, Linux |
QBasic | 148.3 | MS-DOS QBasic 1.1, Windows XP Console |
Raku | 205.6 735.3 |
Rakudo v2021.09/v6.d, Linux, object-oriented rewrite Minimal port of Perl version |
Mathematica | 391.6 | Mathematica 10.3.1.0, Raspberry Pi 3, Raspbian |