Programming languages
Methods
Fortran was once the language for scientific computing. However, there are now literally hundreds of languages to choose from, each with its own jungle of free source code and mathematical libraries. Comparing programming languages is the stuff of holy wars, so one needs to carefully specify the rules of the game when doing so. There is obviously no best overall language and probably no clear winner for even the most specific tasks. The purpose of these pages is to provide a practical guide to several commonly used programming languages from the point of view of an average scientific researcher who needs to carry out a real computational task. The abilities of each language will be compared in terms of how it performs off-the-shelf or with commonly used libraries, not how it performs in the hands of a highly skilled programmer who understands the inner workings of the compiler and can leverage obscure aspects of the language to generate highly optimized code. The latter topic is certainly worthy of discussion, but is of secondary concern to most working researchers who would be better served by first learning a few best practices and a familiarity with a few free numerical libraries.
The first major division in the world of programming languages is between compiled and interpreted languages. Compiled languages generate faster code, but typically involve longer programs and require a little more knowledge about the internals of the system. C, C++, and Fortran are by far the most widely used compiled languages for scientific computation. Generally speaking, Fortran is faster for pure number crunching tasks, followed by C, while C++ is somewhat slower off-the-shelf but more more flexible, allowing one to create complex data structures in full object-oriented elegance. Periodically, someone asks about Fortran vs C++ on comp.lang.fortran or comp.lang.c++. Probably the most appropriate response is that both languages are a very good choice for scientific work and can be very fast with a bit of optimization. Having said that, each language is taking steps in the other’s direction. After the Fortran 90 specification, Fortran continues to add support for more object-oriented concepts and with the development of expression templates, C++ has made strides in the direction of computational efficiency.
Interpreted languages are typically easier to use and contain more fully-featured computational routines than compiled languages, but are slower. Matrix-based environments such as Matlab and Octave? come complete with built in functions for most Linear Algebra routines, optimization routines, etc. Statistical packages such as R provide a huge toolkit of common statistical functions that one would either have to write by hand in C++ or Fortran or use an external library, neither of which is trivial. Scripting languages such as Perl? can be excellent for tasks such as cutting datasets or downloading and processing files from the internet. The Ruby language is very elegant, resulting in programs that are quite small but still very readable, however it cannot rival the performance of a compiled language for computational tasks.
For more information about each language, such as best practices, commonly used mathematical libraries, and useful external resources, see the following pages: