Language Productivity

Programmer Efficiency

We have talked a lot about running times: how fast is the language/compiler output/runtime environment? This is easy to measure, empirical, and sounds cool.

But very little code is CPU limited. Most code spends almost all of its time waiting for memory, disk, database, network, user response, etc. So who cares about speed?

Programmer Efficiency

Programmers are expensive: a language should be fast and easy for a programmer to write correct code.

This is much harder to measure, and I can't find much concrete to say here…

Programmer Efficiency

L. Prechelt, An Empirical Comparison of Seven Programming Languages, in Computer, vol. 33, pp. 23–29, 2000.

Compares how programmers completed various tasks in seven languages. Note Figures 15, 16 in the draft or figure 6 in the paper, which are about time to complete the tasks.

Programmer Efficiency

S. Nanz, C. A. Furia, A Comparative Study of Programming Languages in Rosetta Code, arXiv:1409.0252v4, 2015.

Note results RQ1: lines of code to complete tasks.

Programmer Efficiency

L. Lavazza, S. Morasca, D. Tosi, An Empirical Study on the Factors Affecting Software Development Productivity, e-Informatica Software Engineering Journal, 2018.

Note Figure 3 and 4 about productivity per language.

Programmer Efficiency

Another approach: Programming languages ranked by expressiveness, 2013. This evaluates the size of source control commits, with the assuption that one commit == one piece of functionality. How much code is required to get one piece of functionality completed?

Programmer Efficiency

A related question: A Large Scale Study of Programming Languages and Code Quality in Github, 2014. Presumably writing bug-free code is related to how effective a programmer can be?

Note RQ1: Are some languages more defect prone than others? Spoiler: yes. Also RQ2: Which language properties relate to defects?

Programmer Efficiency

Another survey of results: Static v. dynamic languages literature review.

The conclusion? I don't know.

Higher-level is probably good. Dynamically typing might be a net win.

Library Availability

One thing that definitely lets programmers get things done faster is library availability is library support. citation needed

Most languages have some kind of tool to help find and install 3rd party packages. Of course, these vary in quality and maintenance.

Library Availability

Some examples:

JavaScript npm: npm install encodeurl
Python PyPI: pip install Jinja2
Haskell Cabal: cabal install aeson
Scala sbt.
Rust Cargo.

Library Availability

A language that has a good ecosystem of 3rd party packages that do useful things can make programmers much more effective… if they know how to find them and read the docs.

High- vs Low-Level

Where a language is on the high-level vs low-level scale is almost certainly a factor in how much a programmer has to do to get work done.

Higher-level languages have constructs that let a programmer do more with less code. That should mean that code gets written faster.

High- vs Low-Level

Sometimes less code is actually faster code.

Example problem: take an array of values and create a new array by calculating \(\sin(x-1)+1\) of each element.

I have implemented this in C and several ways in Python.

High- vs Low-Level

The C implementation is predictable enough:

double *do_calc(double *arr) {
    double *result = malloc(N * sizeof(double));
    for( int i=0; i<N; i++ ) {
        result[i] = sin(arr[i] - 1) + 1;
    }
    return result;
}

In C style, it has all the details of what should happen (and extra details about sequential calculation of each element that the compiler has to ignore to produce SIMD instructions).

High- vs Low-Level

In Python, we can do the same thing:

def do_calc_python(a):
    result = np.empty(a.shape)
    for i in range(a.size):
        result[i] = math.sin(a[i] - 1) + 1
    return result

… or use the NumPy library to do it in one line:

def do_calc_numpy(a):
    return np.sin(a - 1) + 1

High- vs Low-Level

Or there's a Python library NumExpr that avoids allocating memory for each intermediate result (a - 1, np.sin(a - 1), np.sin(a - 1) + 1) as NumPy must (because it strictly evaluates each operation).

def do_calc_numexpr(a):
    expr = 'sin(a - 1) + 1'
    return numexpr.evaluate(expr)

High- vs Low-Level

How do their speeds compare?

Implementation	Name	Time
C with `gcc -O0`	`do_calc`	1.53 s
C with `gcc -O3`	`do_calc`	1.29 s
Python loop	`do_calc_python`	31.97 s
Python NumPy ops	`do_calc_numpy`	2.20 s
Python NumExpr	`do_calc_numexpr`	0.32 s

High- vs Low-Level

How does NumExpr take 1/4 the time of raw C code?

It knows what's happening and automatically multi-threads its calculations. It was using all of my processor cores.

I didn't have to do anything, just express what I wanted at a high-enough level, to a smart-enough library.

[When restricted to one core, it was between gcc -O0 and the standard NumPy implementation: 1.96 s.]

High- vs Low-Level

The lesson: sometimes library authors (along with language implementers) have put a lot of work into optimizing what they do. Sometimes it's best to write a high-level description of what should be done, and then optimize from that if necessary.

Programming Paradigm

We talked about paradigm at the start of the course:

Imperative programming: programmer gives statements that execute in order and change the program's states. Focus is on how a program should operate.

Declarative programming: programmer expresses the computation that needs to be done without explicitly saying how the calculation is to be done.

Programming Paradigm

We saw Haskell: a functional language. Functional programming is a subset of declarative.

Other declarative (programming) subcategories: logic and constraint programming (Prolog), query languages (SQL), hardware description (VHDL, Verilog).

Other examples of delarative (programming?) languages: Makefiles, HTML, CSS, XSLT, spreadsheets.

Programming Paradigm

Object oriented programming is a subcategory of imperative: you control the behaviour explicitly, but logic is organized into classes/methods instead of standalone functions (procedural programming).

Many languages take features from different paradigms and are difficult to categorize: OCaml, Scala, Erlang, etc.

Language Features

Hopefully that was interesting. Some things I hope are takeaway lessons…

You can do the same calculations in any language (as long as it's Turing complete, which every programming language is), but the language can make things easier or harder.

Language Features

There are some things that are really hard for people and the language can help us get it right: memory safety, concurrency.

Different programming languages express algorithms very differently. Some of the language design choices affect how efficient the programmer can be.

Language Features

The way languages are designed (with static types, with safe/easy concurrency) and implemented (optimizing compilers, JITs) have a huge influence on their performance.

The language can make it easier or harder to use all of the functionality of modern processors.

Language Features

Programming languages have features that are unique or unusual. Using them can help you become a better programmer. In other languages, similar functionality may be available from libraries, or by structuring your code in the right way.

Don't keep writing the first language you learned in the syntax of a new language.