Languages Matter

Languages Matter

My overall thesis for the course: the choice of (or design of) programming languages matters in more than a superficial way.

Different languages are designed with different problems in mind: they are (of course) often very good at those things. Good languages are good at other things too.

Languages Matter

Hopefully, learning a couple of new languages helped you look at programming problems differently. The overall approach to solving problems in different languages will force you to approach a problem from another perspective.

When encountering a new language in the future, try to learn the ways that language likes to solve problems. Use them: don't just keep writing C++/​Java/​C# with a different language's syntax.

Domain-Specific Languages

The languages we have discussed so far have been general-purpose programming languages:

  • Programming language: end result is a program.
  • General purpose: designed to be used for many/​all programming problems. Not specialized to one problem domain.

Domain-Specific Languages

Domain-Specific Languages (or DSLs) are langauges designed to solve a particuar kind of problem(s), and likely not others.

When doing a lot of work in a specific problem domain, a general purpose language can be too general. If you are always doing a few very specific things, it would be nice to have a language designed just for that.

Domain-Specific Languages

Some examples you may have encountered:

  • LaTeX for document creation. A markup language, like HTML. Designed to create beautiful layouts and formulas.
  • Shell scripts to automate command-line processes.
  • SQL for working with relational databases.
  • Puppet configuration language for describing system configurations.

Domain-Specific Languages

  • R for statistical analysis. Built-in functions do things like read a CSV file, create a histogram, calculate standard deviation.
  • CSS for document style/​appearance information.
  • VHDL, Verilog for describing/​designing digital circuits.
  • POV-Ray files for 3D models.
  • Markdown, YAML, XML, INI files, etc for configuration information.

Domain-Specific Languages

DSLs may or may not be programming languages. Other possible results: documents, 3D models, animations, computer chips, ….

Using DSLs can be wonderful (if they do the job you need) or a mess (if they do most of it).

Domain-Specific Libraries

Problems that can be solved with a DSL can also be solved with a general-purpose language. Maybe you should use one instead?

Good: A general-purpose language is more flexible and can (nicely) solve other problems you encounter.

Bad: It might not be as good at solving the specific problem you're working on most of the time, or it might be harder to approach those problems.

Domain-Specific Libraries

One solution is to use a general-purpose language and a library that is good at solving the problem you have.

A big library can start to feel like a DSL: the code you write in your (general-purpose) programming language can start to use the library almost everywhere and look very different.

Domain-Specific Libraries

Some examples:

  • Pandas (and friends) for data analysis (in Python), instead of R.
  • Rake for build automation (in Ruby), instead of Makefiles or Ant.
  • A PDF library to generate documents, instead of LaTeX.
  • Chef cookbooks are Ruby code, instead of a DSL.
  • A file with code (in your main general purpose language) that builds a configuration object, instead of a separate configuration file.

Domain-Specific Libraries

Some example code in R vs Python+Pandas that both read a CSV file and do a least-squares best linear fit:

data <- read.csv(file="data.csv", head=TRUE)
summary(data)
fit <- lm(data$y ~ data$x)
fit
import pandas as pd
from scipy.stats import linregress

data = pd.read_csv('data.csv')
print(data.describe())
fit = linregress(data['x'], data['y'])
print(fit)

Domain-Specific Libraries

In some cases, the DSL is simply the better tool for the job.

In others, a general-purpose language and good library will be much better.

Opinion: I'm rarely completely happy when working with a DSL. The limitations are eventually going to be a problem. Sometimes, the DSL is the least-worst option.

Languages Don't Matter

Maybe choice of programming language doesn't matter?

All general-purpose programming languages are Turing complete: they all can/can't compute the same things. Clever compiler writers can make code in any language execute fast (except when they can't).

Languages Don't Matter

We have seen that it's possible to write imperative code in Haskell (sort of, with monads). It's possible to write very functional-looking code in any language (e.g. functools and itertools from Python).

In either case, you have to have some idea what's actually happening when your code runs, so you don't write inefficient code.

Languages Don't Matter

Library writers are starting to realize that the same problems can be solved in any language. A library (especially one that starts to feel like a DSL) could implement the same API in multiple languages.

Then the programmer's choice of language really doesn't matter very much.

There are several multi-language APIs…

Languages Don't Matter

For example, Document Object Model for working with HTML/​XML documents from any language (but most common in JavaScript).

e.g. I wrote this in Python, but it's equally-valid JavaScript, Java, etc.

pars = document.getElementsByTagName("p")
new_em = document.createElement("em")
new_em.appendChild(document.createTextNode("some text"))
pars.item(0).appendChild(new_em)

Languages Don't Matter

e.g. Spark for big data analysis. This is Scala but could also be Python (or with some syntax changes, Java or R):

df = spark.read.option("header", true).csv("books.csv")
df = df.select("title", "author")
df = df.groupBy("author").count()
df.write.csv("author_counts.csv")

Languages Don't Matter

A few more…

  • LINQ to query data (arrays, databases, etc) in .NET languages.
  • Apache Arrow defines a standard memory representation for data across languages, so code in different languages can share in-memory data without conversion.

Languages Don't Matter

These cross-language APIs are much more the exception than the rule, but they show that library design can sometimes be more important to a programmer than the underlying language.

Conclusion

The choice of language is (in my mind) basically about three things: writing correct code; writing it quickly; having it execute quickly.

Conclusion

We have seen various ways that tradeoffs could be made:

  • imperative/​functional
  • static/​dynamic types/​binding
  • high-/​low-level
  • better/​worse compiler
  • language features to make code more efficient, more explicit, more readable, etc.

Conclusion

If there was a right way to choose a language, there wouldn't be so many options.

There are projects where the language choice matters a lot. There are ones where the choice of library matters more.

The way you make the various tradeoffs will depend on the project and the programmer(s).

Conclusion

Sometimes programming languages are important.

Sometimes they aren't.