question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Consider syntax with significant indentation

See original GitHub issue

I was playing for a while now with ways to make Scala’s syntax indentation-based. I always admired the neatness of Python syntax and also found that F# has benefited greatly from its optional indentation-based syntax, so much so that nobody seems to use the original syntax anymore. I had some good conversations with @lihaoyi at Scala Exchange in 2015 about this. At the time, there were some issues with which I was not happy yet, notably how to elide braces of arguments to user-defined functions. I now have a proposal that addresses these issues.

Proposal in a Nutshell

  • If certain keywords are followed by an end-of-line and an indented code block, assume block structure as if braces were inserted around the indented block. Example:

    def f(x: Int) =
      val y = x * x
      y + 1
    

    is treated as equivalent to

    def f(x: Int) = {
      val y = x * x
      y + 1
    }
    

    Or, for match expressions:

    xs match
      case x :: xs1 => ...
      case Nil => ...
    

    Or, using the new syntax for if-then-else:

    if condition then
      println("taken")
      x
    else
      println("not taken")
      y
    

    Or for for-expressions:

    for
      x <- xs
      y <- ys
    yield
      f(x, y)
    
  • Use with + indentation as an alternative way to delimit statement sequences in objects and classes. Example:

    object Obj with
      class C(x: Int) with
        def f = x + 3
      def apply(x: Int) = new C(x)
    
  • Also use with + indentation as an alternative way to pass arguments that were formerly in braces to functions. Examples:

    xs.map with x =>
      x + 2
    
    xs.collect with
      case P1 => E1
      case P2 => E2
    

Motivation

Why use indentation-based syntax?

  • Cleaner typography: We are all used to write

    def f() = {
      ...
    }
    

    But if we look at it with fresh eyes it’s really quite weird how the braces embrace nothing but empty space. Geometrically the braces point away from the enclosed space .... One could argue that other brace schemes are better. But there are good reasons why the scheme above is the most popular and in a sense arguments how to make braces look less awkward are themselves an indication that braces are fundamentally problematic. It’s much simpler and cleaner to get rid of them:

    def f() =
      ...
    
  • Regain control of vertical white space. Most of us are very particular how to organize horizontal whitespace, with strict rules for indentation and alignment. We are much less demanding on vertical whitespace. With braces, we cannot be, because vertical whitespace is fully determined by the number of closing braces. So two definitions might be separated by a single blank line, or by many (almost) blank lines if there are closing braces. One could avoid this by putting several closing braces on one line, but this looks weird and therefore has not caught on.

  • Ease of learning. There are some powerful arguments why indentation based syntax is easier to learn.

  • Less prone to errors. Braces are a weak signal, much weaker than indentation. So when brace structure and indentation differ, we misunderstand what was written. The code below exhibits a common problem:

    if (condition)
      println("something")
      action()
    

    Indentation fools us to believe that action is executed only if condition is true. But of course that’s not the case, because we forgot to add braces.

  • Easier to change. A situation like the one above happens particularly often when one adds the first println statement after the fact. To protect against modification problems like this, some people suggest to always write braces even if they only enclose a single statement or expression. But that sort of boilerplatey defensive programming is generally not considered good practice in Scala.

  • Better vertical alignment. In the most commonly used brace scheme, an if-then-else reads like this:

    if (cond1) {
      ...
    } else if (cond2) {
      ...
    } else {
      ...
    }
    

    Instead of nicely aligned keywords, we find weird looking closing braces } at the beginning of the most important lines in this code. We are all used to this, but that does not make it good. I have recently changed my preferred style to:

    if (cond1) {
      ...
    }
    else if (cond2) {
      ...
    }
    else {
      ...
    }
    

This solves the alignment issue: The if and the elses are now vertically aligned. But it gives up even more control over vertical whitespace.

Impediments

What are the reasons for preferring braces over indentations?

  • Provide visual clues where constructs end. With pure indentation based syntax it is sometimes hard to tell how many levels of indentation are terminated at some point. In the code below, it is still easy to see that j is on the same nesting level as g. But if there were many more lines between the definitions, it might not be.

    def f =
      def g =
        def h =
          def i = 1
          i
      def j = 2
    

    The proposal includes with end comments a way to mitigate this issue.

  • Susceptibility to off-by-one indentation. It’s easy to make a mistake so that indentation is off by a space to the left or the right. Without proper guards, indentation based syntax would interprete misalignment as nesting which can lead to errors that are very hard to diagnose.

    In the proposal, an indented block is always preceded by a keyword that must be followed by a nested expression or statement sequence. So it is impossible to accidentally introduce nesting by veering off to the right. I tried to experiment with the following additional rule, which would make it unlikely to accidentally terminate nesting by veering off to the left:

    • When terminating an indented block by a new statement that starts further to the left than the block, it is checked that the new statement aligns exactly with previous statements at the same indentation level.

    That rule proves to be quite constraining (for instance it would outlaw the chained filter and map operations in the example below), so it is currently not implemented.

  • Editor support. Editors typically have ways to highlight matching pairs of braces. Without that support, it becomes harder to understand the nesting structure of a program. On the other hand, it’s also straightforward to provide navigation help for indentation-based syntax, for instance by providing a command to go to the start of the previous or next definition at the indentation level of the cursor. According to @lihaoyi’s comment, major editors do something like this already.

But neither of these points are the strongest argument against indentation. The strongest argument is clearly

  • Cost of change. It would be expensive in many dimensions to change to indentation based syntax. To be sure, the present proposal for indentation based syntax still allows braces, so existing programs would still compile. But there are other costs as well. If the new indentation syntax is not universally adopted, we incur the cost that there will be two visually distinct ways to structure Scala code. People used to one way will be less comfortable reading the other. If the new indentation syntax does take over as a universal standard (which I would expect), we have rendered outmoded all blogs, books, StackOverflow answers and other technical information that used the old syntax. It will be a long time to change all that, and the transition will be awkward.

Proposal in Detail

Expanded use of with

While we are about to phase out with as a connective for types, we propose to add it in two new roles for definitions and terms. For definitions, we allow with as an optional prefix of (so far brace-enclosed) statement sequences in templates, packages, and enums. For terms, we allow with as another way to express function application. f with { e } is the same as f{e}. This second rule looks redundant at first, but will become important once significant indentation is added. The proposed syntax changes are as described in this diff.

Significant Indentation

In code outside of braces, parentheses or brackets we maintain a stack of indentation levels. At the start of the program, the stack consists of the indentation level zero.

If a line ends in one of the keywords =, if, then, else, match, for, yield, while, do, try, catch, finally or with, and the next token starts in a column greater than the topmost indentation level of the stack, an open brace { is implicitly inserted and the starting column of the token is pushed as new top entry on the stack.

If a line starts in a column smaller than the current topmost indentation level, it is checked that there is an entry in the stack whose indentation level precisely matches the start column. The stack is popped until that entry is at the top and for each popped entry a closing brace } is implicitly inserted. If there is no entry in the stack whose indentation level precisely matches the start column an error is issued.

None of these steps is taken in code that is enclosed in braces, parentheses or brackets.

Lambdas with with

A special convention allows the common layout of lambda arguments without braces, as in:

xs.map with x =>
  ...

The rule is as follows: If a line contains an occurrence of the with keyword, and that same line ends in a => and is followed by an indented block, and neither the with nor the => is enclosed by braces, parentheses or brackets, an open brace { is assumed directly following the with and a matching closing brace is assumed at the end of the indented block.

If there are several occurrences of with on the same line that match the condition above, the last one is chosen as the start of the indented block.

Interpreted End-Comments

If a statement follows a long indented code block, it is sometimes difficult as a writer to ensure that the statement is correctly indented, or as a reader to find out to what indentation level the new statement belongs. Braces help because they show that something ends here, even though they do not say by themselves what. We can improve code understanding by adding comments when a long definition ends, as in the following code:

    def f =
       def g =
          ...
          (long code sequence)
          ...
    // end f

    def h

The proposal is to make comments like this one more useful by checking that the indentation of the // end comment matches the indentation of the structure it refers to. In case of discrepancy, the compiler should issue a warning like:

// end f
~~~~~~
misaligned // end, corresponds to nothing

More precisely, let an “end-comment” be a line comment of the form

// end <id>

where <id> is a consecutive sequence of identifier and/or operator characters and <id> either ends the comment or is followed by a punctuation character ., ;, or ,. If <id> is one of the strings def, val, type, class, object, enum, package, if, match, try, while, do, or for, the compiler checks that the comment is immediately preceded by a syntactic construct described by a keyword matching <id> and starting in the same column as the end comment. If <id> is an identifier or operator name, the compiler checks that the comment is immediately preceded by a definition of that identifier or operator that starts in the same column as the end comment. If a check fails, a warning is issued.

Implementation

The proposal has been implemented in #2488. The implementation is quite similar to the way optional semicolons are supported. The bulk of the implementation can be done in the lexical analyzer, looking only at the current token and line indentation. The rule for “lambdas with with” requires some lookahead in the lexical analyzer to check the status at the end of the current line. The parser needs to be modified in a straightforward way to support the new syntax with the generalized use of with.

Example

Here’s some example code, which has been compiled with the implementation in #2488.

object Test with

  val xs = List(1, 2, 3)

// Plain indentation

  xs.map with
       x => x + 2
    .filter with
       x => x % 2 == 0
    .foldLeft(0) with
       _ + _

// Using lambdas with `with`

  xs.map with x =>
      x + 2
    .filter with x =>
      x % 2 == 0
    .foldLeft(0) with
      _ + _

// for expressions

  for
    x <- List(1, 2, 3)
    y <- List(x + 1)
  yield
    x + y

  for
    x <- List(1, 2, 3)
    y <- List(x + 1)
  do
    println(x + y)


// Try expressions

  try
    val x = 3
    1.0 / x
  catch
    case ex: Exception =>
      0
  finally
    println("done")

// Match expressions

  xs match
    case Nil =>
      println()
      0
    case x :: Nil =>
      1
    case _ => 2

// While and Do

  do
    println("x")
    println("y")
  while
    println("z")
    true

  while
    println("z")
    true
  do
    println("x")
    println("y")

  // end while

// end Test

package p with

  object o with

    class C extends Object
               with Serializable with

      val x = new C with
          def y = 3

      val result =
        if x == x then
          println("yes")
          true
        else
          println("no")
          false

    // end C
  // end o

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:664
  • Comments:95 (33 by maintainers)

github_iconTop GitHub Comments

111reactions
bmjsmithcommented, May 21, 2017

This offers no objective improvement to the language at a cost that is not insignificant. More overloaded keywords is the last thing that helps newbies and supporting two styles or switching between them is burdensome. Developers in general will not reach a consensus on indentation vs delimiters any more than they will on tabs vs spaces or which line your curly brackets go on. Please don’t facilitate wasting effort debating this (or having to switch) in every project and leave it as it is.

65reactions
oderskycommented, May 21, 2017

Wow, this proposal has generated a lot of heat (should have expected that!) I think for now my proposed strategy will be:

  • have this or a variant of it as an optional feature in early versions of Dotty, controlled by a command-line flag.

  • get automatic reformatters that can switch between braces and indentation.

  • experiment with the feature and get feedback from users.

Once the experiments are in, decide on whether we want to keep this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Significant Indentation - Programming Linguistics
Significant indentation takes advantage of the fact that it is generally considered best practices to indent block statements and subsequent ...
Read more >
Scala: Consider syntax with significant indentation
Indent -based syntax is great for imperative languages, but not so good for functional languages with very long expressions.
Read more >
New Dotty Proposal: Consider syntax with significant indentation : r ...
I think the significant whitespace in coffescript is terribly implemented. I guess python on the other hand is alright though. It's designed to...
Read more >
Indentation-based syntax considered troublesome
I believe that syntax, although an important aspect of natural languages, should not play a significant role in programming languages. It has ...
Read more >
Optional Braces - Scala 3 - EPFL
Significant indentation is enabled by default. It can be turned off by giving any of the options -no-indent , -old-syntax and -source 3.0- ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found