Sequence expressions and side-effects in F#

September 6, 2013 at 4:11 PMMichele Mottini

I discovered an interesting problem when using sequence expression to read a text. The original case was more complex, but here is a simple example to reproduce the problem – a function reading all the lines of text from a text stream, returning them as a sequence of string using a sequence expression:

let readlines (reader: System.IO.TextReader) = 
  seq {
    let line = ref (reader.ReadLine())
    while !line <> null do 
      yield !line
      line := reader.ReadLine()
 }

Reading a two-lines text:

readlines (new System.IO.StringReader"A
B") 

produces as expected:

val it : seq<string> = seq ["A"; "B"]

Now define a simple function that checks if a sequence is empty and return either the string “Empty” or the string “XX items”:

let test s = 
  if Seq.isEmpty s then 
    "Empty"
  else
    sprintf "%d items" (Seq.length s)

Apply it to the same sequence as above:

readlines (new System.IO.StringReader"A
B") |> test

and the result is

val it : string = "1 items"

that is very wrong: the sequence contains 2 lines, not 1.

The problem is that the call to Seq.isEmpty reads the first line, moving the text reader to the second line (i.e. causing a side-effect) – and so causing subsequent uses of the sequence to skip the first line.

In this case the problem is with a text reader, but any call causing side-effects within a sequence expression is bound to cause the same unexpected behavior.

One possible solution is to cache the resulting sequence:

let readlines (reader: System.IO.TextReader) = 
  seq {
    let line = ref (reader.ReadLine())
    while !line <> null do 
      yield !line
      line := reader.ReadLine()
  } |> Seq.cache

that prevents double calls to read the same element in the sequence, fixing the problem.

Another solution is to create the reader within the sequence expression:

let readlinesStr str = 
  seq {
    use reader = new System.IO.StringReader(str)
    let line = ref (reader.ReadLine())
    while !line <> null do 
      yield !line
      line := reader.ReadLine()
  }

that causes a new reader to be created each time sequence is accessed, avoiding side-effects.

Posted in: Programming

Tags: , , ,

Add comment

  Country flag

biuquote
  • Comment
  • Preview
Loading