Skip to content

Conversation

lihaoyi
Copy link
Contributor

@lihaoyi lihaoyi commented Feb 28, 2025

No description provided.

```

### Javascript
Javascript's expression `...` syntax works identically to this proposal. In Python, you can mix
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python should be Javascript

@som-snytt
Copy link

On the pattern side, the old gotcha is scala/bug#7623 (which has a lint on Scala 2). That is where two pat vars are satisfied by a single Seq, i.e., there is a kind of misalignment. (The example may temper ambition for patterns.)

@Atry
Copy link
Contributor

Atry commented Mar 6, 2025

Do you want to cite #41 ? There are a lot of previous discussions on this feature.

This SIP is a renamed version of #41 , and also adds an intermediate step to create a temporary seq.

  • applyBegin -> newBuilder
  • applyNext -> addOne
  • applyNextSeq -> addAll
  • applyEnd -> result

@kyouko-taiga kyouko-taiga changed the title SIP-XX - Flexible Varargs SIP-70 - Flexible Varargs Mar 21, 2025
@prolativ
Copy link
Contributor

I started to wander if this SIP could help resolve this issue: scala/scala3#18009
In short, the problem there is that an invocation of a curried method added by a refinement on a subtype of Selectable, e.g.

mySelectable.foo(args1*)(args2*)

should get desugared to something like

mySelectable.applyDynamic("foo")(args1*, args2*).asInstanceOf[Foo]

which is now illegal, but would be when this SIP gets implemented

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Jun 29, 2025

@odersky @sjrd @lrytz have any of you had a chance to look at this? It's been a month since reviewers were assigned so I'm hoping to get some feedback

Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's proposed:

  1. Constructing varargs with multiple splices xs*, interspersed with normal values
  2. Deconstructing sequences with a single vararg pattern that can be somewhere in the middle of the sequence, not just at the end.

These are clear quality of life improvements. The construction part is the more important one, and it should be straightforward to implement. The pattern matching part is probably also easy to implement, but it might need some care to get good performance for linear sequences. E.g you have a long list and match with List(x, y, ys*, 2, 3) you want to avoid multiple trips to the end of the list to match the 2 and 3. Nevertheless, it looks quite doable.

@lrytz
Copy link
Member

lrytz commented Jul 2, 2025

I also think this SIP is a very nice improvement.

You mention / suggest to use IArray, which is an opaque type alias and not part of the collection hierarchy. sum(IArray.newBuilder[Int].addOne(1).result()*) wraps the resulting array using IArray.wrapIntArray, which converts it to an immutable.ArraySeq. So it's probably better to use ArraySeq directly. Or maybe Vector?

  • ArraySeq: can wrap a primitive array, minimal memory overhead
  • Vector: better if the collections is later used for appending / prepending, concatenation etc. Memory and computational overhead is small. Primitives are boxed.

I'm not sure if the newBuilder call needs an explicit type argument, or if we can have that inferred by the typer. In Scala 3:

scala> collection.immutable.ArraySeq.newBuilder.addOne(1).result()
-- [E172] Type Error: ----------------------------------------------------------
1 |collection.immutable.ArraySeq.newBuilder.addOne(1).result()
  |                                        ^
  |                                        No ClassTag available for Any
1 error found

Interestingly, the same line works on Scala 2, the compiler infers newBuilder[Int]. Is that a known limitation @odersky?

If an explicit type argument is needed, the SIP should specify how that type is obtained.


For patterns, we'll need to update the spec to say how case E(p1, ..., pN, s*, u1, ..., uM) translates. I think the section is https://scala-lang.org/files/archive/spec/3.4/08-pattern-matching.html#pattern-sequences, the spec seems not to be fully up to date though, it doesn't reflect scala/scala3#11240.

This SIP doesn't change repeated parameters (of case classes), a repeated parameter will still be last in a case classs definition. An unapplySeq declaration should also continue to have the corresponding return type (T1, ..., Tm, Seq[S]).

So the spec needs to map the new pattern shape onto to current case class parameter list shape (or unapplySeq return type shape). Note that T1, ..., Tm can be arbitrary types, all different from each other and different from S. So I'm not sure the proposal of translating case E(p1, ..., pN, s*, u1, ..., uM) to case VarargsMatcher(Seq(p1, ..., pN), s, Seq(u1, ..., uM)) would work.

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Jul 7, 2025

You mention / suggest to use IArray, which is an opaque type alias and not part of the collection hierarchy. sum(IArray.newBuilder[Int].addOne(1).result()*) wraps the resulting array using IArray.wrapIntArray, which converts it to an immutable.ArraySeq. So it's probably better to use ArraySeq directly. Or maybe Vector?

The motivation for IArray was to try and decouple the language feature from the concrete implementation of the Scala collections library: IArray seems like a relatively simple collection with relatively good performance characteristics. In particular, I wanted to:

  • Avoid using Seq which defaults to List which is from the start a lot less cache-friendly than IArray.
  • Vector has the downside of being a pretty specific collection in the standard library, with a very-non-trivial implementation, and AFAIK is something the Scala language was not aware of up to this point.

This SIP doesn't change repeated parameters (of case classes), a repeated parameter will still be last in a case classs definition.

Yes, definitions are unchanged. Not just for case classes, but for method defs as well, they all can only have one vararg* parameter at the end

An unapplySeq declaration should also continue to have the corresponding return type (T1, ..., Tm, Seq[S]).

Yes that is right. The trailing Seq[S] can be broken up into a smaller Seq in the middle and loose elements before/after it during destructuring, but these elements all continue to be homogeneously typed as S, and the signature of the unapplySeq does not need to change

@odersky
Copy link
Contributor

odersky commented Jul 7, 2025

Interestingly, the same line works on Scala 2, the compiler infers newBuilder[Int]. Is that a known limitation @odersky?

No, I did not know that.

Generally, I think it's better of the SIP does not specify a precise implementation scheme. The scheme might change in the future, and we should not have to change the SIP because of that. The IArray code can be considered for illustration, of course.

In terms of what we have, it looks like basing the whole thing on ArrayBuilder might work.

@lrytz
Copy link
Member

lrytz commented Jul 14, 2025

IArray seems like a relatively simple collection with relatively good performance characteristics

It's ArraySeq that ends up being used – which seems a practical choice to me. Mentioning IArray is just a bit confusing, it requires the presence of an implicit conversion.

The spec says when using a sequence argument xs*, the repeated parameter is considered to be of type scala.Seq[S]. I agree using the default scala.Seq factory (List) would not be great. Maybe the SIP/spec can leave it as an implementation detail; f(x, xs*, y) is translated to Seq.newBuilder.addOne(x).addAll(xs).addOne(y).result(), but a compiler may use a different factory such that the result conforms to scala.Seq[S].

The trailing Seq[S] can be broken up into a smaller Seq in the middle and loose elements before/after it during destructuring, but these elements all continue to be homogeneously typed as S, and the signature of the unapplySeq does not need to change

Right. I was thinking what happens when using subpatterns and then got a bit lost in spec. The VarargsMatchHelper seems to work fine for that.

def f(x: Any): String = x match {
  case Seq(Some(_), y: String, xs*, 1) => y
}

// translates to

def f(x: Any): String = {
  val VarargsMatcher = new VarargsMatchHelper(2, 1)
  x match {
    case VarargsMatcher(Seq(Some(_), y: String), xs, Seq(1)) => y
  }
}

@odersky
Copy link
Contributor

odersky commented Sep 2, 2025

There is now an implementation of most of the proposal.

One doubt I am having is spreads of Option. We do not allow an option to be passed individually to a vararg parameter. I.e. the following would not work:

  def foo(x: Int*) = x.sum
  val xo = Some(2)
  f(xo*)

That's an error since the argument of a spread operator must be a Seq or an Array. Do we really want to change that? That would affect normal spreads not just spreads embedded in a longer list. There might be use cases for embedded options, but for a single spread passed to a vararg it feels a bit weird. Alternatively, we can demand an explicit toList, i.e.

  f(xo.toList*)

would work today. I'd be interested in getting the committee member's opinion on this aspect before I try to implement it.

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Sep 2, 2025

My motivation for including a special case for Option is that empirically it's super common to have "optional elements" in a sequence or sequence literal. Libraries like ScalaTags' Frag, OS-Lib's Shellable all allow this because it's such a common need. Other Iterables like Set or Map are not nearly as commonly spliced due to their nondeterministic ordering, a problem that Options don't have.

The use of a naked foo(opt*) is certainly a bit unusual, but it doesn't seem wrong. It doesn't seem worth it to make users jump through hoops in other scenarios by adding .toList everywhere just to prohibit the foo(opt*) scenario which is actually fine, and nobody really cares about either way anyway.

A third option is to allow splicing options in longer sequences such as foo(opt1*, opt2*), but prohibiting it the alone e.g. foo(opt*). Can be done, but again it seems like we're just adding irregularity just for the sake of it without any real benefit.

So IMO might as well allow foo(opt*) for the sake of (a) regularity and (b) convenience in other cases where it actually does matter

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Sep 3, 2025

Perhaps a fourth alternative is that we add a new marker trait trait VarargUnpackable[T] that Option and Seq extend, and make foo(bar*) work on any VarargUnpackable. VarargUnpackable could have whatever abstract methods are necessary to efficiently participate in the flexible varargs desugaring. e.g. one possible definition is

trait VarargUnpackable[+T]{
  // Used when unpacking a single value, can  
  // efficiently return `this` for `Seq`s
  def toSeq: Seq[T]

  // Used when unpacking flexible varargs to 
  // efficiently feed the elements of this object
  // into a `Seq.builder`
  def copyToArray(dest: Array[T], startIndex: Int): Unit

  // Used when unpacking flexible varargs to allow 
  // the builder to expand its capacity precisely
  // when the size of this is known. Otherwise 
  // returns `-1` indicating the size it not known
  // and the builder would have to handle expansion
  // using its own heuristics
  def sizeHint: Int

  // When sizeHint is -1, we cannot pre-size the 
  // builder array, and so we append elements one
  // at a time to give the builder a chance to resize
  // as necessary
  def foreach(f: T => Unit): Unit
}

object VarargUnpackable{
  class Builder[T]{
    def addAll(v: VarargUnpackable[T])
    def add(v: T)
    def result(): Seq[T]
  }
}
// Desugaring
foo(bar*)
foo(VarargUnpackable.Builder().addAll(bar).build())

foo(bar*, qux*)
foo(VarargUnpackable.Builder().addAll(bar).addAll(qux).build())

foo(value1, bar*, value2, qux*, value3)
foo(VarargUnpackable.Builder().add(value1).addAll(bar).add(value2).addAll(qux).add(value3).build())

The VarargUnpackable trait above should be the minimal API necessary to efficiently build the Seq[T] we need to provide to the body of a vararg-taking method. This included allowing use of high-performance batch copy methods like System.arraycopy, for implementers of VarargUnpackable that can support them (e.g. ArraySeq, ArrayBuffer, ArrayDeque) while other collections can implement copyToArray in a more naive foreach/iterator-based fashion. The builder desugaring above is similar to the builder approach taken in Java's collection literals discussions. We could simplify VarargUnpackable's members to just def foreach if we didn't care about performance, but it may be worth the slight additional complexity to ensure we're doing things efficiently

That would avoid hardcoding Option and Seq, and have the upside that user-specified types will be able to hook into this syntax without having to extend Seq[T] themselves which we all know is very error prone and discouraged.

A fifth alternative would be to use a name-based desugaring so foo(bar*, qux*) calls methods on bar and qux without any requirement of them being a particular type. Similar to the trait-based approach above, but perhaps a bit more flexible.

@Atry
Copy link
Contributor

Atry commented Sep 3, 2025

Why not just let Option and Seq extend a newly introduced trait OrderedIterable[+A] extends Iterable[A] if the reason to exclude Set is ordering.

@lihaoyi
Copy link
Contributor Author

lihaoyi commented Sep 3, 2025

@Atry I think it would be good to have the language feature depend on as minimal an interface as possible, so a minimal VarargUnpackable[+T] trait would be preferable since it doesn't extend Iterable and force implementers to support the rich API of the Iterable trait

@odersky
Copy link
Contributor

odersky commented Sep 3, 2025

I think all this goes to far. Varargs are Seq's. I see no reason to complicate things further here. Arrays can be passed to Seq's since there is an implicit conversion. The point of SIP 70 is to stay within this framework and allow multiple spreads. My PR provides that. Options unfortunately are an irregular case since there is no conversion from option to Seq (and for good reason!). I think it is much cleaner to keep that principle even if some use cases require an explicit .toList.

Here's another way to put it: I can have a vararg function and a spread argument to it:

def f(x: Int*)
f(xs*)

I can convert the vararg to a Seq and drop that the spread:

def f(x: Seq[Int])
f(xs)

That works, always. I have done that refactoring many times. But allowing options in spreads breaks it.

Generally varargs are tricky to get right since they need to be treated early during type inference when types are not yet known completely. That compounds complexity. So we need to keep it simple.

@sjrd
Copy link
Member

sjrd commented Sep 3, 2025

I agree with @odersky. I don't question that there are use cases for spreading Options (I can see a few spots in my code where I would definitely use it). The irregularity is a bit too much in this case, though. Having to add a .toList in order to be able to spread it seems like an acceptable tradeoff in order to retain the relative simplicity of varargs.

@odersky
Copy link
Contributor

odersky commented Sep 3, 2025

I think it would indeed be interesting if we had a minimal interface for varargs and spreads instead of the Seq we have now. That would go in the general direction to minimize the language mandated footprint of the standard library. But it would raise a lot of issues, including how to organize migration and how to keep backwards compatibility. So I think this would have to be a different SIP.

@arturopala
Copy link

As a long-term Scala user, I strongly support @lihaoyi's arguments. Optiontype is so essential to everyday Scala programmer life ergonomics that I would love to see it accepted friction-free whenever possible, not less.

@Ichoran
Copy link

Ichoran commented Sep 4, 2025

Maybe we can add .cc as a method that turns a non-collection type into a collection. name.cc* doesn't seem too bad. Even x.cc* doesn't seem too bad to me. x.toList* is a lot more effort.

If we allowed a spread type then we could use a unary * suffix operator to simply allow x.* while the ordinary collection would be x*. As it is, the vararg type isn't first-class, though, so you can't use it this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.