avoid extra function allocations in ZIO.succeed (scala 3) #8866

fwbrasil · 2024-05-17T15:49:01Z

The Unsafe encoding in Scala 3 produces extra function allocations to adapt the function taking Unsafe to Function0 in ZIO.Sync. This PR changes ZIO.Sync to be an abstract class and provide the thunk as an overriden method. With this approach, only the Sync object is allocated in ZIO.succeed.

This isn't an issue in Scala 2 because the function is a by-name parameter, which Scala already encodes as Function0 in the bytecode and doesn't require a new allocation.

fwbrasil · 2024-05-17T15:49:37Z

core/shared/src/main/scala/zio/ZIO.scala

+      n match {
+        case 0 => trace
+        case 1 => () => eval()
+      }


I tried to reproduce the behavior of case class. I wonder if there's a better way to do this.

fwbrasil · 2024-05-17T15:51:48Z

core/shared/src/main/scala-3/zio/ZIOCompanionVersionSpecific.scala

@@ -178,7 +178,9 @@ trait ZIOCompanionVersionSpecific {
   * Returns an effect that models success with the specified value.
   */
  def succeed[A](a: Unsafe ?=> A)(implicit trace: Trace): ZIO[Any, Nothing, A] =


In Kyo, we'd make this method and its a parameter inline. It'd avoid an extra field in the anonymous class to capture a.

In Kyo, we'd make this method and its a parameter inline

If I'm understanding this correctly, would the equivalent here be this?

inline def succeed[A](inline a: Unsafe ?=> A)(implicit trace: Trace): ZIO[Any, Nothing, A] = new ZIO.Sync[A](trace) { def eval() = Unsafe.unsafe(a) }

If yes, this can be quite dangerous, as a new anonymous class will be created on each callsite, effectively causing a huge increase in compiled code size and causing issues with the JVMs code cache

Similar to this: scala/scala3#16723

If yes, this can be quite dangerous, as a new anonymous class will be created on each callsite, effectively causing a huge increase in compiled code size and causing issues with the JVMs code cache

This isn't accurate. Every lambda becomes a class at runtime via LambdaMetafactory, so code cache use is similar. The tradeoff is moving the class generation from runtime to compile time, which increases compile times and jar size but provides better performance. It's also important to keep in mind that code cache contains the result of JIT-optimized compilation, not the original bytecode.

Every lambda becomes a class at runtime

I'm not sure that's true, at least not for Scala 3.

Compiling the code below with the -Xprint:genBCode compiler flag, it seems that the lambdas are converted into static methods in the object that they were created

sealed trait Unsafe object Unsafe { val unsafe: Unsafe = new Unsafe {} } abstract class Sync[A] { def eval(): A } object Sync { def succeed[A](f: Unsafe ?=> A): Sync[A] = new Sync[A] { def eval(): A = f(using Unsafe.unsafe) } } object Foo { def foo1: Sync[String] = Sync.succeed("foo") def foo2: Sync[String] = Sync.succeed("foo") def foo3: Sync[String] = Sync.succeed("foo") }

Output:

package <empty> { private final class Sync$$anon$2 extends Sync { def <init>(f$2: Function1, $outer: Sync): Unit = { this.f$1 = f$2 if $outer.eq(null) then throw new NullPointerException() else () super() () } private val f$1: Function1 def eval(): Object = this.f$1.apply(using Unsafe.unsafe()) } final module class Foo extends Object { def <init>(): Unit = { super() () } private def writeReplace(): Object = new scala.runtime.ModuleSerializationProxy(classOf[Foo]) def foo1(): Sync = Sync.succeed( { closure(<empty>.this.foo1$$anonfun$1) } ) def foo2(): Sync = Sync.succeed( { closure(<empty>.this.foo2$$anonfun$1) } ) def foo3(): Sync = Sync.succeed( { closure(<empty>.this.foo3$$anonfun$1) } ) private final <static> def foo1$$anonfun$1(using contextual$1: Unsafe): String = "foo" private final <static> def foo2$$anonfun$1(using contextual$2: Unsafe): String = "foo" private final <static> def foo3$$anonfun$1(using contextual$3: Unsafe): String = "foo" } private final class Unsafe$$anon$1 extends Object, Unsafe { def <init>(): Unit = { super() () } } @Child[Unsafe] sealed trait Unsafe() extends Object {} final module class Unsafe extends Object { def <init>(): Unit = { super() Unsafe.unsafe = { new Object with Unsafe {...}():Unsafe } () } private def writeReplace(): Object = new scala.runtime.ModuleSerializationProxy(classOf[Unsafe]) private <static> val unsafe: Unsafe def unsafe(): Unsafe = Unsafe.unsafe } abstract class Sync extends Object { def <init>(): Unit = { super() () } def eval(): Object } final module class Sync extends Object { def <init>(): Unit = { super() () } private def writeReplace(): Object = new scala.runtime.ModuleSerializationProxy(classOf[Sync]) def succeed(f: Function1): Sync = { new Sync {...}(f, this):Sync } } final lazy module val Unsafe: Unsafe = new Unsafe() final lazy module val Sync: Sync = new Sync() final lazy module val Foo: Foo = new Foo() }

Compiling the same code with succeed and param as inlined:

package <empty> { final module class Foo extends Object { def <init>(): Unit = { super() () } private def writeReplace(): Object = new scala.runtime.ModuleSerializationProxy(classOf[Foo]) def foo1(): Sync = { new Sync {...}(this):Sync }:Sync def foo2(): Sync = { new Sync {...}(this):Sync }:Sync def foo3(): Sync = { new Sync {...}(this):Sync }:Sync } private final class Foo$$anon$4 extends Sync { def <init>($outer: Foo): Unit = { if $outer.eq(null) then throw new NullPointerException() else () super() () } def eval(): String = { val contextual$3: Unsafe = Unsafe.unsafe() "foo" } def eval(): Object = this.eval() } abstract class Sync extends Object { def <init>(): Unit = { super() () } def eval(): Object } final module class Sync extends Object { def <init>(): Unit = { super() () } private def writeReplace(): Object = new scala.runtime.ModuleSerializationProxy(classOf[Sync]) } private final class Foo$$anon$3 extends Sync { def <init>($outer: Foo): Unit = { if $outer.eq(null) then throw new NullPointerException() else () super() () } def eval(): String = { val contextual$2: Unsafe = Unsafe.unsafe() "foo" } def eval(): Object = this.eval() } private final class Unsafe$$anon$1 extends Object, Unsafe { def <init>(): Unit = { super() () } } private final class Foo$$anon$2 extends Sync { def <init>($outer: Foo): Unit = { if $outer.eq(null) then throw new NullPointerException() else () super() () } def eval(): String = { val contextual$1: Unsafe = Unsafe.unsafe() "foo" } def eval(): Object = this.eval() } @Child[Unsafe] sealed trait Unsafe() extends Object {} final module class Unsafe extends Object { def <init>(): Unit = { super() Unsafe.unsafe = { new Object with Unsafe {...}():Unsafe } () } private def writeReplace(): Object = new scala.runtime.ModuleSerializationProxy(classOf[Unsafe]) private <static> val unsafe: Unsafe def unsafe(): Unsafe = Unsafe.unsafe } final lazy module val Unsafe: Unsafe = new Unsafe() final lazy module val Sync: Sync = new Sync() final lazy module val Foo: Foo = new Foo() }

Good point! Yes, lambdas that don't capture values can be encoded statically. Using inline is definitely not trivial but if you're able to reason about how the code ends up JIT compiled, it's possible to use it safely. A good way to guide it is via JFR's code cache metrics. For example, I'm not sure you'd be able to measure a significant difference in code cache use between these two examples.

@fwbrasil do you happen to have an example in mind where a lambda becomes a new anonymous class? I'm interested because I want to know what might trigger and it and see if I can avoid it.

PS, I tried the following and it seems that it still doesn't generate an anonymous class:

object Foo { def foo1: Sync[String] = Sync.succeed(Bar.bar("123") + "123") def foo2: Sync[String] = Sync.succeed(Bar.bar("012") + "123") def foo3: Sync[String] = Sync.succeed(Bar.bar("123") + "123") } object Bar { def bar(s: String) = s + "456" }

output:

private final <static> def foo1$$anonfun$1(using contextual$1: Unsafe): String = Bar.bar("123").+("123") private final <static> def foo2$$anonfun$1(using contextual$2: Unsafe): String = Bar.bar("012").+("123") private final <static> def foo3$$anonfun$1(using contextual$3: Unsafe): String = Bar.bar("123").+("123")

I'm not sure if the output of the compiler is a good representation of the final bytecode. I normally use CFR. It seems that Scala 3 actually doesn't compile non-capturing lambdas statically:

class example: def plusOne(f: () => Int): Int = f() + 1 def test1() = plusOne(() => 1) def test2(i: Int) = plusOne(() => i) end example

The lambda in the test1 method doesn't capture values so the compiler should be able to encode it as a local method but that doesn't seem to be the case.

Note the -> in the generated bytecode. It represents a lambda that will be expanded into a class by LambdaMetafactory at runtime.

public class example { public int plusOne(Function0<Object> f) { return f.apply$mcI$sp() + 1; } public int test1() { return this.plusOne((Function0<Object>)(Function0 & Serializable)() -> 1); } public int test2(int i) { return this.plusOne((Function0<Object>)(Function0 & Serializable)() -> i); } }

kyri-petrou

We have an issue with this change, it seems that zio-profiling uses ZIO.Sync in pattern matching here, which means we'll break it with this change. I think we need to update zio-profiling before continuing with this.

I also want to run some of the benchmarks to check whether there are any performance issues with this approach. My biggest concern is that now that Sync is an abstract class, the pattern matching on Sync might be slower since we're type-checking on the abstract subclass type.

kyri-petrou

I'm marking this as Request changes so that it doesn't get accidentally merged in before we have a release of zio-profiling that doesn't pattern match on ZIO.Sync(_, _)

fwbrasil · 2024-05-17T17:00:26Z

Thanks for the quick review! :) In case this approach is problematic, I think we can explore fixing this issue by changing the encoding of Unsafe instead.

fwbrasil commented May 17, 2024

View reviewed changes

avoid extra function allocations in ZIO.succeed (scala 3)

1af0028

fwbrasil force-pushed the sync-thunk branch from 87d7f33 to 1af0028 Compare May 17, 2024 15:52

kyri-petrou reviewed May 17, 2024

View reviewed changes

kyri-petrou requested changes May 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avoid extra function allocations in ZIO.succeed (scala 3) #8866

avoid extra function allocations in ZIO.succeed (scala 3) #8866

fwbrasil commented May 17, 2024

fwbrasil May 17, 2024

fwbrasil May 17, 2024

kyri-petrou May 17, 2024

kyri-petrou May 17, 2024

fwbrasil May 17, 2024 •

edited

kyri-petrou May 17, 2024

kyri-petrou May 17, 2024 •

edited

fwbrasil May 17, 2024

kyri-petrou May 17, 2024

fwbrasil May 17, 2024

kyri-petrou left a comment

kyri-petrou left a comment •

edited

fwbrasil commented May 17, 2024

avoid extra function allocations in ZIO.succeed (scala 3) #8866

Are you sure you want to change the base?

avoid extra function allocations in ZIO.succeed (scala 3) #8866

Conversation

fwbrasil commented May 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwbrasil May 17, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kyri-petrou May 17, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kyri-petrou left a comment

Choose a reason for hiding this comment

kyri-petrou left a comment • edited

Choose a reason for hiding this comment

fwbrasil commented May 17, 2024

fwbrasil May 17, 2024 •

edited

kyri-petrou May 17, 2024 •

edited

kyri-petrou left a comment •

edited