-
Notifications
You must be signed in to change notification settings - Fork 62
move atomic primitives to Memory #1135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
andreaTP
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreas-karlsson The changes look good, it’s a nice cleanup, and I agree that lock is a bit of a leaky abstraction.
That said, without the “atomic operations based on VarHandle” portion (which I think is a great idea!), the overall value of this PR feels reduced.
Would you be up to opening another PR in parallel to show the direction we’re aiming for? Even with CI broken, it would give us a sense of how far we are from the desired goal.
I honestly assumed the VarHandle API was more stable, but given the incompatibilities you mentioned, I’m now wondering how well this will work with native-image. I think we’re missing a CI step to verify that compatibility.
| var replacement = (int) stack.pop(); // c3 | ||
| var expected = (int) stack.pop(); // c2 | ||
| var ptr = readMemPtr(stack, operands); // i | ||
| var replacement = (int) stack.pop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think the names in the comments are reflecting the spec, I'd leave it for the future reader.
| synchronized (memory.lock(ptr)) { | ||
| return memory.read(ptr); | ||
| } | ||
| return Byte.toUnsignedInt(memory.atomicReadByte(ptr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me reason out loudly:
- assume we release those changes in a 1.7.0 release
- a user compiles using the build time compiler with 1.7.0
- attempting to use the generated classes with 1.6.0 of the runtime is going to break
correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed in #883 and we are aiming for forward and not backward compatibility of generated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, calling new runtime methods in Shaded will break on an older runtime... what's the rational behind this requirement? Setups where one dynamically pushes newly compiled classes into an otherwise static installation? I don't see any way around it. What does that mean? This PR would require a new major?
|
@andreaTP I did more investigation into the Java 25 issue. It turns out atomic VarHandle operations still work on ByteBuffers but not on regular byte arrays. This test confirms it. Since It would be great to pair this with a performance test running on all platforms/versions in CI. I've noticed JMH in the repo and will look into adding a benchmark. That said, I still believe the refactoring warrants inclusion independent of the optimization. I hadn't thought about the forward compatibility issue though! Will need to wrap my head around that.. |
I'll follow up asking around to relevant people 👍
Thanks, look like a reasonable path forward
I was going to propose it, sounds great! |
|
The issue with Java 25 most likely stems from this issue (and its fix): https://bugs.openjdk.org/browse/JDK-8318966 Basically, you can't (generally speaking) perform multi-byte atomic operations on The best fix is to detect Java 22 or later, and use |
12b678f to
6178674
Compare
|
@dmlloyd Thanks! That's most likely it. I'll take a proper look at implementation switching to @andreaTP I took the liberty to squash and rebase the branch. Then fired off two hopeful commits in quick succession, but both turned very red. I'll put this PR back in draft and workout something more solid. |
|
I only tried the |
ef94562 to
71141f0
Compare
|
@andreaTP I've just pushed a new benchmark based on parallel sieve primes computation. It uses a lot of atomics and also some locking. I also pushed an optimized ByteArrayMemory, but it's basically same as before. Using MemorySegment as suggested isn't a clean fit in ByteArrayMemory, I think it would rather be a separate Memory impl. These are the scores I'm getting when ByteArrayMemory is optimized and ByteBufferMemory is not. I guess it'd be better to compare just ByteArrayMemory optimized vs unoptimized. But to do that in CI I guess it would need separate PRs? Note: The reason it works on Java 25 now is cause it falls back to locking. |
|
This is still very flaky 😞 will need to do more testing in CI |
b3bc8aa to
451b781
Compare
|
@andreas-karlsson hope this helps, to fight flaky tests I usually start with something like this: andreaTP@33ef658 and run the checks on my fork. Numbers are great! |
|
@andreaTP Ah! Smart! |
|
@andreaTP It looks like it was the unguarded grow #1137 that caused the flakiness*. On that branch I'm not using any var-handle atomics but guard Fixing the *Edit: The 2/100 failures are unrelated. |
|
@andreas-karlsson thanks a lot for the help here! Much appreciated!
If we manage to have the right deprecation in place and we verify forward compatibility we should be able to release a |
@andreaTP Does this mean creating like a memory-adapter in the compiler? That could be done, but i think the bugs in the current impl. makes it unusable for anything non trivial, so I don't know if it's worth the effort to salvage forward compatibility in this case? |
|
Fair, feel free to move on 👍 |
451b781 to
4668dfd
Compare
|
@andreas-karlsson hi and happy new year! |
|
@andreaTP Happy new year to you as well! I've just reacquainted myself with the PR and the state is as follows:
I got a bit bogged down before holidays in how to reason about the two memory implementations, and how to introduce a third one(?!) based on the new MemorySegment. There are many options. Should we have a separate implementation of ByteArrayMemory that gets selected by Multi Release Jar? But why would a MemorySegment impl. masquerade as a byte-array impl.? And what's actually the rationale for having even two memory implementations? An alternative would be to instead deprecate ByteArrayMemory and focus on the ByteBufferMemory, which (if using direct buffers) works well with var-handle atomics across all Java versions. We could just make direct buffers an option and fall-back to locks for atomics otherwise. This would consolidate things and lessen the maintenance burden. Another option would be to scale this PR back to just include the performance benchmark and the changes to the Memory interface and its usage in compiler/interpreter. I slightly favour this because it'd be nice to have the benchmark giving comparable numbers when working on changes to the memory implementations. Sorry for the long post, but I need some guidance 😅 |
|
Thanks a lot for keeping the engagement @andreas-karlsson !
Looks fair!
Agree 👍 thanks!
I think is acceptable, IIRC there is an open Issue for Java 25, so the problem will eventually get solved.
Currently, we are using
That's correct, at the moment the situation is:
I'd be interested in looking at performance numbers to see if another implementation(in the
In principle, I'd refrain from adding additional complexity if not well justified.
I'm afraid this is not a desirable option, the gut feeling is that we will fall into spending time chasing Android vs. JVM differences and support instead of focusing on more meaningful targets like perf on the JVM.
This is all good! 🙂 whatever flow works the best for you!
Very welcome, you are doing great job in this project and I'm grateful for the time you are spending. |
This PR proposes adding distinct methods for each atomic memory instruction to the Memory interface, with the following benefits:
Memory.lock(), which is a somewhat leaky abstraction, is no longer used outside of the Memory implementations and has been deprecated.The PR initially also contained changes to make
ByteArrayMemoryimplement atomic operations based onVarHandle. This showed very promising performance improvements, but oddly broke on Java 25, where atomic operations onbyteArrayViewVarHandleare no longer supported. That might be a bug as it's not mentioned in the release notes. It could also safely be feature gated asVarHandle.isAccessModeSupported(...)correctly returnedfalseon Java 25. But I felt it better to leave it out for a follow-up PR when more is known. (The feature gated implementations still exists on a branch here.)