Bug 1389

Summary: MacOS 10.14.6 + OpenJDK11U 11.0.3.7 SIGSEGV on [NSApplicationAWT sendEvent:]
Product: [JogAmp] Jogl Reporter: Sven Gothel <sgothel>
Component: macosxAssignee: Sven Gothel <sgothel>
Status: RESOLVED FIXED    
Severity: critical    
Priority: P1    
Version: 2.4.0   
Hardware: All   
OS: macosx   
Type: DEFECT SCM Refs:
534d764474cacf8bc380123cbfd164c7c55f236a db843e65c6b93d720438c7e751413c0556f51a6e
Workaround: ---
Attachments: AdoptOpenJDK OSX Issues 2019-08-23 (Tested a few builds)
AdoptOpenJDK OSX Issues 2019-08-23 (update)
hs_err_pid10557.log crash report w/ OpenJDK Runtime Environment (11.0.3+7)

Description Sven Gothel 2019-08-23 13:59:08 CEST
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000011acc9b08, pid=8684, tid=775
#
# JRE version: OpenJDK Runtime Environment (11.0.3+7) (build 11.0.3+7)
# Java VM: OpenJDK 64-Bit Server VM (11.0.3+7, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# C  [libosxapp.dylib+0x2b08]  -[NSApplicationAWT sendEvent:]+0x17c
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   https://github.com/AdoptOpenJDK/openjdk-build/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---------------  S U M M A R Y ------------

Command Line: -DummyArg -Djava.awt.headless=false -Djogamp.debug=all -Dnativewindow.debug=all -Djogl.debug=all -Dnewt.debug=all com.jogamp.opengl.test.junit.jogl.demos.gl2.awt.TestGearsAWT -time 6000

Host: Macmini7,1 x86_64 2600 MHz, 4 cores, 8G, Darwin 18.7.0
Time: Fri Aug 23 13:48:14 2019 CEST elapsed time: 9 seconds (0d 0h 0m 9s)

---------------  T H R E A D  ---------------

Current thread (0x00007f983c200000):  JavaThread "AppKit Thread" daemon [_thread_in_native, id=775, stack(0x00007ffee0a1d000,0x00007ffee121d000)]

MacOS 10.14.6 + OpenJDK11U 11.0.3.7 SIGSEGV on most AWT tests

AdoptOpenJDK: OpenJDK11U-jdk_x64_mac_hotspot_11.0.3_7.pkg

Does not occur with Java8.

Stack: [0x00007ffee0a1d000,0x00007ffee121d000],  sp=0x00007ffee12185f0,  free space=8173k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libosxapp.dylib+0x2b08]  -[NSApplicationAWT sendEvent:]+0x17c
C  [AppKit+0x135e0]  -[NSApplication run]+0x2f3
C  [libosxapp.dylib+0x2773]  +[NSApplicationAWT runAWTLoopWithApp:]+0x9d
C  [libawt_lwawt.dylib+0x39af7]  +[AWTStarter starter:headless:]+0x342
C  [JavaNativeFoundation+0x6fce]  +[JNFRunLoop _performCopiedBlock:]+0x11
C  [Foundation+0xb0742]  __NSThreadPerformPerform+0x148
C  [CoreFoundation+0x57683]  __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__+0x11
C  [CoreFoundation+0x57629]  __CFRunLoopDoSource0+0x6c
C  [CoreFoundation+0x3afeb]  __CFRunLoopDoSources0+0xc3
C  [CoreFoundation+0x3a5b5]  __CFRunLoopRun+0x4a5
C  [CoreFoundation+0x39ebe]  CFRunLoopRunSpecific+0x1c7
C  [java+0x681c]  CreateExecutionEnvironment+0x191
C  [java+0x28ea]  JLI_Launch+0x5ea
C  [java+0x16f8]  main+0x198
C  [libdyld.dylib+0x163d5]  start+0x1


siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000010
Comment 1 Sven Gothel 2019-08-23 14:30:48 CEST
Was: MacOS 10.14.6 + OpenJDK11U 11.0.3.7 SIGSEGV on most AWT tests

AdoptOpenJDK: OpenJDK11U-jdk_x64_mac_hotspot_11.0.3_7.pkg

Does not occur with Java8.

+++

Now MacOS 10.14.6 + AdoptOpenJDK 11U Issues

Tested multiple AdoptOpenJDK builds, so far none was satisfactory,
see attachment.
Comment 2 Sven Gothel 2019-08-23 14:31:41 CEST
Created attachment 820 [details]
AdoptOpenJDK OSX Issues 2019-08-23 (Tested a few builds)
Comment 3 Sven Gothel 2019-08-23 14:39:05 CEST
The content of attachment 820 [details] has been deleted for the following reason:

to be replaced
Comment 4 Sven Gothel 2019-08-23 14:39:35 CEST
Created attachment 821 [details]
AdoptOpenJDK OSX Issues 2019-08-23 (update)
Comment 5 Sven Gothel 2019-08-23 14:40:53 CEST
Bottom line, we have to resolve the crash issue here.
Comment 6 Sven Gothel 2019-08-23 14:49:43 CEST
Adding one more test .. 

jdk-12.0.2+10.2 - 6 August 2019
=================================================
openjdk version "12.0.2" 2019-07-16
OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.2+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 12.0.2+10, mixed mode)

1) com.jogamp.opengl.test.junit.jogl.demos.gl2.awt.TestGearsAWT
java.lang.UnsatisfiedLinkError: /private/var/folders/m3/36qvs18n7vx5q33l3wc0slw80000gp/T/jogamp_0000/file_cache/jln17159475570397999035/jln14738818103097697267/natives/macosx-universal/libgluegen_rt.dylib: dlopen(/private/var/folders/m3/36qvs18n7vx5q33l3wc0slw80000gp/T/jogamp_0000/file_cache/jln17159475570397999035/jln14738818103097697267/natives/macosx-universal/libgluegen_rt.dylib, 1): no suitable image found.  Did find:
        /private/var/folders/m3/36qvs18n7vx5q33l3wc0slw80000gp/T/jogamp_0000/file_cache/jln17159475570397999035/jln14738818103097697267/natives/macosx-universal/libgluegen_rt.dylib: code signature in (/private/var/folders/m3/36qvs18n7vx5q33l3wc0slw80000gp/T/jogamp_0000/file_cache/jln17159475570397999035/jln14738818103097697267/natives/macosx-universal/libgluegen_rt.dylib) not valid for use in process using Library Validation: mapped file has no cdhash, completely unsigned? Code has to be at least ad-hoc signed.
        at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method)
        at java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2430)
        at java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2487)
        at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2684)
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2617)
        at java.base/java.lang.Runtime.load0(Runtime.java:765)
        at java.base/java.lang.System.load(System.java:1866)
        at com.jogamp.common.jvm.JNILibLoaderBase.loadLibraryInternal(JNILibLoaderBase.java:604)
        at com.jogamp.common.jvm.JNILibLoaderBase.access$000(JNILibLoaderBase.java:64)
        at com.jogamp.common.jvm.JNILibLoaderBase$DefaultAction.loadLibrary(JNILibLoaderBase.java:107)
        at com.jogamp.common.jvm.JNILibLoaderBase.loadLibrary(JNILibLoaderBase.java:488)
        at com.jogamp.common.os.DynamicLibraryBundle$GlueJNILibLoader.loadLibrary(DynamicLibraryBundle.java:427)
        at com.jogamp.common.os.Platform$1.run(Platform.java:321)

(Same as: jdk-11.0.4+11.2 - 6 August 2019 jdk-11.0.4+11.2)
Comment 7 Sven Gothel 2019-08-23 14:57:19 CEST
openjdk13-ea-20190809-build33
=================================================
openjdk version "13" 2019-09-17
OpenJDK Runtime Environment (build 13+33)
OpenJDK 64-Bit Server VM (build 13+33, mixed mode, sharing)

(same as w/ jdk-11.0.4+11)
^^^ Only due to ssh remote access!
If accessing locally, same crash Bug 1389: Demo runs, but crashes at tearing down the test!
Comment 8 Sven Gothel 2019-08-23 15:00:01 CEST
Created attachment 822 [details]
hs_err_pid10557.log crash report  w/ OpenJDK Runtime Environment (11.0.3+7)
Comment 9 Sven Gothel 2019-09-08 08:27:30 CEST
This crash occurs not with Java8.
On Java11 (diverse versions) the crash occurs.

Reason is that the NSOpenGLLayer detachment command 
at MacOSXCGLContext's drawable disassociation won't get passed 
through the 'macos main thread'.

MacOSXCGLContext's drawable disassociation runnable is send to the main thread
via performSelectorOnMainThread:@selector(jRun) in native code.
Thereafter we even try to kick the NSApp thread via an empty message.

However, the runnable is never executed and the NSOpenGLLayer still attached,
to the CALayer which probably causes the issue at hand.

Further investigation ..
Comment 10 Sven Gothel 2019-09-08 09:38:30 CEST
(In reply to Sven Gothel from comment #9)
> This crash occurs not with Java8.
> On Java11 (diverse versions) the crash occurs.
> 
> Reason is that the NSOpenGLLayer detachment command 
> at MacOSXCGLContext's drawable disassociation won't get passed 
> through the 'macos main thread'.
> 
> MacOSXCGLContext's drawable disassociation runnable is send to the main
> thread
> via performSelectorOnMainThread:@selector(jRun) in native code.
> Thereafter we even try to kick the NSApp thread via an empty message.
> 
> However, the runnable is never executed and the NSOpenGLLayer still attached,
> to the CALayer which probably causes the issue at hand.
> 
> Further investigation ..

In case we simply issue the disassociation action on the current thread from MacOSXCFLContext (caller), no crash.

Reason we used to delegate the CALayer action to the main thread 
is macos's own policy .. well. 

Need to figure out why the task won't get executed on main thread.
Comment 11 Sven Gothel 2019-09-08 12:50:49 CEST
Culprit of the crash and the non propagated action on NSApp main-thread 
was _simply_ our OSXUtil_KickNSApp() 'kick alive' 
NSApplicationDefined NSEvent sent to the NSApp.

Java11's NSApp code overrides sendEvent and handles
  NSApplicationDefined + subtype=ExecuteBlockEvent
using the given data1 as a function pointer. 8-O

ExecuteBlockEvent defined as 0, which we have sent.

Simply passing subtype=8888 avoids this side-effect.
Whether it is still required to KickNSApp() is another question.

+++

Further, make code a bit more robuts regarding the offscreenSurfaceLayer
at JAWTWindow invalidate. I.e. if still not detached, do the late cleanup there.
This just in case the OSX Context callback to disassociate the drawable 
has been missed.
Comment 12 Sven Gothel 2019-09-08 13:00:16 CEST
fixed by commit 534d764474cacf8bc380123cbfd164c7c55f236a

+++

sneaked in commit db843e65c6b93d720438c7e751413c0556f51a6e

OSXUtil::IsMainThread() Utilize ThreadLocal storage flag avoiding unnecessary JNI calls