Summary: | WGL.wglMakeCurrent() crashes in native code | ||
---|---|---|---|
Product: | [JogAmp] Jogl | Reporter: | Wade Walker <wwalker3> |
Component: | windows | Assignee: | Sven Gothel <sgothel> |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | bcutler |
Priority: | --- | ||
Version: | 2 | ||
Hardware: | pc_x86_32 | ||
OS: | windows | ||
Type: | --- | SCM Refs: | |
Workaround: | --- | ||
Attachments: |
Log file
Error file Log file with new WGL/WGLExt instrumentation -- "gold" output Log file - extra output Error file - extra output Log file - threaded optimization = off Error file - threaded optimization = off Log file - CS_OWNDC Error file - CS_OWNDC Log file from CPP version Log file - GDI/WGL-only Error file - GDI/WGL-only Log file - super minimal GDI/WGL Error file - super minimal GDI/WGL Log file - minimal 10/13 Error file - minimal 10/13 Log file - Statically linked WGL Error file - statically linked WGL Log file - jogl 1.1.1a Error file - jogl 1.1.1a Log file - 10/27 version |
Description
Wade Walker
2011-08-30 15:34:14 CEST
I ran the Geeks3D GPU Caps Viewer as Wade suggested, and all of the OpenGL demos ran perfectly, except for the GL 4.x Tessellation demo, which popped up a dialog saying it wasn't supported. So it does appear to be a JOGL problem. Emailed test case to bug reporter to gather initial logs. This is the contents of test.log after running Wade's test program. I hope it is helpful: java.lang.UnsupportedClassVersionError: javax/media/opengl/GLCapabilitiesImmutable : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(Unknown Source) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$000(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: name.wadewalker.drivertest.OneTriangleAWT. Program will exit. Exception in thread "main" Oops, my fault -- I compiled the test with Java 7, which is class version 51.0. I'll redo it with Java 6 to match your environment, and re-send you the test. When you post the log, it'll be large (> 100K) so please post it and the HotSpot log as attachments rather than as comments :) Also, please set the attachment types to plaintext instead of auto-detect -- this helps readability on some platforms. Created attachment 266 [details]
Log file
Created attachment 267 [details]
Error file
I've identified which WGL and WGLExt methods JOGL uses to interact with the OpenGL driver, and I'm in the process of instrumenting them to dump out extra logging information and check for violations of API contracts. Last time we had a bug like this (e.g. https://jogamp.org/bugzilla/show_bug.cgi?id=480) I had to do an assembly-level debugging session with the machine in my possession to find the problem. That's why I'm attempting this strategy of adding more logging and checking -- to try to improve our ability to find these sorts of driver interaction bugs without requiring the reporter to give us login (or physical access) to their machine. Re comment 7: Of course, we use this technique already for a long time, hence all the DEBUG code. This is good for some situations, ofc. However - reaching the goal of catching all unknown circumstances is IMHO impossible. We already catch and notify a lot states and check results. Adding missed API checks can be a good thing, sure. IMHO these things shall be in some balance and I doubt we can catch unknown situations w/o access to the machine. Adding another 40% of DEBUG / INFO statements cannot be the solution. I never really needed to disassemble code (the other extreme) .. Most of the time it's just about reading the API, digest the feedback (debug on machine) and learn whether it's our code or the GL driver's bug. For the latest cases - I would never be able to do this with static DEBUG code alone even though it gives me a hint already. Just my 2 cents .. +++ Since we already have a few WinXP bug reports (when will this platform die again ?) maybe I should set one up here .. hmm But if it's not reproducible .. well, we should get access to the specific machine. Of course, if you're available to do a remote debug on this one, that would be the quickest solution :) I was assuming you'd be too busy with Android and Mac OS X, and I wasn't sure if Beth could make her machine available for remote login. I'm currently running JOGL 2 on two different Windows XP machines, so it's definitely this particular machine/driver/OS combination and not Windows XP in general that's the problem. For this particular bug, we know that JOGL 1 and JOGL 2 both fail, but C OpenGL programs run fine. This implies to me that JOGL is using WGL in a way that may work for many cards/drivers, but is somehow not fully "correct". I was thinking of temporarily wrapping JOGL's WGL/WGLExt calls with a class WGLCheck that would print logging info and perform strict consistency and error checking. For example (in pseudo-Java): class WGLCheck { Map<HGLRC, HDC> mapGLContextsToHDCs; Map<HGLRC, ThreadID> mapGLContextsToThreads; public HGLRC createContext( HDC hdc ) { log( createContext: thread ID, hdc ); HGLRC hglrc = WGL.createContext( hdc ); mapGLContextsToHDCs.put( hglrc, hdc ); return( hglrc ); } public void makeCurrent( HDC hdc, HGLRC hglrc ) { log( makeCurrent: thread ID, hdc, hglrc ); if( mapGLContextsToThreads.get( hglrc ) != null ) log( error: context can only be current on one thread at a time ); WGL.makeCurrent( hdc, hglrc ); mapGLContextsToThreads.put( hglrc, Thread.getID() ); if( !mapGLContextsToHDCs.contains( hglrc ) ) log( error: GL context not created ); else if( mapGLContextsToHDCs.get( hglrc ) == null ) log( error: GL context deleted ); else if( mapGLContextsToHDCs.get( hglrc ) != hdc ) log( error: GL context not associated with hdc ); } ... // wrappers for the other 20 or so WGL/WGLExt functions JOGL uses } Then once I know the exact sequence of WGL calls that fails, I could examine that sequence closely and see what we're doing wrong. Plus, the extra consistency checking might find some problem in the way we're using the WGL API. It would be nice to have code checks to insure that we're properly satisfying the complex WGL API contract. Does this sound crazy? You're the maintainer, so of course you have the final say -- all this is just my humble suggestion :) It would be difficult, though probably not impossible, to get you access to the machine. If you have exhausted all other avenues, I will investigate how we could make that work. Re: comment 9 ofc, something like DebugWGL and TraceWGL (like we produce those for DebugGL* TraceGL*) is a great idea. You can easily use our pipeline generator for this task. See build-jogl.xml target 'java.generate.composable.pipeline.es2' for example .. Don't know if we need to patch gluegen/jogl for this task - it should just work. Indeed, this will give us a nice clue about what passes through incl. the return values (-> TraceWGL*). The use of DebugWGL* might be tricky, since they are allowed to fail here and there .. and there are known (documented) WGL driver cases where it returns a failure value but actually works (or vice versa). Hence DebugWGL* might need to continue .. but dump a marker/warning in the stderr stream. Great idea - go for it! Even if this doesn't pinpoint the problem right away, it will be a great help, as DebugGL* and TraceGL* are already - at least for me. I've looked at com.jogamp.gluegen.opengl.BuildComposablePipeline and the code it emits, and it seems to be custom to GL only, so it can't produce DebugWGL without being modified. For example, the prefixes to GL calls and the code to check for current GL contexts is hard-coded into this class. Also, for WGL it's not so easy to wrap one object inside WGLDebug at the point where it's instantiated. WGL methods are all static, and are called all over the code without using a wrappable instance :( Let me first try manually creating a WGLDebug and WGLExtDebug that only implements the 20 or so methods that JOGL actually uses. I can use this to help find Beth's problem. Then afterwards I can use this experience to figure out how to make a general solution like BuildComposablePipeline. I finished up the trace code for WGL this evening. Tomorrow I'll do WGLExt, and then I should be able to give Beth a new version of the test program that will dump out much more information about our driver calls. (In reply to comment #12) > I've looked at com.jogamp.gluegen.opengl.BuildComposablePipeline and the code > it emits, and it seems to be custom to GL only, so it can't produce DebugWGL > without being modified. For example, the prefixes to GL calls and the code to > check for current GL contexts is hard-coded into this class. ok, so we would need to make them more generic > > Also, for WGL it's not so easy to wrap one object inside WGLDebug at the point > where it's instantiated. WGL methods are all static, and are called all over > the code without using a wrappable instance :( ofc .. we would need to make them use interfaces and using dynamic lookup code, but that should not be too hard. ie like GLXExt, WGLExt .. > > Let me first try manually creating a WGLDebug and WGLExtDebug that only > implements the 20 or so methods that JOGL actually uses. I can use this to help > find Beth's problem. Then afterwards I can use this experience to figure out > how to make a general solution like BuildComposablePipeline. yup .. ofc (In reply to comment #13) > I finished up the trace code for WGL this evening. Tomorrow I'll do WGLExt, and > then I should be able to give Beth a new version of the test program that will > dump out much more information about our driver calls. sounds great, kudos if this helps in this case, we definitely should make WGL and GLX interfaces/dynamic allowing pipelining with the debug implementation automatically created. I finished the logging code for both WGL and WGLExt last night, and emailed a new version of the test program to Beth. Once she attaches the new log file here, we should be able to see the arguments, return code, and stack trace of every WGL and WGLExt call -- hopefully comparing that to the correct output will tell us what's going on. Created attachment 268 [details]
Log file with new WGL/WGLExt instrumentation -- "gold" output
This is the log output when the test works correctly on a Windows XP system. When Beth uploads her log we should be able to compare directly to this and see where they diverge.
Created attachment 269 [details]
Log file - extra output
Created attachment 270 [details]
Error file - extra output
Sorry I've taken so long to look at this -- I'm traveling on business right now and don't have much computer access. I should be able to get to this on Friday once I'm back home. (In reply to comment #19) > Created attachment 270 [details] > Error file - extra output Good logs, at least they show me that all 'should' be fine, but still you get a native SIGSEGV. Reminds me of the NV bug I have fixed here: http://jogamp.org/git/?p=jogl.git;a=commit;h=5166d6a6b617ccb15c40fcb8d4eac2800527aa7b Maybe this workaround doesn't work on your system, so you could try to set your NV driver to: 'Threaded Optimization' := 'off' Just an idea .. otherwise I don't have a clue why MakeCurrent with a valid HDC and context doesn't work. Created attachment 271 [details]
Log file - threaded optimization = off
Using Sven's suggestion of turning off the threaded optimization in the Nvidia settings. My extremely simple jogl test program still failed, but it failed in a slightly different way, so I thought I'd post the log file for this case as well. The log looks just slightly different.
Created attachment 272 [details]
Error file - threaded optimization = off
Looking through this log, the crash happens when trying to make an OpenGL context current for the dummy window that JOGL creates to get access to the WGL functions. The WGL trace first calls these functions, which return null since there's no current OpenGL context: getWGLProcAddressTable wglGetProcAddress (45 times) Then it calls these two, using the HDC of the dummy window created by GDI.CreateDummyWindow0(): wglCreateContext wglMakeCurrent (fails) There are at least two things I can think of that might be causing this failure: - The dummy window isn't created with CS_OWNDC in its window class (supposedly some older drivers/cards/OSes need this) - Maybe the pixel format choosing/setting doesn't work right for dual-card configurations I'll create some more test versions to send to Beth that try these ideas out. I've emailed Beth a new version of my test program with CS_OWNDC set to see if that makes a difference. I've also written a C program that calls the same WGL functions that JOGL calls in this trace. I'll send that to her next if we need to look at the pixel format code. That way we can determine why it works in C but not in Java. Created attachment 273 [details]
Log file - CS_OWNDC
Created attachment 274 [details]
Error file - CS_OWNDC
Looking at the results of the CS_OWNDC test, it doesn't make any difference to this crash. So moving to theory #2, I've sent Beth a C++ program I wrote last night that enumerates the pixel formats, opens a window, and draws a single triangle, using similar C++ code to what's inside JOGL. If this program runs, it confirms (again) that OpenGL works on Beth's machine from C++, just not from Java. It also prints the list of pixel formats so I can compare that to the one we see in JOGL. The idea is to establish working code (the C++ test) and failing code (the Java test), then figure out what the crucial difference between the two is. Fortunately this failure is early in the program's execution, so we should be able to pin it down pretty quickly :) The cpp test failed with this error: "This application has failed to start because libgcc_s_dw2-1.dll was not found. Re-installing the application may fix this problem." Oops, forgot to statically link :) I'll fix it and resend. Created attachment 275 [details]
Log file from CPP version
The new CPP version (with proper dlls) seemed to run without errors. A window popped up showing a color triangle.
Yep, that looks perfect. The C++ code behaves totally as expected, so there's definitely nothing wrong with your OpenGL drivers. My next step is to write a Java test that invokes the exact same GDI and WGL functions as the C++ test, but using JOGL's GDI and WGL wrappers. That way I can rule out a bunch of minor differences in parameters JOGL uses to set up the dummy window. I got some of it done tonight, so it should be ready for you by Monday. Thanks again for helping me track this bug down. I'm very interested to see what the cause turns out to be :) Quick status report: getting my next test case working has proved to be a challenge. I wrote a Java program to try to duplicate Beth's bug using only JOGL's WGL and GDI functions (therefore narrowing down the scope of the code that has to be debugged), but I haven't quite got it to run properly on my own test platform yet. It may take me a while longer to debug this before I can submit it to Beth for testing -- I'll keep you posted. Status report: I finally got a simple Java program working that creates and makes current an OpenGL context using nothing but GDI and WGL calls. I'll polish it up a bit over the weekend, then send it out to Beth for testing. This program should demonstrate the absolute minimum number of lines of Java needed to make an OpenGL context current, so it will tell us whether the problem is in our basic GDI/WGL wrappers, or somewhere in the JOGL framework above that. Just sent the latest GDI/WGL-only test case to Beth. This test calls the exact same GDI and WGL calls in the exact same order as the C++ test that we've confirmed works correctly on Beth's machine. The only other code in it is the bare minimum initialization (maybe 4 lines) so GDI and WGL can be called. If this test still fails in wglMakeCurrent like the full-size test, then the bug has to be in the very small amount of setup code that's in this test. If this test doesn't fail, then the bug is caused by the rest of the JOGL code that's normally called when setting up a GL context (which this test is not calling). Either way, this should narrow things down. KUDOS to your hard work Wade - thx a lot Wade & Beth. Even though I cannot help w/ this issue at the moment for sure it is very much appreciated. Created attachment 277 [details]
Log file - GDI/WGL-only
Created attachment 278 [details]
Error file - GDI/WGL-only
This looks like good news. Since the test still fails, the problem must be in one of the very few remaining differences between the Java and C++ versions. My prime suspect now is wglGetProcAddress. JOGL calls it a bunch of times in the process of setting up its WGL object, relying on the driver to return null since there's no current OpenGL context yet. But there could easily be a driver bug where calling this function without a current context messes up the internal driver state. This would explain why JOGL works fine on most drivers, but not quite all -- it's relying on the driver to be robust in the face of unnecessary calls that aren't supposed to have any effect. If the driver writers haven't carefully tested these sorts of cases, JOGL could easily hit bad driver behavior that other programs don't see. I'll create a new, even more minimal test case that doesn't call wglGetProcAddress unnecessarily and see if that fixes our problem. Yes, this definitely narrows it down. In regards to comment 39, the premature wglGetProcAddress(..) are returning NULL as far as I see in the log files and the crashing call is to the statically linked wglMakeContectCurrent(..). Hence I don't see how this could be the culprit ? Also, only a few 'manual' wglGetProcAddress(..) calls are being issued, not the whole ProcAddressTable - since, as you pointed out correctly, you shall do this only after a context is current. However .. it definitely is great progress. Maybe we can double check the handle values on the Java side and the native JNI side ? comment 39: >> But there could easily be a driver >> bug where calling this function without a current context messes up the >> internal driver state. Yes, maybe .. even though I would guess it's nothing but a TLS function table fetch. But .. well, sometimes pigs do fly :) Created attachment 279 [details]
Log file - super minimal GDI/WGL
Created attachment 280 [details]
Error file - super minimal GDI/WGL
Ok, so pigs do not fly :) Another idea, besides double checking the native DC .. Created dummy window hwnd=0x440396. Got DC of dummy window dc=0xffffffff9b010aae. would be to evaluate if the native function lookup (win's dlsym/..) .. Lookup-Native: <wglMakeCurrent> 0x5ed19bd5 in lib NativeLibrary[OpenGL32.dll, 0x5ed00000] Got address of wglMakeCurrent 0x5ed19bd5. may cause havoc .. A way to verify would be to link the WGL part statically against OpenGL32.dll, not using function pointers. Below I walked through the GlueGen Win32 lookup code .. no result. However, maybe there is a problem w/ the function lookup, ie. using the wrong OpenGL32.dll library or something like that ? How to test this ? You could try installing and using GLIntercept's OpenGL32.dll http://code.google.com/p/glintercept/wiki/Readme .. and use it to debug/trace even the wgl commands .. Maybe that gives some additional clues ? I have also looked for Mesa3D's DLL .. but I don't have it anymore and I couldn't find a precompiled one - however, that would be worth a try as well. Cheers, Sven +++ Recap WindowsWGLDynamicLibraryBundleInfo, which uses GLDynamicLibraryBundleInfo default setting: shallLinkGlobal() { return false; } shallLookupGlobal() { return false; } linkGlobal==true leads to -> dynLink.openLibraryGlobal(path...); and makes no diff on the Win32 code, since both local/global 'open' are equal. lookupGlobal==true just leads to NOP, and hence lookupLocal is being used. Result: it's using the 'local' codepath regardless of the settings .. +++ I sent Beth one more test case that removes one lookup of wglGetProcAddress that I hadn't noticed last time. This probably makes no difference, I just wanted to rule it out. I'll also try sending Beth a test that launches using "java -Xss4096k" as a sanity check. Since we're getting an EXCEPTION_STACK_OVERFLOW, I guess it's possible that her video driver needs more stack than the default 512KB that the JVM uses. But the binary drivertestcpp.exe that I gave Beth only had a 200K stack size (apparently that's the default for gcc & ld under MinGW), so I'm not sure this really makes sense. I verified that the WGL function pointers in Beth's log file have the exact same addresses that they do on my Windows XP system, so the function pointer lookup seems correct. If this latest test still fails, next I'll try statically linking to WGL to avoid all function pointer lookup and use. Created attachment 281 [details]
Log file - minimal 10/13
Created attachment 282 [details]
Error file - minimal 10/13
Did a bit more analysis last night: - Verified that we're loading opengl32.dll from the correct location (C:\WINDOWS\system32\OpenGL32.dll) - Checked the amount of stack used when the exception happens, and it's very small (the stack grows towards low addresses from 0x00910000, and it's only at 0x0090fab8 when the stack overflow occurs). Supposedly the stack should be able to go all the way down to 0x008c0000, about 327 KB, so the program is nowhere near that. - Disassembled the code around the exception, which happens at 0x69e39e87 69e39e68: 080B OR [BP+DI],CL 69e39e6A: C1 DB C1 69e39e6B: 5F POP DI 69e39e6C: 5E POP SI 69e39e6D: 5B POP BX 69e39e6E: C9 DB C9 69e39e6F: C3 RET 69e39e70: 51 PUSH CX 69e39e71: 3D0010 CMP AX,1000 69e39e74: 0000 ADD [BX+SI],AL 69e39e76: 8D4C24 LEA CX,[SI+24] 69e39e79: 087214 OR [BP+SI+14],DH 69e39e7C: 81E90010 SUB CX,1000 69e39e80: 0000 ADD [BX+SI],AL 69e39e82: 2D0010 SUB AX,1000 69e39e85: 0000 ADD [BX+SI],AL 69e39e87: 8501 TEST AX,[BX+DI] 69e39e89: 3D0010 CMP AX,1000 69e39e8C: 0000 ADD [BX+SI],AL 69e39e8E: 73EC JNB 007C 69e39e90: 2BC8 SUB CX,AX 69e39e92: 8BC4 MOV AX,SP 69e39e94: 8501 TEST AX,[BX+DI] 69e39e96: 8BE1 MOV SP,CX 69e39e98: 8B08 MOV CX,[BX+SI] 69e39e9A: 8B4004 MOV AX,[BX+SI+04] 69e39e9D: 50 PUSH AX 69e39e9E: C3 RET 69e39e9F: 8B4424 MOV AX,[SI+24] 69e39eA2: 0483 ADD AL,83 69e39eA4: C0 DB C0 69e39eA5: E0C3 LOOPNZ 006A 69e39eA7: 65 DB 65 69e39eA8: 3A0D CMP CL,[DI] It's unclear how "TEST AX,[BX+DI]" could be causing a stack overflow, since it doesn't modify SP. So it must be that stack overflow is not a precise exception, and SP is hitting an address protected with PAGE_GUARD sometime after 0x69e39e87. The instruction at 0x69e39e96, "MOV SP,CX" must be the culprit. This makes sense since ECX = 0x008c0ac0, which is close enough to the bottom of the stack to hit the guard page. It doesn't look like the most common cause of stack overflow (namely infinite recursion). It might be some calling convention problem that results in a too-large value being added to the stack pointer. So it sounds like trying static linkage to WGL next is the way to go. Status report: Still working on getting static linkage set up for WGL with GlueGen. It looks like I just need to use JavaEmitter instead of GLEmitter, but I need to resolve some GlueGen errors in the code emission process. I did figure out a possible theory of this bug. When the NVIDIA driver DLL loads, it could be hooking/patching some of the entry points of opengl32.dll, which would cause our saved function pointers to become stale. I'll try a version of the test program that re-checks these pointers after the NVIDIA driver load (which happens during SetPixelFormat()) just to be sure. Just emailed Beth a new test I finished up last night. It uses JNI to link directly to opengl32.lib/.dll for the WGL functions, instead of querying function pointers and calling through them like JOGL does. I created a WGLStatic that sits beside WGL to do this, so I can use them both at once and compare the results if needed. If this test passes, it will mean something is wrong with JOGL's treatment of WGL function pointers (either the NVIDIA driver is hacking or changing them, or we're somehow not querying/using/storing them incorrectly on some systems). Created attachment 284 [details]
Log file - Statically linked WGL
Created attachment 285 [details]
Error file - statically linked WGL
OK, the latest log shows that the test still fails, even when I link WGL with the DLL import library (instead of using function pointers and LoadLibrary). There are many more things I can try with the JOGL 2 test, but first I'd like to see if JOGL 1.1.1a works on Beth's machine. I wrote a quick test for this last night, and I'll email it to her shortly. If JOGL 1.1.1a works, this will let us know that the problem is something that's been changed in JOGL 2 (so it should be relatively easy to find). Looking back through this bug report again, I remember now that Beth reported that JOGL 1.1.1.a fails too, but we never got any logs of it failing. So hopefully this new JOGL 1.1.1a test will give us a different failure message than JOGL 2 that will help narrow things down. Created attachment 286 [details]
Log file - jogl 1.1.1a
Created attachment 287 [details]
Error file - jogl 1.1.1a
Looks like JOGL 1.1.1a fails in exactly the same way as JOGL 2 - at the same address inside nvoglnt.dll, of the same stack overflow. I'll go back to stripping down the statically linked JOGL 2 test to resemble the C++ test more. Since we know that the C++ test works, there's got to be a way to get the JOGL test to duplicate its results. I can also add some code to the C++ test to verify that it's seeing the same opengl32.dll and nvoglnt.dll that the Java version sees. I made a few more changes to the test case last night and emailed it to Beth just now. This version manually loads the JOGL libraries with System.loadLibrary() instead of invoking any JOGL code at all. This avoids trying and failing to load nativewindow_x11, and also avoids loading opengl32 redundantly now that it's linked via import library to jogl_desktop.dll. This version also sets the C stack ridiculously large with -Xss4096k just to see if this NVIDIA driver has strangely large stack requirements. At the least, increasing stack size should change where the error happens. Created attachment 288 [details]
Log file - 10/27 version
This test did not produce an error file, so I think it succeeded. No window was made visible though. Hopefully that's a good sign?
(Hmm, my emailed comment doesn't seem to have made it -- retyping in the web form) Yep, it finally worked! This test doesn't create a window, it just writes a log, so this is the expected behavior. I changed two things though, so we'll need to check which one fixed the problem. Could you edit the run.bat file of that latest test to remove the -Xss4096k and then rerun it? If the failure comes back, this means the -Xss4096k fixed the problem. In that case, try adding the same -Xss4096k to the JVM options of your own JOGL program and see if that makes it work. Hey, it's fixed! The -Xss4096k argument did the trick, on all of my tests! Thank you so much, Wade, for sticking it out and tracking down the problem! Glad to hear it works! Sorry I took such a long way around to finding this -- when hearing hoofbeats, I need to think "horses", not "zebras" :) One other thing you might want to do: try reducing the -Xss4096k by powers of two until you find the smallest size that works. This will keep your threads' memory footprint from being needlessly large. This problem appears to be an NVIDIA driver mistake (though not strictly a driver bug, since it can still work). The driver code is doing something like this: void somefunc( bigstruct b ); when it should have done this: void somefunc( bigstruct *b ); So they're pushing a huge amount of data (hundreds of KB) on the stack in just one function call, instead of passing a pointer like they ought to (which would just use 4B). The JVM sets a smaller stack size for its threads than Windows does for executables, so that's why it works from C++ but not from Java (unless you manually set -Xss). |