Crash on load from within protobuf in precompiled tensorflow 1.1 #26

admsyn · 2017-06-14T16:24:57Z

Hey Memo! This is with a fresh Ubuntu 17.04 and OF master (2c7b719).

fl@mallet:~/workspace/openFrameworks/addons/ofxMSATensorFlow/example-pix2pix/bin$ gdb -q ./example-pix2pix
Reading symbols from ./example-pix2pix...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/fl/workspace/openFrameworks/addons/ofxMSATensorFlow/example-pix2pix/bin/example-pix2pix
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
0x00007ffff113404d in google::protobuf::FileOptions::MergePartialFromCodedStream(google::protobuf::io::CodedInputStream*) ()
   from /home/fl/workspace/openFrameworks/addons/ofxMSATensorFlow/libs/tensorflow/lib/linux64/libtensorflow_cc.so
(gdb)

I get this from example-basic as well
The build config is the "default" from running make
libtensorflow_cc.so is from lib_TF1.1_linux64_OPT_CUDA8.0_CUDNN5.1_2017_05_17.tar.gz
I also have tensorflow-gpu, the magenta repo, and fast-style-transfer installed and working fine separately (virtual envs, not used in the context of OF or ofxMSATensorFlow)

Full linux version:

Linux mallet 4.10.0-22-generic #24-Ubuntu SMP Mon May 22 17:43:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

More on this story as it develops..

EDIT: This does not happen with lib_TF1.0_linux64_NOPT_CUDA8.0_CUDNN5.1_2017_02_22.tar.gz, which works

The text was updated successfully, but these errors were encountered:

memo · 2017-06-14T21:19:22Z

Hey, I'm still on 16.04 so I don't know if that is related in any way.

Interesting that there's no issue with the older lib. AND that one is a NOPT build (with debug info). I just realised that I didn't provide a NOPT build for TF1.1. I'll try to do this, but I'm travelling these days and could be difficult. Was the crash during a debug build or release? Could you try a release build? Previously I was encountering crashes (segmentation fault) when running a debug build app with the release build lib. However this hasn't been an issue for me lately.

admsyn · 2017-06-15T15:06:09Z

This comment implies it could be an SSE support issue, and given that the non-optimized version is the one that works on my setup it seems reasonable that it has something to do with that..

I'll continue digging into it.

This processor is an old-ish i7-3820. Here's my grep sse < /cpu/procinfo :

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts

somewacko · 2017-11-25T06:19:37Z

(Haven't used this project, but chiming in from a TF thread)

@admsyn Are you dynamically linking another version of protobuf (either in your program, or possibly in another library you're using). One incredibly sneaky thing that TF's C++ API does is that it requires you to include all of it core internal code which includes TF's protobuf objects in its headers, which causes your program to become implicitly dependent on protobuf. Because of this, if you dynamically load a version of protobuf that is different than the one TF is compiled with somewhere, you get mysterious crashes like these since there is a mismatch between the protobuf your program is using, and the protobuf that is statically linked inside of the TF library. It's possible that this was causing the crash you're experiencing.

I've ran into this with a different project, and the workaround seems to be one of:

Make sure you are using the same version of protobuf everywhere, including downstream dependencies (can be difficult/impossible for complex projects)
Use a script in TensorFlow's source code that converts your protobuf source so that the namespace is named proto3 instead of proto.
Use the C API, which provides an actual interface layer that separates your code from TF's internals (not applicable for this project though)

The core problem is with how TF's C++ API is designed, so there's no real solution for this, but it is something to be aware of when integrating TF in other C++ projects.

admsyn changed the title ~~Crash on load from within protobuf lib in precompiled tensorflow~~ Crash on load from within protobuf in precompiled tensorflow 1.1 Jun 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash on load from within protobuf in precompiled tensorflow 1.1 #26

Crash on load from within protobuf in precompiled tensorflow 1.1 #26

admsyn commented Jun 14, 2017 •

edited

memo commented Jun 14, 2017

admsyn commented Jun 15, 2017 •

edited

somewacko commented Nov 25, 2017

Crash on load from within protobuf in precompiled tensorflow 1.1 #26

Crash on load from within protobuf in precompiled tensorflow 1.1 #26

Comments

admsyn commented Jun 14, 2017 • edited

memo commented Jun 14, 2017

admsyn commented Jun 15, 2017 • edited

somewacko commented Nov 25, 2017

admsyn commented Jun 14, 2017 •

edited

admsyn commented Jun 15, 2017 •

edited