Updated Sylvan to version 1.9.1#1882
Conversation
|
In Sylvan 1.9.1. the |
|
Ideally you should not need to make any changes to Sylvan/Lace so that future upgrades are seamless. |
There was a problem hiding this comment.
Probably cleaner to do this in CMakeLists.txt of mCRL2.
There was a problem hiding this comment.
I already include this in the v2.x.x branch of Lace, I can backport the -fPIC stuff to the lace-1 branch if that makes this cleaner.
There was a problem hiding this comment.
I don't remember why -fPIC was required, but this would probably be helpful indeed. The set(LACE_ENABLE_PIC ON CACHE BOOL "" FORCE) trick does not seem to be sufficient to resolve the linking errors:
: && /home/mlaveaux/.cargo/bin/sccache clang++ -fPIC -O2 -g -DNDEBUG -shared -Wl,-soname,libmcrl2_symbolic.so -o stage/lib/libmcrl2_symbolic.so libraries/symbolic/CMakeFiles/mcrl2_symbolic.dir/source/ldd_stream.cpp.o -Wl,-rpath,/home/mlaveaux/mCRL2/build/stage/lib: stage/lib/libmcrl2_data.so 3rd-party/sylvan/src/lib/libsylvan.a stage/lib/libmcrl2_core.so stage/lib/libmcrl2_atermpp.so stage/lib/libdparser.so stage/lib/libmcrl2_utilities.so -ldl 3rd-party/lace/lib/liblace.a -lpthread -lm && :
/usr/bin/ld: 3rd-party/lace/lib/liblace.a(lace.c.o): relocation R_X86_64_TPOFF32 against `current_worker' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: failed to set dynamic section sizes: bad value
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
There was a problem hiding this comment.
That's because Lace doesn't have it yet, it's part of the changes in Lace v2. I can backport it to Lace v1.
There was a problem hiding this comment.
Ah my bad, I misunderstood. It seems that we can work around it from the mCRL2 side with these two lines, so perhaps backporting is not necessary:
set_target_properties(sylvan PROPERTIES POSITION_INDEPENDENT_CODE TRUE)
set_target_properties(lace PROPERTIES POSITION_INDEPENDENT_CODE TRUE)
|
Is the current design intended to use the C++ wrappers from different threads, or are ltsreach and pbessolvesymbolic by themselves single-threaded? If every call is from the same thread, then this is fine. Otherwise, some work needs to be done to ensure that garbage collection doesn't erase BDDs that are only referenced by a result that is currently only stored as a return value from an external task. For example by wrapping each (!) call in something like this: VOID_TASK_3(bdd_and_wrapped, mtbdd*, result, mtbdd, lhs, mtbdd, rhs)
void bdd_and_wrapped(mtbdd* result, mtbdd lhs, mtbdd rhs) {
*result = CALL(bdd_and, lhs, rhs);
} |
|
@trolando. These tools are all single threaded and only call Sylvan operations from the main thread. However, from what I remember we call some functions that are not guarded by the RUN macro. I suppose the easiest solution would be to wrap the whole tool implementation inside a lace task. |
|
Yes, that would work. Wrap the whole program in a Lace task if this doesn't cause issues with thread-specific variables you are using and things like that. It will always be an improvement if you can wrap a single-threaded program into a Lace task because then there's less overhead from the context switches. In Sylvan 2.0 I am very likely going to make some changes here though. One of the goals is easier support for Java, Python bridges etc. Most likely all methods will move to the "bdd_op(result_ptr, param1, param2, ...)" format for any operation which computes a BDD. I also intend to change it so if there is an out of memory, it sets the result to a magic value SYLVAN_OUTOFMEM or something like that, instead of crashing out. This also enables a potential future option to use the return value for something else. |
|
Let me know if there's anything I should do on the Sylvan/Lace side. The current roadmap is that I'm adding "proper" Windows (MSVC) support to Lace and Sylvan. I have completed this for Lace 2.0 and am working on the Sylvan version. How much urgency is there for supporting Windows natively? I could try to backport these changes to the Lace 1.x branch and the Sylvan 1.x (master) branch, which would then allow mCRL2 to use Sylvan in Windows (MSVC). Currently Windows is supported via MSYS2 (POSIX). |
|
@trolando I think most of our users know how to use WSL by now since also our compiling rewriter also does not work on Windows/MSVC. So I think backporting might not be that urgent, but having MSVC supports would be nice. The lace macros contain code that is not valid in C++ code: Before this was not so much of an issue since it was all in Sylvan itself. We can probably ignore this for the time being, but I think given the fact that it is somewhat necessary to use the lace tasks in the C++ side as well it would be nice if the macros only produced compliant C++ code. |
1aa9f4e to
feaadc7
Compare
|
Well, that's a pretty silly review, considering that the availability of Lace ensures that the I'm fairly sure the compound literals warning comes from something like this: ts = (TailSplitNA){{head,head+1}};which could probably be replaced by ts.ts.head = head;
ts.ts.split = head + 1; I just don't know if the generated assembly will be different. |
|
By the way you can if you want do a |
|
By the way which compiler gave the error |
|
I get these warnings on clang version |
|
It seems that thread sanitizer reports some data race, although it could be a false positive I thought it would at least be good to check with @trolando. There are also some reports of lace_start leaking a thread even though I do use |
|
Yes, there are quite a few intentional data races in Lace. The one reported here is on the There are more such data races, e.g., Regarding the leaking thread: if you inspect |
|
I think it should indeed be a relaxed atomic since data races are instantly undefined behaviour in C++. The relaxed atomics should be free on x86_64, but I don't know about ARM processors. The other issue is probably indeed not joining the thread, although I don't know what this would matter in C++. |
|
Is dealing with these data races and the threading stuff a show stopped? I can fix |
|
I think we can technically work around it for the time being by maintaining a suppression file for the thread sanitizer warnings. |
|
Okay, this was a fair bit of work. I think I managed to address the problem for now. Lace v1.5.2 should not trigger the sanitizer anymore. Not sure about Sylvan, though. Some of these are fairly annoying to deal with. For example, after allocating memory with |
|
Most of the data races seem to be resolved @trolando, there is still one being reported within Sylvan itself: EDIT: I forgot to mention, but this is with the latest version of Sylvan and lace 1.5.2. Furthermore, it seems that the undefined behaviour sanitizer reports on some misaligned read. I don't know if this is something you intend to resolve, but I figured I would mention it any way. |
|
The misaligned read is something that's been there for as long as LDD has been in the package. mddnode_getvalue(mddnode_t n)
{
return *(uint32_t*)((uint8_t*)n+6);
}In Sylvan 2.0 this node structure will change so the misaligned read might also go away then. I could backport this change already to the 1.9.x versions however it also changes the binary file format so I'd rather wait until Sylvan 2.0 with that change. The error regarding The data race: one thread is doing a pretty standard table lookup, the other one is in the middle of garbage collection. That would be an error. Are you sure you are not calling |
|
You could update to Sylvan v1.9.4 and Lace v1.5.3 which should resolve some issues. |
Updated Sylvan to the latest version, this pull request is simply to run the CI. This should resolve thread safety issues for ARM processors.