ARTEMIS-5376 Include all messages in queue management operations#5593
ARTEMIS-5376 Include all messages in queue management operations#5593AntonRoskvist wants to merge 1 commit intoapache:mainfrom
Conversation
4f00574 to
7522389
Compare
|
|
|
I saw Justin started a run of your PR in one of our CIs.. Just out of coincidence I saw that my Jenkins was stuck, I logged into the POD and I saw this deadlock while running org.apache.activemq.artemis.tests.integration.management.ManagementWithPagingServerTest at first I thought it was non related to your PR (as I am working hard on paging right now to make sure there are no deadlocks), but as i read through your changes you changed some Iterators to use iterQueue.. and iterQueue itself will have a different locking than the previous one that was using synchronized. I think this introduces a real issue / deadlock.. you can probably flush this out by running ManagementWithPagingServerTest in a loop with your IDE (Idea has some options for that) I also suggest rebasing, but the issue I saw it wouldn't be fixed by any rebasing.. just to be on the tip of the branch. |
7522389 to
4bc3529
Compare
|
Thanks for reviewing @jbertram and @clebertsuconic. I believe I have found the issue for both points and I'm pushing my changes now, though I will make a second check tomorrow as well to make sure I didn't miss anything (running the full test suite takes some time on my work laptop). @jbertram I believe the addition of a short wait was all that's needed for the LVQ-test, please let me know if you think this is incorrect. @clebertsuconic Nice catch, with moving the Again, I will run additional tests on this and update with what I find. Thanks |
3cbc925 to
59a603d
Compare
|
Sorry for taking so long to get back to you on this. I believe these changes should cover the cases you noted. I don't really like the change on how to handle the LVQ case, though at this moment I don't know how to handle it differently while also making use of the Regardless, this should be working as intended and without locking... if you have any suggestions on a better approach for the LVQ case just let me know. |
|
I rebased this and ran CI on it and there are failures related to LVQ semantics in:
The problem appears to be that invoking |
e303b4d to
a1322e1
Compare
|
Okay, I'll take a look at that. Disregard the push I made earlier, it was caused by attempting to rebase my branch through github, won't be trying that again... |
a1322e1 to
600a8c5
Compare
|
This should be looking better. Wish I could say the full test suite passes with these changes but I'm having some issues running the suite in it's entirety at the moment (against both main and this PR) where many tests are failing intermittently (mostly around AMQP and clustering/bridging). It's probably something on my end but I have yet to figure out what... in any case, it's making proper validation of the changes difficult. I will look into that during next week but until then, from all I've been able to test and validate, this is looking good. |
Currently, some queue management operations act on all messages (including paged and scheduled messages) and some don't. Some operations use their own iterator logic (internally) and some use a common method.
This change aims to improve this by adding some functionality into the 'QueueIterateAction' class and then have additional management operations use it (so that their behavior is consistent with regards to what messages they target and how)
As of now I have left expire and delete reference/references out as I wasn't 100% sure it wont have any adverse effects. If desired, they should be trivial to add as well.