Conversation
So far I have found differences in MTU and using (or not using) TIER1 network, which would influence bandwidth. I am preparing a size32 study directory anticipating testing this. Signed-off-by: vsoch <vsoch@users.noreply.github.com>
|
This likely won't be merged, but I'll add the results (from when I ran them) for transparency. This thread is from December 15th 2024. Some notes for retesting compute engine with the notes I made above. First, we still can't get COMPACT, even for 10 nodes they spin indefinitely, it reaches some timeout around 15-16 minutes, and then starts again and I think this would go on forever. These are c2d-standard-112. I'm going to restart without COMPACT. It's not looking any faster, but I'll wait to do 3 iterations at both sizes to say that for sure. Results! This is for size 32.
My early conclusions:
TLDR: I would not blame the difference between GKE and compute engine on MTU or TIER-1, at least for LAMMPS. I don't know what else we could look at that we did "wrong" because we can't get COMPACT or better resources. Anyway, I guess burned a few hundred dollars and a big chunk of today, was worth a try anyway. I had tiny hopes. |





So far I have found differences in MTU and using (or not using) TIER1 network, which would influence bandwidth. I am preparing a size32 study directory anticipating testing this. We were never able to get COMPACT mode on compute engine so I'm thinking that will still be the case. I haven't found other differences yet but am still looking.