b4b: Move the changes to initialize the task decomposition from mpi_scan to main development#3666
b4b: Move the changes to initialize the task decomposition from mpi_scan to main development#3666ekluzek wants to merge 67 commits intoESCOMP:b4b-devfrom
Conversation
Conflicts: src/cpl/share_esmf/lnd_set_decomp_and_domain.F90
…_and_domain_from_readmesh
…_readmesh subroutine
…e destroyed so remove the destroy for the distgrid, and the two meshes, this runs but doesn't seem to lower memory
…e about leaving the distgrid around, and also delete the meshes as it seems to work with this in place
Conflicts: src/main/decompInitMod.F90
Conflicts: src/cpl/share_esmf/lnd_set_decomp_and_domain.F90
…ted error messaging Conflicts: src/main/decompInitMod.F90
…he subname, create new internal subroutines in decompInit_lnd for allocate, clean, and check errors, move the check errors part to the first thing done Conflicts: src/main/decompInitMod.F90
…array sizes are set before allocates, initialize some decompMod values to invalid for error checking, add error checking to get_proc_bounds/get_proc_clumps, seperate out allocate for gindex to own allocate method, as it has be be done later after decomp is done, these are all improvements in ESCOMP#3448 Conflicts: src/main/decompInitMod.F90
… add error handling of nsegspc, don't check endCohort in get_proc_bounds and get_clump_bounds as doesn't seem to be set Conflicts: src/main/decompInitMod.F90
…etup/clean for each DecompInit test, move the decomp_mod_clean to decompMod and use it for the decompInit tests Conflicts: src/main/decompInitMod.F90 src/self_tests/TestDecompInit.F90 --- removed
…re to the regular operation Conflicts: src/main/decompInitMod.F90
…mpi-serial Conflicts: src/main/decompInitMod.F90
Conflicts: src/main/decompInitMod.F90
…sor_type structure, and start adding a couple methods to help get them set
…for mpiscan and verify it, allocate the new procinfo gi and gj indices, make sure they are set, compiles but fails at run Conflicts: src/main/decompInitMod.F90
…the call expects all subgrid levels to be set
Conflicts: src/main/decompInitMod.F90
…ocate for the local task, this works for the serial case Conflicts: src/main/decompInitMod.F90
… clump_pproc for the setting of ggidx Conflicts: src/main/decompInitMod.F90
…moved for the final version Conflicts: src/main/decompInitMod.F90
…s for serial mode Conflicts: src/main/decompInitMod.F90
…global clumps Conflicts: src/self_tests/TestDecompInit.F90
… the log that aren't needed anymore Conflicts: src/main/decompInitMod.F90
…f there is also an endrun in the it, so calc_globalxy_indices will need to be changed to a pure function that returns values that can be checked at the call level
…urrent unit tests now work
… remove pure from the calc_ routines in decompMod, and write out information, the unit test does work now
…r returns as nglob_x/nglob_y should be set before used
…en't covered in the more public calc_globalxy_indices
…in each test, and add some tests that notes suggested
…e set_decomp_info subroutine and make it public for unit testing, this doesn't work as it fails in clm_ptrs_compdown, and the endrun has as a bad index of -9999 is given to LONDEG
|
I remember where I got stuck with this. I was making changes to comply with what we decided in #3476, and got stuck in some of the unit testing for that. |
| !write(iulog,*) 'WARNING: Global gi index is out of bounds' | ||
| return | ||
| end if | ||
| if ( (this%gj(g) < 1) .or. (this%gj(g) > nglob_x) ) then |
There was a problem hiding this comment.
@briandobbins noticed a correction here. The above should be nglob_y
There was a problem hiding this comment.
Add a unit-test that would detect this problem
| call endrun(msg="nclumps is NOT set before allocation", file=sourcefile, line=__LINE__) | ||
| return | ||
| end if | ||
| ! TODO: This will be moved to the other allocate and for a smaller size ---- |
There was a problem hiding this comment.
Add the issue number.
|
Notes from meeting today with @ekluzek and @briandobbins:
|
We later realized that this isn't critical to do. Since, the number of clumps is basically the number of processors, and even though that's on every task -- the memory for it isn't as big as for example the number of land grid cells. So we can let #3466 happen when there's time for it. |
…bscripts are set properly and write them out in a shared section, rather than for each subgrid level, also do returns after some of the endrun calls which is needed for the PF unit tests
|
#3931 had an impact here and is the reason behind the unit tests not working. Taking this into account should allow me to make progress again. |
Description of changes
This moves the core code changes to initialize the processor decomposition from the mpi_scan branch in #3469 to b4b-dev. This removes some of the changes for memory checking and additional self testing as well as some of the additional timers that don't look useful now.
I created two previous branches before I created the process in #3665 where I worked out the details to NOT make this have too many commits and be hard to do. I also figured out how to remove merge commits as they need special handling, and usually aren't wanted in a case like this. Another way to do this would be to do this outside of git, which might have been similar length as the final version, but could've missed some important changes.
Specific notes
Contributors other than yourself, if any: John Dennis
CTSM Issues Fixed (include github issue #):
Fixes #3370
Fixes #3368
Fixes #3672
Some work on #3448
Some work on #3476
Are answers expected to change (and if so in what way)? No (the determination of the decomposition is identical as well)
Any User Interface Changes (namelist or namelist defaults changes)? No
Does this create a need to change or add documentation? Did you do so? No No
Testing performed, if any: Will run standard testing
The mpi_scan testing branch has had all the test lists run for it: aux_clm, ctsm_sci, decomp_init, decomp_init_uhr, and fates