Rich: Ok, here goes. Attached are the following files: calldynamo # fragment from advance.F in which dynamo is called defs.h # header file with grid parameters dynamo.F # main dynamo module source file mp_updatephi.sub # sub used to scatter elec potential to non-root tasks params.F # main model parameters (s.a., defs.h) perf.ps # perf test results for 1,4,8,16,24,32 pe's (secs/step) perf_dyn.ps # effect of dynamo on perf test (% elapsed in dynamo) tiegcm1.8_eldynamo.ps # description of dynamo implementation by Astrid Maute The gather of fields on the geographic grid to root task as input to the dynamo code is done in sub mp_dynamo_gather (called by sub prep_dynamo), in dynamo.F. Output of dynamo module is the 3d electric potential (phi), which is scattered back out to non-root tasks by sub mp_updatephi. I just noticed that I should probably be using non-blocking isend and irecv in mp_updatephi, altho we have not had problems with it as is... The calls to prep_dynamo, dynamo, and mp_updatephi are in the calldynamo file, which is a fragment taken from a driver inside the main time loop. Please ignore subs ending in "_dyn0" in dynamo.F -- these are called only if the model is being run without electrodynamics (i.e., almost never, or for debug only). The simple performance tests were made last spring with the 2.5 deg resolution TIEGCM for 1-day runs on lightning, bluesky and bluevista. We generally run this code with 16 processors. Recent similiar tests were performed with our larger model (TIMEGCM), at the same resolution, and it scaled to 48 processors. The dynamo code is essentially the same in both models, so I think this reflects the fact that the dynamo is a smaller percentage of the total code of TIMEGCM than of the TIEGCM. I chose to give you source from TIEGCM because that's the model described in Astrid's doc, but the source and method is basically the same in both models. The entire tiegcm1.8 source can be found in /fis/hao/tgcm/tiegcm1.8/src. The PDE solver is called by sub dynamo (dynamo.F). The source files are mud.F, mudcom.F, and mudmod.F in the above directory. These are slightly modified versions of John Adams' MUDPACK multigrid solver, see http://www.cisl.ucar.edu/css/software/mudpack/ The apex coordinate code is called by all mpi tasks only once per run, so I have not included it here. The source is apex.F and apex_subs.F, also in the above source code directory. I have not done timing in the dynamo itself for a long time, but as I recall, much of the cpu time is spent doing the field-line integrations (sub fieldline_integrals), called from sub transf (dynamo.F) from a mag latitude loop labelled "maglat_loop". It is this loop that I've considered threading with OpenMP as a first step. As I mentioned, I think the field lines are mainly independent of each other, so an option might be to decompose into field line "bundles", rather than across the horizontal magnetic grid. I think that's about it. I am cc'ing Astrid Maute (maute@ucar.edu). Rich, I really appreciate your time and help on this problem. Please let me know if there is anything else you need right now...maybe after the guys at Parallel Software Products take a look, if we are still go, I can write up a small code to run the dynamo as a stand-alone, without the bulk of the rest of the model. --Ben