Rich:

Ok, here goes. Attached are the following files:

calldynamo		# fragment from advance.F in which dynamo is called
defs.h			# header file with grid parameters 
dynamo.F		# main dynamo module source file
mp_updatephi.sub	# sub used to scatter elec potential to non-root tasks
params.F		# main model parameters (s.a., defs.h)
perf.ps			# perf test results for 1,4,8,16,24,32 pe's (secs/step)
perf_dyn.ps		# effect of dynamo on perf test (% elapsed in dynamo)
tiegcm1.8_eldynamo.ps	# description of dynamo implementation by Astrid Maute

The gather of fields on the geographic grid to root task as input to the dynamo 
code is done in sub mp_dynamo_gather (called by sub prep_dynamo), in dynamo.F.
Output of dynamo module is the 3d electric potential (phi), which is scattered
back out to non-root tasks by sub mp_updatephi. I just noticed that I should
probably be using non-blocking isend and irecv in mp_updatephi, altho we have
not had problems with it as is...

The calls to prep_dynamo, dynamo, and mp_updatephi are in the calldynamo file,
which is a fragment taken from a driver inside the main time loop. 

Please ignore subs ending in "_dyn0" in dynamo.F -- these are called only if
the model is being run without electrodynamics (i.e., almost never, or for debug only).

The simple performance tests were made last spring with the 2.5 deg resolution
TIEGCM for 1-day runs on lightning, bluesky and bluevista. We generally run this 
code with 16 processors.  Recent similiar tests were performed with our larger model
(TIMEGCM), at the same resolution, and it scaled to 48 processors. The dynamo
code is essentially the same in both models, so I think this reflects the fact
that the dynamo is a smaller percentage of the total code of TIMEGCM than of the 
TIEGCM. I chose to give you source from TIEGCM because that's the model described
in Astrid's doc, but the source and method is basically the same in both models.

The entire tiegcm1.8 source can be found in /fis/hao/tgcm/tiegcm1.8/src.

The PDE solver is called by sub dynamo (dynamo.F). The source files are mud.F, 
mudcom.F, and mudmod.F in the above directory.  These are slightly modified versions 
of John Adams' MUDPACK multigrid solver, see 
http://www.cisl.ucar.edu/css/software/mudpack/

The apex coordinate code is called by all mpi tasks only once per run, so I have
not included it here. The source is apex.F and apex_subs.F, also in the above
source code directory. 

I have not done timing in the dynamo itself for a long time, but as I recall, much
of the cpu time is spent doing the field-line integrations (sub fieldline_integrals),
called from sub transf (dynamo.F) from a mag latitude loop labelled "maglat_loop".
It is this loop that I've considered threading with OpenMP as a first step.  As I 
mentioned, I think the field lines are mainly independent of each other, so an option 
might be to decompose into field line "bundles", rather than across the horizontal 
magnetic grid.

I think that's about it. I am cc'ing Astrid Maute (maute@ucar.edu).

Rich, I really appreciate your time and help on this problem. Please let me know
if there is anything else you need right now...maybe after the guys at Parallel
Software Products take a look, if we are still go, I can write up a small code to 
run the dynamo as a stand-alone, without the bulk of the rest of the model.

--Ben