Warning: This post may anger, perhaps for good reason, computer scientists and/or actual developers.
Recently I was trying to evolve models of low-mass stars (near the fully convective transition mass or ~0.3Msolar) using the Dartmouth Stellar Evolution Program (DSEP). DSEP is an older piece of FORTRAN 77 code (with some of the 9x variants thrown in there for good measure). Because I don’t generally consider myself a digital masochist, I’ve spent the better part of the last year developing a python wrapper (pysep) for DSEP. I think pysep is great, it allows me to much more quickly spin up models, to programmatically enforce consistency between input variables, and to package up models so they can be transparently run across any computer with pysep installed. However, I realized in running these low mass models another way in which pysep makes using DSEP dramatically easier. It let’s me paste python’s dynamic programming tools on top of a program that doesn’t have anything remotely similar. Let me explain….
Dynamic Programming
I want to be clear what I mean when I say dynamic programming, because some quick googling makes it seem that there is not a single clear name for what I am gonna talk about. When I say dynamic programming I mean writing code which changes itself at run time. I’ve heard people in talks refer to this as dynamic programming, and for the rest of this article I will refer to it as such; however, apologies if this is simply a colloquialism and not the appropriate term.
For those who hear the phrase “the code rewrites itself at run time” and may be confused that’s understandable. Dynamic programming is a powerful tool. It’s a tool that’s so powerful it should almost never be used and consequently is, in my experience, very rarely taught or even talked about with scientists. In Python, dynamic programming takes the form of the interpreter being able to execute strings into runnable python code (with the exec) or evaluate a string with a single valid python statement within (with eval). For example:
# Note how this is a string
dynamicFunc = """def helloWorld():
print("Hello, World!")
"""
# if we call helloWorld() here it will return an error as helloWorld() is not a defined function
# This line will evaluate the entire string as python code
# thus because helloWorld is defined as a valid function within the string dynamicFunc once exec(dynamicFunc) is run helloWorld() will be defined in the global scope
# Note that exec does not return anything, so exec(4+6) would run 4+6 but not store the output anywhere
exec(dynamicFunc)
helloWorld()
# eval, unlike exec, can be used to evaluate a single statement and will return the result
r = eval("4+6")
print(r)
Out[1]: Hello, World! Out[2]: 10
The above is a simple example, and functionally equivalent to just defining the function normally and then adding 4 to 6. Note however, how the actual function and addition statement are all within strings. It is the line using exec that takes that first string and evaluates it in such a way that subsequent lines can call the function, and the line with eval that takes 4+6 and does the math.
Where dynamic programming gets its power is when you realize that you can modify strings at run time very easily. You can change the behavior of functions just by modifying strings and running exec on that string again. Once you start using dynamic programming it can seem like a panacea. However, this style of programming leads to so many knock-on-issues (primarily related to code readability, debugability, and security) that you should avoid it almost any time it might seem helpful. This post is not about when you should not use dynamic programming tho; rather, its about one limited set of cases where it actually is very helpful.
Debugging
So let’s return to the models of low-mass stars I was running. I wanted to run these models at a very high resolutions. I have a function which takes in a some initial setup dictionary (containing the input physics for the stellar model), and returns a dictionary based on that setup one but with certain parameters tweaked so that the model runs at a higher resolution.
def set_high_resolution(pnml):
hr_pnml = pnml.copy()
hr_pnml['lenvg'] = True
hr_pnml['atmstp'] = 0.002
hr_pnml['envstp'] = 0.002
hr_pnml['tridt'] = 1.0e-4
hr_pnml['tridl'] = 8.0e-3
hr_pnml['atmerr'] = 3.0e-5
hr_pnml['atmmax'] = 0.05
hr_pnml['atmmin'] = 0.015
hr_pnml['atmbeg'] = 0.015
hr_pnml['enverr'] = 3.0e-5
hr_pnml['envmax'] = 0.05
hr_pnml['envmin'] = 0.015
hr_pnml['envbeg'] = 0.015
hr_pnml['htoler'][0][0] = 6.0e-7
hr_pnml['htoler'][0][1] = 4.5e-7
hr_pnml['htoler'][0][2] = 3.0e-7
hr_pnml['htoler'][0][3] = 9.0e-7
hr_pnml['htoler'][0][4] = 3.0e-7
hr_pnml['htoler'][1][4] = 2.5e-8
hr_pnml['shelltol'][1] = 0.0575
hr_pnml['shelltol'][7] = 0.01
hr_pnml['shelltol'][8] = 0.01
hr_pnml['shelltol'][9] = 0.01
hr_pnml['shelltol'][10] = 0.01
hr_pnml['niter2'] = 40
return hr_pnml
The details of what each parameter is are not important; what is important, is that when I would run models without passing them to this set_high_resolution function first they would evolve just fine. However, when I would pass them to this function they would immediately crash. It was clear that something about this high resolution mode was leading to a numeric instability in the models (this is a not uncommon occurrence with DSEP when pushing it to mass ranges much outside of its main focus). The question was then what parameter was causing this instability. If I knew this I could then lower that parameters resolution until models evolved successfully.
I certainly could have written code to go through and test each one of the dict items; however, that would have been somewhat clunky as I would need something like another dict formatted to hold both the low resolution and high resolution elements for each key and then would have to iterate over that dict trying both until something crashes. That would work, but I already had these high resolution parameters types out and did not want to spend the time copying them over to a dict. Enter a perfect use for dynamic programming.
Brandon Rhodes has a great talk where he expressed how dynamic programming’s greatest strength is allowing meta examination of the language itself. His primary example is how dynamic evaluation has made tools like the IPython kernel and subsequently Jupyter much easier to develop. Here, what would be great is a quick way to “turn off” lines of the set_high_resolution function one by one until it doesn’t crash (or in the other direction turn lines on one by one till it does). The meta examination we wish to do here is to exam which line of the function is causing DSEP to crash. Dynamic programming makes this trivial.
FDEF="""def set_high_resolution(pnml):
hr_pnml = pnml.copy()
hr_pnml['lenvg'] = True
hr_pnml['atmstp'] = 0.002
hr_pnml['envstp'] = 0.002
hr_pnml['tridt'] = 1.0e-4
hr_pnml['tridl'] = 8.0e-3
hr_pnml['atmerr'] = 3.0e-5
hr_pnml['atmmax'] = 0.05
hr_pnml['atmmin'] = 0.015
hr_pnml['atmbeg'] = 0.015
hr_pnml['enverr'] = 3.0e-5
hr_pnml['envmax'] = 0.05
hr_pnml['envmin'] = 0.015
hr_pnml['envbeg'] = 0.015
hr_pnml['htoler'][0][0] = 6.0e-7
hr_pnml['htoler'][0][1] = 4.5e-7
hr_pnml['htoler'][0][2] = 3.0e-7
hr_pnml['htoler'][0][3] = 9.0e-7
hr_pnml['htoler'][0][4] = 3.0e-7
hr_pnml['htoler'][1][4] = 2.5e-8
hr_pnml['shelltol'][1] = 0.0575
hr_pnml['shelltol'][7] = 0.01
hr_pnml['shelltol'][8] = 0.01
hr_pnml['shelltol'][9] = 0.01
hr_pnml['shelltol'][10] = 0.01
hr_pnml['niter2'] = 40
return hr_pnml"""
def comment_after_line(linen):
lines = FDEF.split('\n')
newlines = list()
for lineID, line in enumerate(lines):
if lineID == 0 or lineID == 1 or lineID == 27:
newlines.append(line)
elif lineID <= 2+linen:
newlines.append(line)
return '\n'.join(newlines)
print(comment_after_line(3))
Out[3]: def set_high_resolution(pnml):
hr_pnml = pnml.copy()
hr_pnml['lenvg'] = True
hr_pnml['atmstp'] = 0.002
hr_pnml['envstp'] = 0.002
return hr_pnml
What I have done here is take the set_high_resolution function and put it within a string, FDEF. I then have another function that takes in some number of lines, linen, and returns a new string with just linen number of high resolution elements set (counted from top to bottom) from FDEF. Basically, this lets me ask for a a version which only sets the first, first two, first three, first four, and so on number of options. This is still just a string however, we can’t yet use the dynamically modified function.
checkRange = range(25)
for disableNum in checkRange:
func2eval = comment_after_line(disableNum)
exec(func2eval)
usePHY = set_high_resolution(phys1Low)
... # Do stuff with the usePHYs dict such as run a model
Now on each iteration through the loop we rebuild the set_high_resolution function with one additional line. Then, whenever the code crashes we simply print what iteration the loop was on and we know which setting was causing the numeric instability.
This is certainly not a problem that had to be solved with dynamic programming; however, this took me about 3 minutes of typing to get it working and it solved my issue immediately. I would wager that if I had went the non-dynamic route, I would have spent a good bit more time getting everything set up without any ultimate difference in the results. Once the problem setting was identified (it was shelltol[1]) I was able to adjust its value upwards. Note, that I only used the dynamic version of set_high_resolution while identifying which line was the issue. After that, there was every reason to return to more standard, and safe, programming practices.
So the conclusion here is relatively simple, dynamic programming is great in certain situations, such as the one I showed here where the data you are interested in investigating is the code itself. In my experience this essentially boils down to: it can be useful when debugging. However, in most other cases it should be avoided as it can lead to very hard to read and debug code (and that’s not even getting into the security concerns…but those aren’t too relevant to astronomers). I encourage anyone who is looking for another tool in their debugging tool belt to at least try out dynamic programming.