I’ve spent the last week or so trying to think of the right name for what I want to do here, astroBugs seems just on the right side of the cheeky-serious gradient.
I’ve been privileged enough in my education and career thus far to be exposed to more computer science topics than I think the average astronomer is. Part of this is just where my past and current advisors research interests are and part is just innate interest; however, whatever the reason, that past exposure has been very influential in how I approach astronomy research as I go forward.
Anyone who’s studied astronomy or astrophysics in the last 50 years has undoubtedly spent time wrangling computers to their whims, they are as fundamental to the modern trade as the lens has been since 1609. Despite the computer’s importance, it has been my experience that astronomers tend to learn how to program from other astronomers, who themselves have learned from other astronomers. As the saying goes…it’s astronomers all the way down. The effect is multifaceted; however, some major downsides are:
- Archaic programming techniques stick around longer than they need to (N.B. I do not say languages here…Fortran is a great tool, FORTRAN 77 should no longer be the style of Fortran written)
- Younger students bring in new languages without having mastered either those languages styles or programming in general yet. This leads to programming practices from older language (the language their advisor wrote in more often than not) getting merged into new ones…eventually resulting in something of a perpetual stew of code.
- Powerful, and often easy to use tools, are left on the table due to lack of exposure within the astronomy community.
There are plenty of astronomers that aren’t affected in any major way by these issues. Certainly, the people behind the likes of astroPy, specutils, IRAF community, and many other wonderful astronomy packages are incredible developers. However, plenty of astronomers, at least anecdotally, feel like they could make better use of their time at their computer.
The Golden Rule
First, we need to defend that the points made above are actually downsides. FORTRAN 77 and IDL have given their proverbial blood and sweat to the astronomical community. Why should we use anything else? Whats the problem with having a mish-mash of programming styles if it works, and why use new powerful tools when older simple ones can get the job done? Of course answer are, in some general form, very hard to give. In fact there are times when FORTRAN 77 is the best choice, and its okay to throw in some C style syntax to a Python program. The Golden Rule is that you should do what maximizes reproducibility and minimizes digital pain. If you are doing something quick and dirty, which only needs to be done once, it matters very little how you go about it. It’s when you start writing scripts to be run many times, or any time the analysis needs to be re-run that it is important to start thinking about the software development side rather than just the programming.
This Golden Rule will inform the target of astroBugs. While all are certainly always welcome, this blog will be aimed pretty squarely at the astronomers who may feel comfortable parsing RA and Dec from a FITS header but may back away from the idea of using a bash script to automatically do that recursively through a directory structure. This is aimed at the astronomer who has used Pandas but not SQL, at the astronomer who writes python code and wants it to be faster but hasn’t dived into pdb or used snakeviz. In brief this is aimed at astronomers who use computers daily but feel that there some power behind the screen they have yet to tap into. Every so often I will post something a bit more complicated, sometimes these will be little more than a diary as I learn that particular skill (a recent example which comes to mind is a short foray I took into writing compiled C extensions for python to speed up parsing of large isochrone files). More often than not however, computer scientists, and those astronomers who are closer to the computer science end of things will find little of practical interest here.
Many of the posts here will focus on speeding up development of small tools (e.g. using command line utilities like sed and awk to automate simple tasks). Other posts will discus oft forgotten elements of the standard library (zip and enumerate always come to mind when thinking about Python). Finally I want to explicitly recognize that it is often times the case that large codebases cannot and, in fact, should not have drastically different programming styles forced upon them. Rather, most of what I will present here may be helpful at you write new code.
About Me
I’m (currently) a third year graduate student studying stellar evolution and the chemical composition of globular clusters at Dartmouth College. I help maintain and develop the Dartmouth Stellar Evolution Program (DSEP, I’d provide a link; however, it is closed source), a very efficient stellar evolution program (for stars below a mass ~5Msolar). My work with DSEP built off of experience I’ve gained from writing Python, C, and Fortran programs for research since I was a freshman in college at High Point University. I still use Python, C, and Fortran on a daily basis. I try to keep all of my work controlled on GitHub; however, a few of my favorite projects are necessarily private due to DSEPs licensing. The work for this blog will be stored on github so you can follow along at home if you’d like.