... personal wiki, blog and notes
Why is climate modelling stuck?
Why is climate modelling stuck? Well, I would argue it's not stuck, so a better question might be: "Why is climate modelling so hard?". Michael Tobis is arguing that a modern programming language and new tools will make a big difference. Me, I'm not so sure. I'm with Gavin. So here is my perspective on why it's hard. It is of necessity a bit of an abstract argument ...
We need to start with the modelling process itself. We have a physical system with components within it. Each physical component needs to be developed independently, checked independently ... This is a scientific, then a computational, then a diagnostic problem.
Each component needs to talk to other components, so there needs to be a communication infrastructure which couples components. Michael has criticised ESMF (and by implication PRISM and OASIS etc), but regardless of how you do it, you need a coupling framework. This is a computational problem. I think it's harder than Michael thinks it is. Those ESMF and PRISM folks are not stupid ...
All those independently checked components may behave in different ways when coupled to other components (their interactions are nonlinear). Understanding those interactions takes time. This is a scientific and diagnostic problem.
We need a dynamical core. It needs to be fast, efficient, mass preserving, and stable in a computational sense. Stability is a big problem, given that the various parameterisations will perturb it in ways that are quite instability inducing. This is both a mathematical and a computational problem.
We need to worry about memory. We need to worry a lot about memory actually. If in our discussion we're going to get excited about scalability in multi-core environments, then yes, I can have 80 (pick a number) cores on my chip, but can I have enough memory and memory bandwidth to exploit them? How do we distribute our memory around our cores?
What about I/O bandwidth? Without great care, the really big memory hungry climate models can often get slowed up and be waiting spinning empty CPU cycles waiting for I/O. This is a computational problem.
Every time we add a new process, we require more memory. The pinch points change and are very architecture dependent. Every time we change the resolution, nearly every component needs to be re-evaluated. This takes time.
At this point, we've not really talked about code per se. All that said, the concepts of software engineering do map onto much of what is (or should be) going on. Yes, scientists should build unit tests for their parameterisations. Yes, there should be system/model wide tests. Yes, task tracking and code control would help. But, every time we change some code there may be ramifications we don't understand, not only in terms of logical (accessible in computer science terms) consequences, but from a scientific point of view, there might be some non-linear (and inherently unpredictable) consequences. Distinguishing the two takes time, and I totally agree that better use of code maintenance tools would improve things, but sadly I think it would be a few percent improvement ... since most of the things I've listed above are not about code per se, they're about the science and the systems.
So, personally I don't think it's the time taken to write lines of code that makes modelling so hard. Good programmers are productive in anything. I suspect changing to python wouldn't make a huge difference to the model development cycle. That said, anyone who writes diagnostic code in Fortran, really ought to go on a time management course: yes learning a high level language (python) takes time, but it'll save you more ... but the reason for that is we write diagnostic code over and over. Core model code isn't written over and over ... even if it's agonised over and over :-)
Someone in one of the threads on this subject mentioned XML. Given that there (might be) a climate modeller or two read this: let me assure you, XML solves nothing in this space. XML provides a syntax for encoding something, the hard part of this problem is deciding what to encode. That is, the hard part of the problem is the semantic description of whatever it is you want to encode (and developing an XML language to encapsulate your model of the model: remeember XML is only a toolkit, it's not a solution). If you want to use XML in the coupler, what do you need to describe to couple two (arbitrary) components? If it's the code itself, and you plan to write a code generator, then what is it you want to describe? Is it really that much easier to write a parameterisation for gravity wave drag in a new code generation language? What would you get from having done so?
So what is the way forward? Kaizen: small continuous improvements. Taking small steps we can go a long way ... Better coupling strategies. Better diagnostic systems. Yes: Better coding standards. Yes: more use of code maintenance tools. Yes: Better understanding of software engineering, but even more importantly: better understanding of the science (more good people)! Yes: Couple code changes to task/bug trackers. Yes: formal unit tests. No: Let's not try the cathedral approach. The bazaar has got us a long way ...
(Disclosure:I was an excellent fortran programmer, and a climate modeller. I guess I'm a more than competent python programmer, and I'm sadly expert with XML too. I hope to be a modeller again one day).
Using more computer power, revisited. (from "Bryan's Blog" on Wednesday 23 January, 2008)