... personal wiki, blog and notes
Bryan's Blog 2007/02
service orchestration needs data models
Service description languages need to address three classes of identifiers:
the identifiers (or handlers) of the objects
the identifiers (or handlers) of the services
the identifiers of the object descriptions (we have to know what type of objects they are).
I'm sitting here in Frascati in the ESA Heterogeneous Mission Accessibility (HMA)1 workshop, and the last of those identifiers is the elephant in the room. Lots of talk about web service consumption, and lots of talk about application schema of GML, but little or none about data models.
(Never mind service choreography)
by Bryan Lawrence : 2007/02/28 : 0 trackbacks : 0 comments (permalink)
I don't care about the flops, I care about the PB
It's good to see the UK has commissioned a new supercomputer: HECToR. The press release is all excited about how fast it will go (theoretically, initially 60 Tflop/s, going to 250 Tflop/s in 2009, with another upgrade in 2011). You have to go to the new hector site itself, to discover the important detail from my point of view:
The systems will be connected by a common infrastructure (Rainer) which will also support 576 Tb of directly attached storage, rising to 1 Pb at the end of the phase.
(it's not clear what "the phase" is, given the multitude of timescales involved, but I'll find out eventually).
This is good news for us, as we're currently providing support for atmospheric modellers using HPCx since the disk configuration there was simply inadequate!
by Bryan Lawrence : 2007/02/25 : 0 trackbacks : 0 comments (permalink)
My personal event horizon is receding too quickly
I feel obliged to know about the various technical things that could impact both on our services and our service developments, which means I live within a little black hole into which I want to aggregate information. (It's a black hole because I don't have enough time to communicate much back out again).
The thing about black holes is that as they grow, the space enveloped by the event horizon grows quickly too. At the risk of pushing this analogy too far, the problem is my personal information entropy increases, the surface area of that event horizon becomes larger faster than I can keep track of. There are just too many things I want to know about. I've been here before
Anyway, today's feeling of knowledge impotence is associated with it being unlikely for me to have time in the near future to follow up on two things that today's blog trails1 have led to:
This via Stephen Pascoe!
Tgwebservices looks like a fascinating toolkit for developing interfaces to python code which support (simultaneously) REST, SOAP and JSON clients. The readme gives this example:
class InnerService(WebServicesRoot): @wsexpose(int) def times4(self, num): return num * 4 class ServiceRoot(WebServicesRoot): inner = InnerService() @wsexpose(int) def times2(self, num): return num * 2 *** highlight file error ***
Those decorators give you for free the following: Assume that ServiceRoot is instantiated with a baseURL of "http://foo.bar.baz/". Here are URLs that are available:
|http://foo.bar.baz/times2||HTTP access to the times2 method|
|http://foo.bar.baz/inner/times4||HTTP access to the times4 method on InnerService|
|http://foo.bar.baz/soap/||URL to POST SOAP requests to|
|http://foo.bar.baz/soap/api.wsdl||URL to get the WSDL file from|
This alone is enough for me to revisit turbogears (and maybe cherrypy)!
He started with an argument I can agree with; that web-services are actually more tightly coupled than the web paradigm requires, but draws different conclusions than I might. Anyway, in a very thin argument that "the web" (as opposed to "web services") supported security well (I say thin, but some of his points about message level versus transport level are very fair), he mentioned HTTPsec. And that's what got me interested.
HTTPsec is a strong authentication scheme for HTTP transactions. It defines an HTTP extension for mutual authentication and message origin authentication, via the integrity protection of a defined set of HTTP message headers. It offers message sequence integrity, forward secrecy, and optionally content integrity and content ciphering.
HTTPsec can authenticate any web traffic between any identities or peers that can provide certificates or RSA public keys. HTTPsec is designed for scenarios where credential-based schemes are inappropriate for architectural reasons or are simply considered too weak. It is also appropriate where message-layer security requirements are not otherwise satisfied by transport-layer or network-layer security protocols ...
That looks rather like a "normal web" WS-security to me. If HTTPsec is a drop in replacement for WS-security, then I start to be pretty interested ... as WS-security is one very strong reason for being interested in SOAP (because I can pass encrypted or otherwise messages through various recipients and still know they've not been tampered with).
(Incidentally, I fail to believe that HTTPsec could be considered a RESTful paradigm, as the whole concept of signed messages blows away one of the pillars on which REST rests: that you want to be able to use transparent caching for multiple clients. That doesn't mean it's not desirable, it just means this is not ammunition for another one of those REST versus SOAP arguments).
... and together?
Of course it doesn't look like there is a python implementation of httpsec, indeed, there is only just (November 2006) a java one. And either way (httpsec or ws-security) if tgwebservices is to be really useful, it'd have to be able to roll with a message level security paradigm.
Regular readers will know that I haven't much time for software patents. The stupidity of them is all to clear to see, and we've another example this last week. IBM have apparently provided a blanket IPR disclosure on their involvement in the development of the Atom specs. Well done them! Atom is important and that's a good thing!
But the sad part of this is that they felt they had to do this, even though they have:
NO KNOWN patents or applications for patents that read on the Atom specs.
James Snell goes on to say:
However, as is well known, IBM has a massive IPR portfolio and it would take a very long time and cost a lot of money for us to dig through 'em all to know for certain. Rather than spend the time and energy doing that, IBM has agreed to a blanket commitment to Royalty Free terms for any IPR that reads on the Standards Track specifications produced by the atompub Working Group.
So, IBM have so many patents that they don't know if any of them are relevant! How the hell is anyone supposed to develop software that doesn't violate patents if even the patent-holders can't (afford to) search them in any given context?
Don't get me wrong. I'm not whinging at IBM, I'm whinging at the stupidity of the situation wrt software patents!
Contemplating a move from Leonardo at home
You might think this blog has been quiet lately, but it's nothing compared to my home blog, which has been in stasis for a year or so: mainly lack of time, plus problems running Leonardo in a simple hosting environment (too hard to configure to be the way I want it, cruftless URLs etc). So:
I could do some major surgery on Leonardo, but given this is for my home blog, it'd have to be done in my time, and there isn't much of that. I've done all the surgery I need for Leonardo at work, and don't even have time to do the surgery I want for Leonardo at work ... besides, although I liked (and like) Leonardo, I think there are better frameworks nowadays ...
I've considered moving to one of the "major" blog providers or software packages, but I want my own material to be on my own site (call me a Luddite), and I want to be able to tinker.
I've considered writing my own blogging software (Joe Gregorio has pointed out how easy it is to get started), but I'm often reminded how easy it is to produce a prototype and how hard it is to write a production system, and for home use it needs to be "production-mode", and so although I want to tinker, I don't want to be fiddling incessantly with it as I have little time). So I figure starting from scratch isn't going to be the way to go.
The same Joe Gregorio ditched his old software to a significantly improved 1 moved version of his "throw-away" Robaccia to 1812, and a few folk have even started playing with it. What I like most about it, is the nice simple backend store, and the use of an Atom Publishing Protocol client - I started work on something similar for Leonardo, but of course it was bespoke, and I never got around to finishing it, but always missed it (I would like to be able to both write blog entries and have an archive of my posts on my laptop for the many times when I'm not connected, e.g. trains). If I wanted 1812 for work, to be comfortable with it, I'd want to add support for trackback, openid, and implicit versioning (all previous versions available and editable). I'd probably have to put the latter in both the client and the server, and obviously it'd need my wiki format, and I'd want to make sure the server could support direct editing (for those times when I have access to someone's browser but can't get my laptop online). So lots of work then. But fortunately, I don't want it for my work blog, Leonardo is safe here for the foreseeable future.
So, I plan to potter along with 1812 for home. I've made this intention public today ... but I wouldn't enter a sweepstake on when/if I get my new home blogsite out if I was you ...
The BADC has recently won a contract from Defra to deliver some services in support of distributing climate data to the UK and global community. One of those services is the IPCC data distribution centre. The contract was formally signed for a start date of February the 1st, and the eagle eyed in the community will have noticed that a new website appeared on the 2nd of February (the day the IPCC working group one released their summary statement) to replace the previous site at the Climate Research Unit in East Anglia.
Quite clearly we started ahead of the contract signing date, and we kept most of the previous site, but a lot of work was done on cleaning it up, removing broken links and improving performance. We've got a lot more planned too ... but anyway, I want to publicly acknowledge the team who did the hard work, particularly Charlotte Pascoe, Stephen Pascoe and Martin Juckes!
The IPCC data distribution centre is actually three websites, one in Germany, one in the U.S, and ours, so we're looking forward to further integration, and learning from our new colleagues.
If only Beagle, or Google Desktop search or any of them, could find a document on my desktop ... no, I mean the wooden thing on which my keyboard resides ...
by Bryan Lawrence : 2007/02/07 (permalink)