Archive for the 'complexity' Category

Appliances – Good, bad or virtual ?

So, in another prime example of “Why do Analysts blogs make it so hard to have a conversation?” , Gordon Haff of Illuminata today tweeted a link to a new blog post of his on appliances. No comments allowed, no trackbacks provided.

He takes Chuck Hollis (EMC) post and opines various positions on it. It’s not clear what the notion of “big appliance” is as Chuck uses it. Personally, I think he’s talking about solutions. Yes, I know it’s a fine line, but a large all purpose data mining solution with its’own storage, own server, own console, etc. is no more an appliance than a kitchen is. The kitchen will contain appliances but it is not one itself. If thats not what Chuck is describing, then his post has some confusion, very few organizations will have a large number of these “solutions”.

On the generally accepted view of appliances, I think both Gordon and Chuck are being a little naive when they think that all compute appliances can be made virtual and run on shared resource machines.

While at IBM I spent a lot of time, and learned some valuable lessons about appliances. I was looking at the potential for the first generation of IBM designed WebSphere DataPower appliances. At first, it seemd to me even 3-years ago that turning them into a virtual appliance would be a good idea. However, I’d made the same mistake that Hollis and Haff make. They assume that the type of processing done in an appliance can be transparently replaced by the onward march of Moores Law on Intel and IBM Power processors.

The same can be said for most appliances I’ve looked at. They have unique hardware design, which often includes numerous specialized processing functions, such as encryption, key management and even environmental monitoring. Appliances though real value add is that they are designed with a very specific market opportunity in mind. That design will require complex workload analysis, and reviewing the balance between general purpose compute, graphics, security, I/O and much more, and producing a balanced design and most importantly, a complete user experience to support it. Thats often the key.

Some appliances offer the sort of hardware based security and tamper protection that can never be replaced by general purpose machines.

Yes Hollis and Haff make a fair point that these appliances need separate management, the real point is that many of these appliances need NO management at all. You set them up, then run them. Because the workload is tested and integrated the software rarely, if ever fails. Since the hardware isn’t generally extensible, aka as Chuck would have it, you are locked into what you buy, updating drivers and introducing incompatibility isn’t an issue as it is with most general purpose servers.

As for trading one headache for another, while it’s a valid point, my experience so far with live migration and pools of virtual servers, network switches, SAN setup etc. is that you are once again trading one headache for another. While in a limited fashion it’s fairly straight forward to do live migration of a virtual workload from one system to another. Doing it at scale, which is what is required if you’ve reached the “headache”point that Chuck is positing, is far from simple.

Chuck closes his blog entry with:

Will we see a best-of-both-worlds approach in the future?

Well I’d say that was more than likely, in fact it’s happening and has been for a while. The beauty of an appliance is that the end user is not exposed to the internal workings. They don’t have to worry about most configuration options and setup, management is often minimised or eliminated, and many appliances today offer “phone home” like features for upgrade and maintenance. I know, we build many of them here at Dell for our customers, including EMC, Google etc.

One direction that we are likely to see, is that in the same current form factor of an appliance, it will become a fault tolerant appliance by replicating key parts of the h/w, virtualizing the appliance and running multiple copies of the appliance workload within a single physical appliance, all once again delivering that workload and deployment specific features and functions. This in turn reduces the number of physical appliance a customer will need. So the best of both worlds, although I suspect that not what Chuck was hinting at.

While there is definitely a market for virtual software stacks, complete application and OS instances, presuming that you can move all h/w appliances to this model, is missing the point.

Let’s not forget, SANs are often just another form of appliance, as are TOR/EOR network switches, and things like the Cisco Nexus. Haff says that appliances have been around since the late 1990’s, well at least as far as I can recall, in the category of “big appliances”, the IBM Parallel Query Server which ran a customized mainframe DB2 workload, and attached to an IBM S/390 Enterprise Server was around in the early 1990’s.

Before that many devices were in fact sold as appliances, they were just not called that, but by todays definition, thats exactly what they were. My all time favorite was the IBM 3704, part of the IBM 3705 communications controller family. The 3704 was all about integrated function and a unique user experience, with at the time(1976) an almost space age touch panel user interface.

Physicalization at work – software pricing at bay

This is an unashamed take from an Arstechnica.com article, and I certainly can’t take credit for the term. I’m just back from a week of touring around Silicon valley talking about our thinking for Dell 12G servers, to Dell customers and especially to those that take our products and integrate them into their own product offerings. It was a great learning experience, and if you took time to see me and the team, thank you!

One of the more interesting discussions both amongst the Dell team, and with the customers and integrators, was around the concept of physicalization. Instead of building bigger and faster servers, based around more and more cores and sockets, why not have a general purpose, low power, low complexity physical server that is boxed up, aggregated and multiplexed into a physicalization offering?

For example, as discussed in the arstechnica article, using a very simplified, atom based server, eliminate many of the older software and hardware additions that make motherboards more complex and more expensive to build, which in turn with the reduced power and heat, makes them even more reliable. Putting twelve, or more in a single 2U server makes a lot of sense.

They also, typically don’t need a lot of complex virtualization software to make full use of the servers. That might sound like heresy in these days when virtualization is assumed and the major driver behind much of the marketing spend, and much of the technology spend.

So what’s driving this? Well mostly, if you think about it, the amount of complexity needed in the x86 marketplace these days, and also in mainframe and Power/UNIX marketplace is through complex software and systems management. That complexity is driven by two needs.

  1. Server utilization – in order to utilize the increasing processor power, sockets and cores, you need to virtualize the server and split into consumable, useful chunks. This would normal require a complex discussion about multi-threaded programming and complexity, but I’ll ignore that this time. Net, net there are very few workloads and applications that can use the effective capacity offered by current top-end Intel and AMD x86 processors.
  2. Software Pricing – Since the hardware vendors, including Dell, sell these larger virtualized servers as great business opportunities to simplify IT and server deployment by consolidating disperate, and often distributed server workloads into a single, larger, more manageable server, the software vendors want in on the act. Otherwise they lose out on revenue as the customer deploys fewer and fewer servers. On eploy to combat this, to to charge by core or socket. Ultimately their software software does little and sometimes nothing to exploit these features, they just charge, well, because they can. In a virtualized server environment, the same is true. The software vendors don’t exploit the virtualization layer, heck in some cases they are even reluctant to support their software running in this environment and require customers to recreate any problems in a non-virtualized environment before looking at them.

And so it is that physicalization is starting to become attractive. I’ve discussed both the software pricing and virtualization topics many times in the past. In fact, I’ve expressed my frustration that software pricing still seems to drive our industry and, more importantly, our customers to do things that they otherwise wouldn’t. Does your company make radical changes to your IT infrastructure just to get around uncompetitive and often restrictive software pricing practices? Is physicalization interesting or just another dead-end IT trend?

IBM Big Box quandary

In another follow-up from EMC World, the last session I went to was “EMC System z, z/OS, z/Linux and z/VM”. I thought it might be useful to hear what people were doing in the mainframe space, although largely unrelated to my current job. It was almost 10-years to the day that I was at IBM, were writing the z/Linux strategy, hearing about early successes etc. and strangely, current EMC CTO Jeff Nick and I were engaged in vigourous debate about implementation details of z/Linux the night before we went and told SAP about IBM’s plans.

The EMC World session demonstrated, that as much as things change, the they stay the same. It also reminded me, how borked the IT industry is, that we mostly force customers to choose by pricing rather than function. 10-12 years ago z/Linux on the mainframe was all about giving customers new function, a new way to exploit the technology that they’d already invested in. It was of course also to further establish the mainframes role as a server consolidation platform through virtualization and high levels of utilization.(1)

What I heard were two conflicting and confusing stories, at least they should be for IBM. The first was a customer who was moving all his Oracle workloads from a large IBM Power Systems server to z/Linux on the mainframe. Why? Becuase the licensing on the IBM Power server was too expensive. Using z/Linux, and the Integrated Facility for Linux (IFL) allows organizations to do a cost avoidance exercise. Processor capacity on the IFL doesn’t count towards the total installed, general processor capacity and hence doesn’t bump up the overall software licensing costs for all the other users. It’s a complex discussion and that wasn’t the purpose of this post, so I’ll leave it at that.

This might be considered a win for IBM, but actually it was a loss. It’s also a loss for the customer. IBM lost because the processing was being moved from it’s growth platform, IBM Power Systems, to the legacy System z. It’s good for z since it consolidates it’s hold in that organization, or probably does. Once the customer has done the migration and conversion, it will be interesting to see how they feel the performance compares. IBM often refers to IFL and it’s close relatives the ziip and zaap as speciality engines. Giving the impression that they perform faster than the normal System z processors. It’s largely an urban myth though, since these “specialty” engines really only deliver the same performance, they are just measured, monitored and priced differently.

The customer lost becuase they’ve spent time and effort to move from one architecture to another, really only to avoid software and server pricing issues. While the System z folks will argue the benefits of their platform, and I’m not about to “dis” them, actually the IBM Power server can pretty mouch deliver a good enough implementation as to make the difference, largely irrelavant.

The second confliction I heard about was from EMC themselves. The second main topic of the session was a discussion about moving some of the EMC Symmetrix products off the mainframe, as customers have reported that they are using too much mainframe capacity to run. The guys from EMC were thinking of moving the function of the products to commodity x86 processors and then linking those via high speed networking into the mainframe. This would move the function out of band and save mainframe processor cycles, which in turn would avoid an upgrade, which in turn would avoid bumping the software costs up for all users.

I was surprised how quickly I interjected and started talking about WLM SRM Enclaves and moving the EMC apps to run on z/Linux etc. This surely makes much more sense.

I was left with though a definate impression that there are still hard times ahead for IBM in large non-X86 virtualized servers. Not that they are not great pieces of engineering, they are. But getting to grips with software pricing once and for all should really be their prime focus, not a secondary or tertiary one. We were working towards pay per use once before, time to revist me thinks.

(1) Sport the irony of this statement given the preceeding “Nano, Nano” post!

Nano, Nano – Serving you on 15-Watts

The Dell XS11-VX8 Server

The Dell XS11-VX8 Server

This is something I was asked about a few times at EMC World, power and form factors for servers. Here is the latest server from the Dell Data Center Solutions group (DCS). It’s only a little bigger than a disk drive, and you can get 252 servers in a 42U rack. While the form factor is interesting, very interesting, you need to think outside the “box” to get the true value.

It uses the Via Nano CPU, to deliver an incredibly low-power solution of 20-29 Watts/server when fully busy, and 15 Watts/server when the OS is idle. It includes enterprise features like 64-bit operating systems, 1-to-1 virtualization, and remote management via IPMI. What this does is turn the current server paradigm on it’s head. Instead of using more and more power hungry server chips that deliver more performance than you really need, which opens the gate for someone to tell you about server virtualization and consolidation to make the most of the power per watt, or cost of the server. The Dell XS11-VX8 just gives you “enough” performance, at a good price, and an effective price per watt. For those sensative to cross-charging, billing out IT services etc. it has another side effect of simplifying software licensing and allocation.

Over on the Direct2Dell blog community, Todd has written a post with some useful additional detail.

Dell Management Console and 11G Server Launch

I spent Friday afternoon in a wet Round Rock parking lot where we held the launch thank you party for the team that put together the 11th Generation of Dell servers and the associated management software. We don’t complain about rain in Austin, it feeds some of the best things about town, namely Barton Springs, Lake Travis, which feeds Town Lake where I run, and the lake at Pure Austin North where I swim, in perfect conditions, twice per week. The celebration was sponsored by our partner Broadcom.

The event was hosted by our executives, including Michael Dell, and they made some important observations on the process to design the servers, market acceptance and customer feedback. While I was waiting in the food line, one the other folks and I got talking, he said “I looked at your blog the other day and you didn’t write anything on the Dell Management Console”. And he’s right.

It’s a significant step forward for Dell customers and for Dell. The DMC is based on the modular Symantec Management Platform architecture and offers a comprehensive set of features at no additional cost. While I was in IBM Power Systems, one of the fights I had with them was over their console and management strategy. While I’m sure they had good reasons the way they did, what they did, their ongoing strategy couldn’t follow the same path of fragmented consoles for this, consoles for that, different interfaces, different terminology for the same things etc. I’m hopeful still that when they introduce their next generation of servers, they’ll have learned the lessons that Dell already has.

DMC replaces the existing Dell hardware management console, Dell OpenManage IT Assistant. DMC has a plug-in architecture that allows the console to be extended with additional function and to be used as a manager for other scenarios, devices etc. However, true to the Dell mission to simply IT, Reduce TCO and one way we are doing that is to included a significant amount of function in the base, rather than as chargeable plugins. Here’s a summary of the major functions and improvements over prior offerings:

  • Hardware - multiple choices on how to explore, report and understand hardware configs plus export as tables; many pre-configured reports asd well as the ability to create your own.

    Proactive heartbeat monitoring is also supported, based on a user defined schedule; event suscription is also supported for Dell servers and MIBs can be imported for non-Dell hardware.

    You can push config changes and agent, BIOS, driver and firmware patches to many servers simultaneously without scripting.

  • Security – you can group devices and servers by geographical, logical, organisztional or type, or create your own. These can then be managed using role based secuity. You can create your own roles, or import them from Microsoft Active Directory.
  • Software – Support for hypervisors such as VMware(r) ESXi as well as Microsoft and Citrix. Health monitoring, discovery of virtual machines, associate to physical host server etc. Also included is the normal OS monitoring of utilization for memory, processors, free space and I/O.
  • Networking – The console includes support for a broad range of devices, but also includes support for Fibre Channel switches.

Thats an outline of the support in the new Dell Management Console, powered by Altiris from Symantec. I went to look for a couple of white papers to include links for. One with a more detailed list of device support and a second with a more comprehensive strategy that showed the plug-in architecture and the other function available for DMC. I came across this great resource, the Dell POWER Solutions magazine(just a hint of irony).

Here is a link where you can download the entire magazine, as a 21Mb PDF file. Alternatively, here is a link for an index into the articles where you can review each article seperately.

IT T&C’S

I’ve been able to spend an interesting few weeks examining both how Dell goes about procuring technology and building it’s systems, especially within the Enterprise Product Group and to some degree storage.

Some good things, some bad things, some just are what they are, out of time to market and business necessity. One of the early things I think I want to drive is an effort to create a stanard set of IT Management T&C’s. Think about it, any major company wouldn’t deal with another major company without understanding and agreeing T&C’s for things like payments, legal, disclosure, IP and so on.

While small companies find the level of detail in these T&C’s an unfair burden, they do help in so much as they establish a baseline for how the company acts by default. There are always special cases.

I’m thinking that it is increasingly important from an in-band and out-of-band management perspective that we have the same. If you want to bid for business from Dell to build a device, server, storage, etc. then you ought to be able to find out what our baseline operational requirements are. In mostly cases these ought to be standalone from a given server build, from the baseboard design for the next server, the management console for storage etc.

So thats what I think I’m going to tackle first, a framework of API’s, Protocols, Transports etc. that we can support. I’ll classify each of the major initiatives we have underway, either they are tactically important and we’ll support for the foreseable future, they are depricated and we’ll stop using/supporting them at a given point, at which time they’ll be superceeded or replaced by xx or they are no longer supported or being developed and no new funding or projects will be undertaken using them.

Declaring how we’ll support the various technology platforms will be good for our customers. They’ll have a clear roadmap and be able to see where we are on for example standards implementations; Hopefully it will also reduce the number of protocols etc. in use and standardise around a smaller subset. It will also be good for the OEM’S and Partners we work with. they’ll know what we are going to ask for in RFQ’s, and will be able to influence our thinking ahead of time, and will be able to skill and tool-up before we ask them to bid/build for us. Finally, it will be good for Dell, we’ll be able to build libraries of re-useable assets to handle the specific API’s, protocols, transports etc. and re-use these as much as possible across different products. Also, it will put us in a better position with respect to testing and tooling.

Of course, as far as possible the T&C’s as it were, will be industry standard(s). Some of these will have to be, de facto, they are what’s being built and used today. One of the things I’ll be giving some serious consideration to will be Intel’s Active Management Technology or AMT. While it appears to address a number of the key areas you’d want to tackle, but depending on it would put us in a difficult position with respect to AMD processors, which don’t have the same function, implementedthe same way.

Interesting times, am definately enjoying the new job. Thanks again for all the best-wishes.

Robin Bloor asks what is dynamic infrastructure

Over on his have mac will blog blog, Robin Bloor asks What Does IBM Mean By Dynamic Infrastructure?

Rather than burden his comments section with a long trail of corrections, based on my suppositions, I thought I’d post my answer here and correct it as appropriate.

Robin, You might want to google for IBM Dynamic Infrastructure for MY SAP. or similar, or go look at this redbook. There is also a useful overview PowerPoint from Gerd Breiter, one of the architects and development leads, here

I’d guess the architects/development team for IDI have been moved internally from Systems Group to Tivoli. IDI was an early implementation of on demand and was developed in Boeblingen. As initially envisaged, IDI was a Systems Group initive and the bulk of the early implementation done before on demand, and then carried over and modified as and when possible.

Of course, I’m sure now that this mission is over in Tivoli the thinking and delivery will have evolved. Obviously cloud computing has become as major focus area in the industry since then, and would have to be factored in.

Unless you know better ;-)

The Windows Legacy

My good friend and fellow Brit’ Nigel Dessau posted his thoughts, and to some degree, frustrations with Windows Vista and potentially Windows 7 today on his personal blog, here.

The problem is of course they are stuck in their own legacy. If I were Microsoft,  I’d declare Windows 8 would only support Windows 7 and earlier apps and drivers in a virtual machine.

They’d declare a bunch of their more low level interfaces deprecated with Windows 7 and won’t be accessible in Windows 8 except in a Windows 7 VM.

Then they’d make their Windows virtual machine technology abstract all physical devices, so that Windows could handle them how they thought best, and wouldn’t let applications talk to devices directly, only via the abstraction. They would have generic storage, generic network, and generic graphics interfaces that applications could write to and Microsoft would deal with everything else.

This would initially limit the number of devices that would be supported, but thats really status quo anyway. They would declare how devices that want to play in the Windows space would behave, declare the specs, and Microsoft would own the testing and to a degree validation of almost all drivers or they could farm this out to a seperate organization who would independently certify the device, not write the code. Once they stabilised the generic interfaces though, the whole Windows system itself would become more stable.

This would be a big step for Microsoft. When you look at the Windows ecosystem, there are hundreds of thousands of Windows applications and utilities. Way too many of them though are to deal with the inadeqaucies of Windows itself, or missing function. Cut out the ability to write these sort of applications and their will be at least an infrastructure developer backlash. It might even provoke more antitrust claims. While I know nothing about the iPhone, this would likely put Windows 8 in the same position with respect to developers.

For all I know, this could be what they have in mind, it’s and area I need to get up to speed on with them, and obviously the processor roadmaps for AMD and Intel, as well as understanding where Linux is headed.

Oh, Now it’s legacy IT that’s dead. Huh?

I got a pingback Dana Gardners ZDNet blog for my “Is SOA dead?” post. Dana, rather than addressing the issue I raised yesterday, just moved the goalposts, claiming “Legacy IT is dead“.

I agree with many of his comments, and after my post “Life is visceral“, which Dana so ably goes on to prove with his post. I liked some of the fine flowing language, some of it almost prosaic, especially this “We need to stop thinking of IT as an attached appendage of each and every specific and isolated enterprise. Yep, 2000 fully operational and massive appendages for the Global 2000. All costly, hugely redundant, unique largely only in how complex and costly they are all on their own.” – whatever that means?

However, thinking about a reasonable challenge for anyone considering jumping to a complete services or cloud/services, not migrating, not having a roadmap or architecture to get there, but as Dana suggests, grasping the nettle and just doing it.

One of the simplest and easiest examples I’ve given before for why I suspect as Dana would have it, “legacy  systems” exist, is becuase there are some problems you just can NOT be split apart a thousand times, whose data can NOT be replicated into a million pieces.

Let’s agree. Google handles millions of queries per seconds, as do ebay and Amazon, well probably. However, in the case of the odd goggle query not returning anything, as opposed to returning no results, no one really cares or understands, they just click the browser refresh button and wait. Pretty much the same for Amazon, the product is there, you click buy, and if every now and again there was one item of a product left at an Amazon store front, if someone else has bought it between the time you looked for it and decided to buy, you just suffer through the email that the item will be back in stock in 5-days after all, it will take longer than that to track down someone to discuss it with.

If you ran your banking or credit card systems this way, no one would much care when it came to queries. Sure, your partner is out shopping, you are home working on your investments. Your partner goes to get some cash, checks the balance and the money is there. You want to transfer a large amount of money into a money market account, you check and there amount is just there, you’ll transfer some more into the account overnight from your savings and loan and you know your partner only ever uses credit, right?. You both proceed, a real transactional system lets one of you proceed and the other fails, even if there is only 1-second, and possibly less difference between your transactions coming in.

In the google model, this doesn’t matter, it’s all only queries. If your partner does a balance check, a second or more after you’ve done a transfer, and see’s the the wrong balance, it will only matter when they are embarressed 20-seconds later trying to use that balance, that isn’t there anymore.

Of course, you can argue banks dont’ work like that, they reconcile balances at the end of the day. You will though when that exceptional balance charge kicks-in if both transactions work. Most banks systems are legacy systems from a different perspective, and should be dead. We, as customers, have been pushing for straight through processing for years, why should I wait for 3-days for a check to clear? 

So you can’t have it both ways, out of genuine professional understanding and interest, I’d like to see any genuine transaction based systems that are largely or wholly services based or that run in the cloud.

In order to what Dana advocates, move off ALL legacy systems, those transaction systems need to cope with 1000, and upto 2000 transactions per second. Oh yeah, it’s not just banks that use “legacy IT”, there are airlines, travel companies, anywhere where there is finite product and an almost infinite number of customers.

Remember, Amazon and ebay and paypal don’t do their own credit card processing as far as I’m aware, they are just merchants who pass the transaction on to a, err, legacy system.

Some background reading should include one that I used early in my career. Around the time I was advocating moving Chemical Bank, NY’s larger transaction systems to virtual machines, which we did. I was attending VM Internals education at Amdahl in Columbia, MD. One of the instructors thought I might find the paper useful.

It was written by a team at Tandem Computer and Amdahl, including the late, great Jim Gray. It was written in 1984. Early on in this paper they describe environments that support 800 transactions per second in 1984. Yes, 1984. These days, even in the current economic environment, 1000tps are common, and 2000tps are table stakes.

Their paper is preserved and online here on microsoft.com

And finally, since I’m all about context. I’m an employee of Dell, I started work there today. What is written here is my opinion, based on 34-years IT experience and much of it garned from the sharp end, designing an I/O subsystem to support an large NY banks transactional, inter-bank transfer system, as well as being responsible for the worlds first virtualized credit card authorization system etc. but I didn’t work for Dell, or for that matter, IBM then. 

Speakers corner anyone?

Is SOA dead?

There has been a lot of fuss since the start of the new year around the theme “SOA is dead”. Much of this has been attributed to Anne Thomas Manes blog entry on the Burton Groups blog, here.

Infoworlds Paul Krill jumper on the bandwagon with a SOA obituary, qouting Annes work and say “SOA is dead but services will live on”. A quick fire response came on a number of fronts, like this one from Duane Nickull at Adobe, and then this from James Governor at Redmonk, where he charismatically claims, “everything is dead”.

First up, many times in my career, and James touches on a few of the key ones, since we were there together, or rather, I took advantage of his newness and thirst for knowledge as a junior reporter, to explain to him how mainframes worked, and what the software could be made to do. I knew from 10-years before I met James that evangelists and those with an agenda, would often claim something was “dead”. It came from the early 1980’s mainframe “wars” – yes, before there was a PC, we were having our own internal battles, this was dead, that was dead, etc.

What I learned from that experience, is that technical people form crowds. Just like the public hangings in the middle ages, they are all too quick to stand around and shout “hang-him”. These days it’s a bit more complex, first off there’s Slashdot, then we have the modern equivalent of speakers corner, aka blogs, where often those who shout loudest and most frequently, get heard more often. However, what most people want is not a one sided rant, but to understand the issues. Claiming anything is dead often gives the claimer the right not to understand the thing that is supposedly “dead” but to just give reasons why that must be so and move on to give advice on what you should do instead. It was similar debate last year that motivated me to document my “evangelsim” years on the about page on my blog.

The first time I heard SOA is dead, wasn’t Annes blog, it wasn’t even as John Willis, aka botchagalupe on twitter, claims in his cloud drop #38 him and Michael Cote of Redmonk last year. No sir, it was back in June 2007, when theregister.co.uk reprinted a Clive Longbottom, Head of Research at Quocirca, under the headline SOA – Dead or Alive?

Clive got closest to the real reasons of why SOA came about, in my opinion, and thus why SOA will prevail, despite rumours of its’ demise. It is not just services, from my perspective, it is about truly transactional services, which are often part of a workflow process.

Not that I’m about to claim that IBM invited SOA, or that my role in either the IBM SWG SOA initiative, or the IBM STG services initiative was anything other than as a team player rather than as a lead. However, I did spend much of 2003/4 working across both divisions, trying to explain the differences and similarities between the two, and why one needed the other, or at least its relationships. And then IBM marketed the heck out of SOA.

One of the things we wanted to do was to unite the different server types around a common messaging and event architecture. There was  almost no requirement for this to be syncronous and a lot of reasons for it to be services based. Many of us had just come from the evolution of object technology inside IBM and from working on making Java efficient within our servers. Thus, as services based approach seemed for many reasons the best one. 

However, when you looked at the types of messages and events that would be sent between systems, many of them could be cruicial to effective and efficient running of the infrastructure, they had in effect, transactional charateristics. That is, given Event-a could initiate actions A, then b, then c and finally d. While action-d could be started before action-c, it couldn’t be started until action-b was completed, and this was dependent on action-a. Importantally, none of these actions should be performed more than once for each instance of an event.

Think failure of a database or transactional server. Create new virtual server, boot os, start application/database server, rollback incomplete transactions, take over network etc. Or similar.

Around the same time, inside IBM, Beth Hutchison and others at IBM Hursley, along with smart people like Steve Graham, now at EMC, and Mandy Chessell also of IBM Hursley were trying to solve similar trascational type problems over http and using web services.

While the Server group folks headed down the Grid, Grid Services and ultimately Web Service Resource  Framework, inside IBM we came to the same conclusion, incompatible messages, incompatible systems, different architectures, legacy systems etc. need to interoperate and for that you need a framework and set of guidelines. Build this out from an infrastructure layer, to an application level; add in customer applications and that framework; and then scale it in any meaningful, that need more than a few programmers working concurrently on the same code, or on the same set of services, and what you needed was a services oriented architecture.

Now, I completely get the REST style of implementation and programming. There is no doubt that it could take over the world. From the perspective of those frantically building web mashups and cloud designs, already has. In none of the “SOA is dead” articles has anyone effectively discussed syncronous transactions, in fact apart from Clive Longbottoms piece, no real discussion was given to workflow, let alone the atomic transaction.

I’m not in denial here of what Amazon and Google are doing. Sure both do transactions, both were built from the ground-up around a services based architecture. Now, many of those who argue that “SOA is dead” are often those who want to move onto the emporers new clothes. However, as fast as applications are being moved to the cloud, many businesses are nowhere in sight moving or exploiting the cloud. To help them get there, they’ll need to know how to do it and for that they’ll need a roadmap, a framework and set of guidelines, and if it includes their legacy applications and systems, how they get there, For that, they’ll likely need more than a strategy, they’ll need a services “oriented” architecture.

So, I guess we’ve arrived at the end, the same conclusion that many others have come to. But for me it is always about context.

I have to run now, literally. My weekly long run is Sunday afternoon and my running buddy @mstoonces will show up any minute. Also, given I’m starting my new job, I’m not sure how much time I’ll have to respond to your comments, but I welcome the discussion!

Back in the day – way back

I suggested to @adamclyde we take a twitter conversation about the gray area between personal and corporate blogging offline, into email. In my response to him, like some “grumpy old man“, I started by recalling the good old days when my URL’s were emea.ibm.com/(something) then ibm.com/s390/corner and later ibm.com/servers/corner.

Later I went looking and found some of my webpages from 2000 on the Internet Archive. I was even more delighted find they had some of my old presentations. I didn’t check through all of them, but my V2 Corner is here. I’ve taken one of my better presentations from the Internet archive and posted it on slideshare.

Enterprise Workstation Management - From Chaos to Order

Enterprise Workstation Management - From Chaos to Order

The PDF version doesn’t have all the overlay colors right, and some of the embedded graphics are missing, but it’s still worth looking through for both content and style.

 

If Google can celebrate it’s 10th anniversary by reporting it’s 2001 index, well how about letting me get away with reposting a presentation from 1996 that originated in 1989! The presentation has it’s origins in 1989 as a Lotus Freelance presentation printed on real overheads via a plotter. It covers the management of workstations and PC’s in corporate environments.

This version is dated from June 1996 and was recovered from the Internet Archive. Some of the colored overlays are the wrong colors and some of the graphics missing. I still think its worth taking a look through for both style and content. I got the summary slide wrong, but not by much as we move to what some are calling Cloud Clients

Spaghetti cabling

Andrew McKaskill

Racks and Racks of Spaghetti, photo by: Andrew McKaskill

As always, I’ve been focusing on the positive and forward looking aspects of unified fabrics for data centers. I got a few interesting emails after my last blog entry on the problems people have now that need solving, not least the cost, reliability and sheer complexity caused by the current situation.

One of the emails included a link to this blog entry on vibrant.com, which has a great collection of cable mess pictures. In his email to me Chris wrote “most of our racks are carefully organised and formally tied off in bundles. Server replacement is relatively easy if you just want to exchange one with another that fits in the same space. The problem comes when you need to reconfigure a few servers, add some appliances, maybe remove an email backup or firewall appliance and relocate in a different rack, you undo the cable ties and everything starts falling apart. While none of our rows looks this bad, many of the racks within the rows end up looking like this.”

He goes on to discuss some of the related problems this causes and often the complete lack of momentum in solving the problem because of how labor intensive and expensive it can be, both in cost and downtime to deal with these issues. Another email included a link to this discussion forum on techrepublic.

I have to admit, this is an area I have little or no experience with. Nancy has yet to take me on a tour of the test bed we use across the hall for the high-end Power systems, and as I’ve been locked away working on design mostly for the past 5-years, I’ve not witnessed the explosive growth in large scale data centers. Since I’m doing customer validation sessions now, don’t be surprised if when I come to your office, I ask to see the machine room.

Finally on the same page

Linux another operating system slideThanks to James Governor at Redmonk or @monkchips on twitter, for the pointer. In this pretty direct interview, Linus Torvalds says something I got into trouble for about 7-years ago. Linus says “An OS should never have been something that people really care about… it should be completely invisible”.

I gave the keynote presentation at the OS/390 Expo and Performance Conference, I think in either 2000 or 2001, during the presentation I made exactly the same point, only about Linux. Yes, Linux was great, yes, we were going to do some pretty innovative things with it, but if in 5-years time we are still worrying about scalability, driver compatibility etc. then we’d missed the point, we shouldn’t really have to care about the OS.

Unfortunately in the audience was the IBM Account Executive for a large multi-national company, and the CTO from that company. Rather than come and ask questions afterwards, they took one point from a 75-minute presentation and complained about me to my then VP, Carol Stafford. Carol “invited” me into her office, I had to explain the remark, and context, Carol understood and took care of things.

So it’s with a wry smile I read the Torvalds article, and then sat up and wondered, do we still care about the OS, or has the stack become more important?

Is it so hard to work with IBM?

via James Governors del.ico.us feed, I read a pretty disappointing blog post on Don MacAskill CEO and Chief Geek of SmugMugs, SmugBlog.

Reading some of his past posts it’s easy to accuse Don of being a bit of a Sun fanboy. He recently ran a server bake-off to get some new servers for SmugMug, and decided to go with Sun, fair enough. I can’t dispute that Sun have a good offering in the web space, and being based out on the West Coast, get a lot of face time and word of mouth endorsements with many of the startups, especially in the web 2.0 space.

I know how important this can be. Back in 2000, I hung out a lot at the Silicon Valley World Internet Center and then, late in 2005, early 2006 worked out of the IBM office in Palo Alto with a number of key virtualization partners to get some direction and development work started.

What troubled me was not the Don had chosen Sun for his servers, but his comment “One of the attendees, who spends obscene, ungodly amounts of money with IBM, can’t even get engineering staff on the phone. Apparently, IBM has a big sales force who’s trained to buffer customers away from the engineers.”

“shurley shome mishtake”? My whole career before I joined IBM, was littered with experiences with sage and helpful IBM engineers. Either through the numerous user groups, or best of all, when the products broke, and after talking to level-1 and level-2, you’d get through to the level-3 folks. I remember to this day having a discussion with someone called, I think, Linda Iannella, who worked up in Kingston in the mid-1980’s. She knew code like no one I’d ever spoken to, when I asked how she was so sage, she replied, “I work on this code 10-hours a day, every day”. Adrian Walmsley, a former IBM UK employee was probably the most influential, and I’m delighted to have worked with him later on.

Have things changed so much that you can’t get in touch with the technical team at IBM anymore, or was Don’s experience atypical? Is it a west coast versus east coast thing, or just becuase IBM is so big these days?

I have to admit, getting things actually done these days can be difficult at times if you approach a novice IBMer, but we have an excellent kStart team out in the valley working with some startups, and most of the blue coud work is being done in Almaden and SVL.

Let me know what your experience has been. Can I help ??

More on complexity, configurability

One of my first posts in this blog, was on the subject of complexity. James Governor of Redmonk weighed in today on complexity with a trackback post called “What SOA needs to learn from Ruby On Rails“.

I noted, that while our software, and often our systems were complex, that was because our customers are, not because we design them to be complex. Our customers run a vast array of machines, in widely different environments, supporting a broad range of applications. Of course, this is chicken and egg, and is a difficult tightrope for established solutions to walk. We could just remove most of the configuration options and in a generation or two the complexity would have gone, but what about the customers?

Forced into a straightjacket of “our way or the highway”, would you take the later?

It’s easy for the new kid, in this case Ruby on Rails to come out and offer little or no configuration options, side files etc. It doesn’t have to, it has never made a significant change it what or how it does things. The same isn’t true for the old-timers. Comparing SOA to Ruby, is like comparing a transport system to a footpath.

It is a subject important to me though. At the moment I’m carefully trying to marshal the merger of the function in the System p Hardware Management Console with that of IBM Systems Director and Director console. My desire is to make one simple management platform that acts both as the local platform director, managing configuration, hardware and service management etc. and at the same time providing a set of programmable, function services based interfaces to provide both remote access, and remote management.

So, I’m all for simplicity but it has to be thought through. We are doing this with the System p Configurations for SOA Entry Points. The original SOA Entry points were pure software plays divided into five categories, People, Process, Information, Connectivity, and Reuse. We are taking the entry points one step further and mapping the software onto System p removing another layer of complexity by showing how they work, how you can configure them and testing them as a total solution.

You can read the System p Configurations for SOA Entry Points overview here, via FTP

John Lennon once sang “It’s been too long since we took the time, No-one’s to blame, I know time flies so quickly” … “It’ll be just like starting over, starting over”.

Next Page »


About & Contact

I'm Mark Cathcart, Director of Systems Engineering and a Distinguished Engineer at Dell. I was formerly an IBM Distinguished Engineer and member of the IBM Academy of Technology. I'm an information technology optimist.

Subscribe to updates via rss:

Feed Icon

Blog Stats

  • 35,127 hits