# Tuesday, December 23, 2003

Apologies for the lack of technical content of late (and in this post).  Today I finished my engagement on the project that involved a 4 hour commute, tomorrow I'm flying to Gifhorn, a small town in Germany, to spend the Christmas week with my wife and her family.  I'm hoping to have more time and energy to blog again in the New Year (before our first child is born in April)

Happy Christmas to everyone in the blogosphere.

posted on Tuesday, December 23, 2003 12:42:59 AM (GMT Standard Time, UTC+00:00)  #   

Mary and Tom Poppendieck's have a book on Lean Software Development that's really excited me of late.  I was reading it while I was thinking about how the project I just finished on could be improved in future iterations.

Lean Software Development takes ideas from Lean Manufacturing can be applied to software development.  Their website has an excellent range of presentations and publications.  The one I first read, which I think is as good as the book, is 'The predicability paradox'.  It goes through their eight keys to lean thinking:

  • Start early. Focus on establishing good communication between the group and about the problem.
  • Learn constantly.  Look for end-to-end slices of functionality that you can build, test and deliver to learn more about the problem.
  • Delay commitment.  Use encapsulation, loose coupling, refactoring and automated testing to make it easier to change code.
  • Deliver fast. Focus on being able to deliver fast (e.g. automated build and deployment) through excellent operational discipline (source code control, build tool) in order to get quicker feedback.
  • Eliminate waste.  Focus on customer value and remove wasted processes.
  • Empower the team.  Avoid central control, focus on self-directed workers.
  • Build integrity in.  Write the tests at the start.
  • Avoid sub-optimization. Sometimes breaking a complex task into small parts leads to sub-optimum solutions.  Make sure you measure the right things and set the correct goals.

For some reason this just seemed to speak to me in ways that some of the other writing on Agile/Extreme hasn't in the past.

posted on Tuesday, December 23, 2003 12:40:05 AM (GMT Standard Time, UTC+00:00)  #   
# Monday, December 15, 2003

Here's a Public Service announcement for fellow UK bloggers. The London .NET User Group will be hosting Dave Sussman speaking on ASP.NET 2.0 'Whidbey' on January 20th at the Cafe Royal  (not the MS swimming pool room in Soho).  For those that don't know, Dave is a co-author of the Addison-Wesley .NET Series titles - 'A First Look at ASP.NET v 2.0' and a 'First Look at ADO.NET and System.Xml V2.0' (sample chapters are available at both links).

The event is free but book early as it is likely to book out.  If you're interested let Ian Cooper know at meetings@dnug.org.uk 

I'm going to be presenting on Service Oriented Architecture with WSE and Indigo in February.

posted on Monday, December 15, 2003 11:12:51 PM (GMT Standard Time, UTC+00:00)  #   
# Sunday, December 14, 2003

Doing some performance testing on my current project showed me the evils of premature optimization and the need to test performance first before 'optimizing' the code.

A month ago when I started on the project there was no working deployed version of the application in an environment I could run tests against, so I did some code reviews and eyeballed areas of the architecture that might be slow.  The project uses BizTalk 2002 Orchestrations to process a SOAP message that is passed around as a string.  All of this manipulation was based around the XmlDocument DOM model, which had been slow and memory-hungry in previous projects, so I wrote some code to test the speed of different approaches to XML processing.  Using the XPathDocument and XPathNavigator turned out to be 70% faster. 

However, when I analysed the entire web service method only 6% of the call time was spent in these methods.  So even if we implemented the change it would only improve the overall performance of the web service call by about 4%.  This may or may not be worth it, but it highlighted to me the need to get real performance figures to test my assumptions about where the bottlenecks were in the code. As Donald Knuth says:

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

posted on Sunday, December 14, 2003 5:41:59 PM (GMT Standard Time, UTC+00:00)  #   
# Tuesday, December 09, 2003
I love that Microsoft has put the PDC presentations online, and unlike the US TechEd prsentations, has left the questions in at the end.  They've been very useful in helping to sort out the Microsoft Messaging message.  Here's one more piece of the puzzle that I found at the end of Steve Swartz's excellent WSV 403 Indigo Coming Attractions presentation when he was asked 'How do you implement something like MSMQ with Indigo?'

MSMQ kind of semantics can be implemented [in Indigo] in two ways:  They can be implemented inside a channel, or out in an application serving as an intermediary.

Indigo comes with a channel called the reliable channel that implements point-to-point buffering, point-to point-queuing - so that you can be on a laptop sending messages to me.  The messages stay in a buffer. Later your app goes away, I connect, the messages get sent to me, I reply, the messages sit in a buffer, later your app comes out and drains them.

Or I can build a queue intermediary.  Indigo V1 will ship with a queue class that you can put in an intermediary sitting at an address.  You can put messages to that and you can pull messages from it.

So those are the two options depending on whether you think of a transport kind of thing - so the semantics of me talking to you - or whether it’s a real thing that sits between your and app and the user of the app, where many people may be writing to the same queue and many apps might be reading from it.

Over time - Indigo is a long plan - in Indigo v1 we will be releasing classes, so that you can build queues.  Over time we will be implementing 'queue services' - full bore services that known about clusters and the whole thing.  Parts of MSMQ wont be in v1.  The programmer part will be, but the big old configurable server side wont be.

So, the programming model is there for V1, but a replacement for the enterprise aspects of MSMQ will have to wait until the future.  This would cover the apps where I've message queue in the past (basically a private queue that one app can post to independent of another that reads the messages off), though its not (yet) the replacement to Tibco I was hoping for.

posted on Tuesday, December 09, 2003 8:28:44 PM (GMT Standard Time, UTC+00:00)  #   
# Monday, December 08, 2003
Last week I mentioned that the Microsoft Messaging message wasn't being fully ACK'd by the community and posed some questions.  In the interest of clarifying the message and checking my own understanding, here are some answers I've come up with based on watching the PDC sessions on my daily train commute and the comments that BWill from the Indigo team left on my last post.

What's the relationship between Indigo, MSMQ, BizTalk and  SQL Server "Yukon" Service Broker?  When and where is each technology appropriate?
Basically it depends what type of application you are trying to build and what environment it needs to run in.  Here's a chart adapted from the DAT406 session:

Technology Indigo MSMQ "Yukon" Service Broker
Environment Any WS-* compliant endpoint Windows NT Yukon on both ends
Application Any distributed application Any Async NT Application Database applications
Message Store In memory or DB persistent store
(Yukon/SQL Server 2000)
MSMQ Message Store
(NT File System)
Built into Yukon
Type of Message Persistent and non-persistent dialogs Reliable, Express and Transactional Messaging Transactional messaging only
Protocol Various Various TCP only

Where does BizTalk come into it?
BizTalk is a product that that builds upon other technologies in the Microsoft platform.  As BWill says in comments on a previous post, choosing BizTalk or Indigo will be a question of how much of the infrastructure you want to build yourself.  BizTalk has connectors for MSMQ currently, in future it may connect to Indigo or possibly to Yukon Service Broker.  BizTalk 2004 is currently in Beta it's on a different release schedule than the other products.

Where does Yukon Service Broker fit?
Here's Roger Wolter's answer from the DAT406: Building Reliable Asychronous Database Applications with Yukon:

The key to keep in mind is that Service Broker is a database application framework, not a messaging system.  Yeah we send messages to other databases, but if you're building a messaging infrastructure there's not enough in Service Broker to satisfy your needs.  If you are building a queued, asynchronous database application and you wanted to use reliable messaging to scale out that application, Service Broker is the answer.

There were some hostile questions in the DAT406 session about why it was necessary to put a messaging layer in the database.  John Cavnar-Jonson (who definitely needs a blog) calls it an 'abomination' in the Developmentor Indigo discussion list.  Personally, I think it's part of a Dr. Evil style plan from the SQL team - if they were to add a spreadsheet to the product then a great majority of the world's applications wouldn't need an OS letting the SQL team achieve world domination.  Seriously though, there seem to be several good reasons:

  • Developers that know SQL can now  develop queued, asynchronous database apps.  The ability to have asynchronous queues is a very nice architectural feature.  Being able to achieve this with SQL syntax like BEGIN DIALOG ... FROM SERVICE ... TO SERVICE is pretty cool.

  • It's all in the one box.  Everything happens within the database.  Backup, restore, installation, configuration, monitoring and security are all there in the one location.  So deployment of the database is deployment of the messaging system etc. (no need to hassle with MSMQ installs).

  • The  message broker is the database.  It's easy to query the status of messages, processing the queues is as simple as writing a stored procedure, the database can efficiently throttle the queue processing resources and it's possible to farm out message processing work to another machine since all that is required to process a queue is a DB connection string.

  • It's fast. The Service Broker is fast because there's no need for two-phase commits for transactional messaging, there's no need to cross processes to get to the messaging platform and if the send and receive queues are in the same database then it's very fast.

Neils Berglund from Developmentor has been teaching Yukon for a while to Microsoft employees (such as Tim Sneath) and has an excellent sample chapter on Yukon Service Broker that's available for free download.

How does MSMQ fit into the longer term picture?
BWill says in my comments that the Indigo team has shared the love and embraced the MSMQ team into its building.  John Cavnar-Jonson did some research at the PDC:

I completely disagree with the idea that Indigo offers all of the functionality of MSMQ. I discussed this exact question at the PDC with Steve Swartz, Mike Vernal, and Anand (whose last name I don't recall, but he's an MSMQ program manager).  Indigo will offer reliable messaging (which is a huge improvement over current web service technology), but it will not be a full-fledged message queuing system.

As I reported from the PDC, the message was:

"We are not building the uber queuing system - we are not a replacement yet for MSMQ - we have support for routing, but we aren't replacing CISCO, we support eventing but we aren't a replacement for Tibco."

I think there's more to be said in this space.  It's likely that this is about achieving an Indigo V1 release (primarily about unifying the three different programming models and baking WS-* specification support into the platform) and then targeting more ambitious goals with future releases.

Which parts of Indigo will ship in Whidbey and which bit will ship before or with Longhorn?
Basically, System.Transaction will be in Whidbey, the rest later.  I'm still digesting Don's WSV302 Indigo Part2: Secure, Reliable, Transacted Services and Jim Johnson's Transactional Programming on the Windows Platform presentations to understand this more deeply.

Given that the last version of WSE will be wire-level compatible with Indigo and that a future version of WSE is likely to support WS-ReliableMessaging, what are the benefits of Indigo other than the simplified programming model?
Even though I love WSE I'm following the words of Hervey and accepting that WSE is V.Last++.  This question was me fishing for what features Indigo will provide me with as an architect/developer that I can't get from WSE.  BWill mentions:

Advantages of Indigo over WSE: three off the top of my head are performance, integration, and support [it's part of the platform rather than WSE's 2+1 support policy]. I'm sure there are others. Also, note that there is no guarantee that every feature in Indigo will also be WSE.

So, no bites as to what the extra functionality of Indigo might be, so I'm still fishing (e.g. digging deeper into the Longhorn SDK Indigo Samples).  Of course, Indigo has learnt from WSE, so the Indigo programming model will also be nicer (though the WSE programming model is already small and well refactored).

Indigo is committed to supporting WS-* standards and interoperability, but what extra functionality will be available if the whole environment is made up of Indigo boxes?
I'm still trying to get a feel for what features and functionality might be available in Indigo.  BWill mentions the fact that Indigo will likely run faster in an all-Indigo environment. 

Indigo does offer Peer to Peer functionality
Robert Scoble tells us:

I stopped in on Don Box yesterday and he gave me a demo. Indigo is going to radically change how we think of Internet technologies. Imagine something that looks like a website, but that doesn't require a centralized server. Now you're getting your mind around what Indigo could do. Indigo is designed to take advantage of our always on, always connected computers.

It's not difficult to spot an Evangelist with a marketing strength is it :)? Sounds like a simple programming model on top of existing network stacks opens up opportunities to use the Internet for more than just the web browsers against a central web server.  I think I'll review WSV306 Indigo and Peer to Peer apps on the train this week.

posted on Monday, December 08, 2003 10:02:46 AM (GMT Standard Time, UTC+00:00)  #   
# Sunday, December 07, 2003
 Sam Gentile has been doing some useful questioning and thinking about Service Oriented Architecture (SOA) that lead me to think about the scope of SOA's.  Certainly SOAs are a useful architectural concept, but they are not an answer for all projects.  Currently SOAs are complex and difficult to implement and until the technology improves and it gets easier for developers to build, deploy and manage these systems, SOAs are likely to be most appropriate for a small number of enterprise projects. [Update: I've edited this post slightly to clarify some points after comments from John Cavnar-Jonson]

On the 'death of objects' and the SOA 'paradigm shift'
Same started off by saying:

SOA [is] one of those paradigm shifts - it really does mean the death of objects at least as we know them.

Before I could suggest that we fine any blogger who uses worn-out expressions like 'it's the death of [technology X]' and 'paradigm shift', Sam later clarifies that he didn't mean objects but OOA/OOD: 

I really meant OOA/OOD - how do you now design/decompose system design/requirements into architectures and I think it's less now about classic OOA/OOD and tightly coupled object design and more about a loosely coupled collection of components under a service.

I agree that proclaiming the death of objects was a misstatement. It may be more accurate to say that components, which are based around sharing types between parts of the system, are likely to become less of a primary focus with the rise of SOAs.  I agree with what Bryan Noye's says:

I don't think SOA means the death of OOP or Components at all. Just like most people build components using OOP, I think most people will built SOAs using OOP and Components. They are not competing concepts but complementary.

SOAs are important where there is a need to share messages and interoperate with unknown others
For my mind, SOAs provide the most benefits when there is a need to share data/information and  interoperate with other groups, possibly on unknown platforms, that you have no control over. There will still be plenty of applications that are built in the current n-tier component Enterprise Architectures. Talking with Jim Johnson at the PDC he made the point that the benefits of SOA with external partners may also be benefits within an organisation or service boundary. I agree, but I think the technology platform (e.g. Indigo) and management tools (e.g. whatever Microsoft are planning here) have a long way to go before these benefits outweigh those of using existing component-based technologies.

Interoperating with others often means going for a lowest common-denominator approach which is always going to perform slower than when you can go with a binary format and control both ends of the wire (as Sam mentioned, using ASMX just doesn't give the same performance as current 'binary typed' systems, which is why Indigo will do special things if it knows it is working in an all-Indigo environment).

SOAs are currently still complex and difficult to build and manage
SOAs are currently still complex and difficult to build and manage. They are complex because the standards are still being implemented (on a recent project I was on it took a major international bank nearly 3 months to convince a market-leading J2EE vendor to adopt SOAP headers and honour the 'mustUnderstand' attribute).  Newer standards like WS-Addressing are still being worked through and implemented.  It's difficult because of the layers of the technology that must be understood in order to build the systems.  Just read Clemens' description of his latest FABRIQ project to see the level of technology understanding, skill set and experience you need to build an SOA project today. As the tools develop and experience and awareness of SOA's grow they are likely to get easier and simpler to build, in the same way that client server and then n-tier were once considered complex and difficult but are now considered main stream.

Services are about outside, Objects/COM+/Enterprise Services/MSMQ are for inside
It's useful to be aware of boundaries as Don Box demonstrated at the PDC.  There are parts of SOA that are designed to be used on public organisational boundaries and some that are better deployed within an organisational boundary or even behind a service boundary.  Clemens Vasters' makes the distinction between 'near and far'.

When using services outside an organisational boundary there are benefits to using open standards and working with contracts and schema rather than type, since it's difficult to control what is at the other endpoint.  Enterprise Services and MSMQ provide useful functionality that isn't yet covered in WS-* standards, but the problems with these two approaches is that they often share binary type information or require a Microsoft box or adapter at the other end.  This doesn't mean they shouldn't be used in SOA, just that they are better used inside the organisation where it's possible to have more control over the communication and the endpoints.  Within the service boundary there's still a service to provide, and it's here that technologies like components/COM+/ES/MSMQ are likely to be just as useful as they are today.

Gregor Hohpe's Enterprise Integration Patterns provides useful SOA guidance
I think that Sam Gentile is right, there are some architectural changes that need to move towards SOA and message-oriented systems.  Luckily Gregor Hohpe has written Enterprise Integration Patterns - Designing, Building, and Deploying Messaging Solutions a great book distilling his experience of these systems.  Hervey Wilson's reading it and its on Ingo Rammer's list of recommended books as well!.

posted on Sunday, December 07, 2003 6:11:50 PM (GMT Standard Time, UTC+00:00)  #   
# Saturday, December 06, 2003

Steve Maine writes about the importance of getting the API right when designing software:

Naming the abstractions present in a software system is an important design consideration, especially when designing a consumable API. The names you choose will influence the way consumers of your API will think about your software – you can either choose to make it easy for them, or hard. Convincing ourselves that we've made the right choice is proving difficult

Rather than trying to convince yourselves about the design I'd suggest trying it out or doing some usability testing.

In terms of trying it out, I'd suggest taking the Test Driven Development idea (or the Microsoft idea of Dogfooding) and try writing the sample code first to see how the code looks when you use it yourself. Often design discussions are 'tainted' by people knowing the implementation rather than thinking from about the intention in a users mind.

For the usability testing you could start with the idea of Personas to describe the users of your software (e.g. is it for a 'mort' and 'Einstein' or an 'Elvis'). Think about what goals the users of your they are trying to achieve and then write those down. Get around 5 people who might write your code and ask them how they might achieve these tasks using the code. Give them the object model and ask them to write out the pseudo-code. If it's just the naming that you're concerned with try describing the job is does and asking potential users for what they think it might be called.

What would Jacob Do?  Here I'm trying to find out.You can go all psycho-serious and see what value you can extract from this Microsoft Research presentation "Describing and evaluating API usability at Microsoft" but I'd go with Jacob Nielsen and start with the discount usability approach. Get a small number of potential users, think of open questions you can ask or tasks that they could do then follow the golden rule of sitting back and listening/watching without saying anything (the photo on the right me with Jacob Nielsen.  I'm smiling even though I paid £400 of my own money just to hear Jacob tell me this simple rule).

posted on Saturday, December 06, 2003 9:30:50 PM (GMT Standard Time, UTC+00:00)  #   

Recently I've been reviewing the most efficient way of retrieving values from an XML document in .NET.  As Scott Hansleman mentions, the best way to find this out is to write some code and measure the performance, so I wrote some code that can be downloaded to test the performance of three different approaches:

  • an XmlDocument using XPath queries with SelectSingleNode
  • an XPathDocument with an XPathNavigator
  • using the XmlSerializer to deserialize into a custom class

The results show that the XmlSerializer is fastest once the initial cost of creating temporary assemblies has been overcome.  In situations where the initial performance is most important then an XPathNavigator over an XPathDocument is the fastest.

Approach
Here's more detail about each approach:

Object Type XmlDocument XPathDocument XmlSerializer
Retrieval Method XPath query using XmlDocument.SelectSingleNode XPath queries using an XPathDocument Object properties
Advantages Familiar to many developers. 
XPath queries allow for quick evaluation of complex expressions.
Optimized for XPath and XSLT transformations.
XPath queries allow for quick evaluation of complex expressions.
Likely to become more important in future.
Turns XML into Objects
Disadvantages Slow, requires the whole document to be in memory. Slightly more complex for developers to write than XmlDocument. Requires familiarity with XSD and is more complex to set up. 
Can't match XPath's complex expressions.
Slow due to generation of dynamic assemblies on first use.
Example XmlDocument doc =
   new XmlDocument();
doc.Load(filePath);
XmlNodeList selection =
   doc.SelectNodes(XPath);
result = selection.Item(0).InnerText;
XPathDocument doc =
   new XPathDocument(filePath);
XPathNavigator nav =
   doc.CreateNavigator();
XPathNodeIterator it =
   nav.Select(XPath);
it.MoveNext();
result = it.Current.Value;
XmlTextReader reader =
   new XmlTextReader(filePath);
XmlSerializer ser =
   new XmlSerializer(typeof(message));
message mymsg =
   (message)ser.Deserialize(reader);
result = myMsg.MessageID;

Aaron Skonnard provides some excellent background on the benefits of the XPathDocument over the XmldDocument in his article .NET XML Best Practices: Part I: Choosing an XML API.

For the timing tests I used the code available in the EggHead Cafe article "High-Precision Code Timing in .NET".  Unfortunately the Ticks property of the System.DateTime class is only accurate to around 16ms even though it displays values to 100 nanoseconds.

Results
In this app I take a small XML document with no namespace and retrieve four values from it.  Here are the figures I got when running the console application for the first run and then a repeat run:

Runs XmlDocument XPathDocument/Navigator XML Serialization
1 0.00543 0.00129 0.09020
1 0.00051 0.00035 0.00028

The first time the application is run the XML Serializer has to create temporary dynamic assemblies so it performs the worst, however on subsequent runs within the application it performs the fastest since it can use a cached copy of the temporary assemblies (As Scott found out, these assemblies are not cached in .NET 1.0 if you specify a namespace).  Daniel Cazzulino has some good background on how the XmlSerializer works).  In both the first and the second runs the XPathDocument/XPathNavigator approach is faster than the XmlDocument.

In the sample code I've also presented an ASP.NET web application that can host the same tests.  ASP.NET creates a new AppDomain for each website it hosts and is maintained across requests to the server until it is shut down or recycled. Since the XmlSerializer caches its temporary assemblies based on the AppDomain, using the XmlSerializer in an ASP.NET web application or web service application is actually the fastest technique for retrieving values from an XML document.

In situations where the AppDomain is created per-request then using the XPathDocument/XPathNavigator will be more efficient than an XmlDocument and the XmlSerializier.

Discussion
The XmlSerializer is the fastest way to retrieve values from a small XML file if it is possible to overcome the cost of the creation of the temporary assemblies.  In situations where the first retrieval is the most important, or where more complex XPath queries are used the using an XPathNavigator over an XPathDocument provides better performance than the XmlDocument. 

Thankfully it will be possible with .NET 2.0/Whidbey to use a tool (sgen.exe) to pre-create and compile Serializers.  Doug Purdy covered this in his PDC talk, see Scott's notes.  I believe this will make XmlSerializer the fastest approach to retrieving values from XML, but testing will tell.

Click here to download the sample code.

posted on Saturday, December 06, 2003 7:12:06 PM (GMT Standard Time, UTC+00:00)  #   
# Wednesday, December 03, 2003

A month after the Indigo 'Kimono opening' at the PDC there's still a lack of clarity about what Indigo is, how it relates to other messaging technologies and what's the best way to start developing applications today.  While a lot of this was covered at the PDC my perception is that some of the message hasn't been ack'd successfully from the audience.  [Update: See my more recent post 'More on the Microsoft Messaging message' for some answers to these questions]

The Longhorn DevelopMentor mailing list had an excellent exchange yesterday on Indigo, which has lead me to highlight some areas where I'd like a clearer message:

  • What's the relationship between Indigo, MSMQ, BizTalk and SQL Server Service Broker?  When and where is each technology appropriate? (Thanks to John Cavnar-Johnson for highlighting this)
  • How does MSMQ fit into the Indigo longer term picture? Is MSMQ for within the Enterprise or will it focus more on outside the Enterprise messaging?
  • In terms of EAI technologies, where should we look for this functionality - Indigo or BizTalk?
  • Which parts of Indigo are going to ship in Whidbey and which bits will come in Longhorn?
  • Given that the last version of WSE will be wire-level compatible with Indigo and that a future version of WSE is likely to support WS-ReliableMessaging, what are the benefits of Indigo other than the simplified programming model?
  • Indigo is committed to supporting WS-* standards and interoperability, but what extra functionality will be available if the whole environment is made up of Indigo boxes?

I'm  know some of these were addressed at the PDC.  I'm still digesting some of it (for example, the PDC DAT406 presentation on the "Yukon" Service Broker shows how MSMQ, Indigo and Service Broker are positioned, though not BizTalk).  Other questions I'm still researching.

posted on Wednesday, December 03, 2003 9:29:39 AM (GMT Standard Time, UTC+00:00)  #