James Newkirk lead a Birds of a Feather session on Test Driven Development. The room was packed to the rafters, showing that Unit Testing is starting to reach a critical mass. Here are my notes on the discussion from the session which covered how to write tests, how to use tests against legacy systems, how to test against the database and many other topics.
Should we write test code against interfaces or something more abstract than the implementation?
James mentioned that MBUnit is a tool that allows you to test against an interface. The question was whether you should create interfaces that enable tests to be written against them in case further implementations were created in future. James' attitude was that this might result in wasted work ('you aint gonna need it') since you may not need it, or may not need it now. Instead, abstract things out when you need them - don't create an interface just to test it.
James also said that an interface is not a good example of the contract of what is being done - it is the name of the method with input and outputs, but does not reflect how the method reacts to the input. James writes tests that show the real interaction between someone that calls the code and what it produces.
How much of the class should we test? Public methods only or protected and private methods as well?
Around a third of the group thought you should test protected and private methods, about another third thought that we should only test public methods.
The arguments for testing internal and protected methods:
- To ensure that the internals work. One delegate mentioned writing a test before each internal method. Someone else said the start with the public method then refactor and hide the method by turning it private.
- Another argument was An argument is that if you write a granular class with lots of private methods, then the tests should be just as granular as the thing being tested.
- The internals are the most important as they contain the biggest areas of logic, so they should be tested directly.
The argument for testing public methods:
- Private method tests inhibit refactoring - you have to refactor your tests with a change, increasing the burden and making it less likely that the tests are updaed.
- You may need to test all of the scenarios from the real world against the public method. If your class is written well then the private methods should be tested.
- If you separate the tests from the production code then you must test the public methods.
- Using public methods gives you a clear limit to the number of unit tests that you need. Public tests will get most of the issues, there's no payback from more investment by testing private and internals.
James says that there is no winning this argument. He favors testing the public interface because it decouples the test from the implementation which will discourage refactoring. However, there are cases where it doesn't make sense to expose something just to test it. He thinks it is 80/20 or 90/10 favouring testing through the public interface.
Whidbey will have friend assemblies that allow developer's to split two assemblies, this other assembly is akin to being inside the same assembly. You can do this today with a multi-module assembly built through the command line, not through Visual Studio.
How do we introduce Unit Tests to a 'legacy system' without tests?
No one in the session had seen legacy code that is easy to test if the testing hasn't been thought of upfront. One person mentioned that they created tests for each issue logged through the help desk and then used these as a regression suite. This also demonstrated the value of unit testing to their organisation.
James suggested drawing a line around a piece of the code that you need to change. Test it's external behaviour, make changes and ensure the tests still run. They aren't unit tests, but who cares? Create a boundary, create tests and then change. The key point was to conserve effort and only write tests on things that you are going to change.
James also mentioned a book to be published in September by Michael Feathers called 'Working Effectively With Legacy Code' that describes how to handle this difficult situation.
How to convince people to do test first? Argue against the concept that it will take too long to write the tests first?
One delegate mentioned that this is hard because the best way to convince people is to show results, which requires a practical example, which means you have to know how to do it (what classes, how to unit testing). It takes time and experience but it will work eventually.
Someone else mentioned the green bar of success (my favourite) on the screen is a big part of demonstrating it.
You have to make sure the person has the right mindset. Not everyone has a zero-defect mindset. They want to write tests more than write code.
James says the story is not about the individual developer who wrote the code and knows it works. It is about the team and the ongoing evolution and maintenance is where the tests matter. How can you be confident of your result without them? With unit tests I know they works, before I relied on something else, but now I know.
Why should we write the tests first?
James says that no one will take my word for it. What convinces people is having to test something that hasn't been created with testing as a priority. If a setup method has 20 lines of code and many objects being created and only contains a single assertion - it wasn't written with testing in mind. Someone will say it is really hard to test it. So the next question is 'What would you do differently?' The answer to this question is what people should do in Test First.
James said that he believes that testing has to be seen as a primary part of development. We have to incorporate test inside development. There may be QA, but they start at a different level. You have to incorporate testing as part of development.
In the PAG group they have 2,000 unit tests that get run against the library every time someone modifies the code. When James talked to the testing and QA group who do integration testing -they didn't know what to do - the unit tests did the easy part of the testing. They have to start at a higher, more complex level to start thinking about critiquing what is going on rather than I just pass 32,000 characters in a web app and it breaks. These are necessary tests, but it doesn't take a lot of skill to do these types of tests.
Someone made the point that software development lags behind electronic engineering where hardware must be tested first.
James mentioned that studies have show that Test First is 16% longer, but the quality was much higher.
How do you test a function that is dependant on other objects or libraries?
One delegate mentioned that using Dynamic Mock objects relied on finding a sweet spot. They are useful with objects that have relationships, but the problem is that if you refactor your code then it is highly likely that a whole slew of tests that will break. The problem is compounded with dynamic mocks since you only see this at run time rather than compile. It works well at mocking the database, but not the more dynamic
James described mock objects as the situation where object A interacts with objects B and you need some way of 'switching out' Object B to create a 'dummy response' from the calls of Object A.
James believed that if you are interacting with something complex, building the simulation of something that is complex is a worthless activity - you spend more time writing the simulator than the test and it doesn't tell you anything? Just because my mocks work, does my real system function? James uses a rule that if mock objects have an if statement inside them then that's too far - the behavior is too complicated and there's no value (it says there are multiple situations). They do have value, but we need to separate a simulator and a mock object. For example, writing a mock JDBC implementation - that is way too complicated. Be skeptical of spending a lot of time spending time building MockObjects.
Someone asked - 'Don't you think that it depends on the situation?' James: "I could answer everything that way"
James mentioned the Inversion of Control pattern - this is the a situation where object A depends on object B - the developer wants to be able to switch object B out as a mock or a stub - how can we do that? Inversion of control says you create the dependent object B, or a mock or stub external to object A and then pass it into object A as a interface parameter in a constructor. That's a lot of complexity to add to the initialization, James was interested in opinions about whether the complexity is worth it? He mentioned that some people have pushed back on the idea because of the complexity of the construction - if all you are doing this for testability (it also decouples the design) then it is too much work? He believes that what has to happen is that we need different language structures and we will never get there if we dismiss these ideas now. In the Java world there are lots of work being done on using containers in another way.
How do you we test the database? Where do you get your data for testing? How do you unit test things that involve the database?
James thought that the problem with the database is similar to the MockObjects. It is a lot of effort to Mock out the database. Sometime it is easier to use the database than mock, you need a database to run the unit tests and you have to fill the database, read in a dump, which takes longer and is more effort.
James' experience is that at some point you write tests against the database because he wants to test that the Data Access Layer works. When he write DAL tests he uses the database. But he don't use the DB for anything that uses the DAL - he mocks them out. A technique he use to ensure he's written good unit testing is to deliberately break something and see the results. If he finds a cascade of errors, it means he hasn't isolated the tests correctly. He also mentioned that if he comes back to an a feature afterwards and spend a lot of time in QA it means he didn't do well enough with the tests.
With databases, it is a good idea to use a transaction and rollback the transaction at the end of it to ensure that. There was some discussion about how to construct database tests. The problem with building the database in scripts the unit tests take too long and they wont be run. How long is too long?
How long should the tests take?
There were a range of experiences including:
- Four or five machines run it each product, it takes 6 hours across machines.
- Someone else has developer tests and then smoke and build tests, it can take a long time in the nightly build.
- Another project had two CruiseControl systems that built at different intervals, running a short and long set of tests.
James thinks an hour is too long - the reason is that ideally you would be able to get very quick work around. This is how it works successfully. Waiting an hour for the feedback means that a developer could only make 8 or 10 changes a day. In that hour a number of things are accumulated. James mentioned his experience on an early project where it took 12 hours to recompile that application. In this project if someone made a change to a header file in C++ they had to recompile, so what they'd do is create global static variables instead of the header to ensure it works before 'doing it correctly' (which was never done). Sorting out problems was very complex because it was effectively an integration nightmare..
Should all developers run all tests? It depends on the architecture - if there are subsystems you could do that. In XP it says all of the tests all of the time. James' group run 2000 tests in 340 seconds against 3 databases.
How do you manage dependence tests with the database? How to restore state with persistent storage.
It is a good idea to stub out the system so that you can test against something that you depend upon. Then when the implementation is delivered you can run the tests and it becomes clear where there is an integration problem.
One delegate spoke of how he stubs out the system, then write tests and implementation, then the stubs are removed and the real implementation of the rest of the tests are done. The platform should support this - James mentioned a project where he used 'linker polymorphism' - relied on the linker to substitute. Another delegate said they had written something that as part of the the daily build checks out the project file, it changes the XML structure of the C# project to switch references and changes them to the actual assemblies.
How do you test concurrency? Multithreading?
These are just really hard. Testing timing problems with unit testing are hard. One delegate spoke about how he had written unit tests for a multithreaded apps and said it was one of the best examples to convince others of the value of unit tests. It took 2 months to create, but then it only took 1 day to shift it from Java to C#.
The xUnit tools should support this but don't make it any easier.
Another delegate used a timer library and extended the Nunit assertion.
Roadmap for MS
VS Team System will include unit testing support some time next year. It will have a number of integrated testing tools. The keynote yesterday showed a tool 'very much like' and 'subtly' different tool to Nunit. James will be talking about the difference.
There are load testing tools, web testing tools in there. It is a team system that is extensible by many different partners. Many partners, such as CompuWare (did a functional UI testing), doing stuff inside this 'platform' that can be extended.
Someone will write something that will allow Nunit to execute in this environment.
How do you generate tests?
Team system will create a stub test from the code. These are just boiler plate - it doesn't do analysis of the running code and propose a 'good test'
One delegate had a poor experience with a tool that created tests automatically because the tests didn't take into account the intention of the class which resulted in the tests failing. Writing the test manually is more useful as these can act as a specification for the class.
James thinks looking at the implementation to drive the tests is looking at it the wrong way. If you wrote the implementation wrong and then create tests off these, what's the value? The tests should say 'this is what the implementation should do'
People often propose this if they haven't done unit testing before.
How do you test distributed applications?
It relies on subsystems - just write tests that test the local machine, when you have to integrate it all together you might need to run tests, but these are not unit tests in this situation.
How do you unit test GUIs?
One delegate used Rational Robot - costs a fortune. If you are careful and script it then you can get away for a few builds without it breaking. SendKeys was also used. Another person just avoided the problem by trying to get all of the functionality out, recognising that it is hard to test the UI. Someone said that it is hard to test drag-and-drop operations.
James menioned that Robot is not good to drive development because they require the UI to be done, but it is hard to do. NUnit Forms and NUnit ASP were also mentioned.
What frameworks can you recommend?
Nunit, csUnit, mbUnit, CLRUnit
HarnessIt (commercial)
Xunity (Commercial)
How do you do Web unit testing?
HttpUnit works, but it is Java, these can be used successfully to drive HTTP requests and do some testing on the HTML that is returned.
They suffer from many of the brittleness problems that the robot testing does - you have to use id's on the elements - but it requires a lot of thinking about how you output.