Performance Testing
What
"Testing conducted to evaluate the compliance of a system or component with specified performance requirements." [IEEE]
BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST)
Performance testing is a sub-set of non-functional requirements testing. Other techniques that appear in this area include: conformance testing, conversion testing, documentation testing, penetration testing, recovery testing, serviceability testing, storage testing.
However as with most things, one mans performance test is another mans stress test.
Test techniques that I would include in performance testing are load testing and stress testing. The reason I choose to include these as part of performance testing, is that they evaluate how the behaviour is modified at under various levels of usage.
Why?
To establish that the system or component under test will behave as we expect, under the conditions we expect. Note that we are not setting out to blindly "hammer" to system into submission or destruction. The end result should that management can have confidence in the software. Additionally they will have enough information to base business related decisions on.
Who?
Usually done within the developing organisation. Although, it may be contracted out to a specialist test lab or consultancy. Contracting out is popular due to the heavy investment required in tools and training.
In terms testing roles, performance testing can be split. Typically this is split between a test analysis and test automation. Thus the test analyst will decide which tests are to be run. Test automators or engineers will actually create the scripts and configure the tools. In addition, more junior staff may act as technicians to execute them. Technicians are used because, once the test suite is written it can become a mundane task to automate them.
Where?
At the developing site or a specialist facility.
When?
The timing of performance testing depends on the model or methodologies being used. In the case of the Waterfall, perfomance testing will take place at the very end of the development lifecycle. In the worst case scenario, imagine a system that has tested without fault in functional testing with a single user. When a load of 3 concurrent users is simulated, it crashes with due to continous data locks etc. Huge amounts of time are required to fix high profile defects, at or near release time.
In incremental testing, the system is built up by adding components or sub-systems one at a time. As each component is added, the system can be tested to see if it meets requirements. Of course the individual components or sub-systems can be tested for performance by themselves.
In the case of the Rational Unified Process (RUP), performance may take place during the very earliest phase of Inception. It is Inception that the business risk is mitigated. Also one possible core architecture is put forward. Simple performance testing might be done to confirm that the architecture is suitable (in a very coarse grained manner.) This would be the case for such systems generating large numbers of transactions.
How?
Performance testing requires huge numbers: of users, data and transactions. Such requirements mean that manual testing is unfeasible. Automated testing is really the only way to go. Of course there are many tools available on the market, for instance Loadrunner.
Types of tools include, data preparation, execution. They are generally known as Computer Aided Software Testing Tools (CAST)
The difficulty with performance testing is replicating the real world environment, especially the randomness of individual events, whilst maintaining the predictability of the big picture. Various statistical methods are available to help in this process. My particular favourite is Monte Carlo scenario modelling. In a Monte Carlo simulation, we follow the rules of the casino. Thus if we shake a dice once, we do not know if it will be a 1 through to a 6. However if we roll it 6 million times, we will get a million ones, a million twos, etc.
Load Testing
In load testing the software under test in subjected to various levels of "load", to test its behaviour.
Load testing is not defined in BS7925-1. It can mean different things depending on whom you are speaking. It is most definitely a non-functional test. The aspect of behaviour that concerned is that of performance.
The testing has to be conducted within a structured framework otherwise its results are open to question.
Various elements need to be considered before engaging in Load Testing.
Plannning Load Testing is like any other activity. Good quality is the result of planning and foresight, enabling the test to be ran over and over again.
Objectives What do we hope to achieve? This question has to be answered in relation to the specified requirements of the Software Under Test (SUT). The tester needs to know how the SUT a) should be behave under, b) a given load.
For example for a large site it may be accepted as a business risk that the SUT will run slower at high loads. However, this needs confirming, otherwise the business runs the risk of the SUT due to underestimating the effect of higher loads.
On what? Which aspect of the system is to bear the specified load? The answer ranges from a single component to a Web Services system made up of many third party applications. As the SUT becomes larger, it becomes more complex. Consequently the load testing suite becomes more complex.
For example it is decided a website checkout facililty needs testing. This can be tested in isolation as a component or as part of the system. If it can only be tested once the integration has taken place and a fault occurs, was the problem with the cart or somewhere else in the system.
Iterative development styles suit load testing, as they successively build up a full system. The RUP with its inception etc phases, gives many opportunities. During the Inception and Elaboration phases the core architecture can be assessed. In construction smaller components can be tested as they added incrementally and then as an integrated whole system.
The Sequential or Waterfal method is less suited, due to its single big bang approach to integration. Thus after months or even years of development, can the system as a whole be load tested. The problem is that it is difficult to establish what you are testing and the causes of failure.
Which Load? How is the load on the SUT to be generated? Multiple users or a small number of users generating large numbers of transactions? The important factor is to replicate the situation the SUT will meet in the real world. For example a bank system, processing payments will have one user processing a lot of data. Whereas a consumer website may have millions of users conducting one or two transactions a day.
Should the data be increased as the test progresses, or at a steady sustained level for a long period?
Should the load be of a single type or multiple? For example in our checkout example, a certain percentage of the transactions should be of the user dropping out of the transaction. Again this mirrors the real world.
Testers looking at load testing may want to consider using Monte Carlo simulations. Here data is randomly generated according to a set of rules. I.e. we do not know which transaction is coming next, but it has a 1 in 5 chance of being a drop out from the checkout.
An important point, often overlooked is that the load may spike. Advertising may make the SUT suddenly very attractive to users. The classic example of failure to realise this, was the Victoria's Secrets affair. Here a fashion show of lingerie, during the half time break in the Super Bowl was heavily promoted. Consequently millions of Americans logged on, and lo and behold the servers crashed under the strain.
Expected Outcomes The tester has to be aware of what to expect. Should this not be the case, then raise it with the analyst. Just because load testing is non-functional, we still need to know what to expect.
Logging As the test progresses we need to know what exactly a) the load being applied and b) the behaviour exibhited. This has two benefits, firstly we can keep track of any deterioration of performance as the load increases.
Tools Due to the large amount of transactions, many Load Testing can only realistically be undertaken with Computer Aided Software Tools (CAST). LoadRunner and Rational Robot are the front runners for this type of testing.
Security Testing
What
"Testing whether the system meets its specified security objectives."
BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST)
The latest fashionable technique is penetration testing., where the tester tries to simulate attempts to break system security as a real intruder would.
In most cases security is a non-functional requirement. The exception of course is where the purpose of the software is security itself.
Why?
We all hope that unfriendly people or organisations will not be able to abuse our software or the data that is held or generated by it. Depending on the type of people or organisation our customers are, and the uses to which they are going to put the software should dictate the amount of security testing we do.
Three events, I believe have put security testing into the spotlight:- September 11th, moving business online and large virus attacks. Notwithstanding the the events mentioned every system has will have some level of security requirement, and will therefore need testing.
Thus a military communications system will require stringent security testing. At a more personal level, any system that will hold delicate personal information, should for privacy reasons be secure.
Even fairly inoccous software can be exploited to either do something it should not or be used to break into another system. A good example is the humble Excel product. Many viruses take advantage of buffer overruns to spread themselves or use it as a gateway to the operating system and ultimately take over the users machine.
Who?
The definition above only mentions the system. However I believe awareness of security should begin even earlier in component testing. It is at this stage many of the chinks in armour defects will be found. For example buffer overruns. Websites may find the potential for hot sql injection intrusions. The growing use of web services, with the reliance on opening individual components or sub-systems up for all to use, will make this level of testing even more crucial. Thus we can start our list of with developers or whoever is conducting unit testing.
Especially in high risk systems, analysts need to be ensuring, security, is built into the system design and processes. Additionaly testability needs to be high in this particular area.
Ideally the software though is tested by an independent test team of system testers. In the case of penetration testing, an outside consultancy is brought in to try simulate an attack. However constraints on resources mean that, independence in many cases suffers.
Where?
Testing that software is secure can take place anywhere, including the developers own site. At the other extreme is for the penetration tester or "intruder" to be sitting on a different continent using the telecoms network and internet to try and break into an online transaction site.
When?
Throughout the whole software development lifecycle for the developing organisation and accepting customer. In addition regular security testing should be undertaken to make sure the software is still secure.
How
Perhaps more than any other form, security testing is associated with risk. Thus if security is of such importance, i.e. to the police or military, then awareness and practice has to be pervasive amongst the stakeholders. If the organisation has a mature development culture and are at level 3 or above in the CMMI they should have a strategy for risk mitigation. (For more on risk management)
Data Flow Testing
Testing in which test cases are designed based on variable usage within the code.
BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST)
complete path testing
What
A test case design technique in which test cases are designed to execute all the paths of a component.
Testing paths through a component is largely a white box test technique. This is because the tester needs access to the code. Typically path testing would be conducted towards component testing. In addition if complete path testing is done, this will contribute to exhaustive testing
Why?
Components are the building blocks of software. If we can not be sure about their internal workings how can we expect to trust a system that is built from them?
Only by identifying and executing the different paths or routes through the component, can we be sure that all the behaviour the component will exhibit has been tested. For some software failure is simply not an option, typically these are safety critical systems such as medical software. So unlike normal path testing, all the paths through the software are tested.
Who?
Ideally someone with independence from the person who designed and/or coded the component. However this is quite rare and it usually ends up with the developer who did the coding. Not satisfactory, but this is the real world.
Where?
Invariabliy at the developing organisations home site.
When?
Execution of the tests should be as close as possible to the completion of the code.
How?
dynamic analysis where the code is actually exercised is the method here.
Tests can be either manual or automated. Various tools are available including static analyzers and run time analysis tools.
The tester needs to be aware of the various paths through the component. From this he can decide the paths to be tested. This figure is then used to calculate the target for path coverage. of 100%.
All the paths required to ensure the components behaviour is functionally correct. In addition as many alternative paths should be tested. Commonly the first paths not to be tested are those useful in negative testing. This however runs the risk of not exposing serious defects, when the user takes a path that he should not.
The more life or business critical the component or software, the more paths will be tested. Thus a component for the Space Shuttle will probably have path coverage of 100%
In the long run, though path testing has to be part of a wider culture of testability. Analysts and designers need to be aware of complexity. If they continue to demand complex objects, then path testing becomes ineffective due to the number of paths a tester is required to traverse.
What
"Testing conducted to evaluate the compliance of a system or component with specified performance requirements." [IEEE]
BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST)
Performance testing is a sub-set of non-functional requirements testing. Other techniques that appear in this area include: conformance testing, conversion testing, documentation testing, penetration testing, recovery testing, serviceability testing, storage testing.
However as with most things, one mans performance test is another mans stress test.
Test techniques that I would include in performance testing are load testing and stress testing. The reason I choose to include these as part of performance testing, is that they evaluate how the behaviour is modified at under various levels of usage.
Why?
To establish that the system or component under test will behave as we expect, under the conditions we expect. Note that we are not setting out to blindly "hammer" to system into submission or destruction. The end result should that management can have confidence in the software. Additionally they will have enough information to base business related decisions on.
Who?
Usually done within the developing organisation. Although, it may be contracted out to a specialist test lab or consultancy. Contracting out is popular due to the heavy investment required in tools and training.
In terms testing roles, performance testing can be split. Typically this is split between a test analysis and test automation. Thus the test analyst will decide which tests are to be run. Test automators or engineers will actually create the scripts and configure the tools. In addition, more junior staff may act as technicians to execute them. Technicians are used because, once the test suite is written it can become a mundane task to automate them.
Where?
At the developing site or a specialist facility.
When?
The timing of performance testing depends on the model or methodologies being used. In the case of the Waterfall, perfomance testing will take place at the very end of the development lifecycle. In the worst case scenario, imagine a system that has tested without fault in functional testing with a single user. When a load of 3 concurrent users is simulated, it crashes with due to continous data locks etc. Huge amounts of time are required to fix high profile defects, at or near release time.
In incremental testing, the system is built up by adding components or sub-systems one at a time. As each component is added, the system can be tested to see if it meets requirements. Of course the individual components or sub-systems can be tested for performance by themselves.
In the case of the Rational Unified Process (RUP), performance may take place during the very earliest phase of Inception. It is Inception that the business risk is mitigated. Also one possible core architecture is put forward. Simple performance testing might be done to confirm that the architecture is suitable (in a very coarse grained manner.) This would be the case for such systems generating large numbers of transactions.
How?
Performance testing requires huge numbers: of users, data and transactions. Such requirements mean that manual testing is unfeasible. Automated testing is really the only way to go. Of course there are many tools available on the market, for instance Loadrunner.
Types of tools include, data preparation, execution. They are generally known as Computer Aided Software Testing Tools (CAST)
The difficulty with performance testing is replicating the real world environment, especially the randomness of individual events, whilst maintaining the predictability of the big picture. Various statistical methods are available to help in this process. My particular favourite is Monte Carlo scenario modelling. In a Monte Carlo simulation, we follow the rules of the casino. Thus if we shake a dice once, we do not know if it will be a 1 through to a 6. However if we roll it 6 million times, we will get a million ones, a million twos, etc.
Load Testing
In load testing the software under test in subjected to various levels of "load", to test its behaviour.
Load testing is not defined in BS7925-1. It can mean different things depending on whom you are speaking. It is most definitely a non-functional test. The aspect of behaviour that concerned is that of performance.
The testing has to be conducted within a structured framework otherwise its results are open to question.
Various elements need to be considered before engaging in Load Testing.
Plannning Load Testing is like any other activity. Good quality is the result of planning and foresight, enabling the test to be ran over and over again.
Objectives What do we hope to achieve? This question has to be answered in relation to the specified requirements of the Software Under Test (SUT). The tester needs to know how the SUT a) should be behave under, b) a given load.
For example for a large site it may be accepted as a business risk that the SUT will run slower at high loads. However, this needs confirming, otherwise the business runs the risk of the SUT due to underestimating the effect of higher loads.
On what? Which aspect of the system is to bear the specified load? The answer ranges from a single component to a Web Services system made up of many third party applications. As the SUT becomes larger, it becomes more complex. Consequently the load testing suite becomes more complex.
For example it is decided a website checkout facililty needs testing. This can be tested in isolation as a component or as part of the system. If it can only be tested once the integration has taken place and a fault occurs, was the problem with the cart or somewhere else in the system.
Iterative development styles suit load testing, as they successively build up a full system. The RUP with its inception etc phases, gives many opportunities. During the Inception and Elaboration phases the core architecture can be assessed. In construction smaller components can be tested as they added incrementally and then as an integrated whole system.
The Sequential or Waterfal method is less suited, due to its single big bang approach to integration. Thus after months or even years of development, can the system as a whole be load tested. The problem is that it is difficult to establish what you are testing and the causes of failure.
Which Load? How is the load on the SUT to be generated? Multiple users or a small number of users generating large numbers of transactions? The important factor is to replicate the situation the SUT will meet in the real world. For example a bank system, processing payments will have one user processing a lot of data. Whereas a consumer website may have millions of users conducting one or two transactions a day.
Should the data be increased as the test progresses, or at a steady sustained level for a long period?
Should the load be of a single type or multiple? For example in our checkout example, a certain percentage of the transactions should be of the user dropping out of the transaction. Again this mirrors the real world.
Testers looking at load testing may want to consider using Monte Carlo simulations. Here data is randomly generated according to a set of rules. I.e. we do not know which transaction is coming next, but it has a 1 in 5 chance of being a drop out from the checkout.
An important point, often overlooked is that the load may spike. Advertising may make the SUT suddenly very attractive to users. The classic example of failure to realise this, was the Victoria's Secrets affair. Here a fashion show of lingerie, during the half time break in the Super Bowl was heavily promoted. Consequently millions of Americans logged on, and lo and behold the servers crashed under the strain.
Expected Outcomes The tester has to be aware of what to expect. Should this not be the case, then raise it with the analyst. Just because load testing is non-functional, we still need to know what to expect.
Logging As the test progresses we need to know what exactly a) the load being applied and b) the behaviour exibhited. This has two benefits, firstly we can keep track of any deterioration of performance as the load increases.
Tools Due to the large amount of transactions, many Load Testing can only realistically be undertaken with Computer Aided Software Tools (CAST). LoadRunner and Rational Robot are the front runners for this type of testing.
Security Testing
What
"Testing whether the system meets its specified security objectives."
BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST)
The latest fashionable technique is penetration testing., where the tester tries to simulate attempts to break system security as a real intruder would.
In most cases security is a non-functional requirement. The exception of course is where the purpose of the software is security itself.
Why?
We all hope that unfriendly people or organisations will not be able to abuse our software or the data that is held or generated by it. Depending on the type of people or organisation our customers are, and the uses to which they are going to put the software should dictate the amount of security testing we do.
Three events, I believe have put security testing into the spotlight:- September 11th, moving business online and large virus attacks. Notwithstanding the the events mentioned every system has will have some level of security requirement, and will therefore need testing.
Thus a military communications system will require stringent security testing. At a more personal level, any system that will hold delicate personal information, should for privacy reasons be secure.
Even fairly inoccous software can be exploited to either do something it should not or be used to break into another system. A good example is the humble Excel product. Many viruses take advantage of buffer overruns to spread themselves or use it as a gateway to the operating system and ultimately take over the users machine.
Who?
The definition above only mentions the system. However I believe awareness of security should begin even earlier in component testing. It is at this stage many of the chinks in armour defects will be found. For example buffer overruns. Websites may find the potential for hot sql injection intrusions. The growing use of web services, with the reliance on opening individual components or sub-systems up for all to use, will make this level of testing even more crucial. Thus we can start our list of with developers or whoever is conducting unit testing.
Especially in high risk systems, analysts need to be ensuring, security, is built into the system design and processes. Additionaly testability needs to be high in this particular area.
Ideally the software though is tested by an independent test team of system testers. In the case of penetration testing, an outside consultancy is brought in to try simulate an attack. However constraints on resources mean that, independence in many cases suffers.
Where?
Testing that software is secure can take place anywhere, including the developers own site. At the other extreme is for the penetration tester or "intruder" to be sitting on a different continent using the telecoms network and internet to try and break into an online transaction site.
When?
Throughout the whole software development lifecycle for the developing organisation and accepting customer. In addition regular security testing should be undertaken to make sure the software is still secure.
How
Perhaps more than any other form, security testing is associated with risk. Thus if security is of such importance, i.e. to the police or military, then awareness and practice has to be pervasive amongst the stakeholders. If the organisation has a mature development culture and are at level 3 or above in the CMMI they should have a strategy for risk mitigation. (For more on risk management)
Data Flow Testing
Testing in which test cases are designed based on variable usage within the code.
BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST)
complete path testing
What
A test case design technique in which test cases are designed to execute all the paths of a component.
Testing paths through a component is largely a white box test technique. This is because the tester needs access to the code. Typically path testing would be conducted towards component testing. In addition if complete path testing is done, this will contribute to exhaustive testing
Why?
Components are the building blocks of software. If we can not be sure about their internal workings how can we expect to trust a system that is built from them?
Only by identifying and executing the different paths or routes through the component, can we be sure that all the behaviour the component will exhibit has been tested. For some software failure is simply not an option, typically these are safety critical systems such as medical software. So unlike normal path testing, all the paths through the software are tested.
Who?
Ideally someone with independence from the person who designed and/or coded the component. However this is quite rare and it usually ends up with the developer who did the coding. Not satisfactory, but this is the real world.
Where?
Invariabliy at the developing organisations home site.
When?
Execution of the tests should be as close as possible to the completion of the code.
How?
dynamic analysis where the code is actually exercised is the method here.
Tests can be either manual or automated. Various tools are available including static analyzers and run time analysis tools.
The tester needs to be aware of the various paths through the component. From this he can decide the paths to be tested. This figure is then used to calculate the target for path coverage. of 100%.
All the paths required to ensure the components behaviour is functionally correct. In addition as many alternative paths should be tested. Commonly the first paths not to be tested are those useful in negative testing. This however runs the risk of not exposing serious defects, when the user takes a path that he should not.
The more life or business critical the component or software, the more paths will be tested. Thus a component for the Space Shuttle will probably have path coverage of 100%
In the long run, though path testing has to be part of a wider culture of testability. Analysts and designers need to be aware of complexity. If they continue to demand complex objects, then path testing becomes ineffective due to the number of paths a tester is required to traverse.
No comments:
Post a Comment