Test Design refers to understanding the sources of test cases, test coverage, how to develop and document test cases, and how to build and maintain test data. There are 2 primary methods by which tests can be designed and they are:
- BLACK BOX
- WHITE BOX
Black-box test design treats the system as a literal "black-box", so it doesn't explicitly use knowledge of the internal structure. It is usually described as focusing on testing functional requirements. Synonyms for black-box include: behavioral, functional, opaque-box, and closed-box.
White-box test design allows one to peek inside the "box", and it focuses specifically on using internal knowledge of the software to guide the selection of test data. It is used to detect errors by means of execution-oriented test cases. Synonyms for white-box include: structural, glass-box and clear-box.
While black-box and white-box are terms that are still in popular use, many people prefer the terms "behavioral" and "structural". Behavioral test design is slightly different from black-box test design because the use of internal knowledge isn't strictly forbidden, but it's still discouraged. In practice, it hasn't proven useful to use a single test design method. One has to use a mixture of different methods so that they aren't hindered by the limitations of a particular one. Some call this "gray-box" or "translucent-box" test design, but others wish we'd stop talking about boxes altogether!!!
Black Box Testing is testing without knowledge of the internal workings of the item being tested. For example, when black box testing is applied to software engineering, the tester would only know the "legal" inputs and what the expected outputs should be, but not how the program actually arrives at those outputs. It is because of this that black box testing can be considered testing with respect to the specifications, no other knowledge of the program is necessary. For this reason, the tester and the programmer can be independent of one another, avoiding programmer bias toward his own work. For this testing, test groups are often used,
Though centered around the knowledge of user requirements, black box tests do not necessarily involve the participation of users. Among the most important black box tests that do not involve users are functionality testing, volume tests, stress tests, recovery testing, and benchmarks . Additionally, there are two types of black box test that involve users, i.e. field and laboratory tests. In the following the most important aspects of these black box tests will be described briefly.
Black box testing - without user involvement
The so-called ``functionality testing'' is central to most testing exercises. Its primary objective is to assess whether the program does what it is supposed to do, i.e. what is specified in the requirements. There are different approaches to functionality testing. One is the testing of each program feature or function in sequence. The other is to test module by module, i.e. each function where it is called first.
The objective of volume tests is to find the limitations of the software by processing a huge amount of data. A volume test can uncover problems that are related to the efficiency of a system, e.g. incorrect buffer sizes, a consumption of too much memory space, or only show that an error message would be needed telling the user that the system cannot process the given amount of data.
During a stress test, the system has to process a huge amount of data or perform many function calls within a short period of time. A typical example could be to perform the same function from all workstations connected in a LAN within a short period of time (e.g. sending e-mails, or, in the NLP area, to modify a term bank via different terminals simultaneously).
The aim of recovery testing is to make sure to which extent data can be recovered after a system breakdown. Does the system provide possibilities to recover all of the data or part of it? How much can be recovered and how? Is the recovered data still correct and consistent? Particularly for software that needs high reliability standards, recovery testing is very important.
The notion of benchmark tests involves the testing of program efficiency. The efficiency of a piece of software strongly depends on the hardware environment and therefore benchmark tests always consider the soft/hardware combination. Whereas for most software engineers benchmark tests are concerned with the quantitative measurement of specific operations, some also consider user tests that compare the efficiency of different software systems as benchmark tests. In the context of this document, however, benchmark tests only denote operations that are independent of personal variables.
Black box testing - with user involvement
For tests involving users, methodological considerations are rare in SE literature. Rather, one may find practical test reports that distinguish roughly between field and laboratory tests. In the following only a rough description of field and laboratory tests will be given. E.g. Scenario Tests. The term ``scenario'' has entered software evaluation in the early 1990s . A scenario test is a test case which aims at a realistic user background for the evaluation of software as it was defined and performed It is an instance of black box testing where the major objective is to assess the suitability of a software product for every-day routines. In short it involves putting the system into its intended use by its envisaged type of user, performing a standardised task.
In field tests users are observed while using the software system at their normal working place. Apart from general usability-related aspects, field tests are particularly useful for assessing the interoperability of the software system, i.e. how the technical integration of the system works. Moreover, field tests are the only real means to elucidate problems of the organisational integration of the software system into existing procedures. Particularly in the NLP environment this problem has frequently been underestimated. A typical example of the organisational problem of implementing a translation memory is the language service of a big automobile manufacturer, where the major implementation problem is not the technical environment, but the fact that many clients still submit their orders as print-out, that neither source texts nor target texts are properly organised and stored and, last but not least, individual translators are not too motivated to change their working habits.
Laboratory tests are mostly performed to assess the general usability of the system. Due to the high laboratory equipment costs laboratory tests are mostly only performed at big software houses such as IBM or Microsoft. Since laboratory tests provide testers with many technical possibilities, data collection and analysis are easier than for field tests.
- Black box testing should make use of randomly generated inputs (only a test range should be specified by the tester), to eliminate any guess work by the tester as to the methods of the function
- Data outside of the specified input range should be tested to check the robustness of the program
- Boundary cases should be tested (top and bottom of specified range) to make sure the highest and lowest allowable inputs produce proper output
- The number zero should be tested when numerical data is to be input
- Stress testing should be performed (try to overload the program with inputs to see where it reaches its maximum capacity), especially with real time systems
- Crash testing should be performed to see what it takes to bring the system down
- Test monitoring tools should be used whenever possible to track which tests have already been performed and the outputs of these tests to avoid repetition and to aid in the software maintenance
- Other functional testing techniques include: transaction testing, syntax testing, domain testing, logic testing, and state testing.
- Finite state machine models can be used as a guide to design functional tests
- According to Beizer the following is a general order by which tests should be designed:
1. Clean tests against requirements.
2. Additional structural tests for branch coverage, as needed.
3. Additional tests for data-flow coverage as needed.
4. Domain tests not covered by the above.
5. Special techniques as appropriate--syntax, loop, state, etc.
6. Any dirty tests not covered by the above.