Testing

2017-01-22  本文已影响0人  綿綿_

6.1 Test As You Write The Code

*Test code as its boundaries. *

One technique is ** boundary condition ** testing: as each small piece of code is written-a loop or a conditional statement, for example—check right then that the condition branches the right way or that the loop goes through the proper number of times. Such as ** *nonexistent or empty input. a single input item, an exactly full array, fills a buffer,EOF,and so on. * **The idea is that most bugs occur at boundaries. If a piece of code is going to fail, it will likely fail at a boundary. Conversely, if it works at its boundaries, it's likely to work elsewhere too.

Test pre- and post-conditions

Another way to head off problems is to verify that expected or necessary properties hold before (pre-condition) and after (post-condition) some piece of code executes. Making sure that input values are within range is a common example of testing a pre-condition.

*Use assertions *

C and C++ provide an assertion facility in <assert. h> that encourages adding pre- and post-condition tests. Since a failed assertion aborts the program, these are usually reserved for situations where a failure is really unexpected and there's no way to recover. We might augment the code above with an assertion before the loop:

assert(n>0);

If the assertion is violated, it will cause the program to abort with a standard message:

Assertion failed: n > 0, file avgtest-c, line 7 
Abort(crash)

Assertions are particularly helpful for validating properties of interfaces because they draw attention to inconsistencies between caller and callee and may even indicate who's at fault.If an interface changes but we forget to fix some routine that depends on it, an assertion may catch the mistake before it causes real trouble.

Program defensively

A useful technique is to add code to handle "can't happen" cases, situations where it is not logically possible for something to happen but (because of some failure elsewhere) it might anyway. As an example, a program processing grades might expect that there would be no negative or huge values but should check anyway. This is an example of defensive programming: making sure that a program protects itself against incorrect use or illegal data. Null pointers, out of range subscripts, division by zero, and other errors can be detected early and warned about or deflected.

Check error returns.

Check error returns. One often-overlooked defense is to check the error returns from library functions and system calls. Return values from input routines such as fread and fscanf should always be checked for errors, as should any file open call such as fopen. If a read or open fails, computation cannot proceed correctly. Checking the return code from output functions like fprintf or fwrite will catch
the error that results from trying to write a file when there is no space left on the disk. It may be sufficient to check the return value from fclose, which returns EOF if any error occurred during any operation, and zero otherwise.

fp = fopen(outfile, "w"); 
while (...)   //a write output to outfile 
fprintf(fp, ... );
if (fclose(fp) == EOF) { // any errors? 
/* some output error occurred */
}

Output errors can be serious. If the file being written is the new version of a precious file, this check will save you from removing the old file if the new one was not written successfully.
**The effort of testing as you go is minimal and pays off handsomely. Thinking about testing as you write a program will lead to better code, because that's when you know best what the code should do. **If instead you wait until something breaks, you will probably have forgotten how the code works. Working under pressure, you will need to figure it out again, which takes time, and the fixes will be less thorough and more fragile because your refreshed understanding is likely to be incomplete.

6.2 Systematic Testing

It's important to test a program systematically so you know at each step what you are testing and what results you expect. You need to be orderly so you don't overlook anything, and you must keep records so you know how much you have done.

Test incrementally.

Testing should go hand in hand with program construction. Write part of a program, test it, add some more code, test that, and so on. If you have two packages that have been written and tested independently, test that they work together when you finally connect them.

Test simple parts first.

Tests should focus first on the simplest and most commonly executed features of a program; only when those are working properly should you move on.

Know what output to expect

For all tests, it's necessary to know what the right answer is; if you don't. you're wasting your time.
Most programs are more difficult to characterize —compilers
(does the output properly translate the input?), numerical algorithms (is the answer within error tolerance?), graphics (are the pixels in the right places?). and so on. For these, it's especially important to validate the output by comparing it with known values.

If the program has an inverse, check that its application recovers the input. Encryption and decryption are inverses. so if you encrypt something and can't decrypt it, something is wrong. Similarly, lossless compression and expansion algorithms should be inverses. Programs that bundle files together should extract them unchanged. Sometimes there are multiple methods for inversion: check all combinations.

*Verify conservation properties. *

Many programs preserve some property of their inputs. Tools like wc (count lines, words, and characters) and sum (compute a checksum) can verify that outputs are of the same size, have the same number of words, contain the same bytes in some order, and the like. Other programs compare files for identity (cmp) or report differences (diff). These programs or similar ones are readily available for most environments, and are well worth acquiring.

Compare independent implementations.

Independent implementations of a library or program should produce the same answers. For example, two compilers should produce programs that behave the same way on the same machine, at least in most situations.

Measure test coverage.

One goal of testing is to make sure that every statement of a program has been executed sometime during the sequence of tests; testing cannot be considered complete unless every line of the program has been exercised by at least one test.

There are commercial tools for measuring coverage. Profilers, often included as pan of compiler suites, provide a way to compute a statement frequency count for each program statement that indicates the coverage achieved by specific tests.

6.3 Test Automation

It's tedious and unreliable to do much testing by hand; proper testing involves lots of tests, lots of inputs, and lots of comparisons of outputs. Testing should therefore be done by programs, which don't get tired or careless. It's worth taking the time to write a script or trivial program that encapsulates all the tests, so a complete test suite can be run by (literally or figuratively) pushing a single button.

Automate regression testing.

The most basic form of automation is regression testing, which performs a sequence of tests that compare the new version of something with the previous version. When fixing problems, there's a natural tendency to check only that the fix works; it's easy to overlook the possibility that the fix broke something else. The intent of regression testing is to make sure that the behavior hasn't changed except in expected ways. A test script should usually run silently, producing output only if something unexpected occurs.

There is an implicit assumption in regression testing that the previous version of the program computes the right answer. This must be carefully checked at the beginning of time, and the invariant scrupulously maintained. If an erroneous answer ever sneaks into a regression test, it's very hard to detect and everything that depends on it will be wrong thereafter.** It's good practice to check the regression test itself periodically to make sure it is still valid.**

Create self-contained tests.

Self-contained tests that carry their own inputs and expected outputs provide a complement to regression tests.

An Awk program (what else?) converts each test into a complete Awk program, then runs each input through it, and compares actual output to expected output; it reports only those cases where the answer is wrong.

Similar mechanisms are used to test the regular expression matching and substitution commands. A little language for writing tests makes it easy to create a lot of them; using a program to write a program to test a program has high leverage.

What should you do when you discover an error? If it was not found by an existing test, create a new test that does uncover the problem and verify the test by running it with the broken version of the code. The error may suggest further tests or a whole new class of things to check. Or perhaps it is possible to add defenses to the program that would catch the error internally.

**Never throw away a test. **It can help you decide whether a bug report is valid or describes something already fixed. Keep a record of bugs, changes, and fixes; it will help you identify old problems and fix new ones.

6.4 Test Scaffolds

To test a component in isolation, it's usually necessary to** create some kind of framework or scaffold **that provides enough support and interface to the rest of the system that the part under test will run.

It's easy to build scaffolds for testing mathematical functions, string functions, sort routines, and so on, since the scaffolding is likely to consist mostly of setting up input parameters, calling the functions to be tested, then checking the results. It's a bigger job to create scaffolding for testing a partly-completed program.

As in any testing method, test scaffolds need the correct answer to verify the operations they are testing. An important technique is to **compare a simple version that is believed correct against a new version that may be incorrect. **This can be done in stages, as the following example shows.

6.5 Stress Tests

High volumes of machine-generated input are another effective testing technique.Higher volume in itself tends to break things because very large inputs cause overflow of input buffers, arrays, and counters. and are effective at finding unchecked fixed-size storage within a program. People tend to avoid "impossible" cases like empty inputs or input that is out of order or out of range, and are unlikely to create very long names or huge data values. Computers, by contrast, produce output strictly according to their programs and have no idea of what to avoid.

Random inputs (not necessarily legal) are another way to assault a program in the hope of breaking something.Such tests rely on detection by built-in checks and defenses in the program, since it may not be possible to verify that the program is producing the right output; the goal is more to provoke a crash or a "can't happen" than to uncover straightforward errors. It's also a good way to test that error-handling code works. With sensible input, most errors don't happen and code to handle them doesn't get exercised: by nature, bugs tend to hide in such comers. At some point, though, this kind of testing reaches diminishing returns: it finds problems that are so unlikely to happen in real life they may not be worth fixing.

Some testing is based on explicitly malicious inputs. Security attacks often use big or illegal inputs that overwrite precious data; it is wise to look for such weak spots. A few standard library functions are vulnerable to this sort of attack. For instance, the standard library function ** get s provides no way to limit the size of an input line, so it should never be used; always use * fgets(buf, sizeof (buf) , stdin) * **instead. A bare ** scanf ("%sM, buf) ** doesn't limit the length of an input line either; it should therefore usually be used with an explicit length, such as ** scanf ("%20sW, buf). **

Any routine that might receive values from outside the program, directly or indirectly, should validate its input values before using them.

Conversion between types is another source of overflow, and catching the error may not be good enough.

On a more mundane level, binary inputs sometimes break programs that expect text inputs, especially if they assume that the input is in the 7-bit ASCII character set. It is instructive and sometimes sobering to pass binary input (such as a compiled program) to an unsuspecting program that expects text input.

Good test cases can often be used on a variety of programs. For example, **any program that reads files should be tested on an empty file. Any program that reads text should be tested on binary files. Any program that reads text lines should be tested on huge lines and empty lines and input with no newlines at all. It's a good idea to keep a collection of such test files handy, **so you can test any program with them without having to recreate the tests. Or write a program to create test files upon demand.

6.6 Tips for Testing

Programs should** check array bounds (if the language doesn't do it for them),** but the checking code might not be tested if the array sizes are large compared to typical input.

Initialize arrays and variables with some distinctive value, rather than the usual default of zero; then if you access out of bounds or pick up an uninitialized variable, you are more likely to notice it. The constant OxDEADBEEF is easy to recognize in a debugger; allocators sometimes use such values to help catch uninitialized data.

Vary your test cases, especially when making small tests by hand-it's easy to get into a rut by always testing the same thing, and you may not notice that something else has broken.

Don't keep on implementing new features or even testing existing ones if there are known bugs; they could be affecting the test results.

**Test output should include all input parameter settings, so the tests can be reproduced exactly. **If your program uses random numbers, have a way to set and print the starting seed, independent of whether the tests themselves are random. Make sure that test inputs and corresponding outputs are properly identified, so they can be understood and reproduced.

It's also wise to provide ways to make the amount and type of output controllable when a program is run; extra output can help during testing.

Test on multiple machines, compilers, and operating systems. Each combination potentially reveals errors that won't be seen on others, such as dependencies on byte order, sizes of integers, treatment of null pointers. handling of carriage return and newline, and specific properties of libraries and header files. Testing on multiple machines also uncovers problems in gathering the components of a program for shipment and,may reveal unwitting dependencies on the development environment.

6.7 Who does the testing?

Testing that is done by the implementer or someone else with access to the source code is sometimes called **white box testing. **(The term is a weak analogy to black box testing, where the tester does not know how the component is implemented; "clear box" might be more evocative.)

Black box testing means that the tester has no knowledge of or access to the innards of the code. It finds different kinds of errors, because the tester has different assumptions about where to look.Boundary conditions are a good place to begin black box testing; high-volume, perverse, and illegal inputs are good follow-ons. Of course you should also test the ordinary "middle of the road" or conventional uses of the program to verify basic functionality. Real users are the next step.

Some testing can be done by scripts (whose properties depend on language, environment, and the like). Interactive programs should be controllable from scripts that simulate user behaviors so they can be tested by programs. One technique is to capture the actions of real users and replay them; another is to create a scripting language that describes sequences and timing of events.

Finally, give some thought to how to **test the tests themselves. **But if a regression suite infected by an error will cause trouble for the rest of time. The results of a set of tests will not mean much if the tests themselves are flawed.

6.9 Summary

The better you write your code originally, the fewer bugs it will have and the more confident you can be that your testing has been thorough. Testing boundary conditions as you write is an effective way to eliminate a lot of silly little bugs. **Systematic testing **tries to probe at potential trouble spots in an orderly way; again. failures are most commonly found at boundaries. which can be explored by hand or by program. As much as possible, it is desirable to automate testing, since machines don't make mistakes or get tired or fool themselves into thinking that something is working when it isn't. **Regression tests **check that the program still produces the same answers as it used to. **Testing after each small change **is a good technique for localizing the source of any problem because new bugs are most likely to occur in new code.
The single most important rule of testing is to do it.

上一篇下一篇

猜你喜欢

热点阅读