water marking and tamper proofing softwares for software protection


ABSTRACT
We identify three types of attack on the intellectual property contained in software and three corresponding technical defenses.
A defense against reverse engineering is obfuscation, a process that renders software unintelligible but still functional. A defense against software piracy is watermarking, a process that makes it possible to determine the origin of software. A defense against tampering is tamper-proofing, so that unauthorized modifications to software will result in nonfunctional code. We briefly survey the available technology for each type of defense.
What is water marking?
• Originally used to identify paper quality
• Anti-counterfeiting of paper money
• Extended to other forms of hidden information.
Definitions:
Work: specific song, video, image, text, etc.
Watermarking: practice of imperceptibly altering a work to embed a message about that work.
WATER MARKING GOALS:
verify the owner of a digital image
detect forgeries of an original image
identify illegal copies of the image
Prevent unauthorized distribution.

1 BACKGROUND–MALICIOUS CLIENTS VS. MALICIOUS HOSTS
Until recently, most computer security research was concerned with protecting the integrity of a benign host and its data from attacks from malicious client programs (Fig. 1a). This assumption of a benign host is present in Neumann's influential taxonomy of computer-related risks, in which the job of a security expert is to design and administer computer systems that will fulfill certain stringent security requirements most of the time.
 To defend itself and its data against a malicious client, a host will typically restrict the actions that the client is allowed to perform.
A recent surge of interest in mobile agent systems has caused researchers to focus attention on a fundamentally different view of security. See (Fig. 1b), illustrating a benign client code being threatened by the host on which it has been downloaded or installed. A malicious host attack typically takes the form of intellectual property violations. The client code may contain trade secrets or copyrighted material that, should the integrity of the client be violated, will incur financial losses to the owner of the client. We will next consider three malicious-host attack scenarios

 
1.1 Malicious Host Attacks
 Piracy is a major concern for anyone who sells software. Our goal in this paper is to make piracy more difficult. We note that software piracy is socially acceptable in settings that encourage a belief in insiders' entitlement, price discrimination, cooperation is more important than copyright, or traditional Confucian ethics.
Threats have recently become more of a concern since, more and more, programs are distributed in easily decompilable format rather than native binary code.
A related threat is software tampering. Many mobile agents and e-commerce application programs must, by their very nature, contain encryption keys or other secret information. Pirates who are able to extract, modify, or otherwise tamper with this information can incur significant financial losses to the intellectual property owner.
These three types of attack (software piracy, malicious reverse engineering, and tampering) are illustrated in Fig. 2



·    In Fig. 2a, Bob makes copies of an application he halegally purchased from Alice and illegally sells them to unsuspecting customers.
·    In Fig. 2b, Bob decompiles and reverse engineers an application he has bought from Alice in order to reuse one of her modules in his own program.
·    In Fig. 2c, finally, Bob receives a digital container
 (also known as Cryptolope and DigiBox) from Alice, consisting of some digital media content as well as code that transfers a certain amount of electronic money to Alice's account whenever the media is played. Bob can attempt to tamper with the digital container either to modify the amount that he has to pay or to extract the media content itself. In the latter case, Bob can continue to enjoy the content for free or even resell it to a third party.
2. WATER MARKING
      Watermarking embeds a secret message into a cover message.       In media watermarking, the secret is usually a copyright notice and   the cover a digital image or an audio or video production. Watermarking an object discourages intellectual property theft or, when such theft has occurred, allows us to prove ownership.
     Software watermarking problem as follows:
Embed a structure W (the watermark) into a program P such that:
·    W can be reliably located and extracted from P attacks.
·    W is large (the embedding has a high data rate).
·    Embedding W into P does not adversely affect the performance of P (the embedding is cheap).
·    Embedding W into P does not change any statistical properties of P (the embedding is stealthy).
Any software watermarking technique will exhibit a trade-off between resilience, data rate, cost, and stealth. It should be noted that there are two possible interpretations of stealth, static stealth and dynamic stealth. A watermark is statically stealthy if a static analysis reveals no statistical differences between the original and the watermarked program. Similarly, the watermark is dynamically stealthy if an execution trace of the program reveals no differences



Fig:3

Assume the following scenario: Alice watermarks a program P with watermark W and key K and then sells P to Bob. Before Bob can sell P on to Douglas, he must ensure that the watermark has been rendered useless or else
Alice will be able to prove that her program has been stolen.
Fig. 3 illustrates the kinds of dewater marking attacks available to Bob:
  In Fig. 3a, Bob launches an additive attack by adding his own watermark W1 to Alice's watermarked program P0. This is an effective attack if it is impossible to detect that Alice's mark temporally precedes Bob's.
 In Fig. 3b, Bob launches a distortive attack on Alice's watermarked program P0. A distortive attack applies a sequence of semantics-preserving transformations uniformly over the entire program, in the hope that a. the distorted watermark W0 can no longer be recognized and b. the distorted program P00 does not become so degraded (i.e., slow or large) that it no longer has any value to Bob.
§  In Fig. 3c, Bob buys several copies of Alice's program P, each with a different fingerprint (serial number) F. By comparing the different copies of the program, Bob is able to locate the fingerprints and can then easily remove them.

3.2 Static Watermarking Techniques
Software watermarks come in two flavors, static and dynamic. Static watermarks are stored in the application executable itself; whereas, dynamic watermarks are constructed at runtime and stored in the dynamic state of the program. While static watermarks have been around for a long time, dynamic marks were only introduced recently.
                  Moskowitz and Cooperman and Davidson and Myhrvold are two techniques representative of typical static watermarks. Moskowitz and Cooperman describe a static data watermarking method in which the watermark is embedded in an image using one of the many media watermarking algorithms. This image is then stored in the static data section of the program. Davidson and Myhrvold describe a static code watermark in which a fingerprint is encoded in the basic block sequence of a program's control flow graphs.
To detect the watermark of Venkatesan et. al., the extractor needs to
A. reconstructs the control flow graph of the watermarked program,
B. identify which of the nodes of the control flow graph belong to the watermark graph (or, at least identify most of these nodes), and
C. reconstructs the watermark graph itself. 


        3.3 Dynamic Watermarking Techniques
There are three kinds of dynamic watermarks. In each case, the mark is recognized by running the watermarked program with a predetermined input sequence. This highly unusual input makes the application enter a state which represents the watermark.
There are three dynamic watermarking techniques:
Easter Egg Watermarks. The defining characteristic of an
Easter Egg watermark is that, when the special input sequence is entered, it performs some action that is immediately perceptible by the user.
Execution Trace Watermarks.  Execution Trace watermarks produces no special output. Instead, the watermark is embedded within the trace (either instructions or addresses, or both) of the program as it is being run with the special input I.
Data Structure Watermarks. Data Structure watermarks do not generate any output. Rather, the watermark becomes embedded within the state of the program as it is being run with the special input I.
4 TAMPER-PROOFING
There are many situations where we would like to stop a one from executing our program if it has been altered in any way. For eg, a program P should not be allowed to run if
1) P is watermarked and the code that builds the mark has been altered, 2) A virus has been attached to P, or 3) P is an e-commerce application and the security-sensitive part of its code has been modified. To prevent such tampering attacks we can add tamper-proofing code to our program. This code should
a) detect if the program has been altered and b) cause the program to fail when tampering is evident.
Ideally, detection and failure should be widely dispersed in time and space to confuse a potential attacker. Simpleminded Tamper-proofing code like if (tampered-with ())i=1/0 is unacceptable, for example, because it is easily defeated by locating the point of failure and then reversing the test of the detection code.
There are three principal ways to detect tampering:
1. We can examine the executable program itself to see if it is identical to the original one. To speed up the test, a message-digest algorithm can be used.
2. We can examine the validity of intermediate results produced by the program. This technique is known as program (or result) checking and has been touted as an alternative to program verification and testing.
3. We can encrypt the executable, thereby preventing anyone from modifying it successfully unless they are able to decrypt it. The decryption routines must be protected from reverse-engineering by hardware means, by obfuscation, or both.
              Tamper-proofing of type-safe distribution formats such as Java byte code is more difficult than tamper-proofing assembly code.
    4.1 Tamper-Proofing Viruses
Virus writers employ many obfuscation-like techniques to protect a virus from detection and tamper-proofing-like techniques to protect it from being easily removed from the infected host program. So-called armored viruses add extra code to a virus to make it difficult to analyze. Polymorphic
 viruses generate new versions of themselves as part of the infection process