SECURITY IN DISTRIBUTED SYSTEMS
Security is a very important and difficult design issue to be implemented in distributed systems. Enough care should be exercised in ensuring it to be prevailing over the entire system seamlessly. Security system design usually concentrates on secure communication between users or processes through a secure channel and authorization with access control mechanisms for the resources. The security management involves mechanisms for implementing secure channels and access control with cryptography and other principles. Securing a computer system means protecting it against the major security threats: Interception, Interruption, Modification and Fabrication. Interception involves accessing the services and data in an unauthorized manner. Examples are, hearing a communication between two parties and accessing and copying the data from the private directory of a user. Interruption refers to making a service or data unavailable or unusable. Example is denial of service attack that makes a service inaccessible by maliciously corrupting or destroying files. Modification involves changing the data or tampering a service so that it doesn’t perform as per the specified standards. Examples are changing the data being transmitted by interception, changing the database entries or a program in order to log the activities of the user secretly. Fabrication refers to including additional data or activities. An example is adding an entry in to a password file or database. Fabrication may lead to repetition of activities or messages making it possible to break in to a system. Every system doesn’t need protection against all the security attacks. Based on the requirements a security policy is to be laid out and should be enforced with suitable security mechanisms. The prominent security mechanisms are: Encryption, Authentication, Authorization, and Auditing. Encryption is a means of implementing confidentiality and integrity by transforming the data so that it can not be understood by the attacker. Authentication involves verification of the identity of entities like user, client, and server. Authorization is the check against the access control permissions to an asset. Examples are, read, write, and add or remove permissions to a database record. Auditing tools are used to trace the activities of clients by means of logs. Audit logs are useful in analyzing a security breach and taking measures against intruders.
The following design issues are to be considered while implementing security services. 1) Focus of control: The protection of a distributed application can be achieved by three different approaches. First one is to focus on the integrity of the data associated with the application. Second one is to focus on authentication (by whom the data is being modified), authorization (which are the operations to be carried out) and related access control mechanisms. Third one is to focus on the roles of the users (like administrator, manager, user etc.) based on which the access to a resource is either granted or denied. The three approaches are shown below.
2) Layering of security mechanisms: The issue here is to decide at which layer or level the security mechanism is to be placed. The layered organization of a distributed system combined with the layered organization of the network can be shown as follows. Security mechanisms are usually placed in the middleware layer in distributed systems. These mechanisms may in turn rely on the security mechanisms at other lower layers.
3) Distribution of security mechanisms: In order to minimize the Trusted Computing Base (TCB, which is the set of all security mechanisms required to enforce a security policy in a distributed system) to a relatively small number of hardware and software components, security services can be separated from other services and distributed among different machines depending on the amount of security required. Critical services are placed on servers with trusted operating systems separately. Clients and their applications are placed on untrusted machines. By protecting these machines from external attacks, overall security of the distributed systems can be increased. Reduced Interfaces for Secure System Components (RISSC) approach is an example for this, which is shown below.
4) Simplicity: Though simple and trusted mechanisms that can be understood easily are required they are very difficult to implement in complex applications like digital payments in which multiple parties are required to communicate to make a payment. Also, simplicity plays an important role in deciding the layer in which security mechanisms are to be placed.
Fundamentally, cryptography is the art and science of keeping messages secure. A message is usually a plaintext (also called clear text). The process of disguising a message is called encryption and the encrypted message is called ciphertext. The process of converting a ciphertext back in to a plain text is called decryption. A cipher is nothing but a cryptographic algorithm, which is usually a mathematical function used for encryption and decryption. Cryptographic methods are parameterized by keys. A secret key is one of a large number of values used for encryption and decryption. The range of possible key values is called keyspace. The art and science of breaking ciphertext is called cryptanalysis. The branch of mathematics that encompasses both cryptography and cryptanalysis is called cryptology. The following notation is used to represent this terminology. EK (P) = C DK (C) = P and
Where, EK denotes encryption using a key K, DK denotes decryption using a key K, C denotes cipher text and P denotes plain text. The following figure illustrates the cryptographic terminology and the types of attacks against which encryption helps.
In the first type of attack an intruder may eavesdrop with the message by interception without giving a clue to either sender or receiver. Though encryption of the message may prevent the attacker from eavesdropping, it would be safer to avoid him be aware that the message is being transferred. The second case in which encryption of message helps is, preventing the attacker from tampering the message. In the third case attacker may insert some encrypted message making the receiver to believe that it is sent by the actual sender.
Types of cryptographic systems:
Cryptographic systems are basically two types based on the way the keys are used. They are: Symmetric cryptosystems and Asymmetric cryptosystems If the same key is used for both encryption and decryption of a message it is called symmetric cryptosystem. This can be shown as: P = DK (EK (P)) Both sender and receiver share the same key which is kept secret. Therefore, symmetric cryptosystems are called shared-key or secret-key systems. On the other hand, if a unique pair of different keys is used, out of which one is used for encryption and the other for decryption, it is called asymmetric cryptosystem. This can be shown as: P = DKd (EKe (P)) One of the keys in the pair is kept private and the other is made public. Due to this asymmetric cryptosystems are also called as public-key systems. Based on the way the keys in a pair are used they can be designated as public or private (i.e. either encryption key may be public or decryption key may be public). For example, a sender A may use B’s public key to encrypt the message, which can be decrypted by B using the corresponding private key. Otherwise, A may keep the encryption key private and upon receiving the message B can decrypt it using the A’s public key. The first case ensures that the receiver is genuine and the second case ensures that the sender is genuine. The mathematical function used for encryption (cipher) and keys should have certain properties, which are similar to hash functions, to ensure hassle free system. Firstly, the function used should be oneway: It should be computationally infeasible to find the plaintext P corresponding to a known ciphertext C, and the encryption function. That means, the function should not have an inverse operation. Also, it should be computationally infeasible to find the key K when the plaintext P and the associated ciphertext C are known. Another important property is collision resistance: When given a plain text P and a key k, it | should be impossible to find another key K such that EK (P) = EK| (P). The following three important algorithms aid in understanding the cryptography concepts.
1. Data Encryption Standard (DES): This algorithm is used for symmetric cryptosystems. It is designed to operate on 64-bit blocks of data. The given message is divided in to blocks of 64 bits. Each 64-bit block is subjected to an initial permutation and later transformed in to ciphertext in 16 rounds. In each round a different 48-bit key is used for encryption. All these keys are generated from a 56-bit master key. An inverse permutation is applied after 16 rounds to output final ciphertext block. This is illustrated in the figure below.
In each encryption round ‘i’ the 64-bit block output of previous round ‘i-1’ is taken as input. These 64 bits are split in to left part Li-1 and right part Ri-1, each containing 32 bits. The right part is used as the left part in the next round. i.e., Li = Ri-1. A mangle function ‘f’ is applied to the same 32-bit Ri-1 and 48-bit Ki producing a 32-bit block. Then an exclusive OR (XOR) operation is carried out between Li-1 and the result of the operation performed by the function to produce the right part for the next round Ri. This is shown below.
The mangle function ‘f’ first expands Ri-1 to a 48-bit block and XORs it with Ki. The resulting 48bit block is then partitioned in to 8 chunks of 6 bits each. Each chunk is fed in to a different S-box, which replaces the 6-bit chunk with a 4-bit chunk. All the eight 4-bit chunks are combined in to a 32-bit value and permuted again to produce Ri.
To produce a 48-bit key for each round, the 52-bit master key is first permuted and divided in to two 28-bit halves. Each half is rotated one or two bits to the left and 24 bits are extracted from it. The two extracted 24-bit halves are combined to construct the 48-bit key required. This is shown below.
Though the algorithm is difficult to break using analytical methods, it was cracked using bruteforce attack, which involves searching for a successful key. Use of 128-bit blocks, 128-bit keys and applying DES three times (3DES) with two keys as shown below, are existing solutions to this problem. C = EDES(K1, K2, P) = EDES(K1, DDES(K2, EDES(K1, P) ) ) 2. Rivest-Shamir-Andleman (RSA) algorithm: This algorithm is very widely used for public key cryptosystems. The algorithm is based on the fact that it is impossible to compute the prime factors of very large numbers (greater than 10100). The following method is used in this algorithm to find a key pair (e, d). 1. Choose two very large prime numbers p, q 2. Compute n = p × q, and z = (p-1) × (q-1) 3. Choose any number d that is relatively prime with z. That means, d should not have any common factors with z. 4. Compute the number e so that it satisfies the equation e×d = 1 mod z. That means, e×d is the smallest element divisible by d in the series 1, z+1, 2z+1, 3z+1.... The given plaintext is divided into equal blocks of k bits such that the numerical value of a block is always less than n (i.e., 2k < n). In practice the value of k will be in the range of 512 to 1024. Now, the function for encrypting a single block of plaintext M is, | E (e, n, M) = Me mod n = C The function for decrypting a block of ciphertext C to produce the original plaintext is, | D (d, n, C) = Cd mod n = M | | E and D are mutual inverses for all values of p in the range 0 ≤ p ≤ n. That means, | | | | E ( D ( M)) = D ( E ( M)) = M The key for encryption function is Ke = <e, n> and the key for the decryption function is Kd = <d, n>. 3. Message-Digest algorithm 5 (MD5): MD5 is a widely used cryptographic hash function for computing a 128-bit fixed length message digest from an arbitrary length binary input string. The input string is first padded to a total length of 448 bits, modulo 512 (That is, the message is extended so that it is just 64 bits shy of being a multiple of 512
bits long) and then the length of the original bit string is added as a 64-bit integer. Thus, the input is converted to a series of 512-bit blocks. The algorithm works in k phases, where k is the number 512-bit blocks. The algorithm begins with some constant 128-bit value. During each phase a 128-bit digest is computed from a 512-bit block of data coming from the padded message and 128-bit digest computed in the previous stage. A phase consists of four rounds of computation. In each of these rounds one of the following functions are used. F (x,y,z) = (x AND y) OR ((NOT x) AND z) G (x,y,z) = (x AND z) OR (y AND (NOT z)) H (x,y,z) = x XOR y XOR z I (x,y,z) = y XOR (x OR (NOT z)) Each of these functions operates on 32-bit variables x, y. and z. The outline of the algorithm is shown in the figure below.
Security issues in distributed systems:
Securing a distributed system mainly involves dealing with two major issues: 1) Securing the communication between two parties which requires their authentication, ensuring the message integrity and confidentiality. 2) Authorization for controlling access to resources. For securing the communication between two parties, a secure channel is to be established between them. A secure channel protects senders and receivers against interception, modification and fabrication of messages. Protecting messages against interception is done by ensuring confidentiality: the secure channel ensures that its messages are not subjected to eavesdropping by intruders. Protecting against modification and fabrication by intruders is done through protocols for mutual authentication and message integrity. Authentication and message integrity are interrelated. That means it should be ensured that the message has been sent by the right party and received in proper form as required. The following protocols exist for message authentication in distributed systems. In these protocols one party challenges the other and the response to that can be correct only if the other knows the shared secret key. Therefore such solutions are called challenge-response protocols. a) Authentication based on shared secret key: The protocol proceeds as follows. 1. Alice sends her identity to Bob indicating that she wants to setup a communication channel. 2. Bob sends a challenge (a random number RB) to Alice. 3. Alice encrypts the challenge with the shared secret key (KA,B) and returns the same to Bob. If Bob could decrypt it using the shared secret key (KA,B) it is a confirmation that he is talking with Alice. However, to confirm that he is talking to Bob, 4. Alice sends a challenge (RA) to Bob. 5. Bob responds to the challenge by encrypting it with KA,B. If Alice could decrypt it using KA,B it is a confirmation that he is talking with Bob. This protocol is illustrated in the figure below.
In an attempt to optimize this protocol the number of messages may be reduced from five to three as shown below.
However, this adaptation can be easily be defeated by reflection attack (illustrated below), in which a third person can pretend as any one of the two parties.
Major mistake here that lead to reflection attack were, two parties using the same challenge in two different runs of the protocol. Even the use of different challenges also may lead to problems such as man-in-the-middle-attack. Therefore to setup a secure channel the two parties should not do the identical things in series as far as possible. Another mistake we can observe in the adapted three message protocol is, Bob sent a valuable information in the form of the response KA,B (RC) without any confirmation that it is being sent to Alice. This principle was not violated in the original protocol, in which Bob passed the encrypted message after Alice confirmed her identity. b) Authentication using a Key Distribution Centre (KDC): Shared secret key has scalability problem. If there are N hosts, each host is required to share a secret key with each of the remaining N-1 hosts. Thus, a total of N(N-1)/2 keys are to be managed by the system. The KDC approach requires N keys as a whole, in which a centralized Key Distribution Centre shares a key with each of the participating hosts and no pair of hosts share a secret key. The basic principle of the approach is, 1. Alice sends a message to KDC indicating that she wants to talk to Bob.
2. KDC sends a shared secret key KA,B to both Alice and Bob by encrypting it separately with the keys it shares with Alice and Bob respectively (KA,KDC(KA,B) and KB,KDC(KA,B)). This is shown in the figure below. But, an issue here is, Bob receives the ticket (KB,KDC(KA,B)) from KDC which is to be done by Alice actually.
To overcome this problem KDC passes KB,KDC(KA,B) back to Alice and lets her take care of connecting to Bob as shown below. This is a variant of Needham-Schroeder authentication protocol, which is a multiway challenge-response protocol.
The Needham-Schroeder authentication protocol is illustrated in the figure below. When Alice wants to set up a secure channel with Bob, she sends a request to the KDC containing a challenge RA1, and their identities A, B.
The challenge RA1 is called nonce, which is a random number that is used only once. A nonce is used to indicate that two messages are related. This will help in avoiding the attacks such as replaying the old messages. Thus, the reply from KDC contains the nonce RA1, shared key of Alice and Bob KA,B, Bob’s identity B, and the ticket with Alice’s identity KB,KDC(A, KA,B). The inclusion of Bob’s identity avoids attacks such as any third person pretending on behalf of the Bob. Now, upon receiving the reply from KDC Alice sends message 3 to Bob as shown, which contains a challenge RA2. Bob decrypts the ticket to find the shared key using which RA2 is decrypted. He then sends a response RA2 -1 to Alice along with a challenge RB. By returning RA2 -1instead of RA2 Bob proves that he knows the shared secret key and has actually decrypted the challenge. With the nonce RA2 messages 3 and 4 are tied together. Though looks secure, the protocol has another weak point. That is, the message 3 can be sent by a third person using any of the old KA,B pretending on behalf of the Alice. To avoid this, message 3 should be related to the message 1, making the key depend on the initial request from Alice to setup a secure channel with Bob. This is shown in the figure below. This protocol protects against malicious reuse of a previously generated session key in the Needham-Schroeder authentication protocol.
c) Authentication using public-key cryptography: This approach doesn’t need KDC. Alice and Bob both possess each other’s public key. The protocol proceeds as follows. Alice sends a challenge RA to Bob encrypted with his public key KBpublic. Bob decrypts the message using the private key associated with his public key. He is the only person who can do so because the pair of public and private key is unique. To authenticate Alice, along with the decrypted challenge Bob returns his own challenge RB and a session key KA,B that can be used for further communication, all encrypted using Alice’s public key KApublic. Alice decrypts that message and using the session key KA,B he returns the Bob’s challenge RB, which proves that it is Alice with whom Bob is really communicating. The protocol is illustrated in the figure below. A l i c e
KBpublic (A, RA) KApublic (RA, RB , KA,B ) KA,B (RB)
B o b
Message integrity and confidentiality: Message integrity means protecting the messages against modification. Confidentiality means protecting the messages against interception and eavesdropping. Confidentiality can be maintained by encrypting the messages using private key or public key. Integrity can be achieved by means of digital signatures. Digital signature, which emulates the role of conventional signature, is a mechanism that binds a message to the signer by means of encrypting the message with a key that is known only to the signer. This unique association between a message and its signature prevents the message from modification and the sender or the signer cannot repudiate the fact that he signed the message. Digital signatures can be done using public key cryptosystems such as RSA as shown below.
m Alice’s private key KAprivate Bob’s public key KBpublic Bob’s private key KBprivate Alice’s public key KApublic
KBpublic (m, KAprivate (m))
First Alice encrypts the message with her private key. She encrypts the original message and version signed by her using bob’s public key and sends it to Bob. Bob first decrypts the message using his private
key to get both original message and the signed version. He further decrypts the signed version using Alice’s public key and compares it with original message to confirm that it came from Alice. By digitally signing the message Alice is protected against any malicious modifications to m by Bob, because Bob should always prove that any modifications made to the message are digitally signed by Alice. Similarly, Bob should keep the signed version to protect himself against repudiation by Alice. However, this scheme has the following problems. 1) Alice may still deny the fact that the message was sent by him saying his private key was stolen. 2) Alice may change her private key for protection against intruders. But, the message previously signed by him and lying with Bob becomes worthless. To avoid such problem a central authority that keeps track of changes to messages and use of timestamps when signing messages are required. 3) The encryption of the entire message using private key is a costly affair in terms of processing requirements. Use of message digests is an elegant scheme that solves all these problems. This is illustrated in the figure below.
m Alice Bob m Hash Function H Hash Function H Alice’s private key KAprivate Alice’s public key KApublic
KAprivate (H (m))
As studied, a message digest is a fixed length bit string h, that has been computed from an arbitrary length message m, by applying a hash function H. To sign a message digitally, Alice first computes the message digest and encrypts the digest using her private key. The encrypted digest is sent along with the message to Bob. The message is being sent as plaintext. In case confidentiality is needed it should be encrypted with bob’s public key. Upon receiving the message and its encrypted digest, Bob decrypts the digest using Alice’s public key. He separately calculates the message digest using hash function and compares it with the decrypted message. If both matches it clarifies that the message has been signed by Alice. Note: Use of session keys along with long lasting authentication keys is a good choice for implementing secure channels for exchanging data. Secure group communication: Distributed systems often involve communication between more than two parties that arises similar security issues as discussed above. The problem of protecting communication in a group of N users against eavesdropping to ensure confidentiality can be effectively dealt with using a public-key cryptosystem. Each member has its own (public-key, private-key) pair. Public key can be used by all members for sending confidential messages. A group of replicated servers is another case where the security problems arise. When a client sends a request to a group of replicated servers, the response from the servers should be trustworthy without undergoing any security attack through corruption of one or more servers by an intruder. The problem can be stated as follows. Let there be N servers out of which c servers are corrupted by an intruder. We need to protect the client against these c corrupted servers, which means even if these servers respond incorrectly the client should be able to decide whether the response is correct or incorrect.
The solution to this problem can be combining the signatures of the servers in such a way that at least c+1 signatures are needed to construct a valid signature for the response. That means, we should make the replicated servers generate a secret valid signature such that c servers are not enough to produce that signature. This can be explained with the example shown in the figure below.
Let there be a group of five replicated servers that should be able to tolerate two corrupted servers. For a request from the client each server Si sends its response ri along with its signature, sig(Si, ri) = Kiprivate (md(ri)) Client will receive five responses in the form of triplet < ri, md(ri), sig(Si, ri) > from which it should derive correct response. Client also calculates message digest for each response md(ri) . Apart from this client takes a set of three signatures V from the five received as input and produces a single digest d as output using a function D. There will be 5!/3!2! = 10 possible combinations of three signatures that the client can use as input for D. If one of these combinations produce a correct digest for some response ri, then the client can consider ri as being correct and trust that the response has been produced by at least three honest servers. To improve the replication transparency, each server can be made to broadcast a message containing its response, to the other servers along with the associated signature. When a server has received at least c+1 of such messages including its own, it attempts to compute a valid signature for one of the responses. Then the server sends r and set V of c+1 servers as a single message to the client. The client then verifies the correctness of r by checking its signature, i.e., md(r) = D(V).This scheme is called (m,n)-threshold scheme. In our example, m = c+1 and n = N. In an (m,n)-threshold scheme, a message has been divided in to n pieces, known as shadows. Any m shadows can be used to reconstruct the original message, which is not possible with m-1 or less messages.
Authorization is a process of granting access rights for resources in a distributed environment. Access control means verification of those access rights for an attempt to access the resources. An important method of actually controlling access to resources is to build a firewall that protects applications or entire network. Controlling the access to an object involves protecting the object against specific invocations not permitted to be carried out. This may involve creating renaming or deleting the objects. A program called reference monito records the permissions granted for different subjects or entities and decides whether to allow a specific operation to be carried out by a particular subject. It is important to protect even the reference monitor against attacks.
Access control matrix M, is used to model the access rights of subject with respect to an object. Each subject is represented by a row in the matrix and each object is represented by a column. An entry M[s,o] in the matrix lists precisely what operations subject s can request to be carried out on object o. In an environment that involves millions of objects that require protection access control matrix cannot be implemented as a true matrix, because many entries will be empty and size of the matrix will be larger. An efficient and widely used approach is, each object maintaining an Access Control List (ACL), which contains a list of access rights of subjects that want to access the object. That means the matrix is distributed column-wise across all objects. Another way of modelling access rights is by distributing the matrix row-wise by giving each subject a list of capabilities it has for each object. A capability is similar to a ticket. Its holder is given certain rights associated to it. An important issue is the ticket should be protected against modifications by its holder. Use of ACLs and capabilities to protect the access to an object is shown in the figure below.
(a) Use of ACLs for protecting objects
(b) Use of capabilities for protecting objects
It may be possible that even the ACL and capability lists still become quite large. Different approaches exist to take care of the issue. They can be categorized according to: 1) the use of protection domains, 2) the use of object groups. A protection domain is a set of <object, access rights> pairs. Requests for carrying out operations are always issued within a domain. A domain may be implemented either as a particular group of users or based on the role of users. For example, to make a web page accessible to only to the employees of a particular organization, a protection domain ‘Employee’ group can be created, which contains the details of all the employees of that organization. Whenever a request is placed by a user to access the web page, the reference monitor first looks up the Employee group to check whether that user is an employee, and then allows accessing the page in case of valid credentials. This Employee group list should be protected against unauthorized access. For more flexibility, groups can be made hierarchical based on, for example, different branches of the organization, department, etc.
Though this approach looks efficient in constructing very large groups and makes management of group membership easy, looking up a member can be quite costly if the membership database is distributed. To deal with this problem, the users can be made to carry a certificate listing the groups they belong to. However, the certificate should be protected against tampering and should be made genuine by means of mechanisms like digital signature. Certificates are comparable to capabilities discussed earlier. In a role-based access control, the role of the user (e.g. Head of the department, project manager) in the organization is verified according to which access privileges are assigned. That means his role determines the protection domain (or group) in which he will operate. If a user is playing many roles simultaneously, provision can be made to allow the user change their role if required. As a second major approach objects can be grouped hierarchically, based on the operations they provide according to the interfaces they provide. When a user requests an operation to be carried out at an object, the reference monitor looks up to which interface the operation for that object belongs. Then the user is checked whether he can perform an operation belonging to that interface, rather than calling the operation for the specific object. Both the approaches, use of protection domains and grouping of objects can be combined for still better results. All the above security mechanisms work well as long as the communicating parties play according to the same set of rules and in case of standalone distributed system that is isolated from the rest of the world. To control external access to any part of the distributed system a firewall is employed. Firewall: A firewall is a special kind of reference monitor which disconnects any part of a distributed system from an outside world. All outgoing and incoming packets are routed through a special computer and inspected before they are passed. Unauthorized traffic is discarded and not allowed to continue. Also, a firewall should never fail and should be heavily protected against any kind of security threat. A common implementation of a firewall is shown in the figure below.
Firewalls essentially come in two different flavours: Packet-filtering gateway and application-level gateway. These two are often combined for better results. The packet-filtering gate way acts as a router and make decisions whether to allow a network packet based on the source and destination address as contained in the packet header. The router that is outside of the LAN protects against incoming packets and the router that is inside of the LAN would filter outgoing packets. As an example, consider a company’s network which consists of multiple local area networks connected through an SMDS network. Each LAN can be protected by means of a packet filtering gateway, which is configured to pass incoming traffic only if it is originated from a host on one of the other LANs. In this way a private virtual network can be set up. Unlike the packet-filtering gateway which inspects only the header of the networking packets, application-level gateway inspects the content of an incoming or outgoing message. For example, a mail gateway discards incoming or outgoing mails exceeding a certain size. Another example is a gateway that allows external access to a digital library server, but, supplies only abstracts of documents. If an external user wants more, an electronic payment protocol is started. Users inside the firewall have direct access to the library service. A proxy gateway is a special kind of application gateway which works as a front end to a specific kind of application and ensures that only messages meeting certain criteria are passed through. For example, many web pages contain scripts or applets that are to be executed inside a browser.
To prevent such code to be downloaded to the inside LAN, all web traffic could be directed through a web proxy gateway. This gateway accepts regular HTTP requests, either from inside or outside the firewall. It appears to its users as a normal web server. However, it filters all the incoming and outgoing traffic, either by discarding certain requests and pages, or modifying pages when they contain executable code. Securing mobile code: Modern distributed systems allow code migration instead of passive data. Two issues are required to be dealt with in this case. First one is to protect the agent being migrated and the second one is to protect the target host against malicious code agents. The second issue is related to access control problem because the program should not be allowed unauthorized access to the host’s resources. As an example for the first case where protecting an agent is required, consider an agent from a user searching for the cheapest aeroplane ticket and is authorized to make reservation as soon as a flight is found, containing electronic credit card information. This agent should not be allowed any host stealing the credit card information and should be protected against modifications such as changing amount payable to a higher value. Other such attacks include not allowing the agent to visit a competitor’s site offering cheaper rates, maliciously destroying the agent, tampering with an agent such that it will attack or steal from its owner on its return. It is impossible to protect the agent against all kinds of attacks because no hard guarantee can be given by the host to do what it promises. Therefore, agents can be organized in such a way that at least modifications can be detected. Three mechanisms are suggested in Ajanta system for this. The first one is read-only state. It consists of a collection of data items signed by the agent’s owner. Signing takes place when the agent is constructed and initialized before it is sent off to other hosts. The owner first constructs a message digest, which is subsequently encrypted with the private key. When the agent arrives at a host, the host can easily detect whether the read-only state has been tampered with by verifying the state against the signed message digest of the original state. Secure append-only log is the second mechanism, which is characterized by the fact that data can only be appended to the log. There is no way that the data can be removed or modified without the owner being able to detect this. Third mechanism is selective revealing of state to certain servers. It uses an array of data items where each entry is intended for a designated server and is encrypted with that server’s public key to ensure confidentiality. The entire array is signed by the agent’s owner to ensure integrity of the array as a whole. If any entry modified by a malicious host, any of the designate servers will notice and can take appropriate action. The second important issue is protecting the target, which involves protecting all the resources against unauthorized access by the downloaded code. One approach to protection is to construct a sandbox. A sandbox is a technique by which a downloaded program is executed in such a way that each of its instructions can be fully controlled. If an attempt is made to execute an instruction that has been prohibited by the host, or if an instruction accesses certain registers or areas in the memory that the host has not allowed, execution of the program will be stopped. Implementing a sandbox is not easy. One approach is to check the executable code when it is downloaded and to insert additional instructions for situations that can be checked only at run time. Consider the example of a java sandbox, a constrained arena within which Java applications can be made to run, preventing for example, access to the local hard disk or to the network. While this restricts the program's capabilities, it provides security for downloading Java applets from the Internet and running them from the Web browser. Three components help in protecting the client system. Firstly the class loader takes care of the downloading. It is responsible for fetching a specified class from a server and installing it in the clients address space so that the JVM can create objects from it. The second component is a byte code verifier which checks the classes downloaded from the external server to ensure that they contain no illegal instructions that could corrupt the stack or memory. After securely downloading and verifying a class JVM can instantiates the objects from it and execute the object’s methods. The third component called security manager (similar to a reference monitor) performs various checks at run time to further prevent the objects from unauthorized access to the client’s resources. All the java programs
being downloaded are forced to make use of the security manager. The following figure shows the organization of a java sandbox.
This implementation of java sandbox is overly restricted and provides strict security policy. Alternatively the playground approach provides more flexibility. A playground is a separate designated machine exclusively reserved for running mobile code. Resources local to the playground such as files or network connections to external servers are available to programs executing in the playground subject to normal protection mechanisms. But, resources local to other machines are physically disconnected from the playground and cannot be accessed by downloaded code. Users on these other machines can access the playground in a traditional way, by means of RPCs. However, no mobile code is ever downloaded to the machines not in the play ground. The following figure shows the differences between a sandbox and a play ground.
For more flexibility and to accept code from only trusted servers code-signing approach is used as an alternative to sandboxing. Code-signing involves signing of mobile code just like any other document by which downloaded programs can be authenticated. To enforce a specific security policy on the downloaded code, three mechanisms can be used in case of Java programs. The first approach is based on the use of object references as capabilities. To access a local resource such as a file, a program must have been given a reference to a specific object that handles file operations when it was downloaded. The principle is shown in the figure below.
The second mechanism for enforcing a security policy is (extended) stack introspection. In this, any call to a method m of a local resource is preceded by a call to a special procedure enable_privilege that checks whether the caller is authorized to invoke m on that resource. If the invocation is authorized, the caller is given temporary privileges for the duration of the call. Before returning control to the invoker when m is finished, the special procedure disable_privilege is invoked to disable these privileges. The third approach for enforcing security policy is by means of name space management. To give programs access to local resources, they first need to attain access by including the appropriate files that contain the classes implementing those resources. It requires that a name is given to the interpreter, which then resolves it to a class loaded at run time. To enforce security policy for specific downloaded program, the same name can be resolved to different classes, depending on where the downloaded program came from. Name resolution is handled by class loaders, which need to be adapted to implement this approach.
Overall three important aspects are to be considered for security management. They are: the general management of cryptographic keys, managing a group of servers securely, and authorization management. Key management involves establishing and distributing cryptographic keys. Authentication and establishing a secure channel requires session keys, public keys and private secret keys. However, it is obvious that a secure channel is required even for exchange of keys well before the actual authentication takes place. An elegant and widely applied scheme for establishing a shared key across an insecure channel is Diffie-Hellman key exchange. This is shown in the figure below. This protocol can be viewed as a public key cryptosystem.
Public key distribution takes place by means of public-key certificate, which consists of a public key together with a string identifying the entity (user, host, or special device) to which the key is associated. The major problem to be dealt with while managing a group of replicated servers is securely admitting a new group member without compromising the integrity of the group. Similarly, the major issue in authorization management is how access rights are initially granted to users or groups of users, and how they are subsequently maintained without avoiding them. For further details please refer to Tanenbaum book on distributed systems.
Y GOVINDA RAMAIAH Research Scholar, CSE Dept, JNTUH, Hyderabad
Figures and Notes compiled from A S Tanenbaum, Maarten Van Steen, Distributed Systems Principles and paradigms, Second edition, 2007, Pearson Education Inc.