Tuesday, June 4, 2019
Security Issues in Peer-to-peer Networking
Security Issues in Peer-to-peer NetworkingACKNOWLEDGEMENTSThe interest in the knowledge dobriny of entanglementing, driven me to take the calculator entanglementing as my variant in M.Sc. at that place ar few different guinea pigs of net incomes. Out of them the more than popularized and upcoming tr final stage of net profits ar peer-to-peer meshworks. This report of my final dissertation for the spellial outcome of my M.Sc, info processor profiting, would not have been possible without the reenforcement of my supervisor, Mr. Harry Benetatos. He helped me a lot by guiding me and pin-pointing the key mistakes which I have make during my inquiry. My course leader Mr. Nicholas Ioannides in exchangeable manner helped me a lot to complete this dissertation. His advises and suggestions gave me a lot of rise and support which made me do this re look to and finish it in time. I am truly thankful to my university, LONDON METROPOLITAN UNIVERSITY which provided me th e free access to the IEEE library which helped me to find out the key musical compositions which argon real recyclable for my research. I also thank my p arnts for their support given to me in solely walks of my life.DEDICATIONI sacrifice this report to my p atomic number 18nts and my well wisher Sakshi for their constant support and encouragement throughout my education and life.CHAPTER 1PROJECT INTRODUCTION1.1 INTRODUCTION TO THE PROJECTThis dissertation is tout ensemble(prenominal) around the security issues in the peer-to-peer profits. There argon mevery security issues in peer-to-peer communicates. I have chosen to do research on sophisticate intrusions in peer-to-peer networks. In this document I have mentioned how the writhe propagates in the network from one peer to ancourseer(a) peer, how the biting louse target be notice and how the find oneselfed twine locoweed be attacked and save the network from getting defileed.1.2 AIMSecurity issue in Peer-to -peer networksSecuring the peer-to-peer network from writhes.1.3 OBJECTIVES To understand how the peers slip by with each some other in the peer-to-peer network To analyse the prolongation of worms in the network. To maintain the worms near the nodes of the network To defence the worms in the network.1.4 RESEARCH QUESTIONThis document t str and so on forthcatedly discusses somewhat how the worms propagates in the network and how clear it be observe and attacked in order to save the peer-to-peer network1.5 APPROACHMy approach for this dissertation is as follows Understanding peer-to-peer networks define the problem Data collection and analysis need and understanding the existing solutions for the problem analyse different solutions conclusion1.6 METHODOLOGYThis partition of my document contains what of the essence(predicate) steps to be followed in order to attain the mentioned objectives. It also helps to schedule how to develop and complete different parts of the dis sertation.In this dissertation branchly I volition study and understand about the peer-to-peer networks and how the peers in the networks communicate and sh atomic number 18 training with the remaining peer in the network. hence I do research on how the worm propagates in the network, how female genitalia the worm be discover and how the spy worm bay window be attacked and restore the network. In the pictorial form the different stages of my dissertation ar1.7 PREVIEW approximately THE COMING CHAPTERS IN THE REPORTThe rest of the report is organised as follows in the chapter 2, on that point is brief discussion about the peer-to-peer networks, different types of peer-to-peer networks, advantages and disadvantages of the peer-to-peer networks. There is also some information about the worms, its nature and different types of worms. In chapter 3, thither is a discussion about the modes given by the different person to detect the worm in the network by the manner of duplic ate the singularity chemical chain of the worm. In incision 4, there is a solution for this issue. That is mathematical method of detect the worm in the network and defending it. Chapter 5 consists of a critical appraisal and suggestions for the yet work. Fin bothy, I concluded in chapter 6.CHAPTER 2OVERVIEW OF THE generic AREA AND IDENTIFICATION OF occupation2.1 NETWORKNetwork is a group of electronic devices which be connected to each other in order to communicate which each other. The devices preempt be computers, laptops, patsyers etc. networks can be wired or wire slight. Wired networks ar networks in which the devices be connected with the help of wires. receiving learn networks atomic number 18 the networks in which the devices are connected without the wires. There are m both different types of networks and peer-to-peer is one of the beta and special types of networks.2.2 PEER-TO-PEER NETWORKSPeer-to-peer networks are emerged in 1990 because of the developmen t of the peer-to-peer cross-file sharing comparable Napster 1. Peer-to-peer networks abbreviated as p2p networks are the networks in which every the nodes or peers in the network acts as hosts as well as nodes on demand. This is unlike typical client server model, in which the clients requests the services and server supplies the re addresss. only when in trip of peer-to-peer networks every node in the networks requests services like a client and every node impart supply the resources like server on demand. Peer-to-peer network doesnt need any centralized server coordination. Peer-to-peer network is scalable. Addition of new nodes to the network or removal of already existing nodes on the network doesnt affect the network. That means addition or removal of nodes can be done dynamic every last(predicate)y. every the nodes connected in a peer-to-peer network run on the alike(p) network protocol and software dodge. Resources visible(prenominal) on a node in the network a re available to the remaining nodes of the network and they can access this information well. Peer-to-peer networks provide robustness and scalability. All the wired and radio set networks can be configured as peer-to-peer networks. Home networks and sm entirely enterprise networks are preferable to configure in a peer-to-peer networks. nigh the networks are not pure peer-to-peer networks because of they use some network interface devices. In the beginning, the information is stored at all the nodes by making a sham of it. that this increases the give ear of traffic in the network. But now, a centralize system is maintained by the network and the requests are directed to the nodes which contains the relevant information. This result save the time and the traffic flow in the network.2.3 WIRELESS NETWORKSDevices connected to each other without any wires can also be configured like peer-to-peer networks. In a case of small of physique of devices it is preferable to configure t he network in wire little peer-to-peer networks because it leave be easy to share the information in both the directions. It is even cheaper to connect the networks in wireless peer-to-peer because we do not need to top on the wires.Peer-to-peer networks are divided into threesome types. They areInstant messaging networksCollaborative networks similitude community networks2Instant messaging networksIn this type of peer-to-peer networks, the users can chat with each other in true(a) time by installing some software such as MSN messenger, AOL instant messenger etc.Collaborative networksThis type of peer-to-peer networks are also called as distri saveed computing. This is widely employ in the field of perception and biotechnology where the intense computer processing is needed.Affinity community peer-to-peer networksIt is a type of p2p network, where the group of devices are connected only for the purpose of sharing the info among them.Peer to peer networks are basically cla ssify into two types. They are Structured peer-to-peer networks Unstructured peer-to-peer networks2.4 merged PEER-TO-PEER NETWORKSIn the structured peer-to-peer nodes connected in the network are fixed. They use distributed hashing hedge (DHT) for indexing 4.In DHT data is stored in the form of hash table like (key, value). Any node leave behinding to retrieve the data can easily do that use the keys. The mapping of values to the keys are maintained by all the nodes depict in the network such that there will be very less disruption in case of change in the set of participantsDHT-based networks are very efficient in retrieving the resources.2.5 UNSTRUCTURED PEER-TO-PEER NETWORKSIn amorphous p2p network nodes are established arbitrarily. There are three types of unstructured p2p networks. They arePure peer-to-peerHybrid peer-to-peerCentralized peer-to-peerIn Pure p2p networks all the nodes in the network are check. There wont be any preferred node with special infrastructure fun ction.In hybrid p2p networks there will be a special node called supernodes 3 . This supernode can be any node in the network depending on the momentary need of the network.Centralized p2p network is a type of hybrid network in which there will be one central system which manages the network. The network cannot be able to work without this centralized systemBasically, all the nodes in the peer-to-peer networks contain the information of the dwell in its routing table. The rate of propagation of worms in the peer-to-peer networks is larger than compared to the other networks. This is because the information of the dwell peers can easily gaind from the routing table of the infected node.Different types of files are divided between the nodes in the peer-to-peer networks. These files can be the audio files, video files, music files, text documents, books articles etc. there are a lot of peer-to-peer software available these days in the market for sharing the files. Some of them are bittorrent, limeware, shareaza, kazaa, Imesh, bearshare Lite, eMule, KCeasy, Ares Galaxy, Soulseek, WinMX, Piolet, Gnutella, Overnet, Azureus (vuze), FrostWire, uTorrent, Morpheus, Ants, Acquisition5. There are lot more file sharing softwares in the market but these are the top 20 file sharing softwares for peer-to-peer networks.Basically, all the nodes connected together in the network should configure with the corresponding network protocol and the same software should be installed in all the nodes in order to communicate with each other. Else the nodes in the network cannot communicate if they are configured with the different software or protocol.2.6 ADVANTAGES OF PEER-TO-PEER NETWORKS 6It is more reusable for the small business network comprising of very small number of computer systems or devices.Computers in this network can be configured easily.Full time network administrator is not required for the p2p networks.Easy maintenance of the network. altogether a single operatin g system and less number of cables needed to get connectedCan be installed easilyUsers can control the share resourcesDistributed nature of the network increases the robustness of the network.2.7 DISADVANTAGES OF THE PEER-TO-PEER NETWORKS 12No centralised administrationBack-up should be performed on the each computer individually.Peer-to-peer networks are not secure any computer in the network behaves as server and client which can slow down the doing of the systemLegal controversy with the copyrights.2.8 WORMWorm is a computer malware program or it can be called as a mischievous code which can multiple itself into several(prenominal) replicas or it duplicate itself into several copies. Worm in unanalyzable can be called as autonomous intrusion agent 19 .It doesnt actually alters the function of the system but it elapse through i.e., worm is unlike virus. It intrudes the network without the intermediation of the user.This is first detected by Robert T Morris in 198818. Today we have some billions of systems connected to net profit. Bu during 1988 there were only 60,000 systems connected to the internet. During that period 10% of the internet systems i.e., 6000 of the systems are infected and almost clogged because of the worms 8.Worms when enters the system it hides in the operating system where it cannot be pronounced 18 . It drastically slows down the system the effect the other programs in the system. In worst cases it could even effect the entire network and slow down the internet across whole world.As it is utter former that it replicates itself into multiple copies and attach itself to the emails and corrupt them and sometimes deleting the file without the user interaction. If it enters our email, it can able to send itself to all the contacts in our email book and and consequently to all the contacts of the emails of our email book and likewise it propagates, grow and sprinkle at the higher rate.Worms will even spend a penny the backdoor int o the computer 11. This will make the attackers to send e-mail easily.Some famous worms discovered in 2003 and 2004 are Mydoom, Sobig and Sasser7. Sasser worm has recently affected the computers which are development Windows 2000 or Windows XP operating system. It restarts the system automatically and crashes it. It is spread to all the nodes in the network.There are some worms which are unlike the normal worms. These worms are very useful to the user some times. Hence, these are called the reformatory worms 9. Sometimes they help users without the interaction with the user. But most of the cognize worms are harmful and will al routes tries to infect the nodes in the network and affect the performance of the network.When the peer-to-peer networks are attacked by the worms, it slows down the might of the network. So there is a need to save the networks from move into into the network and spreading itself all over the network. The worms should be detected and defended. If we d elay in defending these worms, they replicate itself and makes many copies of itself and spread all through the network. This is very dangerous to the network as it affects the performance and efficiency of the network 10.CHAPTER 3RELEVANT WORK DONE BY OTHERS IN ORDER TO SOLVE THE PROBLEMMany people proposed solutions to this problem. First Zhou L gave solution to p2p worm and he observed that propagation of worm in p2p network is very zip up when compared to other networks13 . Jayanthkumar performed some simulations on worm propagation from infected node to other node10. Wei yu researched on the demeanor of worms in p2p networks14. In my research I ensnare one more interesting method of detecting the worms in the peer-to-peer network. This is indeed a special method of detecting the worms in network because the actors Yu Yao, Yong Li, Fu-xiang Gao, Ge Yu in their paper titled A Signature-behaviour-based P2P worm undercover work approach they proposed a mechanism of detecting t he cognise worms in the peer-to-peer networks based on feature withdraw co-ordinated. Worm make use of vulnerabilities in the network and +Spreads15. They also proposed the sleuthing mechanism for the unmapped worms based on their behaviour. They technique mainly consists of the technology of distinction draw twine interconnected, identifying the application and the secret worm spying technology. They have given the algorithm for the matching the pillowcaseistics sop up of the worm called suffix- steer algorithm- suffix set about algorithm. This is efficient and simple with very less time complexity. As peer-to-peer network follows fragment transfer technique there is chance of assigning the feature films hang of the worm to the other suspends of data. And once again during the reorganisation process this characteristic string can identify the worm. These authors even validated their results by simulation. They proved that their method is also one of the efficient m ethods of p2p worm detection.As mentioned above this method detects the known worm and also the unknown worms based on characteristic string matching and their behaviour respectively. In this method they initially capture the network mail boats using the library function called LibPcap. LibPcap is the library function that captures the network packets in UNIX and Linux platforms. This function contains many functions that will be useful for capturing the network packets. After capturing the data packets with help of these functions the non-P2P packets are filtered out. So now the P2P packets are filtered. In these P2P packets the known worms are detected by using the characteristic string matching. This is implemented by the couple of algorithms. They are the suffix grade algorithm and the wave-particle duality algorithm. These algorithms are very accurate and are equal to(p) of detecting the worms in very less time. As I mentioned above peer-to-peer networks follow fragment trans fer mechanism. Hence the characteristic string of the worm can be assigned to the other folds of data. So, in this dapple it is difficult to detect the worm if the characteristic string of the worm is based on the single packet. But if the characteristic string is present in the auction mob then there is a chance of detecting the worm because it will assign it to the two packets. At this time the worm characteristic string present in the two different data packets need to restructure. After restructuring, the worm can be detected by using the matching mechanism. In this way the known worm in the network is detected by using the characteristic string matching. The unknown worms in the p2p network can be detected with the help of the act characteristics of the worm at the initial stage of its propagation. This can be called as the behaviour based detection of the unknown p2p worms. Like this all the known and unknown worms in the network are detected.3.1 P2P KNOWN WORM DETECTIONTh ere are four steps in detecting the p2p known worms. They areDeal flowTechnology of identifying the applicationCharacteristic string matchingReorganising the characteristic string3.1.1 DEAL FLOWIn this step of get it on flow the flow of data is divided into four steps16. standard 1 Extracting the p2p data stream from the true data stream. bar 2 check the extracted p2p data stream for worms using characteristic string matching with the worms already existing in the library function. smell 3 data is flow is reorganised. It now contains worm characteristic string as well. Go to step 2.Step 4 check the data flow for unknown worms using unknown worm detection techniques.After performing the four steps update the library function.All the four steps is stand for pictorially as in the next page. kind 4 flow chart representing four steps to detect wormsyes normalNormalnoaffectedabnormal3.1.2 TECHNOLOGY OF IDENTIFYING THE APPLICATIONAs said earlier, this paper uses the method of capturing the data packets and sca it for the worms which are known with the help of a function library called LibPcap17 . For this there should be already some assigned rules in the network interface devices. So assigning these rules to those devices is done in stepwise procedure as site the available network interface devicesOpen the network interface deviceCompile the rules that we are willing to attach to the devicesSetup the rules of filtering to the deviceNow exploit the equipmentStart the process of capturing the packetsThere are some rules for identifying the p2p application. They areCharacteristic information of the known p2p is usedSometimes, if source-destination IP pairs dont use the known P2P and they may use TCP and UDP at same time, then they are p2p.At a particular time source pairs srcIP, srcport27 and the destination pairs dstIP, dstport27 are check into here(predicate) we can identify whether its a p2p or not. If the number of connection port is equal to the number of co nnection IP, then we can secern that it is a p2p. There are the situations where these rules have been used unruly. So the there were some amendments made to these rules. The amendments are rule (2) can identify even the mazes which are present and rule (3) is modified in such a way that in the detect cycle srcIP, srcport27 pairs at the source and the dstIP, dstport 27 pairs at the destination are checked. From this they derived that if the number of connection port is equal to the number of connection IP, the protocols which are used are same. If they are different then the protocols are different.3.1.3 CHARACTERISTIC disembowel MATCHINGThis is the most important section of the paper. Here authors have given some definitions to the terms which we are going to use, the algorithms which we are going to use to detect the worm. Couple of algorithms are mentioned. They are suffix-array algorithm and the dichotomy algorithm. So the entire process of detecting the worm depends on the ef ficiency and the accuracy of these algorithms.First of all before using and understanding suffix-array algorithm we will try to understand some keywords and rules.Suffix suffix is the part of a string or a substring which starts at a particular location to the end of the string. If a suffix in the string S starts at the location i to the end of the string S, then the suffix can be represented as Suffix(i)=Si,Len(S) 27 .Let us understand how the strings can be compared. The comparison in this paper followed vocabulary comparison If u and v are the two different strings. Comparing the strings u and v is same like comparing ui and vi, where i starts with the value 1. Here string u is equal to string v i.e., u=v when ui=vi String u is greater then string v i.e., uv when uivi String u is less than string v i.e., uBut the results were still not obtained for ilen(u) or ilen(v)Also if len(u)len(v) then u v, if len(u) Suffix-array suffix-array is denoted by SA. It is a one-dimensional arra y. It is an array of SA1, S2, SA3,. And so on. Here siRank-array rank-array is nil but SA-1. If SAi=j, then Rankj=i. we can say that the ranki saves the rank of Suffix(i) in an ascending order for all the suffixes.In this paper the author has taken the example of string science and explained everything clearly. The string science can generate seven suffixes. They areSuffix(1) scienceSuffix(2) cienceSuffix(3) ienceSuffix(4) enceSuffix(5) nceSuffix(6) ceSuffix(7) eWhen we sort out everything in a dictionary order it will be in the order as followSuffix(6)= ceSuffix(2)= cienceSuffix(7)= eSuffix(4)= enceSuffix(3)= ienceSuffix(5)= nceSuffix(1)= scienceSuffix-array algorithm follows multiplier ideas. Firstly get SA1 and Rank1 by comparing every character in the string. Comparing string is similar to comparing the every character sequentially. So by comparing every character, SA1 and Rank1 can derive SA2 and Rank2. And this SA2 and Rank2 will derive SA4 and Rank4. And this will again deri ve SA8 and Rank8. So finally suffix-array and rank-array are derived from this process. The main process of the suffix-array algorithm is Calculating SA1 and Rank1. Firstly all the suffixes are arranged in the first letter order and then suffix-array (SA1) is generated by using agile sorry algorithm and then Rank1 is also generated. Comparing 2k-prefix Suffix(i) and Suffix(j) using SAk and Rankk.2k-Suffix(i) = 2k-Suffixes(j), this is equivalent to RankkSAki = RankkSAkj and RankkSAki+k = RankkSAkj+k2k-Suffix(i) Suffix-array algorithm is a sorting algorithm which sorts out the characteristic string. So, this uses binary search algorithm. The algorithm followsStep 1 in the first step values are assigned like left=1, right=n and max_match=0Step 2 the middle value i.e., mid= (left +right)/2.Step 3 comparing the characters corresponding to Suffix (SAmid) and P. the longest public prefix r can be helpful in implantation and comparison. If r max_match, then max_match=r.Step 4 if Suffix(SA mid)If Suffix(SAmid)P, then right=mid-1If Suffix(SAmid)=P, then go to step 6Step 5 if leftStep 6 if max_match= m, then print match is successful.3.1.4 REORGANISING THE CHARACTERISTIC STRINGIn this step the characteristic string is reorganised. If the character string is divided into two different data blocks, then the data block with the partial characteristic string is stored. Basically, all the information about the data block like index, beginning offset, duration of the block and so on are contained at the head of the each block. Here a structure piece is delimit which consists of index of the block, beginning offset of the block offset, length of the character array head and the length of the character array end18. Initially each and every data packet is compared with the characteristic string for matching. If it is matched then the specimen or an alert is sent to all the users about the worm. Here if the tail of the characteristic string of the worm matches with the head of the data block, then it will be stored in the character array end. And if the head of the characteristic string of the worm matches with the tail of the data block then it is stored in the corresponding character array head. Suppose if the neighbouring data block contains a partial characteristic string of the worm then the neighbour string in the array head as well as in the end will be reorganised. Now this reorganised string will again perform the characteristic string matching and if any worm is detected then again the warning is sent to all users saying that the worm have found. If it is not matched then it wont perform any surgical operation. If in a case that the characteristic string is present in the block but is divided into two different data packets, then a special term called character array is introduced. First the matching mechanism is performed in both the data packet. If the matching characteristic string is found then the warning is sent to the users that there is a worm present. But if only part of the characteristic string is found then it will be enough if it meets some of the requirements like the head of the data packet should match with the tail of the characteristic string or the tail of the data packet should match with the head of the characteristic string. But if these conditions are not satisfied then no operation is performed. Now, if the tail of the data packet contains the partial characteristic string then the data packet is stored in the array. If the length of the characteristic string is m, then the Arraym is set as . And if the head of the data packet contains a part of the characteristic string then that data packet is stored in the n consecutive units of array. Finally, this array will be the characteristic string matching and if the worm is detected then the warning is sent to all the users. If it is not matched then nothing is done.3.2 espial UNKNOWN P2P WORMIn the above section we have seen how the known worm is dete cted. But that algorithm or mechanism are meant to detect the unknown p2p worms. So here in this section we will understand how the unknown worms can be detected and restrain the network. As we know in p2p networks a node can able to send the information to multiple hosts at a same time. Anyhow same protocol is used by all the nodes in the network27. These characteristics of the network helps worm to propagate easily. As we discussed above, only the known worms can be detected by using the characteristic string matching method. Here we will see how the unknown worms can be detected. The unknown worms are detected based on the behaviour of the node. Some of the detection rules are same content files are transferred to multiple hosts in a very short time. Same protocol is used and the destination port is same. If these rules are satisfies by the source port then it allows the p2p worm to propagate. Now, it is needed to extract the characteristics of worm near the worm propagation nod es. When these characteristics are extracted, they are added to the feature library. This data resemblance comparison and extracting the characteristics are done using the LCSeq algorithm. But the LCSeq algorithm based on generalized suffix tree (GST) is the more efficient. The overall idea is that all the suffixes are represented as a tree.And this tree will have some characteristics like Every node in a tree is a string and root is the empty string Every suffix can be represented as a path from the root. Every substring can be considered as a prefix of a suffix. To achieve the searching public sub sequence, every node should be set the information of its subordinate source string.3.3 EXPERIMENTWe know that the worm carcass tries to infect the other nodes in the network by sending the worm to the specific ports of p2p node. So here the author tried to prove the efficiency of his method by performing an experiment. In this experiment he prepared a multiple group worm body and sent it repeatedly at regular intervals of time. Then he captured these packets and extracted their characteristics and compared it with the one that already exist in the feature library.P2p worm is detected separately using different algorithms like BF algorithm, KMP algorithm and suffix-array algorithm and compared their results doing three experiments. In the experiment 1, worm characteristics are in the same packet.. in the experimentSecurity Issues in Peer-to-peer NetworkingSecurity Issues in Peer-to-peer NetworkingACKNOWLEDGEMENTSThe interest in the field of networking, driven me to take the computer networking as my course in M.Sc. there are many different types of networks. Out of them the more popularized and upcoming trend of networks are peer-to-peer networks. This report of my final dissertation for the partial fulfilment of my M.Sc, computer networking, would not have been possible without the support of my supervisor, Mr. Harry Benetatos. He helped me a lot by guiding me a nd pin-pointing the key mistakes which I have done during my research. My course leader Mr. Nicholas Ioannides also helped me a lot to complete this dissertation. His advises and suggestions gave me a lot of encouragement and support which made me do this research and finish it in time. I am very thankful to my university, LONDON METROPOLITAN UNIVERSITY which provided me the free access to the IEEE library which helped me to find the key papers which are very useful for my research. I also thank my parents for their support given to me in all walks of my life.DEDICATIONI dedicate this report to my parents and my well wisher Sakshi for their constant support and encouragement throughout my education and life.CHAPTER 1PROJECT INTRODUCTION1.1 INTRODUCTION TO THE PROJECTThis dissertation is all about the security issues in the peer-to-peer networks. There are many security issues in peer-to-peer networks. I have chosen to do research on worm intrusions in peer-to-peer networks. In this document I have mentioned how the worm propagates in the network from one peer to another peer, how the worm can be detected and how the detected worm can be attacked and save the network from getting infected.1.2 AIMSecurity issue in Peer-to-peer networksSecuring the peer-to-peer network from worms.1.3 OBJECTIVES To understand how the peers communicate with each other in the peer-to-peer network To analyse the propagation of worms in the network. To detect the worms near the nodes of the network To defence the worms in the network.1.4 RESEARCH QUESTIONThis document briefly discusses about how the worms propagates in the network and how can it be detected and attacked in order to save the peer-to-peer network1.5 APPROACHMy approach for this dissertation is as follows Understanding peer-to-peer networks Defining the problem Data collection and analysis Study and understanding the existing solutions for the problem Comparing different solutions conclusion1.6 METHODOLOGYThis section of my document contains what important steps to be followed in order to achieve the mentioned objectives. It also helps to schedule how to develop and complete different parts of the dissertation.In this dissertation firstly I will study and understand about the peer-to-peer networks and how the peers in the networks communicate and share information with the remaining peer in the network. Then I do research on how the worm propagates in the network, how can the worm be detected and how the detected worm can be attacked and restore the network. In the pictorial form the different stages of my dissertation are1.7 PREVIEW ABOUT THE COMING CHAPTERS IN THE REPORTThe rest of the report is organised as follows in the chapter 2, there is brief discussion about the peer-to-peer networks, different types of peer-to-peer networks, advantages and disadvantages of the peer-to-peer networks. There is also some information about the worms, its nature and different types of worms. In chapter 3, ther e is a discussion about the methods given by the different person to detect the worm in the network by the method of matching the characteristic string of the worm. In section 4, there is a solution for this issue. That is mathematical method of detecting the worm in the network and defending it. Chapter 5 consists of a critical appraisal and suggestions for the further work. Finally, I concluded in chapter 6.CHAPTER 2OVERVIEW OF THE GENERIC AREA AND IDENTIFICATION OF PROBLEM2.1 NETWORKNetwork is a group of electronic devices which are connected to each other in order to communicate which each other. The devices can be computers, laptops, printers etc. networks can be wired or wireless. Wired networks are networks in which the devices are connected with the help of wires. Wireless networks are the networks in which the devices are connected without the wires. There are many different types of networks and peer-to-peer is one of the important and special types of networks.2.2 PEER-TO -PEER NETWORKSPeer-to-peer networks are emerged in 1990 because of the development of the peer-to-peer file sharing like Napster 1. Peer-to-peer networks abbreviated as p2p networks are the networks in which all the nodes or peers in the network acts as servers as well as clients on demand. This is unlike typical client server model, in which the clients requests the services and server supplies the resources. But in case of peer-to-peer networks every node in the networks requests services like a client and every node will supply the resources like server on demand. Peer-to-peer network doesnt need any centralized server coordination. Peer-to-peer network is scalable. Addition of new nodes to the network or removal of already existing nodes on the network doesnt affect the network. That means addition or removal of nodes can be done dynamically. All the nodes connected in a peer-to-peer network run on the same network protocol and software. Resources available on a node in the netw ork are available to the remaining nodes of the network and they can access this information easily. Peer-to-peer networks provide robustness and scalability. All the wired and wireless networks can be configured as peer-to-peer networks. Home networks and small enterprise networks are preferable to configure in a peer-to-peer networks. Most the networks are not pure peer-to-peer networks because of they use some network interface devices. In the beginning, the information is stored at all the nodes by making a copy of it. But this increases the flow of traffic in the network. But now, a centralised system is maintained by the network and the requests are directed to the nodes which contains the relevant information. This will save the time and the traffic flow in the network.2.3 WIRELESS NETWORKSDevices connected to each other without any wires can also be configured like peer-to-peer networks. In a case of small of number of devices it is preferable to configure the network in wir eless peer-to-peer networks because it will be easy to share the data in both the directions. It is even cheaper to connect the networks in wireless peer-to-peer because we do not need to spend on the wires.Peer-to-peer networks are divided into three types. They areInstant messaging networksCollaborative networksAffinity community networks2Instant messaging networksIn this type of peer-to-peer networks, the users can chat with each other in real time by installing some software such as MSN messenger, AOL instant messenger etc.Collaborative networksThis type of peer-to-peer networks are also called as distributed computing. This is widely used in the field of science and biotechnology where the intense computer processing is needed.Affinity community peer-to-peer networksIt is a type of p2p network, where the group of devices are connected only for the purpose of sharing the data among them.Peer to peer networks are basically classified into two types. They are Structured peer-to-pe er networks Unstructured peer-to-peer networks2.4 STRUCTURED PEER-TO-PEER NETWORKSIn the structured peer-to-peer nodes connected in the network are fixed. They use distributed hashing table (DHT) for indexing 4.In DHT data is stored in the form of hash table like (key, value). Any node willing to retrieve the data can easily do that using the keys. The mapping of values to the keys are maintained by all the nodes present in the network such that there will be very less disruption in case of change in the set of participantsDHT-based networks are very efficient in retrieving the resources.2.5 UNSTRUCTURED PEER-TO-PEER NETWORKSIn unstructured p2p network nodes are established arbitrarily. There are three types of unstructured p2p networks. They arePure peer-to-peerHybrid peer-to-peerCentralized peer-to-peerIn Pure p2p networks all the nodes in the network are equal. There wont be any preferred node with special infrastructure function.In hybrid p2p networks there will be a special nod e called supernodes 3 . This supernode can be any node in the network depending on the momentary need of the network.Centralized p2p network is a type of hybrid network in which there will be one central system which manages the network. The network cannot be able to work without this centralized systemBasically, all the nodes in the peer-to-peer networks contain the information of the neighbour in its routing table. The rate of propagation of worms in the peer-to-peer networks is larger than compared to the other networks. This is because the information of the neighbour peers can easily achieved from the routing table of the infected node.Different types of files are shared between the nodes in the peer-to-peer networks. These files can be the audio files, video files, music files, text documents, books articles etc. there are a lot of peer-to-peer software available these days in the market for sharing the files. Some of them are bittorrent, limeware, shareaza, kazaa, Imesh, bear share Lite, eMule, KCeasy, Ares Galaxy, Soulseek, WinMX, Piolet, Gnutella, Overnet, Azureus (vuze), FrostWire, uTorrent, Morpheus, Ants, Acquisition5. There are lot more file sharing softwares in the market but these are the top 20 file sharing softwares for peer-to-peer networks.Basically, all the nodes connected together in the network should configure with the same network protocol and the same software should be installed in all the nodes in order to communicate with each other. Else the nodes in the network cannot communicate if they are configured with the different software or protocol.2.6 ADVANTAGES OF PEER-TO-PEER NETWORKS 6It is more useful for the small business network comprising of very small number of computer systems or devices.Computers in this network can be configured easily.Full time network administrator is not required for the p2p networks.Easy maintenance of the network.Only a single operating system and less number of cables needed to get connectedCan be insta lled easilyUsers can control the shared resourcesDistributed nature of the network increases the robustness of the network.2.7 DISADVANTAGES OF THE PEER-TO-PEER NETWORKS 12No centralised administrationBack-up should be performed on the each computer individually.Peer-to-peer networks are not secureEvery computer in the network behaves as server and client which can slow down the performance of the systemLegal controversy with the copyrights.2.8 WORMWorm is a computer malware program or it can be called as a mischievous code which can multiple itself into several replicas or it duplicate itself into several copies. Worm in simple can be called as autonomous intrusion agent 19 .It doesnt actually alters the function of the system but it pass through i.e., worm is unlike virus. It intrudes the network without the mediation of the user.This is first detected by Robert T Morris in 198818. Today we have some billions of systems connected to internet. Bu during 1988 there were only 60,000 systems connected to the internet. During that period 10% of the internet systems i.e., 6000 of the systems are infected and almost clogged because of the worms 8.Worms when enters the system it hides in the operating system where it cannot be noticeable 18 . It drastically slows down the system the effect the other programs in the system. In worst cases it could even effect the entire network and slow down the internet across whole world.As it is said earlier that it replicates itself into multiple copies and attach itself to the emails and corrupt them and sometimes deleting the file without the user interaction. If it enters our email, it can able to send itself to all the contacts in our email book and then to all the contacts of the emails of our email book and likewise it propagates, grow and spread at the higher rate.Worms will even create the backdoor into the computer 11. This will make the attackers to send spam easily.Some famous worms discovered in 2003 and 2004 are Mydo om, Sobig and Sasser7. Sasser worm has recently affected the computers which are using Windows 2000 or Windows XP operating system. It restarts the system automatically and crashes it. It is spread to all the nodes in the network.There are some worms which are unlike the normal worms. These worms are very useful to the user some times. Hence, these are called the helpful worms 9. Sometimes they help users without the interaction with the user. But most of the known worms are harmful and will invariably tries to infect the nodes in the network and affect the performance of the network.When the peer-to-peer networks are attacked by the worms, it slows down the efficiency of the network. So there is a need to save the networks from entering into the network and spreading itself all over the network. The worms should be detected and defended. If we delay in defending these worms, they replicate itself and makes many copies of itself and spread all through the network. This is very dan gerous to the network as it affects the performance and efficiency of the network 10.CHAPTER 3RELEVANT WORK DONE BY OTHERS IN ORDER TO SOLVE THE PROBLEMMany people proposed solutions to this problem. First Zhou L gave solution to p2p worm and he observed that propagation of worm in p2p network is very speed when compared to other networks13 . Jayanthkumar performed some simulations on worm propagation from infected node to other node10. Wei yu researched on the behaviour of worms in p2p networks14. In my research I found one more interesting method of detecting the worms in the peer-to-peer network. This is indeed a special method of detecting the worms in network because the authors Yu Yao, Yong Li, Fu-xiang Gao, Ge Yu in their paper titled A Signature-behaviour-based P2P worm detection approach they proposed a mechanism of detecting the known worms in the peer-to-peer networks based on characteristic string matching. Worm make use of vulnerabilities in the network and +Spreads15. They also proposed the detection mechanism for the unknown worms based on their behaviour. They technique mainly consists of the technology of characteristic string matching, identifying the application and the unknown worm detection technology. They have given the algorithm for the matching the characteristics string of the worm called suffix-tree algorithm- suffix array algorithm. This is efficient and simple with very less time complexity. As peer-to-peer network follows fragment transfer technique there is chance of assigning the characteristics string of the worm to the other blocks of data. And again during the reorganisation process this characteristic string can identify the worm. These authors even validated their results by simulation. They proved that their method is also one of the efficient methods of p2p worm detection.As mentioned above this method detects the known worm and also the unknown worms based on characteristic string matching and their behaviour respectivel y. In this method they initially capture the network packets using the library function called LibPcap. LibPcap is the library function that captures the network packets in UNIX and Linux platforms. This function contains many functions that will be useful for capturing the network packets. After capturing the data packets with help of these functions the non-P2P packets are filtered out. So now the P2P packets are filtered. In these P2P packets the known worms are detected by using the characteristic string matching. This is implemented by the couple of algorithms. They are the suffix array algorithm and the dichotomy algorithm. These algorithms are very accurate and are capable of detecting the worms in very less time. As I mentioned above peer-to-peer networks follow fragment transfer mechanism. Hence the characteristic string of the worm can be assigned to the other blocks of data. So, in this situation it is difficult to detect the worm if the characteristic string of the worm is based on the single packet. But if the characteristic string is present in the block then there is a chance of detecting the worm because it will assign it to the two packets. At this time the worm characteristic string present in the two different data packets need to restructure. After restructuring, the worm can be detected by using the matching mechanism. In this way the known worm in the network is detected by using the characteristic string matching. The unknown worms in the p2p network can be detected with the help of the act characteristics of the worm at the initial stage of its propagation. This can be called as the behaviour based detection of the unknown p2p worms. Like this all the known and unknown worms in the network are detected.3.1 P2P KNOWN WORM DETECTIONThere are four steps in detecting the p2p known worms. They areDeal flowTechnology of identifying the applicationCharacteristic string matchingReorganising the characteristic string3.1.1 DEAL FLOWIn this step o f deal flow the flow of data is divided into four steps16.Step 1 Extracting the p2p data stream from the original data stream.Step 2 check the extracted p2p data stream for worms using characteristic string matching with the worms already existing in the library function.Step 3 data is flow is reorganised. It now contains worm characteristic string as well. Go to step 2.Step 4 check the data flow for unknown worms using unknown worm detection techniques.After performing the four steps update the library function.All the four steps is represented pictorially as in the next page.Figure 4 flow chart representing four steps to detect wormsyes normalNormalnoAbnormalabnormal3.1.2 TECHNOLOGY OF IDENTIFYING THE APPLICATIONAs said earlier, this paper uses the method of capturing the data packets and sca it for the worms which are known with the help of a function library called LibPcap17 . For this there should be already some assigned rules in the network interface devices. So assigning th ese rules to those devices is done in stepwise procedure asIdentify the available network interface devicesOpen the network interface deviceCompile the rules that we are willing to attach to the devicesSetup the rules of filtering to the deviceNow operate the equipmentStart the process of capturing the packetsThere are some rules for identifying the p2p application. They areCharacteristic information of the known p2p is usedSometimes, if source-destination IP pairs dont use the known P2P and they may use TCP and UDP at same time, then they are p2p.At a particular time source pairs srcIP, srcport27 and the destination pairs dstIP, dstport27 are checkedHere we can identify whether its a p2p or not. If the number of connection port is equal to the number of connection IP, then we can say that it is a p2p. There are the situations where these rules have been used unruly. So the there were some amendments made to these rules. The amendments are rule (2) can identify even the mazes which are present and rule (3) is modified in such a way that in the detect cycle srcIP, srcport27 pairs at the source and the dstIP, dstport 27 pairs at the destination are checked. From this they derived that if the number of connection port is equal to the number of connection IP, the protocols which are used are same. If they are different then the protocols are different.3.1.3 CHARACTERISTIC STRING MATCHINGThis is the most important section of the paper. Here authors have given some definitions to the terms which we are going to use, the algorithms which we are going to use to detect the worm. Couple of algorithms are mentioned. They are suffix-array algorithm and the dichotomy algorithm. So the entire process of detecting the worm depends on the efficiency and the accuracy of these algorithms.First of all before using and understanding suffix-array algorithm we will try to understand some keywords and rules.Suffix suffix is the part of a string or a substring which starts at a parti cular location to the end of the string. If a suffix in the string S starts at the location i to the end of the string S, then the suffix can be represented as Suffix(i)=Si,Len(S) 27 .Let us understand how the strings can be compared. The comparison in this paper followed dictionary comparison If u and v are the two different strings. Comparing the strings u and v is same like comparing ui and vi, where i starts with the value 1. Here string u is equal to string v i.e., u=v when ui=vi String u is greater then string v i.e., uv when uivi String u is less than string v i.e., uBut the results were still not obtained for ilen(u) or ilen(v)Also if len(u)len(v) then u v, if len(u) Suffix-array suffix-array is denoted by SA. It is a one-dimensional array. It is an array of SA1, S2, SA3,. And so on. Here siRank-array rank-array is nothing but SA-1. If SAi=j, then Rankj=i. we can say that the ranki saves the rank of Suffix(i) in an ascending order for all the suffixes.In this paper the autho r has taken the example of string science and explained everything clearly. The string science can generate seven suffixes. They areSuffix(1) scienceSuffix(2) cienceSuffix(3) ienceSuffix(4) enceSuffix(5) nceSuffix(6) ceSuffix(7) eWhen we sort out everything in a dictionary order it will be in the order as followSuffix(6)= ceSuffix(2)= cienceSuffix(7)= eSuffix(4)= enceSuffix(3)= ienceSuffix(5)= nceSuffix(1)= scienceSuffix-array algorithm follows multiplier ideas. Firstly get SA1 and Rank1 by comparing every character in the string. Comparing string is similar to comparing the every character sequentially. So by comparing every character, SA1 and Rank1 can derive SA2 and Rank2. And this SA2 and Rank2 will derive SA4 and Rank4. And this will again derive SA8 and Rank8. So finally suffix-array and rank-array are derived from this process. The main process of the suffix-array algorithm is Calculating SA1 and Rank1. Firstly all the suffixes are arranged in the first letter order and then suffix-array (SA1) is generated by using quick sorry algorithm and then Rank1 is also generated. Comparing 2k-prefix Suffix(i) and Suffix(j) using SAk and Rankk.2k-Suffix(i) = 2k-Suffixes(j), this is equivalent to RankkSAki = RankkSAkj and RankkSAki+k = RankkSAkj+k2k-Suffix(i) Suffix-array algorithm is a sorting algorithm which sorts out the characteristic string. So, this uses binary search algorithm. The algorithm followsStep 1 in the first step values are assigned like left=1, right=n and max_match=0Step 2 the middle value i.e., mid= (left +right)/2.Step 3 comparing the characters corresponding to Suffix (SAmid) and P. the longest public prefix r can be helpful in implantation and comparison. If r max_match, then max_match=r.Step 4 if Suffix(SAmid)If Suffix(SAmid)P, then right=mid-1If Suffix(SAmid)=P, then go to step 6Step 5 if leftStep 6 if max_match= m, then print match is successful.3.1.4 REORGANISING THE CHARACTERISTIC STRINGIn this step the characteristic string is reorgani sed. If the character string is divided into two different data blocks, then the data block with the partial characteristic string is stored. Basically, all the information about the data block like index, beginning offset, length of the block and so on are contained at the head of the each block. Here a structure piece is defined which consists of index of the block, beginning offset of the block offset, length of the character array head and the length of the character array end18. Initially each and every data packet is compared with the characteristic string for matching. If it is matched then the warning or an alert is sent to all the users about the worm. Here if the tail of the characteristic string of the worm matches with the head of the data block, then it will be stored in the character array end. And if the head of the characteristic string of the worm matches with the tail of the data block then it is stored in the corresponding character array head. Suppose if the neig hbouring data block contains a partial characteristic string of the worm then the neighbour string in the array head as well as in the end will be reorganised. Now this reorganised string will again perform the characteristic string matching and if any worm is detected then again the warning is sent to all users saying that the worm have found. If it is not matched then it wont perform any operation. If in a case that the characteristic string is present in the block but is divided into two different data packets, then a special term called character array is introduced. First the matching mechanism is performed in both the data packet. If the matching characteristic string is found then the warning is sent to the users that there is a worm present. But if only part of the characteristic string is found then it will be enough if it meets some of the requirements like the head of the data packet should match with the tail of the characteristic string or the tail of the data packet sh ould match with the head of the characteristic string. But if these conditions are not satisfied then no operation is performed. Now, if the tail of the data packet contains the partial characteristic string then the data packet is stored in the array. If the length of the characteristic string is m, then the Arraym is set as . And if the head of the data packet contains a part of the characteristic string then that data packet is stored in the n consecutive units of array. Finally, this array will be the characteristic string matching and if the worm is detected then the warning is sent to all the users. If it is not matched then nothing is done.3.2 DETECTING UNKNOWN P2P WORMIn the above section we have seen how the known worm is detected. But that algorithm or mechanism are meant to detect the unknown p2p worms. So here in this section we will understand how the unknown worms can be detected and restrain the network. As we know in p2p networks a node can able to send the informati on to multiple hosts at a same time. Anyhow same protocol is used by all the nodes in the network27. These characteristics of the network helps worm to propagate easily. As we discussed above, only the known worms can be detected by using the characteristic string matching method. Here we will see how the unknown worms can be detected. The unknown worms are detected based on the behaviour of the node. Some of the detection rules are same content files are transferred to multiple hosts in a very short time. Same protocol is used and the destination port is same. If these rules are satisfies by the source port then it allows the p2p worm to propagate. Now, it is necessary to extract the characteristics of worm near the worm propagation nodes. When these characteristics are extracted, they are added to the feature library. This data similarity comparison and extracting the characteristics are done using the LCSeq algorithm. But the LCSeq algorithm based on generalized suffix tree (GST) is the more efficient. The overall idea is that all the suffixes are represented as a tree.And this tree will have some characteristics like Every node in a tree is a string and root is the empty string Every suffix can be represented as a path from the root. Every substring can be considered as a prefix of a suffix. To achieve the searching public sub sequence, every node should be set the information of its subordinate source string.3.3 EXPERIMENTWe know that the worm body tries to infect the other nodes in the network by sending the worm to the specific ports of p2p node. So here the author tried to prove the efficiency of his method by performing an experiment. In this experiment he prepared a multiple group worm body and sent it repeatedly at regular intervals of time. Then he captured these packets and extracted their characteristics and compared it with the one that already exist in the feature library.P2p worm is detected separately using different algorithms like BF algori thm, KMP algorithm and suffix-array algorithm and compared their results doing three experiments. In the experiment 1, worm characteristics are in the same packet.. in the experiment
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.