Limo, Chepchumba Soti, authorJayasumana, Anura P., advisorYang, Liuqing, committee memberPapadopoulos, Christos, committee member2015-08-282015-08-282015http://hdl.handle.net/10217/167211Distributed Hash Tables (DHT) are commonly used in large Peer-to-Peer networks to increase the efficiently of resolving queries. Minimizing the resource discovery time in P2P networks is highly desirable to improve system-wide performance. Distributed caching is an approach used to reduce the look-up time. File sharing P2P networks have shown that there exists nodes/users who share similar interests based on semantics, geography, etc., and a group of nodes that share similar interests are said to form a community. A Community Based Caching (CBC) algorithm where nodes make caching decisions based on personal interests is investigated. One of CBC’s major contributions is that it alleviates the issue of nodes being limited to caching resources that are popular relative to the entire network. Instead, caching decisions are primarily based on a node's community affiliations and interests. Community discovery algorithms that currently exists either need a centralized source(s) to aid in community discovery or require additional messaging and complicated computations to determine whether to join a group or not. In many cases, nodes are also limited to being members of only one community at a time. A dynamic and decentralized community discovery algorithm, Dynamic Group Discovery (DGD), is proposed. DGD also allows nodes to be members of multiple communities at the same time. DGD's behavior and performance is then evaluated in conjunction with the Community Based Caching algorithm. To aid in group discovery during run time (i.e., dynamically), DGD uses special keys with embedded group identification information. Oversim, a flexible overly network simulation framework is used to evaluate the proposed DGD algorithm. Performance of DGD is compared to Chord and Static Group Allocation (SGA), in which group identification is done only once. Performance is evaluated for different network sizes, community sizes, and asymmetry among communities. Performance results are presented and analyzed when queries are resolved using cache data versus when queries are resolved using non-cache data. The analysis shows that DGD generally improves lookup performance when cache data is used to resolved queries. However, when non-cache data is used, DGD occasionally performs slightly worse than Chord and SGA. For example, in a network with 10,000 nodes, asymmetrical communities and no churn group churn, DGD outperforms Chord by approximately half a hop and 0.1 seconds in latency. When churn was introduced to the same network, DGD performance drops by approximately one hop and 0.15 seconds in latency. The results also show that approximately 90% of the queries are resolved using non-cache data and therefore, even though DGD is guaranteed to reduce lookup time when asymmetrical communities are present and cache records are to used to resolve queries, it is often not enough to significantly improve overall system performance. The results however confirm that caching resources based on personal interests really does reduced lookup performance when resolving queries using cache records.born digitalmasters thesesengCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.communitiescachingP2P networksDecentralized and dynamic community formation in P2P networks and performance of community based cachingText