Relating Query Popularity and File Replication in the Gnutella Peer-to-Peer Network
Autor: O. Waldhorst, A. Klemm, C. Lindemann Links:
Quelle: Proc. 12th GI/ITG Conf. on Measuring, Modelling and Evaluation of Computer and Communication Systems (MMB), pp. 305-314, Dresden, Germany, September 2004
In this paper, we characterize the user behavior in a peer-to-peer (P2P) file sharing network. Our characterization is based on the results of an extensive passive measurement study of the messages exchanged in the Gnutella P2P file sharing system. Using the data recorded during this measurement study, we analyze which queries a user issues and which files a user shares. The investigation of users queries leads to the characterization of query popularity. Furthermore, the analysis of the files shared by the users leads to a characterization of file replication. As major contribution, we relate query popularity and file replication by an analytical formula characterizing the matching of files to queries. The analytical formula defines a matching probability for each pair of query and file, which depends on the rank of the query with respect to query popularity, but is independent of the rank of the file with respect to file replication. We validate this model by conducting a detailed simulation study of a Gnutella-style overlay network and comparing simulation results to the results obtained from the measurement.