|Author:||O. Waldhorst, A. Klemm, C. Lindemann, M. Vernon||links:||DownloadBibtex|
|Source:||Proc. ACM Internet Measurement Conference (IMC), pp. 55-67, Taormina, Italy, October 2004|
This paper characterizes the query behavior of peers in a peer-topeer (P2P) file sharing system. In contrast to previous work, which provides various aggregate workload statistics, we characterize peer behavior in a form that can be used for constructing representative synthetic workloads for evaluating new P2P system designs. In particular, the analysis exposes heterogeneous behavior that occurs on different days, in different geographical regions (i.e., Asia, Europe, and North America) or during different periods of the day. The workload measures include the fraction of connected sessions that are passive (i.e., issue no queries), the duration of such sessions, and for each active session, the number of queries issued, time until first query, query interarrival time, time after last query, and distribution of query popularity. Moreover, the key correlations in these workload measures are captured in the form of conditional distributions, such that the correlations can be accurately reproduced in a synthetic workload. The characterization is based on trace data gathered in the Gnutella P2P system over a period of 40 days. To characterize system-independent user behavior, we eliminate queries that are specific to the Gnutella system software, such as re-queries that are automatically issued by some client implementations to improve system responsiveness.