Fitting mixtures of exponentials to long-tail distributions to analyze network performance models

https://doi.org/10.1016/S0166-5316(97)00003-5Get rights and content

Abstract

Traffic measurements from communication networks have shown that many quantities charecterizing network performance have long-tail probability distributions, i.e., with tails that decay more slowly than exponentially. File lengths, call holding times, scene lengths in MPEG video streams, and intervals between connection requests in Internet traffic all have been found to have long-tail distributions, being well described by distributions such as the Pareto and Weibull. It is known that long-tail distributions can have a dramatic effect upon performance, e.g., long-tail service-time distributions cause long-tail waiting-time distributions in queues, but it is often difficult to describe this effect in detail, because performance models with component long-tail distributions tend to be difficult to analyze. We address this problem by developing an algorithm for approximating a long-tail distribution by a hyperexponential distribution (a finite mixture of exponentials). We first prove that, in prinicple, it is possible to approximate distributions from a large class, including the Pareto and Weibull distributions, arbitrarily closely by hyperexponential distributions. Then we develop a specific fitting alogrithm. Our fitting algorithm is recursive over time scales, starting with the largest time scale. At each stage, an exponential component is fit in the largest remaining time scale and then the fitted exponential component is subtracted from the distribution. Even though a mixture of exponentials has an exponential tail, it can match a long-tail distribution in the regions of primary interest when there are enough exponential components. When a good fit is achieved, the approximating hyperexponential distribution inherits many of the difficulties of the original long-tail distribution; e.g., it is still difficult to obtain reliable estimates from simulation experiments. However, some difficulties are avoided; e.g., it is possible to solve some queueing models that could not be solved before. We give examples showing that the fitting procedure is effective, both for directly matching a long-tail distribution and for predicting the performance in a queueing model with a long-tail service-time distribution.

References (59)

  • S. Asmussen et al.

    Large claims approximations for risk processes in a Markovian environment

    Stochastic Process Appl.

    (1994)
  • G.L. Choudhury et al.

    Long-tail buffer-content distributions in broadband networks

    Performance Evaluation

    (1997)
  • J. Abate et al.

    Waiting-time tail probabilities in queues with long-tail service-time distributions

    Queueing Systems

    (1994)
  • J. Abate et al.

    The Fourier-series method for inverting transforms of probability distributions

    Queueing Systems

    (1992)
  • J. Abate et al.

    An operational calculus for probability distributions via Laplace transforms

    Adv. Appl. Probab.

    (1996)
  • A.T. Andersen et al.

    Modelling and performance study of packet-traffic with self-similar characteristics over several time-scales with Markovian arrival processes (MAP)

  • S. Asmussen
  • S. Asmussen et al.

    Marked point processes as limits of Markovian arrival streams

    J. Appl. Probab.

    (1993)
  • S. Asmussen et al.

    Fitting phase type distributions via the EM algorithm

    Scand. J. Statist.

    (1996)
  • R.E. Barlow et al.
  • S.N. Bernstein

    Sur les fonctions absolument montones

    Acta Math.

    (1928)
  • P. Billingsley
  • V.A. Bolotin

    Modelling call holding time distributions for CCS network design and performance analysis

    IEEE J. Sel. Areas Commun.

    (1994)
  • A.A. Borovkov
  • R. Cáceres et al.

    Characteristics of wide-are TCP/IP conversations

    Comput. Commun. Rev.

    (1991)
  • G.L. Choudhury et al.

    Squeezing the most out of ATM

    IEEE Trans. Commun.

    (1996)
  • J.W. Cohen

    Some results on regular variation for distributions in queueing and fluctuation theory

    J. Appl. Probab.

    (1973)
  • M.E. Crovella et al.

    Self-similarity in World Wide Web traffic — evidence and possible causes

  • N.G. Duffield, Economies of scale in queues with sources having power-law large deviations scalings, J. Appl. Probab.,...
  • N.G. Duffield et al.

    Large deviations and overflow probabilities for the general single-server queue, with applications

  • D.E. Duffy et al.

    Statistical analysis of CCSN/SST traffic data from working CCS subnetworks

    IEEE J. Sel. Areas Commun.

    (1994)
  • A. Feldmann

    On-line call admission for high-speed networks

  • A. Feldmann

    Modeling characteristics of TCP connections

    (1996)
  • W. Feller
  • D.P. Gaver et al.

    Waiting times when service times are stable laws: tamed and wild

    (1995)
  • C.M. Harris

    The Pareto distribution as a queue service distribution

    Oper. Res.

    (1968)
  • M.R. Izquierdo et al.

    Statistical characterization of MPEG VBR video at the slice layer

  • R. Jain et al.

    Packet trains: measurements and a new model for computer network traffic

    IEEE J. Sel. Areas Commun.

    (1986)
  • P.R. Jelenković and A.A. Lazar, Subexponential asymptotics of a Markov-modulated G/G/1 queue, J. Appl. Probab., to...
  • Cited by (327)

    • Adaptation to climate change: Extreme events versus gradual changes

      2021, Journal of Economic Dynamics and Control
    View all citing articles on Scopus

    An abbreviated version of this paper has been presented at IEEE INFOCOM'97, Kobe, Japan, April 1997.

    View full text