[ Home ] [ Table of Contents ] [ About Lee Goeller ] [ Search ]

Goeller on Telecom Traffic

A First Course in
Telephone Traffic Engineering

Chapter 2: The Binomial and Poisson Distributions.
Lost Calls Held

Some History

Back before the World War I, E. C. Molina at AT&T and other telephone engineers in America and Europe were developing mathematical models to represent the behavior of telephone users. During the course of Molina's work, he showed that the binomial distribution was a good model and, further, that when he wished to deal with very large numbers of callers (say, more than 100 or so), it could be simplified to another probability distribution that was easier to use (1908).

The new distribution was quite nice, and everybody applauded. However, somebody with a background in math and an eye for history noted that this very same distribution had been published in 1837, years earlier, by a French mathematician named Poisson. As the story goes, Molina taught himself French, read the work of Poisson, and saw that M. P had, indeed gotten there first. So Molina made sure that the distribution which he had derived independently was referred to by the proper name.

Throwdowns

We have all heard that a PBX should handle 6 CCS per extension. But what does this actually mean? Well, 6 CCS is 600 seconds or 10 minutes. Further, it is one sixth of an hour. Thus, we are assuming that, during the busy hour, each station user is on the phone for 10 minutes, more or less at random, and the occupancy of his line is about 17%. But the specification is not complete. We also have to specify a grade of service. Implied, but seldom actually stated, is a probability of less than one in one hundred of not finding a path through the switching matrix, the part of the PBX that connects lines to each other or to trunks. Thus the entire specification should read: During the busy hour, the switching matrix shall have a probability of blocking of less than B.01 when each station user is generating 6 CCS of traffic.

This does not say that we have a probability of 0.99 of reaching another extension when we call at random; by definition, any particular extension has only an 83% probability of being idle when we call it. Further, we don't have a 99% chance of dialing 9 and getting outside; if we engineer our CO trunk group for B.05 service, we have only a 95% chance of finding at least one trunk free. What we are talking about is the probability of a path through the matrix being available when the caller wants his connection established. But even if the matrix path is available, the called matrix terminal may well be busy on another call.

But let's look at a much simpler problem. Let's assume that we are told that each extension, whenever we take a random look at our system, has a probability of 1/6 of being in use. Suppose we have 60 extensions. How many, on the average, will be off hook, and how will that number actually vary with time?

The nice thing about 6 CCS is that it matches an ordinary die of the type used in gambling, board games, and the like. There is a probability of one in six that any given one of the six sides of the die will be up after we roll. Thus we could get 60 dice, roll them, and count the number of ones that turn up. We could roll them again and again, and thus simulate the behavior of our 60-line PBX. Each roll corresponds to one look at the system during the hour, and the number of ones corresponds to the number of phones off hook.

Today, of course, we do simulations on a computer. But the telephone industry is almost 100 years older than the practical business computer, and the telephone people couldn't wait. They used dice, and performed "throwdowns" by throwing down the dice and observing the results. In the telephone business, many people still call simulations throwdowns.

Obviously, if we have 60 dice, the most likely number of ones (or any other face) coming up is is one sixth of 60, or 10. But we will not get ten every time we throw. We will sometimes get more, sometimes less, but, after a very large number of throwdowns, the average number of ones per throw will be quite close to 10.

Simulations of this sort are fun. But, before you rush down to your neighborhood toy store and buy all the dice in stock, let's do things the easy way. Fire up your computer and load from the disk that comes with this book the program called SIMU-1. When your computer tells you that SIMU-1 is ready to go, tell it to run the program. The program will now ask you for two things: the number of lines on your PBX, and a random number. For now, let's use a 60 line PBX; the random number can be any number you like, such as 5 or 27 or 623. What it does is start a random number generator that is part of your BASIC system. BASIC will then generate a random number between 0 and 1 and, if it is less than 0.166666, it will assume a given die came up one. If it is greater than 0.166666, it will assume some other number came up. It will do this for each line on the PBX, count the number of ones, and give you a display.

After 18 looks, corresponding to 18 throwdowns of 60 dice, you will have a display on your screen showing how many lines were off hook each time, and a summary. If you then tell the program to continue, it will run again, keeping score as it goes. Note that it keeps track of the average number of lines off hook in each run, and the percent off hook. Further, it accumulates this information over all runs to date; the latter should quickly approach the theoretical values. Your screen should look like Printout 2-1 in Appendix IV, except for the difference in output between your random number generator and mine.

One more item is kept. On each run, the number of lines off hook is counted. If that number is bigger than the biggest one seen so far, the old number becomes the ex-champ and the new number becomes the biggest. Thus, after continuing through ten or twelve screens full of data, we have a pretty good idea of the most lines off-hook we are likely to find.

But now, let us pause and think for a moment. How many paths through the PBX will we need? If we just have an intercom, with no trunks to the outside world, we will need half as many paths through the switch as the peak number of lines off-hook. Why? Because it takes two to tangle. We seldom use a telephone to talk to ourselves. Thus two people off hook are needed for every conversation. If we have trunks to the outside world, things change. In the absence of measured data, we often make the assumption that half the calls on a PBX are internal, and half are external. Thus, because two internal extensions are used on each internal call, but only one is used on an external call, the other end reaching a trunk rather than an extension, two thirds of our callers are participating in inside calls, while the remaining third are in touch with the world. For example, if our 60-line PBX showed 15 lines off hook maximum after ten or fifteen runs, we might assume 5 internal calls involving 10 callers and 5 outside calls involving the remaining 5.

There is another program called SIMU-2 that lets you use other occupancies. It will ask you for the average occupancy per extension, and then run as before. If you give it 0.16666, you will get the same results as with SIMU-1. But if you give it 10% occupancy (3.6 CCS per line) or 6% (2.16 CCS), the results will be different.

The Binomial Distribution

Running a simulator, like playing the new electronic slot machines at Atlantic City, is a lot of fun (and works very much the same way), but it takes time. Further, we don't actually know the grade of service implied by runs of such simple simulators as SIMU-1 and -2. We just know that in so many runs, we never see more than X lines off hook, maximum. There ought to be a better way...and there is. This is what Molina worked on. He, like several others, applied probability theory to the problem, and showed that the answer was the standard Binomial distribution. For those who are interested in the derivation, turn to Appendix 1. The Binomial gives the probability of exactly X lines being off hook (or exactly X ones turning up in a throwdown of 60 dice), and is the first step to a complete and useful solution.

The next step was to find the probability of blocking that a telephone call would encounter; this is, of course, something a little different from knowing the largest number of lines off-hook in a finite number of simulator runs. To appreciate the development, let us consider the particular hardware involved. In those days, the Automatic Electric Company had evolved Step by Step (SXS) switching to a high art for small central offices, and Bell was working on the Panel System for very large COs. But both worked pretty much the same way for our purposes here. A customer would pick up his phone and an automatic circuit, which remained in the connection for the duration of the call, would connect him to the means for accessing other lines in the system. Ultimately, the "automatic circuit" became the "line finder;" line finders were arranged in groups to serve 200 lines in SXS, and 400 lines in Panel. Each individual line finder was a special kind of switch with either 200 or 400 inputs, and a single output. The question was, how many line finders were needed in a group to serve either 200 or 400 lines?

Note that line finders only originated calls; completing to callED lines used connectors, and they were engineered separately. Second, there was only a finite number of callers served by the line finder group; for 200 SXS callers, it was obvious that more than 200 line finders would never be needed under any circumstances. But even with 6 CCS per line (3 originating, 3 terminating), heavy use for residential customers, the probability that all 200 would be originating calls at the same time was seen be be quite small.

What was done was to assume an infinite number of line finders, and then use the Binomial distribution to calculate the probability that X or more originating calls were in the system at any one time. Then, if the number of line finders were cut off at X, we would know the probability that all X were busy and any new call originations would be blocked.

Now, suppose we hoped that 15 line finders might be enough. What we had to calculate was the probability that 15 or 16 or 17 or 18...etc...up to 200 calls were being originated from our group of 200 lines at the same time. This would give us the probability that a new call, arriving at random, would find all the line finders occupied and would thus be blocked.

Let us digress for a moment and observe the actual operation of obsolete things like line finders. Suppose you pick up your phone and all the line finders in the group are busy. What happens? You wait until one becomes available; then, it comes and finds you and connects you to dial tone. Now, we have cleverly defined the traffic offered to the system to be the time the phone is off-hook, including the time waiting in a queue for a line finder to come free. Thus the time a call is in the system (the offered traffic per line) includes both the waiting time and the occupancy of the line finder. This is the measured off-hook time at the telephone set. Thus we can say that each call is in the system for one complete holding time; if it does not find a line finder immediately, it waits until one is available and then uses it for the remainder of a holding time. Molina called this the "lost calls held" assumption, and recognized it to be somewhat artificial.

So far, we have been sitting on the outside of the system, looking at the behavior of the callers. Assume we have a SXS system, with 200 lines in the line-finder group. Further, assume we have an average occupancy of originating traffic per line of 5% or 1.8 CCS per hour. We are now observing the entire group of 200 lines, each offering 0.05 erlangs of traffic; therefore the total offered traffic is (200 x 0.05) = 10 erlangs. With this offered traffic, what is the probability that originating calls will be blocked when we have 15 line finders...or 20?

This statement of the problem is known as "the wire-chief's point of view." It describes the behavior of the system looked at from the outside. But there is another point of view, the one originally used: that of a particular caller. He perceives a grade of service based on the behavior of all the other callers. When he comes off-hook to originate a call, he interacts with the system occupied by calls from the remaining 199 people in his line group. Thus the offered traffic from the "particular caller's point of view" is 199 x 0.05, or 9.95 erlangs.

The difference in the two points of view is not large when we have a group of 200 sources. Let's get a new program in our computer and see what the numbers turn out to be. Tell your computer to run BINOM-2. It will ask you for the number of sources and the average occupancy per source. Give it 200 and 0.05 for starters. The screen will now show P, or the probability of exactly N calls in the system, and P1, or the probability that N or more calls are in the system. (Printout 2-2 in Appendix IV shows sample runs of both BINOM-1 and -2). If we have N line finders and N or more calls in the system, a new call arriving at random will find all line finders busy and will have to wait. The first screen-full of data takes us to N=10. Have the computer continue and we see P and P1 for N in the range of 11 to 20. We see at once the probability of 15 or more originating calls in system is 0.0781, and the probability of 20 or more in progress or waiting is 0.0027. Now, let us run the program again, but using 199 sources. P1 for N=15 is 0.0756, and for N=20 is 0.0025.

The difference between the wire chief's point of view and that of the particular caller is seen to be quite small when the number of callers is large. But let's try a similar problem. Let's assume we have switches with only 20 inputs to use as line finders, but we still have 5% occupancy with originating calls. With 4 and 5 line finders, the wire chief sees a blocking probability of 0.0159 and 0.0026, respectively, while the particular caller sees P1 of 0.0132 and 0.002. Obviously, in a smaller system, a difference of one source makes a somewhat bigger difference.

A very similar program, BINOM-1, works directly in the particular caller mode. That is, if we have a 20 input switch with N outputs, it calculates the probability of one caller, attempting to originate a call at random, will find the remaining 19 callers originating N or more calls and thus blocking his attempt. In system design, BINOM-1 is most often used because it estimates the blocking probability as perceived by the customers. Further, because it observes the behavior of one less caller than BINOM-2, it gives a somewhat better grade of service for the same overall traffic.

Poisson

When playing with BINOM-1 or -2, we note that we must specify a number of sources and the occupancy (or traffic) per source. For 20 or 30 inputs, which we can think of as sources or Lines, as opposed to servers, or Trunks, which will carry the traffic away, this seems quite natural. But as systems grow in size, the mathematics gets tricky; further, we may have several groups of servers such as WATS and FX facilities, tie trunks, etc. So when the number of sources gets beyond about 50 or so, it is easier to assume a very large number of sources and work with the total offered traffic rather than the product of the occupancy per source and the number of sources. This is the approach that Molina used to go from Binomial to Poisson, and later found to have been used by Poisson himself. See Appendix I.

Mathematically, Molina assumed that, as the number of sources got larger, the traffic per source got smaller so that the product of the two, or the total offered traffic, remained the same. Physically, this difference between finite and "infinite" sources is easy to understand. With a finite number of lines, say L, we can never have more than L calls in the system at any one time. With infinite sources, there is no such upper limit. There is always a small probability that any given number of calls may be present at any one time.

For instance, if we have 20 lines, running at an average occupancy of 80%, a run of BINOM-2 shows the probability of 20 or more calls in the system at any one time is 0.0115. But the probability of 21 calls in the system is 0, because there are no more lines available to originate calls.

With 20 lines at .8 Erlangs per line, the offered traffic is 20 x .8 or 16 Erlangs. Now, let's suppose we have 40 lines running at 40% occupancy. This is the same 16 Erlangs of offered traffic. But now, the probability of 20 or more calls being present in the system (run BINOM-2 again) is 0.1298, and the probability of 21 calls being present is 0.0744. The probability of 23 or more calls being present is 0.0189. Now, let's try again, but for 80 lines at 20% occupancy. Same 16 Erlangs of offered traffic, but now the probability of 20 calls or more being present is 0.1634, the probability of 21 or more is 0.1066, and we would have a probability of 0.0217 of finding 24 trunks all busy. With 160 sources at 10% occupancy, again 16 Erlangs of offered traffic, the probability of blocking with 20 trunks is 0.1765, with 21 it is 0.12, and if we have 25 trunks, the probability of blocking is 0.0167.

Suppose we had infinite sources...but we still had 16 Erlangs of offered traffic. What would the numbers be? We can get the answer easily enough. Clear the computer and load the program named POISSON. When it asks, give it 16 Erlangs and 5 minute calls; run and continue until you get a screenfull of data including N=20 trunks. There you will find P1 which, once again, is the probability of 20 or more calls in the system. P1 is 0.1878. Similarly, P1 for N = 21 is seen to be 0.1318, and 25 trunks will handle 16 Erlangs at a grade of service of 0.0223. See Printout 2-3.Let's summarize all these results in a table:

			
L \ N  20  21  22  23  24  25
20  0.0115  0.0  0.0  0.0  0.0  0.0
40  0.1298  0.0744  0.0392  0.0189  0.0083  0.0034
80  0.1634  0.1066  0.0660  0.0388  0.0217  0.0115
160  0.1765  0.1200  0.0781  0.0487  0.0291  0.0167
INF  0.1878  0.1318  0.0892  0.0582  0.0367  0.0223

Our table shows Lines L as sources, ranging from 20 to INFinite, with Trunks N ranging from 20 to 25. The numbers in the table are the probabilities (P1) of finding N (or more) calls in the system and, thus, N trunks busy when a call enters the system at random, assuming in all cases that the traffic offered the system is 16 Erlangs.

It is evident from the table that, as the number of sources gets smaller, we have a lower P1 for the same amount of offered traffic. If we have a lower blocking probability, we could carry more traffic for the same grade of service. This phenomenon is called Limited Source Gain.

What we have observed here is that Binomial and Poisson tend to become indistinguishable when N gets large, traffic per source gets small, and the product of the two remains constant. This is more than coincidence, of course. The actual formulas from which the computer gins up the numbers can use the same assumption to transform Binomial into Poisson. Indeed, there are several ways that Binomial can be taken to Poisson "in the limit," as the mathematicians say.

Carried Traffic

P1 is the probability that a call, coming in at random, finds all trunks busy and perhaps some other calls waiting. But notice that this does not imply that the caller is disappointed. By definition, the caller will simply wait until a server comes free, and then use it for the remainder of the time he is off hook. Remember our Line-finder example. What we have here is a mini-queue and, as long as the queue is relatively short, compared with the total holding time, Poisson and Binomial work quite well. The next question we have to answer is just how much of the offered traffic is actually carried.

If we had an infinite number of servers, or at least a number of servers equal to the number of sources, there would be no probability of blocking and calls would be carried away at their arrival rate. Or, to put it another way, offered and carried traffic would be equal. But when the number of servers is less than the number of sources, there is always a probability that all the servers will be busy (perhaps with calls waiting for service). But one thing we know: if all N trunks are busy, we are carrying away traffic at the rate of N Erlangs. If P1 is the probability that at least N calls are in the system, P1*N is the amount of traffic carried during the time all trunks are busy.

During the rest of the time, trunks are available to handle the traffic as it comes in, so carried traffic equals offered traffic. But the probability that trunks are available is (1-P1) for N-1 Trunks. So the carried traffic is made up of two parts: the traffic carried during ATB, and the traffic carried when 1 or more trunks are free.

When we run POISSON, we find the program calculating the carried traffic. But it does a little more. Obviously, the difference between the carried traffic and the offered traffic is the total time spent waiting in queue. Divided by the number of calls, we get the average delay per call. We find number of calls by dividing total offered traffic by average holding time (allowing for the units involved). This is called D1 in the program, and is measured in seconds. We also notice that not all calls are delayed; only those arriving during ATB have to wait. But this is P1 times the total number of calls. To find the average delay of those calls that are delayed, which we will call D2, we simply divide D1 by P1. As the program is written, this gives us the average delay in seconds of just those calls which are delayed. This explains the additional columns on your screen or in Printout 2-3.

Summary

Binomial and Poisson are useful traffic models that apply when we have finite and "infinite" sources of traffic, respectively. Both assume that offered traffic, either occupancy per source or A, the total traffic, is known, and relate offered traffic to N, the number of servers and the grade of service. The programs give P, the probability that exactly N calls are in the system, and P1, the probability that N or more are present. P1 is, of course, the Grade of Service. Fig. 2-1 shows the relation between P and P1 for a Poisson distribution with 5 Erlangs of offered traffic.

Fig. 2-1. Poisson distribution, showing relation of P and P1.

With Poisson and Binomial, it is also possible to calculate carried traffic and delay as held calls wait for service. The program POISSON lists both carried traffic and delay. D1 is average delay on all calls, expressed in seconds; a D1 of 3 or 4 seconds is often specified rather than a probability of blocking, and fits situations where very short queues can be permitted. D2, the average delay on just those calls that are delayed, is seen to be much longer; because POISSON tends to be used when grade of service is quite good, the sample size on which D2 is based may not be realistic.

[ Top ] [ Next Chapter ] [ Table of Contents ]


Copyright 2006 Lee Goeller. All Rights Reserved.