My DSL line downloads at 6 megabits per second. I just took the test. This is a pair of twisted copper wires, the same POTS (Plain Old Telephone Service) twisted pair that connected your grandma’s phone to the rest of the world. In fact, if you had this phone, you could go online and use it today.
I remember the old 110 bps acoustic coupler modems. Maybe some of you can too. Do you remember the upgrade to 300 bps? Wow! Triple the speed. Gradually, the speed increased from 1200 to 2400, then finally, 56.6k. All about the same son. Now we feel like we’re out of change if we don’t get multiple megabits of DSL on that same POTS line. How do we get such speeds on a system that still connects and dials your grandma’s phone? How did the engineers know these increased speeds were possible?
The answer can be found in 1948 with Dr Claude Shannon who wrote a seminal article, “A Mathematical Theory of Communication”. In this article, he laid the foundations for information theory. Shannon is also recognized for having applied Boolean algebra, developped by George boole, to electrical circuits. Shannon recognized that switches then and logic circuits today followed the rules of Boolean algebra. This was his master’s thesis written in 1937.
Shannon’s Communications Theory explains how much information you can send through a communications channel at a specified error rate. In summary, the theory says:
- There is a maximum channel capacity, C,
- If the transmission rate, R, is less than C, the information can be transferred with a selected low probability of error using intelligent coding techniques,
- Coding techniques require intelligent coding techniques with longer signal data blocks.
What the theory doesn’t provide is information on smart coding techniques. Theory says you can do it, but not how.
In this article, I will describe this work without going into the mathematics of derivations. In another article, I’ll discuss some of the smart coding techniques used to approach channel capacity. If you can understand the math, here is the first part of the article as posted in the Bell System Technical Journal July 1948 and the remains published later that year. To walk through the system used to hold so much information about a twisted copper pair, keep reading.
Information theory in brief
Let’s start with a diagram to understand the basic problem. We have information sent from a source by a transmitter through a channel. The channel is disturbed by a source of noise. A receiver accepts the signal plus the noise and converts it back into information. Shannon has determined the maximum amount of information you can reliably move through the channel. The maximum bit rate is determined by the channel bandwidth and the amount of noise, and only these two values. We intuitively see that bandwidth and noise are limiting factors. What is amazing is that they are the only two factors.
Obviously, a channel with more bandwidth will pass more data than a smaller one. The bigger a pipe, the more it can be pushed into it. This statement is true for all communication channels, be it radio frequency (RF), fiber optic, or twisted pair of POTS copper wires.
The bandwidth of a channel is the difference between the highest and lowest frequencies that will pass through the channel. For example, a POTS voice channel has a low frequency of 400 Hz and a high frequency of 3400 Hz. This provides a bandwidth of 3000 Hz. (Some references indicate that the low frequency is 300 Hz, which provides a bandwidth 3,100 Hz.) In reality, channel limits are not sharp cuts. Shannon used a 3 decibel drop in signal strength to determine the limits.
A twisted pair has a bandwidth greater than 3000 Hz, so this is not the reason for the narrow bandwidth of POTS. Telephone companies impose this limited bandwidth in order to be able to multiplex the frequency of long distance calls on a single line. The bandwidth limit is acceptable because human speech is intelligible using this frequency band.
Shannon’s theory based on the work of Harry nyquist and Ralph hartley. Nyquist, analyzing telegraph systems, took the first steps towards determining the capacity of the channels. He determined that the maximum pulse frequency of a channel (PDF) is double the bandwidth. It’s the Nyquist rate (if you disagree, please see my note at the end of the ‘Nyquist’ terminology article). In our 3000Hz POTS channel, we can transmit 6000 pulses per second, which is totally counterintuitive.
Let’s send a 3000Hz sine wave through the channel. We kind of cut off all the negative lobes of the sine wave. If we designate the remaining lobes as 0 and the missing lobes as 1, we are sending 6000 bits through the channel. Nyquist discussed impulses, but we would now call them symbols in communication work. The number of symbols per second is a baud, named after Emile Baudot who created one of the first digital codes. It is incorrect to say “baud rate” because by definition it is a rate.
The Nyquist rate formula is:
VS is the capacity of the channel in symbols per second, or baud
B is the bandwidth of the channel in hertz
Hartley’s contribution extended this to use more than two signal levels, or multi-level coding. He recognized that the receptor determines the number of levels that can be detected, independent of all other factors. In our example of the POTS channel, you can use the amplitude of the sine wave to determine multiple levels. With 4 different levels, we can send two bits for each symbol. With multilevel encoding, the bit rate for a noise-free channel is given by:
VS is the capacity of the channel in bits per second
B is the bandwidth of the channel in hertz
M is the number of levels.
Obviously, noise will limit the amount of data that can be transmitted. This is analogous to the rough interior surface of a pipe causing friction and slowing the passage of material. The more noise, the slower the error-free data rate. Here is Shannon’s formula:
VS is the capacity of the channel in bits per second (bps)
B is the bandwidth of the channel in hertz
S is the average power of the signal received over the bandwidth
NOT is the average noise or interference power over the bandwidth
S / N is the signal to noise ratio (SNR)
Shannon’s output is in bits because he defined “information” using bits. Consider a deck of 52 playing cards with 4 suits and 13 cards in each suit. It takes 2 bits (00b to 11b) to represent the four colors and 4 bits (0000b to 1101b) to represent the cards. In total, it takes 6 bits to represent a deck of cards.
A game has more data that could be transmitted: color like 1 bit, face card or number card like 1 bit, kind of face card like 1 bit, etc. These are redundant since the complete information is already available in 6 bits. Based on Shannon’s work, this is no more information than telling someone that there were new articles on Hackaday. There are always new articles so this is not information. One day report without articles? It’s information.
The probability that a bit is received correctly or incorrectly is determined by the signal to noise ratio. Higher noise level means less error-free bit transfers. This is directly related to Hartley’s recognition that the bit rate is determined by the ability of the receiver to correctly detect multiple signal levels. Errors begin to occur when the noise level exceeds the receiver’s ability to differentiate between good and bad symbols. The impact of noise on a symbol can actually depend on the specific symbol. Low level amplitude modulated signals can be overwhelmed by noise while higher amplitude symbols are acceptable. Other modulation methods are affected by noise in other ways.
The phone company cheats
Earlier I asked how DSL worked through the same line that handled your grandma’s phone. The basic answer is cheating by phone companies. Seriously, the line between your home and the phone company’s first location isn’t filtered, allowing DSL to use all of the twisted pair’s bandwidth. On the telephone company side, they separate voice and DSL signals. Voice is limited to 3000 Hz and DSL left at full bandwidth. That little filter you have at home is a bandpass filter to block DSL signals from your handset.
I mentioned that Shannon’s theory does not answer the question of how to achieve high throughput. In the next article, we’ll look at some of these techniques, which include detecting and correcting errors on a noisy channel.
A commentary on Nyquist’s terms
There is a conflict between the references on the meaning of the Nyquist rate in relation to the limit or the sampling theorem. The confusion highlights Nyquist’s contribution, because if the amount was less, there would be no confusion. A Wikipedia article may not be definitive but the one intended for Nyquist rate explains the two opposite meanings of the term.