The branch office is on the far side of the WAN, and predictably the SCCP phones are on the branch router's EtherSwitch ports. So, how to find the actual voice packets that we want to know how much network bandwidth is utilized?
- Easy, just browse the flow exports pcap. But the number of voice flows is actually a very small fraction of the number of signaling flows. Just look at this screenshot of pcap search,
, the first actual voice flow is at packet 280. The IP source/destination is indistinguishable between small, frequent signaling flows and actual voice flow. It turns out, among the thousands of packets, only 3 are actual voice traffic flow exports. Browsing is out of the question. The real C1900's flow pcap is at
https://drive.google.com/file/d/0B2NfHoyfFf1aMzlNbnM0SkhpQTg/view?usp=sharing . You can try it yourself.
- Easy, just search the TOS on the flow exports pcap, right? The problem is that the hardware phone is subject to change without notice, so the TOS number is unknow.
- Why not search for flow packet size? The size is actually unknown as well. In above screenshot, there is a roughly 36MB sized packet every half an hour of 1800 seconds. The string "Octets: 360" seems very unique. However, how do you get this number? Intranet voice runs at the common 64kbps for u/alaw 8kps sample, 8 bits quantization. (64+16overhead)kbps * 1800 seconds / 8bit/Byte . It should be 18MB. It turns out, the counting is done twice for each packet while ingress to the router and at the egress interface. There is no certain way to foresee the search string "Octets: 360".
It turns out, ELK search is necessary to figure out what search strings we want! And that search is performed by ranking all flow size(bytes), and the 36MB stands out in the crowd of 1KB short signaling flows.
The ELK graphing is as below picture after taking out all signaling flows,
. Notice, that straight ELK can't handle niche fields like VLAN ID. When VLAN ID is present, ELK may simply discard the whole flow. That is why we need to alter the flow content in an editor in the terminal.
Now, after going through so much, are we sure that modern phone voice indeed is with 8kps sampling rate? According to Math, 8 kHz sampling rate can only transmit 4kHz sound, and that seems too low for high human hearing. The answer is that modern SD phone speech is defined as from 300 to 3400 Hz. We can hear a lot wider range of sounds than we can make. And the speech encoding is not meant to transmit Hi-Fi music, waves, etc.