The Purpose of this report is to illustrate the VoIP project using Sharp Zaurus on SIPUA based on CINEMA platform.
1. Introduction
Zaurus
VoIP project is a voice over IP (VoIP) application on Sharp Zaurus Linux based
PDA. It allows user to use Zaurus as a phone, to make voice phone calls based on
Columbia University’s CINEMA project architecture.
The
project uses Session Initiation Protocol and Real Time Protocol.
Co-authored
by Professor Henning Schulzrinne of Columbia University, SIP is a signaling
protocol for establishing real-time, interactive communications sessions over IP
networks.
It
is capable of seamlessly integrating of a wide variety of communications
services using Internet protocols.
2. VoIP Architecture
While
convergence of voice and data is becoming a feasible solution in communication
network, Voice over Internet Protocol (VoIP) plays a significant role.
With the entrance of Cisco and other data networking companies to this
territory, many traditional switch
vendors have restructured their product line focusing on the software-based
solutions based upon converged architectures.
VoIP
doesn't use TCP because it is too heavy for real time applications, so instead a
UDP (datagram) is used.
The Columbia InterNet Extensible
Multimedia Architecture (CINEMA) consists of a set of SIP-based servers that
provide a pathway to a post-PBX era of communications. It provides a
comprehensive environment for creating and deploying rich Internet multimedia
services including programmable Internet telephony services, audio/video
conferencing, IP-based voice mail, and unified messaging.
In the Zaurus project, the SIPUA is
built using RTP protocol to provide a global Sequencing number and time stamp
for each packet.
The RTP protocol
enables end-to-end real-time delivery services for data. Sample services provided by RTP are the following:
·
Sequencing: Since the packet may not arrive
in the correct order (UDP), and we need audio data packet arrive it in write
order, RTP which runs on top of a datagram-based service, packet sequence
numbers are required to reassemble the information stream in the destination.
This is utilized in the SIPUA.
·
Time stamping: Time stamping in the packets
to assure synchronization of the devices. This
time stamping is also important in control and monitoring applications.
·
Delivery monitoring: Allows the
participating applications to collect and share statistics about the performance
of the data transport service.
·
Information about connection: the
statistics, the names and locations of the parties, when and why a party is
leaving, etc.
Delivery
monitoring and connection information gathering services are provided by RTCP,
the companion control protocol for RTP. In
my Zaurus sipua project, only RTP was supported.
SIPUA utilizes SIP protocol to
provide call setup, routing, authentication and other features to end devices.
SIP is a text-based protocol allows for the scalable and extensible
implementation of a wide variety of applications including audio, video, chat,
instant messaging and whiteboard. SIP
requires less overhead comparing with the traditional H.323 protocol.
Zaurus is the Linux-based PDA produced by sharp. With an Intel StrongARM 206MHz processor, the Zaurus SL-5500 provides users with powerful multimedia and multi-processing capabilities. In addition, MP3 and MPEG 1 capabilities allow users to listen to music or view video clips.
Sipua is a sip user
agent. sipua
allows end user to make and receive SIP calls.
Zaurus sipua implementation should support the original sipua features such as SIP version 2.0 (RFC 2543, RFC 3261), G.711 Mu Law audio for Solaris on SPARC and rudimentary RTP implementation, etc. In addition, it has to supports AFMT_S16_LE format audio for Linux and embedded-Linux. Especially, it has to support the Zaurus’s audio device whose inputs is only mono. The embedded Linux operating system on Zaurus challenges the adaptability and extensibility.
3. Project design and implementation
1.
Convert analog voice to digital signals (bits)
2.
Compressed the bits into a good format for transmission: PCMU,
AFMT_S16_LE.
3.
Here we have to insert our voice packets in data packets using a
real-time protocol (RTP over UDP over IP)
4.
SIP performs as singling protocol to users.
5.
Receivers have to disassemble packets, extract data, then convert them to
analog voice signals and send them to audio device.
6.
All that must be done in a near real time
fashion cause we cannot waiting a long time during talking
The
Zaurus sound card allows you convert with 16 bit a band of 22050 Hz (for
sampling it you need a freq of 8000 Hz for Nyquist Principle) obtaining a
throughput of 2 * 8000 (samples per second) = 16000 Bytes/s.
For
each 20 ms, we have 16000/50 = 320 bytes. So
the default packet size was set to 320 bytes.
Now
that we have digital data we may convert it to a standard format that could be
quickly transmitted.
1) PCM, Pulse Code Modulation, Standard ITU-T G.711
2) AFMT_S16_LE
For this Zaurus project, I only implemented for AFMT_S16_LE format that was suggested having a better performance on Zaurus.
Next
step is to converting the raw data and encapsulating it into TCP/IP stack
follows the structure:
VoIP data packets
RTP
UDP
IP
I,II layers
VoIP
data packets live in RTP (Real-Time Transport Protocol) packets that are inside
UDP-IP packets. RTP enables the receiver to put the packets back into the
correct order and not wait too long for packets that have either lost their way
or are taking too long to arrive.
3.4
Zaurus Networking
Unlike
Unix and other Linux products, Zaurus doesn’t automatically assign the IP
address to its hostname. Since the
project is based on Wi-Fi, DHCP will assign a new IP address to Zaurus each time
it connects. This created some
confusions and delays in the project since SIPUA uses hostname to find its
peers. The solution is to use
hostname command to assign its IP address to Zaurus each time it obtains a new
one. An improvement on SIPUA will
be to identify user’s IP address instead of simply using the hostname.
VoIP
connecting through Wi-Fi also depends on the stability of the wireless LAN.
During the project, we have found if the wireless connection is
disrupted, SIPUA has to be reconfigured (hostname setup) and restarted.
An improvement in SIPUA will be to verify the LAN connection (Such as
using a heartbeat mechanism) during a SIP session call.
If the connection is lost, then SIPUA goes to sleep and retry the
connection in some intervals.
Figure
1
Use Zaurus to
Linux Laptop as an example in Figure 1.
1.
SIPUA (Both form Linux or Zaurus) started
and initialized. User login and
authentication.
Set userid yg97@cs.columbia.edu
Set password xxxx
2.
From Zaurus, initiate invite command to
Linux ‘s SIPUA using the it’s DHCP’s IP information such as the following:
invite
from=sip:yg97@cs.Columbia.edu sip:128.59.xx.xx
3.
The Linux SIPUA, already started, now can
accept the invitation
4.
The audio sessions are both side are
initialized, the play thread and record thread on each side were spawned.
5.
The SIP session phone is established.
The voices from each side are compressed into the data and transported to
the counter party’s sipua using RTP protocol and UDP packet(can be configured
using TCP).
The Trolltech’s Qtopia Qt environment fits
well the strongARM-Linux development environment. So I integrated the Qt’s programming environment with the
CINEMA projects by modifying the Makefile in the sipua directory and include the
necessary libraries from the QT’s strongARM-Linux environment.
In radio.cpp class
Figure
2
4.1
Setup the audio devices
Unlike the audio devices for Sun Unix and
Linux machines, the Zaurus supports only mono inputs. That is we have to open /dev/dsp to play and open /dev/dsp1
to record. The details about setup
information can be found in sipua’s radio.cpp file.
ThreadParam object contains in and out
socket responsible for each threads communication with others.
It also contains the audio device names opened by the start audio tool
function. By contrasting with the
standard Unix and Linux platform, the Zaurus’s SIPUA has to open different
audio devices for play and record.
Figure
3
Algorithm for RingBuffer scheme
1)
Playthread initialized start to put the
packet on the buffer.
2)
RecordThread initialized and start to get
the packet and play
3)
Both thread are working on the circular
buffer to write and read packet
Issue with the RingBuffer:
When there are large amount packets
continually in transporting, we can observe the buffer overflow.
Since there are a time delay between the PlayThread and RecordThread, the
buffer overflow will enlarge the GAP between the PlayThread and the RecordThread.
Then the Playthread will catch up with the RecordThread on the RingBuffer.
Unexpected exception will occur under this scenario such as Thread lock,
and voice data loss, etc.
Solution:
1)
Enlarge the MAX_PLAYOUT_BUFFER_SIZE will
delay the buffer overflow, but eventually over flow will happen.
2)
By suppressing the silence, we can reduce
the unnecessary traffic significantly. Because
it only sends the necessary data across the network, the delay issue is
resolved. Buffer will never over
flow as long as there is silence in the middle of the conversation.
See next section for performance results.
4.3
Performance Testing
The Performance testing was conducted on
both sides including from Zaurus to Linux and from Linux to Zaurus.
In the initial testing, the performance
from Linux to Zaurus is fairly reasonable.
Throughout the whole testing session (10 minutes), the delay maintains
under 100 ms. The result is
consistent with sipua’s behavior on other platforms.
On the other hand, we have observed a linear growth in terms of voice
delay on the Zaurus side, and sometimes loss of the sound.
After investigation, we have found the problem and its solution described
the previous section.
In the second test we conducted after
applying the suppressing silence and bigger playout buffer size, the issue is
resolved. On Linux side, the
delay stays the same as before. On
Zaurus side, during the testing session (10 minutes), the delay stays below 460
ms and doesn’t show a significant linear growth pattern.
The implementation of the SIPUA on Zaurus
involves wireless networking, audio device playing and utilizes various
protocols such as RTP and SIP. From
the performance of the project, we can conclude that Zaurus, which is based upon
embedded strongARM Linux system, can provide satisfactory result in real time
networking systems.
While the delay can be acceptable, there are
still some improvement can be conducted in the future work of the Zaurus SIPUA
implementation
Code modified:
Set
unsigned char buffer[512]; instead
of 1500
Silence
suppression have to set
//
memset (&buffer[12+len],0x7f,size-len);
memset (&buffer[12+len],0x00,size-len);
// record
using au_r instead of au_dev
if (oUsingDevice) {
// len = read(th->au_dev,
&buffer[12], size);
len = read(au_r, &buffer[12], size);
}
In PlayThread fuction:
Set
unsigned char buffer[512]; instead
of 1500
In
EnergyLevel: use the updated version of EnergyLevel from Sangho Shin
double
EnergyLevel (unsigned char *dataBuf, int dataLen)
{
int i, n;
short data;
unsigned long sum_x, avg_x;
n= dataLen;
// Following portion of the code is copied
from HW2
for (sum_x=0, i=0; i < n; i++) {
data = dataBuf[i];
data = data | (dataBuf[i+1] << 8);
sum_x += abs(data);
i++;
}
avg_x = sum_x/n;
for (sum_x=0, i=0; i < n; i++) {
data = dataBuf[i];
data = data << 8 || dataBuf[i+1];
sum_x += SQR(abs(data) - avg_x);
i++;
}
avg_x = sum_x/n;
return (avg_x);
}
Changes in start_audio_tool function, instead of using the regular audio device setup, used the folowing:
In start_audio_tool function:
/* Open audio device for Z*/
if (au_p >= 0) {
close (au_p);
}
fprintf(stderr, "**************Before opening Audio
***************...\n");
if (usedevice) {
fprintf(stderr, "opening audio device...\n");
int speed = 8000;
int channels = 1;
int format = AFMT_S16_LE;
int block_size;
int fragment = 0x00020009;
au_p = open("/dev/dsp", O_WRONLY);
if( au_p == -1) {
perror("open(\"/dev/dsp\")");
}
// this is assuming you know what the format of the wav file is.
// otherwise you'd have to read the header file first.
if( ioctl( au_p, SNDCTL_DSP_SETFMT , &format)==-1)
{
perror("play setup ioctl(\"SNDCTL_DSP_SETFMT\")");
}
if( ioctl( au_p, SNDCTL_DSP_CHANNELS , &channels)==-1)
{
perror("play setup ioctl(\"SNDCTL_DSP_CHANNELS\")");
}
if( ioctl( au_p, SNDCTL_DSP_SPEED , &speed)==-1)
{
perror("play setup ioctl(\"SNDCTL_DSP_SPEED\")");
}
if( ioctl( au_p, SNDCTL_DSP_GETBLKSIZE , &block_size)==-1)
{
perror("play setup ioctl(\"SNDCTL_DSP_GETBLKSIZE\")");
return -1;
}
if( ioctl( au_p, SNDCTL_DSP_SETFRAGMENT , &fragment)==-1)
{
perror(" recording ... recording ... ioctl(\"SNDCTL_DSP_SETFRAGMENT\")");
return -1;
}
fprintf(stderr, " Opening Audio for playing part...done\n");
//## YG added for Recording...
// *NOTE* the Zaurus input
is ONLY mono !!
if (au_r >= 0) {
close (au_r);
}
au_r = open("/dev/dsp1", O_RDONLY);
// *NOTE* the Zaurus has a nonstandard input
// so /dev/dsp1 must be opened when
// recording
if( au_r == -1) {
perror("open(\"/dev/dsp1\")");
}
if( ioctl( au_r, SNDCTL_DSP_SETFMT , &format)==-1)
{
perror(" recording ... ioctl(\"SNDCTL_DSP_SETFMT\")");
}
if( ioctl( au_r, SNDCTL_DSP_CHANNELS , &channels)==-1)
{
perror("recording ... ioctl(\"SNDCTL_DSP_CHANNELS\")");
}
if( ioctl( au_r, SNDCTL_DSP_SPEED , &speed)==-1)
{
perror("recording ... ioctl(\"SNDCTL_DSP_SPEED\")");
}
if( ioctl( au_r, SNDCTL_DSP_GETBLKSIZE , &block_size)==-1)
{
perror("ioctl(\"SNDCTL_DSP_GETBLKSIZE\")");
return -1;
}
fprintf(stderr, " Opening Audio for recording part...done\n");
} else
fprintf(stderr, " NOT GOING INTO Opening Audio for recording
part...FALSE\n");
In setup thread part, uses au_p and au_r for play and record thread.
/*
* Setup threads to listen to incoming packets and send outgoing
* packets
*/
{
int e;
pthread_t tid[2];
/* Recv thread */
ThreadParam *th;
th = (ThreadParam *)calloc(1, sizeof(ThreadParam));
th->in_sock4 = in_sock4;
th->out_sock4 = out_sock4;
th->in_sock6 = in_sock6;
th->out_sock6 = out_sock6;
th->au_dev = au_p;
th->o_send = o_send; /* Don't care */
th->o_silence = 0; /* No silence detection for playout */
if ((e = pthread_create_detached(&tid[0], NULL,
PlayThread, (void *)th)) != 0) {
perror("pthread_create_detached");
}
/* Record/send and play thread
*/
th = (ThreadParam *)calloc(1, sizeof(ThreadParam));
th->in_sock4 = in_sock4;
th->out_sock4 = out_sock4;
th->in_sock6 = in_sock6;
th->out_sock6 = out_sock6;
th->au_dev = au_r;
th->o_send = o_send;
th->o_silence = o_silence;
th->threshold = threshold;
if ((e = pthread_create_detached(&tid[1], NULL,
RecordThread, (void *)th)) != 0) {
perror("pthread_create_detached");
}
}
return 0;
1) Columbia University CINEMA project.
2) Help from PHD students in IRT lab
3) Researched conducted through embedded-Linux newsgroup
4) Sharp Zaurus website