CINEMA (Columbia InterNet Extensible Multimedia Architecture) is a set of SIP-based Internet multimedia servers for creating an enterprise Internet telephony and multimedia system. CINEMA has sipconf as its multimedia conferencing server. sipconf uses various multimedia (audio and video codecs) for doing multimedia conferencing. As part of my project, I supported another new audio codec called Speex in CINEMA so that sipconf can use it. In this tutorial-like article, I describe the various steps required to support new audio codec in CINEMA. While integrating Speex, I came to find some problems with existing conference server for which 16KHz codec could not work properly. I have also described how I fixed those problems.
CINEMA is a set of SIP-based Internet multimedia servers for creating an enterprise Internet telephony and multimedia system. CINEMA consists of sipd (SIP proxy server, redirect, and registrar server), sipconf (sip multimedia conferencing server), sipum (sip voicemail/unified messaging server), sip323 (SIP-H323 protocol translator) and rtspd (RTSP media server). It supports various audio codecs - pcmu, pcma, gsm, dvi, g722.
In this project, I have supported another new audio codec Speex. Speex is an Open Source/Free Software patent free audio compression format designed for speech. Speex is based on CELP and is designed to compress voice at bitrates ranging from 2 to 44 kbps. I supported two modes of Speex codec - narrowband (8KHz) and wideband (16KHz).
All the previous codecs supported in CINEMA were 8KHz. So integration of Speex's wideband mode made it possible to do conferencing using 16KHz codec. Also, some interesting structural problems occurred while supporting Speex's wideband mode. Since all previous codecs were narrowband, these problems did not show up before.
In the following sections of this article, I described various steps required to support a new audio codec, going through the steps that I took to support Speex. I also described how I fixed those problems with CINEMA to make Speex's wideband mode work properly.
In this section, I stepped through the process of integrating Speex codec. Similar process can be followed to integrate any new audio codec in future.
CINEMA_MODULES = NT libcine \ ... librtp rtplib++ libgsm libspeex \ libsip libcanon \ ...We also have to modify module.mk in libmedia directory so that CINEMA's media library can find speex library. To do this, I added a code fragment like this in the existing libmedia/module.mk file.
LIBMEDIA_INC := -I$(topsrcdir)/libmedia $(LIBGSM_INC) $(LIBSPEEX_INC) $(LIBILBC_INC) LIBMEDIA_LIB := -L$(libmedia_dir) -lmedia -lm $(LIBCINE_LIB) $(LIBGSM_LIB) $(LIBSPEEX_LIB) $(LIBILBC_LIB) $(SEMLIBS) $(libmedia_dir)-CFLAGS := $(LIBCINE_INC) $(LIBGSM_INC) $(LIBSPEEX_INC) $(LIBSIPAPI_INC) $(LIBILBC_INC) ... $(LIBMEDIA): $(LIBMEDIA_OBJ) $(LIBMEDIA_DEP) $(LIBGSM) $(LIBSPEEX) $(LIBILBC) ...
extern "C" { ... #include "speex.h" }Also at the end of the same file, we will have to declare a class for our codec. For Speex's narrowband mode, I declared a class as follows -
class SPEEX : public Codec { public: SPEEX(); ~SPEEX(); virtual bool Encode(const std::string& raw, std::string& encoded); virtual bool Decode(const std::string& encoded, std::string& raw); private: void init_encoder(); void init_decoder(); static int control_enc; static int control_dec; void* state; SpeexBits bits; };I declared another similar class for Speex's wideband mode in the same file.
Next we have to implement the methods of our declared class in libmedia/mediacodec.cpp file. This is actually the implementation of our codec's encoding and decoding methods.
However, we also have to modify Codec::Allocate method just at the beginning of libmedia/mediacodec.cpp so that it can allocate object of our codec class properly. For Speex codec's narrowband mode, I added a code fragment as follows -
Codec * Codec::Allocate(const SessionFormat& format) { if (format == SessionFormat("pcmu", 8000, 1)) { return new G711MuLaw; } ... else if(format == SessionFormat("speex", 8000, 1)) { return new SPEEX; } ... }I added similar code for Speex codec's wideband mode.
bool Member::GetSupportedFormat(SessionMedia& media) const { if (media.IsAudio()) { media.RemoveAllFormats(); media.AddFormat(0); //pcmu ... media.AddFormat(SessionFormat(110, "speex", 8000, 1)) ; ... } ... }I did similar modification for supporting wideband mode of Speex.
SIPCONF_DIST_FILES := \ $(TOP_DIST) \ ... $(LIBSPEEX_DIST) \ ... SIPCONF_DEPENDENCIES := $(LIBSIPAPI) $(LIBCONF) $(LIBMEDIA) $(DBAPI) $(LIBGSM) $(LIBSPEEX) $(LIBCINE) $(LIBSIP) $(LIBRESPARSE)Similar modification will be necessary if the codec has to be supported in media server or other components of CINEMA.
While integrating Speex codec, I came to find some problems in CINEMA. Actually, in libconf/confaudio.cpp, the constructor of class AudioSession set the sampling rate to 8000. So whenever a user tried to connect to the server using 16KHz codec, the server crashed. This problem did not occur before because all the previous codecs were 8KHz. To fix this problem, I had to do some structural modification so that the sampling rate is queried properly from database table, instead of just hard coding it to 8000.
I added two new methods in Instance class in libconf/confinstance.h file and declared sampling_rate as a protected member of the class.
class Instance : public LockableObject { public: ... /** * Get the sampling rate of the conference. * * @return the sampling rate of the conference object. */ int GetSamplingRate() const { return sampling_rate; } /** * Set the sampling rate of the conference. * * @param rate, the sampling rate for the conference. */ void SetSamplingRate(int rate); ... protected: ... /** sampling rate of the conference */ int sampling_rate; ... };Definition of SetSamplingRate method goes in libconf/confinstance.cpp file. I had to modify the constructor of AudioSession class in libconf/confaudio.cpp so that it sets the sampling rate properly.
AudioSession::AudioSession(Instance& c) : MediaSession(c) { ... sampling_rate = c.GetSamplingRate(); ... }I had to modify SipConference::RetrieveAttributes method in sipconf/sipconf.cpp file, so that after retrieving conference's attribute, like sampling rate, it calls SetSamplingRate method defined previously to pass the correct sampling rate.
bool SIPConference::RetrieveAttributes() { ... SetSamplingRate(sampling); ... }
I created two conferences - one with 8KHz sampling rate and another with 16KHz sampling rate.
I then ran sipconf using the command -
./sipconf/sipconf -D sql://root@localhost/sip -X -b -S irt.cs.columbia.edu -l
-b option of sipconf sends back audio to the participant who generated it.
I then connected to sipconf using linphone and tested with both codecs and conferences whether I can hear clearly. Since I ran the server with '-b' option, the ecohed back voice indicates that the codec is working properly.
First of all, I would like to thank Professor Henning Schulzrinne for giving me opportunity to work on this project. I would like to thank Jonathan Lennox for giving me access to CVS repository of CINEMA and also giving me valuable advice during the starting phase of the project to make me familiar with the project. I would like to thank Wonsang Song for showing me how to run sipconf server. Lastly, I would like to thank Kundan Singh for helping me understand the architecture of CINEMA, giving timely help and advice to find out various details of the existing CINEMA code base and also for helping me with bug-fixing of CINEMA.