Integrating Speex Codec in CINEMA

Mohammad Merajul Islam Molla
Columbia University (NY)
New York, NY 10027
USA
mmm2177@columbia.edu

Abstract

CINEMA (Columbia InterNet Extensible Multimedia Architecture) is a set of SIP-based Internet multimedia servers for creating an enterprise Internet telephony and multimedia system. CINEMA has sipconf as its multimedia conferencing server. sipconf uses various multimedia (audio and video codecs) for doing multimedia conferencing. As part of my project, I supported another new audio codec called Speex in CINEMA so that sipconf can use it. In this tutorial-like article, I describe the various steps required to support new audio codec in CINEMA. While integrating Speex, I came to find some problems with existing conference server for which 16KHz codec could not work properly. I have also described how I fixed those problems.

Introduction

CINEMA is a set of SIP-based Internet multimedia servers for creating an enterprise Internet telephony and multimedia system. CINEMA consists of sipd (SIP proxy server, redirect, and registrar server), sipconf (sip multimedia conferencing server), sipum (sip voicemail/unified messaging server), sip323 (SIP-H323 protocol translator) and rtspd (RTSP media server). It supports various audio codecs - pcmu, pcma, gsm, dvi, g722.

In this project, I have supported another new audio codec Speex. Speex is an Open Source/Free Software patent free audio compression format designed for speech. Speex is based on CELP and is designed to compress voice at bitrates ranging from 2 to 44 kbps. I supported two modes of Speex codec - narrowband (8KHz) and wideband (16KHz).

All the previous codecs supported in CINEMA were 8KHz. So integration of Speex's wideband mode made it possible to do conferencing using 16KHz codec. Also, some interesting structural problems occurred while supporting Speex's wideband mode. Since all previous codecs were narrowband, these problems did not show up before.

In the following sections of this article, I described various steps required to support a new audio codec, going through the steps that I took to support Speex. I also described how I fixed those problems with CINEMA to make Speex's wideband mode work properly.

Steps to Integrate New Audio Codec

In this section, I stepped through the process of integrating Speex codec. Similar process can be followed to integrate any new audio codec in future.

Step 1:

To support a new codec, we will need some library that implements the codec. For Speex codec, I used open source implementation of speex library, version 1.0.5, downloadable from here. A good place to store the library source directory is the top directory of CINEMA package. I extracted the sources of speex library in a folder libspeex under CINEMA's top directory. Then I had to write module.mk file in libspeex directory, so that the library can be built properly using 'make' command.

Step 2:

We have to modify the top level Makefile.in to support libspeex. To do this, I added a code fragment like this in the existing Makefile.in file.
CINEMA_MODULES =  NT libcine \ 
		  ...	
                  librtp rtplib++ libgsm libspeex \
                  libsip libcanon \
		  ...
We also have to modify module.mk in libmedia directory so that CINEMA's media library can find speex library. To do this, I added a code fragment like this in the existing libmedia/module.mk file.
LIBMEDIA_INC := -I$(topsrcdir)/libmedia $(LIBGSM_INC) $(LIBSPEEX_INC) $(LIBILBC_INC)
LIBMEDIA_LIB := -L$(libmedia_dir) -lmedia -lm $(LIBCINE_LIB) $(LIBGSM_LIB) $(LIBSPEEX_LIB) $(LIBILBC_LIB) $(SEMLIBS)
                                                                                
$(libmedia_dir)-CFLAGS := $(LIBCINE_INC) $(LIBGSM_INC) $(LIBSPEEX_INC) $(LIBSIPAPI_INC) $(LIBILBC_INC)
...
$(LIBMEDIA): $(LIBMEDIA_OBJ) $(LIBMEDIA_DEP) $(LIBGSM) $(LIBSPEEX) $(LIBILBC)
...

Step 3:

The next step is to write the code for supporting various functions on the codec. First we will have to modify two files in libmedia directory. Just at the beginning of libmedia/mediacodec.h, I included the header file of speex library - speex.h.
extern "C" {
...
#include "speex.h"
}
Also at the end of the same file, we will have to declare a class for our codec. For Speex's narrowband mode, I declared a class as follows -
 class SPEEX : public Codec {
  public:
    SPEEX();
    ~SPEEX();
                                                                                
    virtual bool Encode(const std::string& raw,
                        std::string& encoded);
    virtual bool Decode(const std::string& encoded,
                        std::string& raw);
  private:
    void init_encoder();
    void init_decoder();
                                                                                
    static int control_enc;
    static int control_dec;
                                                                                
    void* state;
    SpeexBits bits;
  };
I declared another similar class for Speex's wideband mode in the same file.

Next we have to implement the methods of our declared class in libmedia/mediacodec.cpp file. This is actually the implementation of our codec's encoding and decoding methods.

However, we also have to modify Codec::Allocate method just at the beginning of libmedia/mediacodec.cpp so that it can allocate object of our codec class properly. For Speex codec's narrowband mode, I added a code fragment as follows -

Codec * Codec::Allocate(const SessionFormat& format)
{
  if (format == SessionFormat("pcmu", 8000, 1)) {
    return new G711MuLaw;
  }
  ...
  else if(format == SessionFormat("speex", 8000, 1)) { 
    return new SPEEX;
  }
  ...
}
I added similar code for Speex codec's wideband mode.

Step 4:

Next we have to add appropriate session format for our new codec in libconf/confmember.cpp. For narrowband mode of Speex codec, I used a code fragment as follows -
bool Member::GetSupportedFormat(SessionMedia& media) const
{
  if (media.IsAudio()) {
    media.RemoveAllFormats();
    media.AddFormat(0); //pcmu
    ...	
    media.AddFormat(SessionFormat(110, "speex", 8000, 1)) ;   
    ...	
  }
...
}
I did similar modification for supporting wideband mode of Speex.

Step 5:

The last step is to modify module.mk in sipconf directory so that sipconf can find speex library. I did this as follows -
SIPCONF_DIST_FILES := \
        $(TOP_DIST) \
        ... 
        $(LIBSPEEX_DIST) \
        ...

SIPCONF_DEPENDENCIES := $(LIBSIPAPI) $(LIBCONF) $(LIBMEDIA) $(DBAPI) $(LIBGSM) $(LIBSPEEX) $(LIBCINE) $(LIBSIP) $(LIBRESPARSE)
Similar modification will be necessary if the codec has to be supported in media server or other components of CINEMA.

Bug-fixes

While integrating Speex codec, I came to find some problems in CINEMA. Actually, in libconf/confaudio.cpp, the constructor of class AudioSession set the sampling rate to 8000. So whenever a user tried to connect to the server using 16KHz codec, the server crashed. This problem did not occur before because all the previous codecs were 8KHz. To fix this problem, I had to do some structural modification so that the sampling rate is queried properly from database table, instead of just hard coding it to 8000.

I added two new methods in Instance class in libconf/confinstance.h file and declared sampling_rate as a protected member of the class.

class Instance : public LockableObject {
public:
  ...
  /**
     * Get the sampling rate of the conference.
     *
     * @return the sampling rate of the conference object.
     */
    int GetSamplingRate() const { return sampling_rate; }
                                                                                
    /**
     * Set the sampling rate of the conference.
     *
     * @param rate, the sampling rate for the conference.
     */
    void SetSamplingRate(int rate);
  ...
protected:
  ...
 /** sampling rate of the conference */
    int sampling_rate;
  ...
};
Definition of SetSamplingRate method goes in libconf/confinstance.cpp file. I had to modify the constructor of AudioSession class in libconf/confaudio.cpp so that it sets the sampling rate properly.
AudioSession::AudioSession(Instance& c)
  : MediaSession(c)
{
  ...
  sampling_rate =  c.GetSamplingRate();
  ...
}
I had to modify SipConference::RetrieveAttributes method in sipconf/sipconf.cpp file, so that after retrieving conference's attribute, like sampling rate, it calls SetSamplingRate method defined previously to pass the correct sampling rate.
bool SIPConference::RetrieveAttributes()
{
    ...
    SetSamplingRate(sampling);
    ...
}

Testing

I tested the integration of Speex codec using linphone - open source web phone.

I created two conferences - one with 8KHz sampling rate and another with 16KHz sampling rate.

I then ran sipconf using the command -

./sipconf/sipconf -D sql://root@localhost/sip -X -b -S irt.cs.columbia.edu -l

-b option of sipconf sends back audio to the participant who generated it.

I then connected to sipconf using linphone and tested with both codecs and conferences whether I can hear clearly. Since I ran the server with '-b' option, the ecohed back voice indicates that the codec is working properly.

Acknowledgment

First of all, I would like to thank Professor Henning Schulzrinne for giving me opportunity to work on this project. I would like to thank Jonathan Lennox for giving me access to CVS repository of CINEMA and also giving me valuable advice during the starting phase of the project to make me familiar with the project. I would like to thank Wonsang Song for showing me how to run sipconf server. Lastly, I would like to thank Kundan Singh for helping me understand the architecture of CINEMA, giving timely help and advice to find out various details of the existing CINEMA code base and also for helping me with bug-fixing of CINEMA.