B. Compressing SDP Packets

Session announcements are transmitted as SDP packets. SDP is a simple protocol, defined as ASCII text lines describing the announcement, separated by a CR LF. These lines all have the same format:

x=value

, where x is any letter which defines the attribute to be defined (case sensitive) and value is the value given to the attribute. Several attributes are defined by SDP so far. The following is a SDP packet arbitrarily catched from the INTERNET:

v=0
o=james 3054298051 3054298223 IN IP4 129.89.142.50
s=FreeBSD Lounge
i=Channel to discuss FreeBSD related issues.  Please keep video bandwidth below 64 kbps.
e=Jim Lowe <james@cs.uwm.edu>
p=Jim Lowe (414) 229-6634
c=IN IP4 224.2.100.100/127
t=0 0
a=tool:sdr v2.2a23
a=type:test
m=audio 16400 RTP/AVP 0
c=IN IP4 224.2.100.100/127
a=ptime:40
m=video 49200 RTP/AVP 31
c=IN IP4 224.2.100.102/127
m=whiteboard 32800 udp wb
c=IN IP4 224.2.100.101/127
a=orient:portrait

This encoding technique is quite simple to implement, because a parser for recognizing a SDP packet is very simple. But the format is very inefficient and waste a lot of bandwidth. This is a problem on slow-speed links like PPP links on normal telephone lines. Because of this the SDP packet should be compressed. The SAP protocol which is normally used on the INTERNET to carry SDP packets, defines a way to simply compress the SDP packet by using the gzip algorithm. The following document describes a better technique, which utilizes a better compression ratio then simply using the gzip algorithm. The resulting protocol will be called compressed SDP (CSDP).

B.1. CSDP

The SDP packet will be compressed sequential on a per attribute basis. Because the order of the attributes is specified in the SDP draft, it is not necessary to identify the attributes explicit by an ID. They can simply be identified by the order. If an optional attribute is not present in an SDP packet, it will be coded as not present (see below).

There are two groups of attributes which can be repeated several times, the Time Description and the Media Description. The first attribute of each group is mandatory, if the group is present. So these groups will also be repeated in the compressed SDP packet. The last (or even missing) group will be indicated by the first attribute. If it is not present in the compressed packet (violation of the SDP specification), then this group will no longer repeated.

B.2. The Presence Bit

Because there are some attributes which are optional, and others which can be repeated, every attribute will be preceded by an <presence-bit>. If this bit is equal to 1, then the attribute is present, if the bit is 0, it is not. If it is present (i.e. the bit was 1) then the attribute will also be repeated. For example, to specify an attribute once, the following sequence is produced:

+-+-+-+-+-+ ... +-+-+-+
|1| the attribute   |0|
+-+-+-+-+ ... +-+-+-+-+

If the attribute occurs two times, then the following sequence will be generated:

+-+-+-+-+-+ ... +-+-+-+-+-+-+-+ ... +-+-+-+
|1| the attribute   |1| the attribute   |0|
+-+-+-+-+ ... +-+-+-+-+-+-+-+ ... +-+-+-+-+

According to the SDP specification, there are some fields which are mandatory and have to be specified exactly once. So the usage of the <presence-bit> is not needed for such attributes. But to simplify the compression / decompression, and to make the algorithm more clear, every attribute will be encapsulated by the <presence-bit>. This makes it also possible to extend the SDP spec, so that such attribute can be repeated, of skipped. The overhead of two bits (1 and 0) can be accepted.

There is one exception regarding the usage of the <presence-bit>. The very first attribute, the version of the SDP packet, will NOT be encapsulated. This is to let us define new versions of SDP or this compression protocol, which uses a fully different encapsulation. Therefore, we need the version information in plain.

B.3. Text Compression

Several attributes include plain ASCII text designated to be read by humans. Such fields cannot be compressed like an IP address. So they these fields will be compressed by the gzip algorithm.

Tests have shown that, when the compressed header information for each attribute (e.g. the <presence-bit> for the i= attribute) were are directly followed by the text value (the textual description), the gzip algorithm works very poor. Because of this, the compressed SDP packet will be splited in four sections.

+----------------------+
| version              |
+----------------------+
| header-length        |
+----------------------+
| header               |
+----------------------+
| compressed text data |
+----------------------+

The version section simply defines the SDP and CSDP version. The header-length section defines the number of bytes occupied by the version, the header-length and the header. The header section contains all the compressed SDP data, excluding the compressed text data. The last section contains the sequential appended text, which is compressed by gzip (the whole collected text-buffer will be compressed, not each attribute separately!).

The last section begins on a byte boundary. Because the compression is bit oriented, the last octet of the header section may be padded to the byte boundary. This can be done by appending 0 to 7 <final-bits>.

Theoretically, the length field can be ommited, because the end of the header can be detected automatically. The last Media Description group defines the end of the SDP packet, so the last byte is the one with the 0 presence-bit for the group. But during trying to implement the CSDP decompressing algorithm, we found out that it is a difficult to handle the text sections. Without the length indication, the parser has to parse the packet twice, resulting in a poor performance. So we decided to include the length indication.

B.4. CSDP Specification

The following sections specify the format of the compressed version of every attribute defined by SDP. The order given is the same as in the SDP draft.

For every length filed defined below, a value with all bits set to 1 is reserved to indicate a length extension. This allows us to efficient compress attributes with a usual length, and makes it still possible to encode larger attributes. The format and usage of the extension mechanism has not been defined up to now.

Each attribute but the version one, is preceded by a <presence-bit> as defined above. It is indicated by a P-bit.

B.4.1. Version (v=)

This field is mandatory for all SDP packets, and it is the only one, which is NOT preceded by the presence-bit!

Uncompressed format
v=<version>
Compressed format
```
 0 1
+-+-+
|ver|
+-+-+
```
Ver is the binary encoded version field. Currently, the value 00 is defined for the actual version 0, and the description given here applies only to this version!

B.4.2. Header-Length

This is not a SDP attribute, but is necessary for decoding the compressed SDP packet. This field indicates the length of the header, which preceeds the compressed text section.

Uncompressed format
not available
Compressed format
```
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|<header-length>|
+-+-+-+-+-+-+-+-+
```
<headler-length> is the binary encoded length of the header, given int octets. The value includes the overall header, including the <version> and <header-length> field, so it an offset, to the point where the compressed text section begins.

B.4.3. Origin (o=)

The origin field specifies the origin of the announcement.

Uncompressed format
o=<username> <session-id> <version> <net-type> <addr-type> <addr>

Compressed format

                                 3 3                   3 3
 0   0 1 2 3   0 1 2 3 4         0 1   0 1 2 3         0 1
+-+ +-+-+-+-+ +-+-+-+-+ ... +-+-+-+-+ +-+-+-+-+ ... +-+-+-+
|P| | u-len | | 32 bit <session-id> | | 32 bit <version>  |
+-+ +-+-+-+-+ +-+-+-+ ... +-+-+-+-+-+ +-+-+-+ ... +-+-+-+-+
 0 1 2 3 4 5
+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+
| <net-type> <addr-type> <addr> |
+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+

u-len specifies the length of the username in octets and is 6 bits long. Because the the <username> is raw ASCII, is will be appended to the text section to be compressed. Because in practice, the <session-id> and <version> fields seems to be derived from an 32 bit integer, it will be represented in binary version. The address-triple will be encoded as defined in [comprNetAddrAddr].

B.4.4. Session Name (s=)

This field gives the session a reasonable name.

Uncompressed format
s=<session-name>
Compressed format
```
 0   0 1 2 3 4 5
+-+ +-+-+-+-+-+-+
|P| | 6 bit len |
+-+ +-+-+-+-+-+-+
```
The 6 bit len field specified the length of the session name in octets. The <session-name> will be appended to the text section to be compressed.

B.4.5. Session Description (i=) or Media Title

This optional attribute describes the session.

Uncompressed format
i=<session-description>
Compressed format
```
 0  0 1 2 3 4 5 6 7 8 9
+-+ +-+-+-+-+-+-+-+-+-+-+
|P| | 10 bit length     |
+-+ +-+-+-+-+-+-+-+-+-+-+
```
The 10 bit length field specifies the length of the description in octets. It is possible to specify a length of up to 1023 octets, which is enough because of the length restriction of an SDP packet to 1024 octets (well, the packet "should" not be larger than 1024). The <session-description> will be appended to the text section to be compressed.

B.4.6. URI (u=)

This optional attribute specifies an URI for further informations about the session.

Uncompressed format
u=<uri>
Compressed format
```
 0   0 1 2 3 4 5 6 7
+-+ +-+-+-+-+-+-+-+-+
|P| | 8 bit length  |
+-+ +-+-+-+-+-+-+-+-+
```
The 8 bit length field specifies the length of the <uri> in octets. The <uri> field will be appended to the text section to be compressed.

B.4.7. E-Mail (e=)

This optional attribute defines an e-mail address for a contact person regarding the session. This field can be repeated.

Uncompressed format
e=<e-mail>
Compressed format
```
 0   0 1 2 3 4 5 6 7
+-+ +-+-+-+-+-+-+-+-+
|P| | 8 bit length  |
+-+ +-+-+-+-+-+-+-+-+
```
The 8 bit length field specifies the length of the <e-mail> address in octets. The <e-mail> address will be appended to the text section to be compressed.

B.4.8. Phone Number (p=)

This attribute defines a phone number for a contact person regarding the session. This field can be repeated and is optional.

Uncompressed format
p=<phone-number>
Compressed format
```
 0   0 1 2 3 4 5 6 7
+-+ +-+-+-+-+-+-+-+-+
|P| |     length    |
+-+ +-+-+-+-+-+-+-+-+
```
The 8 bit length field specifies the length of the <phone-number> in octets. The <phone-number> will be appended to the text section to be compressed.

B.4.9. Connection Information (c=)

This attribute defines where the session is being transmitted (i.e. on which network address). It is optional and can be repeated. This filed can also be used in the Media Description group.

Uncompressed format
c=<net-type> <addr-type> <addr>[/<ttl>]

Compressed format
Format for an IN IP4 address:

 0   0 1 2 3 4 5 6 7 8                 0 1 2 3 4 5 6 7
+-+ +-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+ +-+-+-+-+-+-+-+-+
|P| | <net-type> <addr-type> <addr> | |  8 bit <ttl>  |
+-+ +-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+

The address-triple (<net-type>, <addr-type> and <addr>) will be encoded as defined in [comprNetAddrAddr]. The 8 bit <ttl> field is the binary encoded <ttl>.

B.4.10. Bandwidth (b=)

This optional attribute specifies the bandwidth consumed by the session.

Uncompressed format
b=<modifier>:<bandwidth-value>

Compressed format

                     1
 0   0 1 2 3 0 1 2 3 4 5 6 7 8 9
+-+ +-+-+-+ +-+-+-+-+-+-+-+-+-+-+
|P| |<mod>| | 10 bit <bandwidth>|
+-+ +-+-+-+ +-+-+-+-+-+-+-+-+-+-+

The 3 bit <mod> is encoded as following:

000	CT
001	AS
111	(reserved for extension)

The <bandwidth> will be encoded as 10 bit binary value.

B.4.11. Times (t=)

This attributes defines the time, when the session will be active. It also starts the Time Description group. If it is present, then the group definition (t=, r= and z=) will come next. Otherwise these other attributes will be skipped.

Uncompressed format
t=<start-time> <stop-time>
Compressed format
```
                         6                       6
 0   0 1 2 3 4 5         2   0 1 2 3 4 5         2
+-+ +-+-+-+-+-+-+ ... +-+-+ +-+-+-+-+-+-+ ... +-+-+
|P| | 27 bit start-time   | | 27 bit stop-time    |
+-+ +-+-+-+-+-+ ... +-+-+-+ +-+-+-+-+-+ ... +-+-+-+
```
The 27 bit start-time is the binary encoded value of the <start-time>, given in minutes. The 27 bit stop-time is the binary encoded value of the <stop-time>, given in minutes.
Because these attributes are represented as NTP values, there should be no limitation by using the 27 bit binary encoding, given in minutes.
Well it can happen that the values have to be rounded. In fact, this does not affect the real meaning. Anyway, we define that the <start-time> has to be rounded down, and the <stop-time> has to be rounded up.

B.4.12. Repeat Interval (r=)

This optional attribute specifies, when the session will be repeated. Note that this field can only occur within a Time Description group.

Uncompressed format
r=<repeat-interval> <active-duration> <list-of-offsets>
Compressed format
```
                       4                       1
 0   0 1 2 3 4 5       2   0 1 2 3 4 5         5
+-+ +-+-+-+-+-+-+ ... +-+ +-+-+-+-+-+-+ ... +-+-+
|P| | 20 bit interval   | | 11 bit  duration    |
+-+ +-+-+-+-+-+ ... +-+-+ +-+-+-+-+-+ ... +-+-+-+
```
The 20 bit interval is the binary encoded value of <repeat-interval>, given in minutes. This largest value is larger than 1 year, so 20 bit should be sufficient.
The 11 bit duration is the binary encoded value of <active-duration>, given in minutes. The largest value is about 34 hours (longer than one day), which should be sufficient.
The variable length list-of-offsets is encoded like an attribute, i.e. it is preceded by the presence-bit. This field is repeated until a presence-bit of 0 is received.
```
                       4
 0   0 1 2 3 4 5       2
+-+ +-+-+-+-+-+-+ ... +-+
|P| |   20 bit offset   |
+-+ +-+-+-+-+-+ ... +-+-+
```
The 20 bit offset is the binary encoded value of one <list-of-offsets> element, given in minutes. The largest value is more than a year, which should be sufficient.
It can happen that the values were truncated, which doesn't have any real meaning. Anyway, the <repeat-interval> should be rounded down, the <active-duration> up and the <offset> down.

B.4.13. Timezone Adjust (z=)

This optional attribute specifies, some kind of timezone adjustments. Note that this field can only occur within a Time Description group.

Uncompressed format
z=<adjust-time> <offset> ...
Compressed format
Because this attribute consists of an variable length list of the two fields, it will be repeated until the presence-bit is 0.
```
                         4
 0   0 1 2 3 4 5         2   0 1 2 3 4 5 6 7
+-+ +-+-+-+-+-+-+ ... +-+-+ +-+-+-+-+-+-+-+-+
|P| | 20 bit adjust time  | | 8 bit offset  |
+-+ +-+-+-+-+-+ ... +-+-+-+ +-+-+-+-+-+-+-+-+
```
The 20 bit adjust time is the binary encoded value of the <adjust-time> field, given in minutes. The 8 bit offset is the binary encoded value of the <offset> field, given in minutes! So we can have an adjustment of up to 4 hours; should be far enough!
It can happen that the values were truncated, which doesn't have any real meaning. Anyway, the <adjust-time> should be rounded down and the <offset> up.

B.4.14. Encryption Key (k=)

Uncompressed format
Compressed format
```
(currently not defined)
```

B.4.15. Session Attribute (a=)

This attributes defines additional attributes.

Uncompressed format
a=<flag>
a=<attrib>:<value>
Compressed format
Because the values for this attribute can vary very much and will extend in the future, it makes no sense to define any special mappings. Instead, it will be encoded like text.
```
 0   0 1 2 3 4 5
+-+ +-+-+-+-+-+-+
|P| |   length  |
+-+ +-+-+-+-+-+-+
```
The 6 bit length field specifies the length of the remaining attribute definition, either the <flag> or the <attrib>:<value>. These fields will be appended to the text section to be compressed.

B.4.16. Media (m=)

Uncompressed format
m=<media> <port> <transport> <fmt-list>
Example
m=video 60450 RTP/AVP 31
Compressed format
```
                                1
 0   0 1 2   0 1                5    0 1 2
+-+ +-+-+-+ +-+-+-+-+-+-+ ... +-+-+ +-+-+-+
|P| |media| | 16-bit port         | |trans|
+-+ +-+-+-+ +-+-+-+-+-+ ... +-+-+-+ +-+-+-+
```
The <media> field will be encoded as an 3 bit binary value. The following values have been defined:
```
000	audio
001	video
010	whiteboard
011	html
100	text
111	(reserved for extensions)
```
The 16 bit port field is the binary encoded value of <port>.
NR IS MISSING!!!
The next field is the transfer format, and will be encoded as 3 bit value. The following values have been defined:
```
000	RTP/AVP
001	VAT
010	UDP
111	(reserved for extensions)
```
Because the <fmt-list> is a variable length list, it will be encoded as present-bit, value. If the present-bit is 0, there are no more fmt entries. Otherwise, a media specific 5 bit encoded fmt description will follow. For the media type audio, the following values are defined:
```
00001	0	(pcm)
...
00010	pcm
00011	gsm
00100	dvi4
0
0001	pcm
0010	gsm
11111	(reserved for extension)
```
For the media type video, the following values are defined:
```
00000	h261, 31
00001	???, 96
11111	(reserved for extension)
```
For the media type whiteboard, the following values are defined:
```
00000	none
00001	wb
11111	(reserved for extension)
```
For the media type html, the following values are defined:
```
00000	Mosaic
00001	Netscape
11111	(reserved for extension)
```
For the media type text, the following values are defined:
```
00000	nt
11111	(reserved for extension)
```

B.4.17. Combined Compression of <net-type> <addr-type> <addr>

Network-addresses are specified as a triple in SDP. In contrast of coding each part separately, we will combine these three field into one logical unit.

The compressed version consists of a fix 4 bit header which defines the combined type of the address format. The following formats has been defined up to now:

IN IP4 <addr>

                     1               3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3         5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+-+
|0 0 0 0|  32 bit binary IP4 address  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+-+-+

IN IP6 <addr>

                     1               5
 0 1 2 3 4 5 6 7 8 9 0 1 2 3         1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+-+
|0 0 0 1|  48 bit binary IP6 address  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+-+-+

IN DNS <addr> (currently not defined in SDP)

                     1
 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+-+-+-+-+-+-+-+-+
|0 0 1 0| length    | length octets of DNS-address  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+-+-+-+-+-+-+-+-+-+

Length (6 bits) specifies the length of the DNS-address in octets (i.e. 8 bit units). Addresses longer than 64 octets cannot be represented by this format.

The following header is reserved for further extensions.
```
 0 1 2 3
+-+-+-+-+
|1 1 1 1|
+-+-+-+-+
```

B.5. Comparison of SDP and CSDP

The following is a comparison of the SDP, the gziped SDP and the CSDP sizes. The SDP packets has been grabbed from the INTERNET during several hours, so they represent a set of usual transmitted SDP packets. The packets are ordered on the original size:

Nr	SDP/bytes	gzip'd	ratio/\%		CSDP	ratio/\%

19	178		157	11.7		86	51.6
04	274		210	23.3		158	42.3
20	336		245	27.0		194	42.2
16	356		260	26.9		195	45.2
09	371		247	33.4		201	45.8
18	378		290	23.2		217	42.5
01	381		286	24.9		228	40.1
14	381		298	21.7		230	39.6
05	406		293	27.8		213	47.5
03	408		294	27.9		231	43.3
24	416		307	26.2		249	40.1
07	458		304	33.6		248	45.8
06	459		309	32.6		255	44.4
21	466		309	33.6		238	48.9
23	505		340	32.6		245	51.4
15	505		341	32.4		245	51.4
17	505		341	32.4		245	51.4
11	518		354	31.6		291	43.8
22	531		342	35.5		250	52.9
10	565		380	32.7		316	44.0
13	644		429	33.3		351	45.4
08	647		394	39.1		305	52.8
12	665		447	32.7		356	46.4
02	744		435	41.5		372	50.0
25	834		524	38.3		399	52.1
00	837		517	38.2		360	56.9

The following is a graph, which compares the size of the compressed SDP packet, once compressed by gzip and once with the method described here:

You can see very clear that the CSDP compression is much more effective than simply applying gzip to the SDP packet, as expexected by the author. BTW, it is interesting that the two plots have nearly the same shape, but they the CSDP plot is lowered by an offset compared to gzip.

Another interesting comparison is the compression ratio using the two different compression tachniques:

Once again you can see that the shape is nearly the same. But the major result is, that the CSDP method has a compression ratio of approximately 40% to 55 %, compared to 15% to 40% when using gzip. The arithmetic mean of the gziped SDP packet is 30.5%. It is 46.8% when using the CSDP compression.

B.6. Summary

As shown in the last sections, the here explained method for compressing SDP packets is much more efficient than simply using gzip. The average gain compared to gzip is about 15% (46.8% - 30.5%), which seems to be a notable value.

But there are also some kind of pitfalls. First, the compression algorithm shown here strictly relies on the order of the SDP lines. While implementing the algorithm explained above, I realized that there are still a lot of SDP packets on the net, which do not strictly care about the order, even sdr. So prior to compression, a reordering of the fields was necessary. Doing so results in correctly ordered SDP packets, so that there was no packet, which couldn't be compressed.

Of course, extending the SDP (e.g. adding new tags) will cause a problem. But it should be easy to adapt such changes to CSDP as well.

File was created Wed Feb 26 17:43:11 1997 by tex2html