TOC |
|
This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 21, 2003.
Copyright (C) The Internet Society (2003). All Rights Reserved.
One of the major applications of the Session Initiation Protocol (SIP) is in Internet telephony. In this context it is often useful or convenient for a SIP entity to request another SIP User Agent generate some type of tone, without generating this tone as part of a session. This document describes how SIP can be used without modification to carry such tones and explores issues relating to their use. Finally this document defines a new MIME type which is useful for conveying computer generated tones.
TOC |
TOC |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119[2].
TOC |
One of the major applications of SIP[1] is in Internet telephony. In this context it is often useful or convenient for a SIP entity to request another SIP User Agent generate some type of tone, without generating this tone as part of a session. These tones are presented as MIME content in the body of a SIP message or as indirected content[7]. As such, these tones can be conveyed without any modifications to SIP whatsoever.
This document makes use of the audio/telephone-events MIME type defined in RFC2833[5] and also defines a new MIME type (audio/tone-info+xml) to carry a textual, semantic description of tone information in XML, which is suitable for tone generators. When tones carried in the body of the message are used with the Alert-Info header, these tones are correlated to the appropriate body using the cid: URI and the Content-ID MIME header from RFC2045[3] using the grammar used in the Referred-By Mechanism[16].
Although a variety of tones with well-defined semantics already exist, use of this mechanism may be especially useful to convey special ringback tones or country-specifc cadences without encouraging the proliferation of early media[17]. Of course, playing tones suggested by another has a variety of security consequences. Also, the use of certain tones may leak potentially private information (such as the home country of the callee). Both privacy and security issues are discussed in the Security Considerations Section.
TOC |
Perhaps the simplest tone scenario involves using the Alert-Info in a 180 Ringing response to provide a specific sound file used to render target-country specific ringback.
SIP/2.0 180 Ringing ... Alert-Info: <https://server.example.net/ringback-france.wav>
Note that ringing tone or ringback tones specified in an Alert-Info header SHOULD be repeated for as long as local ringback would have been generated. In the future it may be desirable to define an explicit Alert-Info header parameter which indicates if the file should be repeated or not.
Our next example requests playing special (call waiting) ringback using inline MIME content of type audio/telephone-event from RFC2833[5]. The body of the SIP message consists of 4 octets of binary data. A UAS MUST NOT include this type of content in a 180 response unless support for audio/telephone-event MIME tye and the appropriate event(s) are advertised in an Accept header in the correspinding INVITE request. (For example: Accept-Content: application/sdp, audio/telephone-event;events="0-11,70,71" )
Note that this MIME type was originally defined only or use in RTP[12], so its use here may be considered irregular. The community should carefully consider if reuse of this MIME type if appropriate for the usage described here.
early media ringback : SIP/2.0 180 Ringing Alert-Info: <cid:foo@bar> Content-Type: audio/telephone-event Content-ID: <foo@bar> Content-Length: 4 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | event |E R| volume | duration | | 71 |0 0| 3 | 4000 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This next example conveys a US ringback tone from the XML tone description file defined in the Formal Syntax section of this document. Note also that in this example, the 180 also contains an SDP[9]offer or answer[13], so the tone description is part of a multipart MIME[4] body. Notice that the Content-ID MIME header is not a SIP header and so it is included inside the specific MIME part which carries the tone description.
SIP/2.0 180 Ringing Alert-Info: <cid:foo@bar> Content-Type: multipart/mixed;boundary=godzilla Content-Length: xxx --godzilla Content-Type: audio/tone-info+xml Content-Disposition: render;handling=optional Content-ID: <foo@bar> Content-Length: yyy <?xml version="1.0" ?> <tones repeat="true"> <tone> <modulation hz="0"/> <volume db="-3"/> <duration ms="2000"/> <frequency hz="440"/> <frequency hz="480"/> </tone> <tone> <modulation hz="0"/> <volume db="-63"/> <duration ms="4000"/> <frequency hz="0"/> </tone> </tones --godzilla Content-Type: application/sdp Content-Disposition: session;handling=required Content-Length: zzz <SDP contents> --godzilla
Other mechanisms for suggesting country-specific ringback have been suggested as far back in the past as IETF 47 (Adelaide). Specifically, in a long-expired draft, Adam Roach proposed an extension which carried the ISO country code of the appropriate ringback tone.
In the following flow, Alice receives an invitiation from a Third-Party Call Control[8] (3pcc) controller. Alice may have been invited due to a click-to-dial service or some type of scheduled callback or reminder service. Alice answers immediately, but when the Controller invites Bob, he does not answer immediately. It is desirable for Alice to receive an indication that her call to Bob is ringing. The Controller could negotiate a session with Alice to just provide a stream of RTP media to provide ringback, but this requires that the Controller be able to provide such media and deal with all the security and middlebox traversal issues associated with RTP for a simple tone which is transient in nature. Worse yet, in this flow Bob is willing to negotiate session details before answer which ordinarily would minimize or eliminate clipping of Bob's initial "Hello". However, if the Controller negotiates a session with Alice to send early media, the Controller must setup a session with Bob and discard or relay Bob's media which is likely to introduce clipping.
Instead, the Controller sends an UPDATE[11] request to Alice which contains the tone description file from the previous example. (It could have been a more traditional audio file instead, such as a wave file.) The UPDATE also contains a Reason header[6] which provides additional information about the cause of the UPDATE request. Alice's UA plays the tone until media is received from Bob. After the successful offer/answer exchange with Alice, the controller sends a PRACK[10] request to acknowledge Bob's reliable 180 response.
Alice Controller Bob |(1) INVITE offer1 | | |no media | | |<----------------------| | |(2) 200 answer1 | | |no media | | |---------------------->| | |(3) ACK | | |<----------------------| | | |(4) INVITE no SDP | | |---------------------->| | |(5) 180 Ringing offer2 | | |<----------------------| |(6) UPDATE offer2' | | | and ringback tone | | |<----------------------| | |(7) 200 answer2' | | |---------------------->|(8) PRACK answer2 | | |---------------------->| | |(9) 200 OK (PRACK) | | |<----------------------| | | | UPDATE sip:alice@a.example.com SIP/2.0 ... Reason: SIP;cause=180 Content-Type: multipart/mixed;boundary=gorilla Content-Length: xxx --gorilla Content-Type: audio/tone-info+xml Content-Disposition: render;handling=optional Content-Length: yyy (tone description file from the previous example) --gorilla Content-Type: application/sdp Content-Disposition: session;handling=required Content-Length: zzz (SDP contents of offer 2') --gorilla
The list below provides a non-exhaustive list of other tones (from RFC2833[5]) which may be provided during certain services. This list represents events 85, 80, 79, and 84 respectively of audio/telephone-event.
Calling card service tone: The calling card service tone consists of 60 ms of the sum of 941 Hz and 1477 Hz tones (DTMF '#'), followed by 940 ms of 350 Hz and 440 Hz (U.S. dial tone), decaying exponentially with a time constant of 200 ms.
Pay tone: The caller, at a payphone, is reminded to deposit additional coins.
Intrusion tone: The call is being monitored, e.g., by an operator.
Call waiting tone: Another party wants to reach the subscriber.
Note that there does not seem to be any good reason to indicate the Call Waiting tone remotely rather than directly at the User Agent and therefore sending this tone over SIP to implement Call Waiting is inadvisable.
In the example that follows, a pre-paid calling card application sends a calling card service tone to a SIP UA, along with instructions to collect digit stimulus using the App-Info header[14]and the Keypad Markup Language[15](KPML). It provides the tone description as indirect content[7]. These tones could be sent as an audio/telephone-event body instead.
UPDATE sip:a.example.com SIP/2.0 ... App-Info: <http://app.example.net/collect-digits.kpml> Content-Length: xxx Content-Type: message/external-body; access-type="URL"; expiration="Tue, 24 July 2003 09:00:00 GMT"; URL="http://app.example.net/calingcard.xml" Content-Type: audio/tone-info+xml Content-Disposition: render;handling=required
After authorizing the request, the UA fetches the following tone description file from the http: URL in the UPDATE, and renders this "be-dong" tone to the user.
<?xml version="1.0" ?> <tones repeat="false"> <tone> <modulation base-hz="0"/> <volume db="-3"/> <duration ms="60"/> <frequency hz="941"/> <frequency hz="1477"/> </tone> <tone> <modulation base-hz="0"/> <volume db="-3"/> <duration ms="940"/> <frequency hz="350"/> <frequency hz="440"/> <decay type="exponential" scale="200"/> </tone> </tones
TOC |
The following is an XML Schema description of the audio/tone-info+xml syntax.
repeat indicates if the tone sequence repeats continuously or plays only once.
modulation and frequency are measured in Hz. Up to 5 simultaneous frequencies are allowed.
duration and decay scale are measured in milliseconds
volume is measured in decibels from line level. The range -3 to -63 is recommended for consistency with RFC2833.
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="tones"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" minOccurs="1" name="tone" type="toneType"/> </xs:sequence> <xs:attribute name="repeat" type="xs:boolean" use="required"/> </xs:complexType> </xs:element> <xs:complexType name="toneType"> <xs:sequence> <xs:element maxOccurs="1" minOccurs="0" name="modulation" type="modulationType"/> <xs:element maxOccurs="1" minOccurs="1" name="volume" type="volumeType"/> <xs:element maxOccurs="1" minOccurs="1" name="duration" type="durationType"/> <xs:element maxOccurs="5" minOccurs="1" name="frequency" type="frequencyType"/> <xs:element maxOccurs="1" minOccurs="0" name="decay" type="decayType"/> </xs:sequence> </xs:complexType> <xs:complexType name="modulationType"> <xs:attribute name="hz" type="xs:decimal" use="required"/> </xs:complexType> <xs:complexType name="volumeType"> <xs:attribute name="db" type="xs:integer" use="required"/> </xs:complexType> <xs:complexType name="durationType"> <xs:attribute name="ms" type="xs:integer" use="required"/> </xs:complexType> <xs:complexType name="frequencyType"> <xs:attribute name="hz" type="xs:integer" use="required"/> </xs:complexType> <xs:complexType name="decayType"> <xs:attribute name="type" use="required"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="none"/> <xs:enumeration value="linear"/> <xs:enumeration value="exponential"/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name="scale" type="xs:integer" use="optional"/> </xs:complexType> </xs:schema>
TOC |
TODO
TOC |
TODO - MIME registration for audio/tone-info+xml
TOC |
TOC |
TOC |
[9] | Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998 (TXT, HTML, XML). |
[10] | jdrosen@dynamicsoft.com and schulzrinne@cs.columbia.edu, "Reliability of Provisional Responses in Session Initiation Protocol (SIP)", RFC 3262, June 2002. |
[11] | Rosenberg, J., "The Session Initiation Protocol (SIP) UPDATE Method", RFC 3311, October 2002. |
[12] | Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996. |
[13] | Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. |
[14] | Jennings, C., "SIP Support for Application Initiation", draft-jennings-sip-app-info-00 (work in progress), October 2002. |
[15] | Burger, E., "Keypad Markup Language (KPML)", draft-burger-sipping-kpml-01 (work in progress), March 2003. |
[16] | Sparks, R., "The SIP Referred-By Mechanism", draft-ietf-sip-referredby-01 (work in progress), February 2003. |
[17] | Schulzrinne, H. and G. Camarillo, "Early Media and Ringback Tone Generation in the Session Initiation Protocol", draft-camarillo-sipping-early-media-01 (work in progress), February 2003. |
TOC |
Rohan Mahy | |
Cisco Systems, Inc. | |
101 Cooper Street | |
Santa Cruz, CA 95060 | |
USA | |
EMail: | rohan@cisco.com |
TOC |
The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director.
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees.
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Funding for the RFC Editor function is currently provided by the Internet Society.