The projects below are available in the context of CS E6998-03 (Advanced Internet Services) or as a 3998/4901/6901 project. Please contact Prof. Henning Schulzrinne for details.
Names under each project indicate whether the project has been assigned. Some projects may be assigned to more than one group. Projects listing "NN" as a project team member are looking for additional students.
Using our telephone gateway, build a web-by-phone server that recognizes touch-tone (DTMF) digits and then reads a web page aloud using the Bell Labs TTS text-to-speech system. The gateway requires use of on-campus hardware resources, but you can implement the project without the hardware on any system with audio I/O. You may find a DTMF generator handy; it is available for a few dollars from your local Radio Shack store. Lynx may make a good browser to pre-process web content for reading aloud, particularly in its version for blind users.
The TTS system is available at ~hgs/multimedia/tts, with some random notes.
Xin Jin
Similar to the previous project, allow reading of email messages through the phone. The caller should be able to navigate through messages and delete them. You can either retrieve messages into a file (using movemail, for example) or build/use a POP or IMAP client. You should be able to read subject lines and mail sources (From header). Usually, only the display name, not the user@host part, should be read.
The user should be able to step through the headers of the messages rather than reading the message.
When the message is being read, you should be able to stop, back up and cancel the reading. Included quoted text (<) should be spoken with a different voice. Possible extensions include the ability to only read messages marked as 'urgent' in the subject line or those personally addressed to the recipient (rather than to a mailing list). Additional features include the ability to respond to a message, using a MIME voice attachment recorded from the phone. Also, it should be possible to play back messages with an audio attachment.
For testing, the system should have an input mode that accepts, via telnet or command line, the same DTMF commands as issued by the phone interface.
CoolMail and CollegeClub are examples of commercial services. Siemens has a product called Xpressions.
Serge Shamis, Jack Hsu and Jeff Stutz
We have a 500-CD ROM "jukebox" in our lab, a robotic jukebox changer. Using existing interface utilities, Write a program that catalogues the jukebox CD ROMs into a line-per-CD text file and/or web page. With that, write a program that locates and loads CDs by title. Allow user to associate a particular waveform signature with an audio CD, so that CDs can be found after having been added and removed. (Suitable only for on-campus students.)
Using the jukebox described in the previous project and available raw-disk reading routines, write a program that reads the data from the disk and serves it using an existing RTSP server.
Using the H.261 codec in vic and the cinepak codec in xanim, convert standard AVI files to H.261, to allow streaming across the network using the RTSP server.
Write a module that splits an MPEG file into individual frames and wraps each into the necessary RTP payload header (see RFC xxx). Integrate into the RTSP server for generating streaming MPEG.
Develop a scalable, multicast-based floor controller suitable for a distributed classroom. The floor control tool should send PMM messages to compatible multimedia tools to enable sending and receiving. By pressing a function key on the keyboard, students "raise their hand". The instructor can recognize a student by clicking on the student's name (or using a touch screen) or picking students in order by pressing a key on another computer. Students should be able to "lower their hand" and attach a note indicating the nature of the question. Consider the loss of packets in the network. Students may join class late and need to learn about any pending floor requests.
The LPC-10 low-rate audio codec by Andy Fingerhut, Washington University is to be integrated into NeVoT, the network voice terminal. Attempts at optimization should be made (if you have some signal processing background). The processing requirements and audio quality (as a function of loss and packet size) are to be evaluated and compared to the current LPC codec.
Integrate the MPEG layer 3 high-quality audio codec into NeVoT.
Add a function to NeVoT, NeViT and/or the vic video tool that records incoming and outgoing packets into an rtpdump file. This is useful when recording events for later re-broadcast.
Using the RTP library, build an RTP translator to allow configurable translation between audio formats, using the audio encoding libraries of NeVoT. For example, an incoming high-bitrate stream using PCM encoding could be translated to GSM.
Build a Java or Tcl/Tk user interface that implements "VCR" functionality for on-demand access to stored audio and video using RTSP. The user interface contains the standard VCR control buttons (record, play, pause, rewind, fast forward), as well as a slider or similar element for positioning (by elapsed time) within the movie. The audio and video stream are delivered by the RTSP server being developed in my lab; streams are played back using standard audio and video tools such as NeVoT, vat, vic, rat or IP/TV.
As part of the project, you should develop a standalone library that parses RTSP responses. (Please discuss API with instructor.)
Shiva Bhakta
Alex Basile
Evaluate and compare the performance of audio and video codecs both through objective and subjective measurements. (Possibly several, coordinated projects.) Include the performance for random and correlated packet losses, including loss traces collected from the Internet.
Prepare and present a wide variety of standard audio and video sequences, showing different qualities and the effects of impairments, to be used for comparison (A/B tests) and teaching.
Create a tool that combines traceroute and ping to indicate where in the path packets are lost and delayed. Present results graphically and try your tool on some domestic and international routes.
Based on existing H.261 implementations, implement an H.263 video codec for vic and/or NeViT (MBONE video conferencing).
Measure clock accuracy of audio/video clocks on workstations and PCs.
Develop a module for the Columbia RTSP media-on-demand server that reads and writes Microsoft ASF audio/video files.
John Pelak
Design and develop one or more modules for creating digital effects for live video (such as title overlay, wipes, fades, insert) for use in the new CS/CVN digital video facility.
Design and develop an application that supports "raising your hand" in a distributed, Internet-based classroom. The mechanism should scale to very large classes, with hundreds of participants. Thus, instead of a central server, multicast UDP should be used. Students should be able to indicate the nature of the question so that an instructor can group or delay questions. An instructor should be able to call on students out-of-order and cancel questions. Calling on a student enables that student's audio and video transmission. Other students should, in general, be able to see the list of waiting students. Consider using the PMM (pattern-matching multicast) enabled audio and video applications NeVoT and vic. Your application has to deal with packet loss, but can assume that the computer clocks of all participants are synchronized, so that there is no ambiguity as to who raised her hand first. This application could also be useful for meetings, "Internet TV" shows with audience participation and town-hall-style gatherings.
John J. Lin
Construct a software driver to connect the analog phone line interface to an IP telephony server, allowing dial-in and dial-out. (Details). A basic version connects every incoming phone call to a fixed Internet address and allows outgoing calls to any phone number.
For group projects, this gateway can be extended so that a phone caller can "dial up" an Internet address. The phone answers Welcome to the Internet telephony gateway service. Please enter the email address or extension, using * for the "at" sign. The user then "types" in the beginning of a name or email address using the phone's numeric keys (447 for hgs, for example), with the text-to-speech facility offering the list of ambiguous choices. You should be able to resolve both ambiguous user names (e.g., from the /etc/passwd file) and ambiguous host names (e.g., by acquiring a zone dump of the .com entries).
Consider also the restriction of outgoing phone calls.
Your system must handle the case when no outgoing phone line is available.
Ko Uchiyama and Herbert Ignacio
Preferably using the video codecs already in vic or NeViT, construct a system that only transmits a (moving) person, without the background and allows to reassemble a virtual audience at the receiver.
(Some image processing background is helpful for this project.)
The Canon VC-C1 camera has a motorized zoom, pan and tilt which can be controlled remotely through a serial port. Serial port interfaces can be written in C or Tcl. As part of the project, write a generic control library in C or Tcl that allows to control the camera. Once a basic control interface API has been written, there are several choices for functionality:
The camera is located in my office and can be accessed remotely through serial port A of erlang.cs.columbia.edu.
Compare the existing directory services for Internet telephony (Netscape/Insoft IS411, Vocaltec, Four11, Microsoft ULS, etc.). Some are documented, some need to be reverse-engineered. Consider distributed alternatives that allow sub-grouping and graphical representations (rooms, maps, VRML?). Install an implementation of LDAP and construct an interface to SIP.
Allandel Manipon
Moshe Sambol
William Nagy
Write an interface to a calendar program (Schedule+, cm, CDE calendar, vcalendar compliant...) that automatically forwards or answers calls ("I'm in a meeting until 4:30", "I'm on vacation until August 26"). Allow user to define keywords that govern behavior, e.g., "private" may disable forwarding calls. Use existing privacy indications ("Show Time and Text", "Show Time Only", "Show Nothing") to govern detail to be provided, but allow configuration based on user groups. For example, the user might configure the application so that all callers from cs.columbia.edu are shown time only, while a select group of people identified by their email addresses are shown full details. Everybody else is shown nothing (available/not available). You could use the Netscape addressbook to look up forwarding details. (For example, "meeting at Hilton, San Francisco" would look up phone number for forwarding.)
Your program should be able to deal with overlapping appointments. For example, if the user is traveling to LA and has a meeting there, indicating the end of the meeting may not be particularly interesting, but rather the time to call back would be the time of return from the trip.
Consider parsing the calendar entry to understand header fields, e.g., a Location: line would indicate the appropriate forwarding location.
For the Sun calendar manager (cm), the calendar files are stored in /var/spool/calendar.
The application should be called as
The application should return messages in the format of SIP cgi responses, for example:
Status: 480 Meeting with John Doe Location: j.doe@cs.bar.edu Retry-After: Mon, 9 Feb 1998 17:37:17 +0100 Content-Length: 0If there is currently no meeting scheduled, return 200 OK.
Enhance the Netscape addressbook to allow dialing of Internet and POTS phone calls and to log incoming phone calls by person ("contact list") in the "notes" section of the address book entry. Use the address lists to manage group conferences. Your application should also offer a non-GUI interface to register a phone call and use the database. For example, it should be possible to invoke a program as ab_lookup Smith and get back, one per line, the email addresses of possible callees. Similarly, ab_log J.Smith@foo.com subject should add a log of this call to the entry for J. Smith.
Using Java or Tcl/Tk, implement a call controller, as described in See Personal Mobility for Multimedia Services in the Internet. The program should support both calling and called party using SIP for signaling. Your project should be modular, consisting of a parser and generator for SIP requests and responses, a module for parsing and generating SDP descriptions, and a user interface for initiating and receiving calls. The call controller starts the necessary applications (media agents) to receive and send audio and video. The controller also terminates these applications (by sending them an appropriate signal). You can use the media agents described on the RTP page, e.g., NeVoT, vic, rat and vat. Initially, it is sufficient to simply start these agents, with parameters supplied from the command line. For example, to set up a video connection to port 3456 at foo.bar.com, the vic video tool is started as vic foo.bar.com/3456.
Christopher Tse, Janet Park
Implement (a subset of) H.323, the Internet telephony standard. H.323 is supported by Intel Internet Video Phone and Microsoft NetMeeting, among others, but only on Windows'95.
The project only has to worry about the signaling aspects, since there are existing RTP tools that can exchange data with these H.323 implementations; they just cannot set up a call.
The project might start by implementing routines that parse and generate the Q.931 messages described in H.225.0 and then proceed to parse and generate the ASN.1 (PER) encoding.
Shiva Bhakta (to be completed end of summer 1998)
Design and implement a network reliability monitor. Test reachability of a number of sites periodically (e.g., via ping every ten minutes). If a site is not reachable, try to determine, using traceroute, whether this is a local problem or a problem affecting only one site or a problem affecting an Internet backbone.