COMS 4419: Internet Technology, Economics and Policy (Spring 2025)
Projects and Term Papers
The categories below are approximations - all projects and term
papers may include quantitative analysis, interviews with experts,
literature surveys and software development. In all cases, the relevant
literature should be carefully considered and cited.
Many of the projects listed as part of the Datasets
and Potential Research Questions 2013 may be suitable as a
project, although some of the datasets (e.g., Form 477) and issues are
now outdated.
Project Plan
Each team must submit a project plan
outlining their goals for their project or term paper.
Progress Report
Each project team member should submit one updated progress report as a PDF file once every
two weeks, on Thursday evenings starting two weeks after the project
proposal, clearly indicating the work accomplished in those two weeks as
well as any obstacles. There should be some verifiable signs of
progress - "80% complete" is not helpful; "wrote functions to do X (app.
200 LoC)" is better.
Project Presentation
The presentation should be targeted to last no more than 12 minutes,
leaving 3 minutes for questions, for both one-person and group projects.
Since talks are back-to-back, we will have to cut short talks that exceed
their time allotment. For most speakers and slide styles, this
translates to (at most) 7-8 slides, including the title slide. For
group projects, you can either split the presentation or designate a
single speaker. The former is preferred, to give everyone a chance to
practice. You should consider your talk like a "pitch talk", i.e., get
the listener interested in your project. What problem did you tackle or
what area did you investige? Why is this interesting or important?
What were you most surprising results? What approaches did not work
well? Briefly, what would be next steps? Please be sure to practice
your talk so that you are sure of timing, content and hand-offs.
See talk hints, and Writing
technical articles also links to materials related to talks.
Project Final Report (= Research Paper)
Project reports are typically 3,000 to 5,000 words per project team
member, i.e., 6,000 to 10,000 words for a two-person project.
Papers should be single-spaced, 11 or 12 pt font, and should conform
to the recommendations on writing
style and avoid
common mistakes. You can include any extensive graphs or tables as
appendices if needed. Use of the IEEE
templates is strongly suggested. The structure in this guidance
page should be followed, although it is somewhat less applicable for
analysis and review projects, where a standard "term paper" format is
called for. Please ignore the guidance about page limits and individual
project reports - the guidance is from a different university.
Experiments and Implementation
Some, but not all, of the projects below require computer networking background,
e.g., from a class like CSEE 4119.
- Privacy for the Internet of Things or smart TVs:
- What kind of data do Internet of Things devices, smart TVs or video
devices (Roku, Apple TV, Google TV) exchange with the outside world?
Who do they "talk" to? What countries does your data "visit"? The
project requires knowledge of Wireshark or scapy.
- Measuring Internet port blocking:
- Most consumer Internet services block some Internet ports, sometimes
for historical reasons, sometimes for security reasons and sometimes for
reasons that are less obvious. Develop a tool that allows a user to
test which UDP and TCP ports can be used for both incoming and outgoing
packets, and whether other IP features such as IP options and IPv6 are
usable.
- Video quality:
- How does bandwidth and packet loss affect the video quality of
streaming and interactive applications such as Zoom, WebRTC
applications, YouTube, Netflix, or TikTok? Consider using a network
emulator to simulate various network conditions.
- Finding Internet bottlenecks:
- When Internet applications suffer from performance problems, it is
often difficult to tell whether the problem is found in the home (Wi-Fi)
network, in the first-hop access network (e.g., the shared cable
network), the middle-mile network, the Internet backbone or at the
server or CDN. Develop a tool for either a desktop or mobile OS to
estimate where the performance problem is likely to be found.
- Wi-Fi performance:
- It is not uncommon that Wi-Fi is slower than LTE. Map the
performance of Wi-Fi (e.g., the Columbia Wi-Fi network) vs. LTE in a
geographic area, including indoors, e.g., using the FCC mobile
measurement application.
- Wi-Fi performance at home:
- Characterize the performance of your Wi-Fi router (single or mesh)
as a function of distance, time-of-day, interference (e.g., a second
router) and other factors. (You do not need the highest speed service -
you can use a local test server.)
- Wi-Fi congestion:
- Measure Wi-Fi spectrum usage in the 2.4, 5 and 6 GHz bands in various
locations in New York City (or wherever you may want to travel...), both
indoors and outdoors. How many stations are visible on what frequency
channels? Where are publicly accessible access points, such as "Cable
Wi-Fi" visible and accessible? Measure the impairment due to
interference, i.e., how much lower throughput is between a mobile device
and the base station compared to a "silent" radio environment.
- Internet speed tests:
- There are a large number of speed tests available, including from
the FCC (Measuring Broadband America), Ookla (speedtest.net), M-Lab,
Netflix, and Google. For the same network, they often provide very
different results. Compare these approaches and analyze how they
measure speed and latency. Can you explain the differences?
- Location measurements:
- What is the reliability of handset-provided geographic location
data? Build a tool that allows users to indicate their true location
based on a map and compare it to the location provided by GPS, Wi-Fi or
cellular tower data. Explore the reliability systematically, both
outdoors and within a building.
- Indoor positioning:
- Can you determine the room (apartment, office, ...) you are in by
comparing Wi-Fi "fingerprints"? Can you apply machine learning
techniques to the task?
- Altitude information:
- Many modern smartphones have built-in altimeters based on barometric
pressure. Altitude (elevation) information can be very useful to
dispatch first responders after a 911 call. Conduct experiments that
allow you to evaluate the accuracy of this data in various
buildings.
- Speech-to-text and AI-based meeting summary:
- Using recordings or after live participation by citizen reporters,
can we auto-transcribe local government or (FCC) regulatory meetings and
provide structured summaries (e.g., alignment to agendas), to augment
reporting by local journalists (who may no longer be able to cover every
borough or county meeting)?
- Ad tracking and cookie permissions:
- Many websites allow you to choose whether to accept cookies or
select among categories of cookies. Determine which cookies are
affected. Does the loading speed or data volume of the website
change?
Data Analysis
The projects below can use a variety of data analysis techniques,
from SQL to statistics packages to ML, often in combination. For all
projects, the instructor has pointers to data sources.
- BEAD challenge process:
- The
BEAD
program aims to deploy high-speed internet to every location that is
currently unserved or underserved. The BEAD challenge process aims to
find out if the public maps are accurate. Analyze the BEAD challenge
process data to find out what kind of corrections were made during the
process, who participated and how different states handled the process.
- Broadband deployment:
- The FCC collects broadband deployment data (BDC). Analyze which
technologies were deployed where and which technologies disappeared. How
does the data compare to the older Form 477 data?
- Broadband subsidies:
- In the United States, both the Federal
government and states subsidize broadband and communication services
(mobile phones, mainly). Who benefits - consider rural vs. urban,
richer vs. poorer areas, "red" vs. "blue" areas, using data provided
by the FCC, Census data and other sources.
- Broadband pricing, broadband label:
- Using the FCC broadband labels, collect
pricing data for both promotional and long-term pricing. Does the price
change where there is more competition? What other factors (e.g.,
rurality) explain pricing differences? Does price change linearly with
speed or is there some other correlation?
- Peering:
- Using routing and peering
data, characterize peering relationships between carriers, content
providers and CDNs. Who peers with whom? Under what conditions?
- Broadband metrics:
- The FCC now gathers a range of broadband performance indicators that
are highly correlated, e.g., as part of the Measuring Broadband
America data set. What is their relationship with each other?
Which of these are independent or dependent variables?
- Consumer expenditures:
- Gather all available data on consumer expenditures for telephone,
cellular and Internet services, comparing government data, industry
analysis and corporate annual reports. (The BLS consumer expenditures survey
provides some information, but may not map cleanly into current
categories.) Is the data consistent? Can it be compare against other
major OECD economies? How have expenditures changed?
- Rural electric cooperatives:
- Analyze the service territories of rural electric cooperatives.
Using the FCC Form 477 data, how good (or bad) is broadband connectivity
in those areas? Has it changed recently?
- Network reliability:
- Can you determine network outages, both "sunny day" and "rainy day",
from the FCC Measuring Broadband America or ATLAS measurement
infrastructure data?
Literature Review and Analysis
The projects below summarize key resources in the topic area. They
may involve data, but are likely to require smaller volumes of data and
less advanced statistics. They may also draw on interviews you conduct
with domain experts.
- BEAD policies:
- The Broadband
Equity, Access and Deployment project is aiming to deploy high-speed
internet to all unserved and underserved locations. Analyze how
different states and territories approached key facets, such as
the low-cost option, the challenge process and subgrantee selection.
- Mobile broadband and financial inclusion:
- Explore the impact of mobile broadband accessibility on financial
inclusion in the global south: This project should entail a comparative
analysis of mobile vs. fixed broadband adoption amongst unbanked or
underbanked populations, with emphasis on pricing, customer experience,
and service quality. (Likely combines literature survey, data analysis,
and interviews.) Consider countries where mobile providers also serve as
"banks." (M-Pesa); J-PAL, study
- Content moderation and amplification for social media:
- Survey the literature on content moderation for social media -
current approaches, tools, effectiveness, transparency, requirements
(e.g., in Europe).
- Content moderation for discussion forums:
- What kind of discussion forums, ranking and content moderation do
national and local news sites employ? Is there a way to measure the
quality of the discussion? Consider contacting newspaper staff to
gather their experiences.
- Digital "papers":
- Read Carpenter v. United States, United States v.
Jones, and Riley v. California and maybe some lower court
cases, summarize how courts are handling search warrants of digital
"papers". How has treatment changed? How do these decisions reflect (or
not) the differences between traditional and digital letters and other
personal documents?
- Transparency report:
- Do a survey and data analysis of tech company's transparency
reports. What do they cover? How do they differ in categories and
geographic detail? Do they indicate what they do not disclose? Can you
design a template similar to, say, a 10-Q disclosure?
- Media:
- For different TV and radio stations (e.g., in the NYC
area), determine their programming mix, e.g., children's programming,
local news, advertisements, syndicated programming, ...
- Data portability:
- For major consumer services for photos, messages, social media
posts, address books, and email (e.g., various Google services,
Facebook, Instagram, TikTok, Whatsapp, Yahoo Mail, Apple photos and
email), can you extract your data, e.g., to move to a new service? How
long does it take? How useful is the data you can extract? Can you
import the data (e.g., email or photos) to another service? Are there
tools to help?
- Ad blocking:
- Among popular websites, e.g., for news, which function well with ad
blockers and which fail or explicitly refuse to provide content? How
effective is ad blocking?
- Rural broadband:
- Analyze the cost of deploying fiber in rural areas. What are the
cost components, such as planning, fiber, electronics and construction?
How does take-up affect cost and viability? What are financing
models?
- Cost of Internet access:
- Using bills gathered from (Facebook, LinkedIn, real-life) friends
and family, try to evaluate the typical cost structure of Internet and
phone service. How much variation is there for similar services? How
does this compare to the advertised rates?
- Communication networks during natural disasters:
- Using interviews with residents and public safety officials, as well
as various data sources, describe how well various communication
facilities help up during Harvey and Irma, including land mobile radio
("walkie-talkies"), cellular, landline and Internet access.
- Spectrum usage:
- Analyze what spectrum is used for, by whom and where, comparing use
for categories such as broadcast, communication and non-communication
(radar, medical, industrial) applications.
- Spectral efficiency:
- Compare the spectral efficiency of FM radio, digital over-the-air
(ATSC 3) TV, land-mobile radio and cellular systems. Consider the
encoding of information, the air interface, and how many bits of content
are delivered to users, or how much spectrum it would take to replace a
traditional service such as radio or TV with a cellular service. Note
that there is no single definition of spectral efficiency, so the
project should consider existing definitions in the literature and
justify choices.
- TV stations:
- Investigate whether one could put all TV stations on cable or
satellite, either generally or in more rural areas. How many stations
are must-carry vs. retransmission consent? What would be the costs,
potential sources of revenue and benefits?
- Cybersecurity:
- What are the principal causes of cybersecurity problems? Is there
quantitative evidence? What remedies are likely to reduce the frequency
or impact of such events? (Cite research to support your
arguments.)