TAGGING AND BROWSING VIDEOS ACCORDING TO
THE PREFERENCES OF
DIFFERING AFFINITY GROUPS
John R. Kender
Columbia University
OVERVIEW:
We investigate the new task of determining which textual tags are
preferred by different affinity groups for news and related videos. We
use this knowledge to assign new group-specific tags to other videos
of the same event that have arisen elsewhere. We map visual and
multilingual textual features into a joint latent space using reliable
visual cues, and determine tag relationships through various canonical
correlation analyses (CCA) variants. For human-interest international
events such as epidemics and transportation disasters, we detect
country-specific tags from US, Chinese, European, South American, and
other countries' news coverage.
We catalog statistically significant cross-group differences in
multimedia creation and tagging, and explore variants of Deep CCA,
finding them better suited to capturing those preferences in a three
view space (one common video dimension, two culturally-determined tag
dimensions). We investigate how these non-linear methods can be
extended to the videos of multiple affinity groups, including more
subtle shadings such as US compared to UK or even Canada. As
different groups are differentially sensitive to particular images, we
investigate the day-to-day spreading influence of visual memes across
countries through a novel application of the PageRank algorithm.
We demonstrate and evaluate a novel cross-group multimedia browser
that accesses online webpage archives of international events from two
different countries. It visualizes these results with
country-specific information on separate timelines, but with
cross-country images and tags straddling both. This system provides
an exploratory, zoomable differential view of clips and text, and
graphs their development over time. We demonstrate that this browser
expands and improves the effectiveness of video retrieval
Some examples of different cultural viewpoints detected:
In the above, near-duplicate keyframes appear across cultures with
different texts. The upper pair is the near-duplicate keyframes about
the AirAsia Flight news. The lower pair is the near-duplicate
keyframes about the AlphaGo vs. Human event. Included is the
translated English version (from Chinese) from an auto-translator, for
comparison. It can be noticed that the translation has some slight
issues, such as translating the Chinese word "diliutian(day 6)" to
"6th", omitting the "tian" character.
Some examples of differential use of named entities:
In the above, a plot of the frequency counts of named entities in
Chinese (CGTN) versus U.S. (YouTube) sources, about the Chinese Lunar
Rover event. The outliers in the top-left corner (gray) are "Yutu"
and "Chang'e-4", which are the Chinese names of the mission and the
space craft, respectively, generally unmentioned in the U.S. The
outlier in the bottom-right corner (green) is "NASA", generally
unmentioned in China. The outliers (red) in the top-right corner,
reflecting equal frequency of use, are "first" and "moon", the common
subjects of the event.
STUDENTS (alphabetical order):
INTERIM RESEARCH REPORTS:
- System: Deep cross-cultural system architecture and performance I [Yicun Liu]
- System: Deep cross-cultural system architecture and performance II [Yicun Liu]
- System: Deep cross-cultural system architecture and performance III [Yicun Liu]
- System: Determining significant cross-cultural news events [Xu Han]
- System: Pipeline performance enhancement [Yiyang Zeng and Guandong Liu]
- System: Non-negative Matrix Factorization to derive cultural foci [Yifei Chen and Zhengyi Chen]
- System/CV: Pipeline for collection and analysis and extraction, India versus India [Tom Joshi]
- CV: A hybrid method for recognizing near-duplicate frames in news videos [Xu Han]
- CV: Improving clarity of key frames [Ruochen Liu]
- CV: Object-based video frame similarity in news videos [Ruochen Liu]
- CV: Autoencoder-based clustering of cross-cultural video frames [Omer Onder]
- CV: Extraction and translation of embedded text in Chinese news videos [Mehul Goel]
- CV: Frame similarity analysis via ViT encoding [Zhepeng Luo]
- CV: Deepfake face detection depends on affinity group [Rohit Gopalakrisnan]
- CV: Frame autoencoder using FC bottleneck layers [Tiancheng Shi]
- CV: Color- and setting-based cultural differences in news media [Marvin Limpijankit]
- CV: Scene differences between U.S. and China commercial advertisements [Zhe Pang]
- CV: Cultural differences in news video thumbnails via computational aesthetics, U.S. versus China [Marvin Limpijankit]
- CV: Cross-Cultural Exploration of Visual Aesthetic in Magazine Covers [Marvin Limpijankit]
- CV/NLP: Cultural differences in images chosen for the same textual tags [Yu-Shih Chen]
- CV/NLP: Multimodal autoencoder architecture for image and text dimension reduction [Tiancheng Shi]
- CV/NLP: LLM-assisted nterpretation of cross-cultural multimodal autoencoder embeddings [Tiancheng Shi]
- NLP: Detecting named entities in English and Chinese news about AlphaGo [Xu Han and Shao-An Chien]
- NLP: Detecting culture-specific outliers in news topics, U.S. versus China [Shao-An Chien]
- NLP: Finding white-list news topic English words [Kathleen Lee]
- NLP: Finding and visualizing black-list news topic English words I [Zikun Lin]
- NLP: Finding and visualizing black-list news topic English words II, using BERT [Zikun Lin]
- NLP: Bert-based Chinese advertising filter [Zejun Lin]
- NLP: Middle-frequency Chinese word analysis [Yifei Zhang]
- NLP: Arousal and valence in cross-cultural captions of news videos [Vikki Sui]
- NLP: Tweet versus Weibo emotions in response to a world event [Hongxuan Chen]
- NLP: American versus Russian sentiment in news headlines [Nathaniel Wang]
- NLP: Sentiment analysis of news video descriptoins via Graph Neural Networks [Ting Zou]
- NLP: Sentiment response to news shock, England versus China [Zihang Xu and Zheng Hui]
- NLP: Sentiment analysis of Ukrainian, American, and Russian sentiment in the UACommunity subreddit [Adelina Duman]
- NLP: Sentiment response to news shock, England versus China II [Zheng Hui]
- NLP: Cross-culture sentiment in English and Chinese Reddit posts [Zhenni Xu]
- NLP: Emoji analysis between English and Spanish posters in World Cup 2022 [Yu-Hsin Huang and Jittisa Kraprayoon]
- NLP: Sentiment dynamics in climate change discourse, U.S. versus India [Mansi Singh]
- NLP/UI: Finding and visualizing English keywords for browser display [Satvik Jain]
- NLP/UI: Chinese keyword selection [Caroline Chen]
- NLP/UI: Cross-cultural emotive responses to news videos [Luvena Huo]
- NLP/UI: Color-coded valence, arousal, and culture word clouds for news video responses [Luvena Huo]
- NLP/UI: Analysis and display of emoji use in response to news videos [Angela Zhang]
- NLP/UI: Visualization of valence of cross-cultural social media responses to news [Risheng Lian]
- UI: Visualizing cross-cultural news coverage, U.S. versus China [Jiaqi Liu]
- UI: Visualizing cross-cultural news coverage II, U.S. versus China [Dave Dirnfeld]
- UI: Cross-cultural browser alpha test and improvement [Luvena Huo and Tiansheng Sun]
- UI: Shape and color for cross-cultural word cloud user interfaces [Shaun Wang]
DETAILS:
NSF IIS project award number:
1841670
Expected duration: 18 to 24 months
Award title: "Tagging and Browsing Videos According to the Preferences
of Differing Affinity Groups"
Principal investigator: John R. Kender
Acknowledgment: This material is based upon work supported by the
National Science Foundation under Grant No. 1841670.
Disclaimer: Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the authors and do not
necessarily reflect the views of the National Science Foundation.
For further information:
jrk atsign cs dot columbia dot edu
Last update: Sep. 15, 2023