Research
My research interests lie in advancing robustness and open-world generalization in machine learning, aiming to build trustworthy models that can adapt and perform consistently across varied, dynamic real-world environments. .
|
|
Diversity Helps Jailbreak Large Language Models
Weiliang Zhao,
Daniel Ben-Levi,
Junfeng Yang,
Chengzhi Mao,
arXiv, 2024
arXiv
A Generalised jailbreaking technique by encouraging higher levels of diversification and adjacent obfuscated prompting to evaluate the vulnerabilities of LLMs.
|
|
Learning to Rewrite: Generalized LLM-Generated Text Detection
Wei Hao,
Ran Li ,
Weiliang Zhao,
Junfeng Yang,
Chengzhi Mao,
arXiv, 2024
arXiv
We propose a method designed to enhance
the detection of LLM-generated text by learning
to rewrite more on LLM-generated inputs and less
on human generated inputs.
|
Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.
|
|