publications
2025
- ConEm: Learning Embedded Concept Representations from LLMsLuna Peck and Alvin Po-Chun Chen2025
While large language models offer impressive performance, flexibility, and ease-of-use via natural language interaction, slight variation in the wording of prompts can have significant impacts on model performance. To sidestep this problem, we propose ConEm, a prompt tuning–like framework for learning non-linguistic representations of general concepts. We use ConEm to train a variety of GPT-2– compatible concept embeddings across multiple datasets, and evaluate their utility on a generative classification task. We find that while GPT-2–based ConEm does not produce consistently better results than similar natural language methods, the method shows significant promise, and is worth pursuing further.
@misc{peck2025conem, title = {ConEm: Learning Embedded Concept Representations from LLMs}, author = {Peck, Luna and Chen, Alvin Po-Chun}, year = {2025}, }
- NeuroSpect: Automating Aspect Annotation for UMRAlvin Po-Chun Chen, Claire Benét Post, Saksham Khatwani, and 5 more authors2025
Uniform Meaning Representation (UMR) is successor to Abstract Meaning Representation (AMR), graph-based framework for representing the semantics of natural language data. Graphs from both frameworks require extensive effort by trained annotators to produce, motivating the need for automating parts of the annotation process. While current approaches struggle to produce faithful UMR graphs from natural language inputs, converting existing AMR graphs (of which there are plenty) is a more tractable task. A key part of the conversion process requires incorporating features added to the UMR framework that are not present in AMRs. One such addition is aspect, which marks the temporal structure of eventive predicates. In this paper, we introduce RuleSpect and GraphSpect, two novel methods for producing UMR aspect annotations using existing AMR graphs and purely natural language inputs, respectively.
@misc{chen2025neurospect, title = {NeuroSpect: Automating Aspect Annotation for UMR}, author = {Chen, Alvin Po-Chun and Post, Claire Benét and Khatwani, Saksham and Sairam, Karthik and Bontempo, Paul and Milliken, August and Derby, Nicholas and Nabieva, Sumeyye}, year = {2025}, }
2024
- Studying the Effects of Collaboration in Interactive Theme Discovery SystemsAlvin Po-Chun Chen, Dananjay Srinivas, Alexandra Barry, and 2 more authors2024
NLP-assisted solutions have gained considerable traction to support qualitative data analysis. However, there does not exist a unified evaluation framework that can account for the many different settings in which qualitative researchers may employ them. In this paper, we take a first step in this direction by proposing an evaluation framework to study the way in which different tools may result in different outcomes depending on the collaboration strategy employed. Specifically, we study the impact of synchronous vs. asynchronous collaboration using two different NLP-assisted qualitative research tools and present a comprehensive analysis of significant differences in the consistency, cohesiveness, and correctness of their outputs.
@misc{chen2024studyingeffectscollaborationinteractive, title = {Studying the Effects of Collaboration in Interactive Theme Discovery Systems}, author = {Chen, Alvin Po-Chun and Srinivas, Dananjay and Barry, Alexandra and Seniw, Maksim and Pacheco, Maria Leonor}, year = {2024}, eprint = {2408.09030}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, }
- Mothman at SemEval-2024 Task 9: An Iterative System for Chain-of-Thought Prompt OptimizationAlvin Po-Chun Chen, Ray Groshan, and Sean Von BayernIn Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), Jun 2024
Extensive research exists on the performance of large language models on logic-based tasks, whereas relatively little has been done on their ability to generate creative solutions on lateral thinking tasks. The BrainTeaser shared task tests lateral thinking and uses adversarial datasets to prevent memorization, resulting in poor performance for out-of-the-box models. We propose a system for iterative, chain-of-thought prompt engineering which optimizes prompts using human evaluation. Using this shared task, we demonstrate our system’s ability to significantly improve model performance by optimizing prompts and evaluate the input dataset.
@inproceedings{chen-etal-2024-mothman, title = {Mothman at {S}em{E}val-2024 Task 9: An Iterative System for Chain-of-Thought Prompt Optimization}, author = {Chen, Alvin Po-Chun and Groshan, Ray and Von Bayern, Sean}, editor = {Ojha, Atul Kr. and Do{\u{g}}ru{\"o}z, A. Seza and Tayyar Madabushi, Harish and Da San Martino, Giovanni and Rosenthal, Sara and Ros{\'a}, Aiala}, booktitle = {Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)}, month = jun, year = {2024}, address = {Mexico City, Mexico}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.semeval-1.263}, doi = {10.18653/v1/2024.semeval-1.263}, pages = {1876--1888}, }