Basic Information
I am in search of a Ph.D. position in 2024, preferably in America.
I am fascinated by reasoning and dedicate to causality described in pre- and post-conditions (referring to the PROC2PDDL paper). If you are interested in this, feel free to email me.😃
Highest Education:
MSE in Data Science at University of Pennsylvania; GPA: 3.97
Research Interests:
- Reasoning in Natural/Symbolic Language
- Interdisciplinary in NLP, CV and Robotics
Expertise:
- Learning Science and Cognitive Science (6 years of experience, B.S., M.Ed)
- Human Learning and Knowledge Storage
- Learning Task Design
- Natural Language Processing (3 years of experience, MSE)
- Event Extraction
- Event Reasoning in Natural and Symbolic Language
e.g. Schema Generation, Event and Entity Pre- and Post-condition Generation
Publications
Human-in-the-Loop Schema Induction. Tianyi Zhang*, Isaac Tham*, Zhaoyi Hou*, Jiaxuan Ren, Liyang Zhou, Hainiu Xu, Li Zhang, Lara J. Martin, Rotem Dror, Sha Li, Heng Ji, Martha Palmer, Susan Brown, Reece Suchocki, Chris Callison-Burch. ACL Demo 2023
Argument Role Labelling Question-Answer Pair Data Augmentation. Tianyi Zhang, Elior Sulem, Dan Roth. 2023, In submission
PROC2PDDL: Predicting Domain Definitions Based on Natural Language for Symbolic Planning. Tianyi Zhang*, Li Zhang*, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch, Niket Tandon. 2023, In submission
Not publicly available currently.
Education
- MSE in Data Science, Jan. 2021 - Dec. 2022
University of Pennsylvania, Philadelphia, America - M.Ed in Learning Science and Technology, Sept. 2018 - Dec. 2019
University of Pennsylvania, Philadelphia, America - B.S in Educational Technology, Sept. 2014 - Jun. 2018
Beijing Normal University, Beijing, China
Research Experience
- Research Assistant: NLP Group at UPenn, May. 2022 - Jul. 2023
- Event schema generation using GPT-3
- Design the scaffold prompts (cause, plan, procedure, effect, etc.) for GPT-3 to generate inclusive events
- Apply SRL and constituency parsing to summarize and extract structured events
- Build schema graphs by adding temporal relations to the events
- Ground the graph nodes into the ontology through the semantic inference or the semantic similarity
- Outcome: "Human-in-the-Loop Schema Induction" paper
- Natural language to symbolic language translation for reasoning
- Focus on open-domain natural language (wikiHow) to symbolic language (PDDL) generation with GPT-4
- Decompose the task into three stages: extraction, inference, and translation
- Identify strong text extraction and entity states inference abilities with increasingly complex wikiHow text (~5000 words)
- Acknowledge a weak translation capability to predefined symbolic pattern
- Improve the performance using CoT and instructions on translation.
- Outcome:"PROC2PDDL" paper
- Event schema generation using GPT-3
- Research Assistant: Cognitive Computation Group at UPenn, Mar. 2020 – Dec. 2022
- Event trigger identification and classification using sequence tagging
- Build 1-of-N (72) joint model and BIO identification + event type classification pipeline model
- Transfer learning with target language auxiliary dataset, e.g. OntoNotes Arabic
- Event argument identification and classification using QA
- Design fixed questions for each argument role and convert argument role labeling task to question answering task
- Build has-and-no-answer joint model and has/no answer classification + has answer identification pipeline model
- Transfer learning with auxiliary QA dataset, e.g. SQuAD, QAMR
- Event data augmentation using answer extraction (AE) and question generation (QG)
- Train AE-QG T5 model to extract QA pairs from unlabeled event text
- Train AEwSRL-QG Bert-T5 model to extract QA pairs from unlabeled event text
- Evaluate on joint-QA and pipeline-QA model and prove effectiveness of the above data augmentation approach
- Outcome: "Argument Role Labelling Question-Answer Pair Data Augmentation" paper
- Event trigger identification and classification using sequence tagging