Research and Engineering

Predicting Difficulty and Discrimination of Natural Language Questions

Starting in spring of 2020, I began researching a way to generate questions that allowed for specifying how difficult the question should be.

As a researcher, I implemented a seq2seq question generation model, explored various ways of implementing item response theory, and preformed my own research on how syntactic and semantic features can predict how difficult a question will be to solve. I also explored various practical aspects of Natural Language Processing, including how to fine-tune large language models for down-stream classification tasks. In particular, I have a deep interest in the application of A.I. in education technology which led me to participating in a kaggle competition which focused on determining the quality of argumentative essays. This project helped me think about how to setup efficient environments for testing many models in various environments; this mostly stemmed from the numerous hyperparameters that come with finte-tuning large language models. My code can be found here. This project also pushed me to familiarized myself with modern mechanisms to interpret predictions from these large language models, including integrated gradients and LIME. I even extended the most popular implementation of LIME and opened a pull-request, which adds the ability to use a T5 model to fill in blanks in data that LIME creates.

I'm currently exploring ways to use natural language feedback to improve models, partly inspiring my implementation of fast-weights, in which one model (the slow-network) learns weight updates for another (the fast-network). As well, I've taken up studying a classification task in which the goal to is determine the severity of medical errors based on a description of the event, which can be viewed here when the codebase becomes public.


A long time ago, I worked on a cheat detection system for Canvas LMS. This was a desktop app that used an Autoencoder to detect anomalies in testing data. This project was later remade into a website and mobile app, but was discontinued to the complexities of a hobby project complying with FERPA. The code base for the website can be found here and the desktop app here.


Notice: This page is still under construction