About

Hello and welcome to my portfolio. I am currently a third year PhD student in Computer Science at Cornell University, where I am working with Joe Halpern. Previously, I completed my M.S. in Computer Science at the University of Washington under Anna Karlin’s supervision.

My primary research interests lie in the intersection of economics and computer science. I enjoy creating compelling and realistic mathematical models of behavior, and rigorously investigating the outcomes of these models. My Master’s thesis considered a model of how agents might procrastinate when faced with a difficult task, and used competition to alleviate the harms of procrastination. Recently, I started work on pricing problems in cryptocurrencies, focusing on how to best set interest rates for lending pools. I’ve also worked more broadly in theoretical computer science, investigating strong privacy definitions to protect correlated data and coordination games on large social networks. In the past, I’ve done research in Tech Policy/HCI and NLP.

In addition to my CS interests, I’m also interested in epistemology and the philosophy of statistics. I’m currently designing a new, social theory of statistical evidence, Robust Bayesianism (RB), that extends likelihood-based methodologies to broader settings. RB is mathematically interesting because it’s compatible with weak, qualitative requirements on rationality. I’m also working on connecting statistical definitions of evidence to more commonplace ethical definitions of evidence, with the goal of providing an “ethics-first” approach to evidence, and to epistemology more generally.

Interests

  • EconCS
  • Blockchains/Cryptocurrency
  • Epistemology/Philosophy of Statistics

Education

  • M.S. in Computer Science, 2020

    University of Washington

  • B.S. in Computer Engineering, 2019

    University of Washington

Projects

*

Pricing Crypto Lending Pools with Options

Cryptocurrency lending pools offer overcollaterized loans with one cryptocurrency held as collateral and a different currency given out as a loan. Because they are overcollaterized, they do not increase the borrower’s liquidity, and so should not be compared to standard loans. Rather, lending pools allow borrowers to increase their exposure to a particular currency, just as options allow investors to increase their exposure in the standard financial world. As a result, we build models to price the interest rates and other parameters of lending pools by replicating them as options, and using ideas from options pricing.

Lending pool loans are overcollaterized, and if the value of the collateral dips too low, the entire loan is liquidated: the borrower cannot repay the loan to claim their collateral, which is instead sold off at a discount. Borrowers have the option to “top-up” their loans at any time by depositing more collateral which they retrieve when they repay the loan.

We first use the famous Black-Scholes model, which assumes that the price of the asset (i.e., the crypto) obeys geometric brownian motion, that the market is frictionless, and that trades are continuous. If we simplify lending pools by ignoring the ability of borrowers to top up their loans, we can replicate lending pool loans via (down-and-out) barrier options, which are options that are called off when the price of the asset dips too low. This lets us get a simple model for pricing interest rates and collaterization parameters of lending pools.

However, modeling top-ups requires us to deviate from the Black-Scholes model. In particular, top-ups are trivial if there is no discount factor. So, we augment the Black-Scholes model to allow for discounting, and implement simulations to price interest rates with top-ups. To do so, we consider loans which borrowers are allowed to top-up a finite number of times. We show that such loans can be considered nested barrier options, such that topping up $k$ times is a barrier option with a rebate equivalent to the value of being allowed to top-up only $k-1$ times. We show, via simulations, that the values of these limited top-up options converges quickly.

Finally, we study lending pools in practice, and show the discount rates induced by their interest rates, under our model.

Chunking Tasks for Present-Biased Agents

A well known technique that instructors use to combat procrastination is to break up a large project into more manageable chunks. But how should this be done best? We formalize chunking within an existing model of present bias for task completion, derive algorithms for optimal chunking, and prove that a relatively small amount of chunking can ensure that biased agents behave optimally.

We build on a model originally proposed by Jon Kleinberg and Sigal Oren [KO14]. In their model, tasks are represented by directed, acyclic graphs with designated start node $s$ and end node $t$. The agent traverses a shortest $s\to t$ path with one twist: when the evaluating the cost of a path, they multiply the cost of the first edge by their bias parameter, $b$. They traverse the first step of this biased path, and then recompute the best path. Existing results show that these biased agents can take exponentially more costly paths through a given graph.

In our model, the task designer can chunk edges into smaller pieces. We show that, for edges along the shortest path, the optimal way to chunk an edge is to make initial pieces easier and later pieces progressively harder. For edges not along the shortest path, optimal chunking is significantly more complex, but we provide an efficient algorithm. We also show that with a linear number of chunks on each edge, the biased agent’s cost can be exponentially lowered to within a constant factor of the true cheapest path. Finally, we show how to optimally chunk task graphs for multiple types of agents simultaneously.

An “Ethics-First” Approach to Evidence

There are many ethical norms surrounding evidence. Evidence determines when researchers should halt drug trials, and what pharmaceutical companies are obliged to disclose to the public. Prosecutors and witnesses are obliged to disclose certain types of evidence in court trials, and jury members are obliged to be responsive to evidence when rendering their verdicts. But scientific evidence is often defined in purely statistical terms. We bridge this gap, by deriving theories of statistical evidence from ethical premises. Specifically, we argue that several theories of statistical evidence, including likelihoodism, Bayesian confirmation theory, and Robust Bayesianism (see below), answer the ethical question, “When is a scientist obliged to disclose experimental data or analysis?” In the future, we hope to extend our “ethics-first” approach to other epistemological concepts, such as knowledge or confirmation.

Present Bias in Competitive Task Completion

Can competition eliminate the harms of procrastination? We augment an existing model of present bias in task completion with a unique competitive twist, and find that a small amount of competition alleviates the significant harms of procrastination.

We build on a model originally proposed by Jon Kleinberg and Sigal Oren [KO14]. In their model, tasks are represented by directed, acyclic graphs with designated start node $s$ and end node $t$. The agent traverses a shortest $s\to t$ path with one twist: when the evaluating the cost of a path, they multiple the cost of the first edge by their bias parameter, $b$. They traverse the first step of this biased path, and then recompute the best path. Existing results show that these biased agents can take exponentially more costly paths through a given graph.

We augment this model by having two agents compete to finish the task first, with the winner getting a reward. We show that, for any graph, a very small amount of reward convinces biased agents to behave optimally, even when their natural behavior would have exponentially high cost. The amount of reward needed to guarantee optimal behavior with competition is also significantly less (in general) than several non-competitive reward schemes in the existing literature.

Robust Bayesianism: A New Theory of Statistical Evidence

We investigate a new theory of statistical evidence that analyzes likelihood-based methods through a Bayesian lens. Our theory better clarifies the objectivity of evidence, and takes the view that evidence primarily serves as a tool to convince a community of the truth of some hypothesis. Our theory is technically interesting because it’s compatible with weaker forms of rationality. Specifically, we work with a mathematical framework of qualitative, conditional probability, rather than quantitative degrees of belief.

We start from the perspective that evidence is inherently social – its primary purpose is persuasion. We thus formulate principles that essentially state that $E$ is good evidence for hypothesis $H_1$ over $H_2$ if it raises one’s posterior in $H_1$ by more than one’s posterior in $H_2$, regardless of one’s prior (it thus convinces all Bayesian agents). Our theory is closely related to likelihoodism. We show that likelihoodist principles are equivalent to our Robust Bayesian principles in the standard setting, where agents are assumed to have probabilistic beliefs. However, our theory generalizes to a setting of qualitative belief functions, and in this broader setting only some likelihoodist principles have analogs, suggesting that those principles which don’t generalize may simply be modeling artifacts. In a related line of work, we argue that social norms of rationality better justify some aspects of Bayesianism, especially in scientific contexts. That paper is also listed below, as the ‘Foundations Paper’

Privacy for Correlated Data

Differential privacy (DP) is the gold standard for provable database privacy guarantees. In this work, we investigate a stronger privacy definition, Bayesian differential privacy (BDP), which is necessary for correlated data. We provide optimal mechanisms for BDP when data correlations can be modeled by a Markov chain.

Though DP is a popular choice is many settings, we argue that DP either does not apply or provides insufficient guarantees when the database is correlated (and this correlation structure is easily inferrable). Unfortunately, databases almost always store correlated data (examples include location data, medical data, power grids, social networks, etc.), and the correlation models are often easily learned from historical data, and so should be assumed to be public knowledge. We thus investigate a stronger notion of privacy, BDP, which offers strong guarantees even when adversaries know the correlation structure.

We provide optimal mechanisms (in terms of noise-privacy tradeoffs) for achieving BDP with Markov chain data. Our mechanism is non-interactive, since it outputs a sanitized database, and local, since it doesn’t require a centralized data curator. We also experiment on real world heart rate data, demonstrating that our mechanism is robust to somewhat varying correlation models.

Trial and Error for Hidden Graph Properties

We investigate a novel property testing setting, where access to a graph is hidden by an oracle. In this setting, some graphs can never be fully learned, even with infinite queries. Despite this, some properties are efficiently certifiable; we provide a reduction to the standard setting and use matroids to form sufficient conditions for which properties are efficiently certifiable in this hidden setting.

In the standard property testing setting, researchers are interested in sublinear algorithms, since one can trivially learn the full input in linear time. In our setting, we fix the number of vertices and consider properties which are monotone increasing (in the number of edges), such as connectivity. Such properties are are fully defined by their certificates, minimal subgraphs which have the property (e.g. spanning trees for connectivity). While the number of vertices is known, the graph must be accessed through an oracle, which replies to potential certificates with the lexicographically first edge that doesn’t exist in the graph (or $\text{SAT}$ if the certificate is valid). The algorithm must proceed in a trial-and-error fashion, guessing certificates and getting a small amount of information with each query. However, because of the lex-first nature of the oracle, there are examples where one cannot fully learn the graph, even with infinite queries.

We develop a transfer theorem for this setting, by providing a poly-time reduction to the problem of certificate extension (in the non-hidden setting), which asks if one can extend a set of edges to a valid certificate. This problem is easy for some properties (connectivity) but NP-Hard for others (directed path). We provide some sufficient conditions for which properties are efficiently extensible by looking at whether the certificates form the bases of a matroid.

Modelling Altruism in Kidney Exchanges

We conducted preliminary (theoretical) research on the effects of altruistic donors in kidney exchanges, which were modelled as dynamic matching markets. This work is especially relevant in light of schemes to encourage altruism, such as the National Kidney Registry’s voucher program.

Single Document Summarization

For my senior capstone, we compared neural and combinatorial approaches to extractive single document summarization, where sentences from the document are concatenated into a summary. Our neural approaches were constructed from recent literature, and our combinatorial approaches reduced to MaxSAT or ILP. We also built a web-based visualizer to compare the summaries generated by different models.

A Strong Bridge Principle for Logical Normativity

This paper seeks to answer the question “In what sense does logic provide principles for what we ought to believe?” I argue in favor of a strong “bridge” principle that connects logical laws to rationality.

Tech Policy

I spent a year at UW’s Tech Policy Lab, where I synthesized security/privacy research around new and upcoming technologies for legal and policy audiences. The areas I focused on were autonomous vehicles, IoT devices, and cell-site simulators (devices of dubious legality that law enforcement officers use to monitor local cell traffic). We published a paper on the attitudes of children and parents towards internet connected toys.

Teaching

CS 4830: Introduction to Cryptography

CS 4820: Introduction to Analysis of Algorithms

CSEP 590B/CSE 490Z: Incentives in Computer Science

CSE 490C: Cryptography

CSE 311: Foundations of Computing I

CSE 421: Introduction to Algorithms

I was a teaching assistant for this course three times: Spring ‘18, Winter ‘19, and Spring ‘19.