About

Hello and welcome to my portfolio. I am currently a fifth year PhD student in Computer Science at Cornell Tech, where I am advised by Rafael Pass and Joe Halpern. Previously, I completed my M.S. in Computer Science at the University of Washington under Anna Karlin’s supervision.

My primary research interest is solving incentive problems in blockchain systems. For example, I’ve worked on pricing lending pools fairly, and analyzing transaction fee mechanisms in terms of their social welfare. I’m comfortable both with proof-heavy approaches, as well as using simulations and empirical data to analyze mathematical models. Currently, much of my work blends incentive analysis with consensus, including my projects on timing games and multiple concurrent proposers (see below).

More broadly, I’m interested in the intersection of economics and computer science. I enjoy creating compelling and realistic mathematical models of behavior, and rigorously investigating the outcomes of these models. My Master’s thesis considered a model of how agents might procrastinate when faced with a difficult task, and used competition to alleviate the harms of procrastination. I’ve also worked more broadly in theoretical computer science, investigating strong privacy definitions to protect correlated data and coordination games on large social networks. In the past, I’ve worked on tech policy.

In addition to my CS interests, I’m also interested in epistemology and the philosophy of statistics. My main goal is to provide an “ethics-first” approach to epistemology, by grounding concepts like knowledge, evidence, or explanation, in terms of obligations. As an example of this, we defined “evidence” in terms of what you have a duty to disclose. We showed that this definition is equivalent to several existing theories of statistical evidence, even in settings with only weak, qualitative requirements on rationality. This suggests that existing theories of statistical evidence may have foundational links to ethics.

Interests

  • EconCS
  • Blockchains
  • Epistemology/Philosophy of Statistics

Education

  • M.S. in Computer Science, 2020

    University of Washington

  • B.S. in Computer Engineering, 2019

    University of Washington

Projects

*

Throughput vs. Censorship with Multiple (Rational) Concurrent Proposers

Standard blockchain consensus protocols have a single proposer for each round. In their round, this proposer has a monopoly on block production; they alone can decide which transactions are included. Thus, this “monopolist” can easily censor transactions, either for their own preferences, or because they were bribed. For this reason, many blockchains are considering protocols with multiple concurrent proposers (MCP). This allows many proposers to build a single block, by having each construct a sub-block separately and then taking the union to form the main block. We formalize a notion of economic censorship resistance (eCR), and show that MCP can significantly increase the eCR as compared to a single proposer. However, with MCP, proposers can submit the same transaction, which decreases the overall throughput. We design an algorithm to compute an equilibrium for the block-building and censorship process with MCP, and use it to evaluate the throughput vs eCR tradeoffs, given a variety of different transaction fee mechanisms.

Curbing Spam Transactions with Priority Fee Ordering

Many Ethereum L2s, such as Base or Solana, have a high volume of “speculative backrunning” transactions, that search for and exploit arbitrage dynamically (on-chain). These transactions are sometimes considered spam, as vast majority fail to find any arbitrage, but still waste a lot of gas on searching. And some believe that this represents a barrier to scaling; as block capacity grows, more of it will go to spammers, leaving less for other users. We develop a formal model that computes the expected spam volume given other parameters, such as user demand and block capacity. We find evidence to that spam limits scaling, by lowering user welfare by a constant fraction, even as demand and capacity increase. We then show that priority fee ordering, as opposed to first-come first-serve sequencing, could dramatically lower the expected spam volume, preserving high user welfare.

The CoinAlg Bind: Profitability-Fairness Tradeoffs in Collective Investment Algorithms

We investigate trading bots or other collective investment strategies, and show that such systems face a fundamental dilemna: they can either guarantee economic fairness, by ensuring that the system is transparent, or profitability, but not both.

Collective Investment Algorithms (CoinAlgs) are increasingly popular systems that deploy shared trading strategies for investor communities. Their goal is to democratize sophisticated – often AI-based – investing tools. We identify and demonstrate a fundamental profitability-fairness tradeoff in CoinAlgs that we call the CoinAlg Bind: CoinAlgs cannot ensure economic fairness without losing profit to arbitrage. We present a formal model of CoinAlgs, with definitions of privacy (incomplete algorithm disclosure) and economic fairness (value extraction by an adversarial insider).

We prove two complementary results that together demonstrate the CoinAlg Bind. First, privacy in a CoinAlg is a precondition for insider attacks on economic fairness. Conversely, in a game-theoretic model, lack of privacy, i.e., transparency, enables arbitrageurs to erode the profitability of a CoinAlg. Using data from Uniswap, a decentralized exchange, we empirically study both sides of the CoinAlg Bind. We quantify the impact of arbitrage against transparent CoinAlgs. We show the risks posed by a private CoinAlg: even low-bandwidth covert-channel information leakage enables unfair value extraction.

Timing Games in Responsive Consensus Protocols

Timing games are when block proposers intentionally delay their proposals to extract more MEV. While causing some issues on ETH, timing games are much worse with responsive consensus protocols, potentially defeating responsiveness entirely. We introduce a dynamic block reward, with higher rewards going to validators with shorter round durations; these durations are measured by votes from the other validators. We show that proposing early is a Nash equilibrium in this mechanism, and prove some coalition-resistant guarantees. Much of the coalition analysis hinges on the fact that validators want a higher rate of reward, so others delaying hurts them. Thus, our work shows a surprising result: rather than responsiveness being an unattainable property due to timing games, responsiveness itself can promote faster block proposals.

Block proposers can extract more MEV by proposing later within their window, as more transactions appear and markets diverge. This gives rise to timing games, where block proposers try to wait as long as possible before proposing. On Ethereum, this results in a small number of missed blocks. But in a responsive protocol, the round durations are dynamic, with the new round beginning as soon as quorum is achieved on the current block. Timing games are much more problematic here, causing everyone to wait as long as possible to propose, thus defeating responsiveness.

We develop a model of timing games that reveals find a prisoner’s dilemma structure. Cooperation (proposing promptly) is in the validators’ best interest, as the overall system runs faster, increasing their rate of reward. But individual incentives encourage validators to delay proposals selfishly. This motivates our voting approach, as that allows validators to coordinate to reach the better equilibrium.

We also analyze latency fairness. Ideally, everyone’s average rewards should be directly proportional to their stake. In reality, having lower latency to the rest of the network means you can potentially wait longer, or include more transactions in the same time period. This problem exists even with static block rewards, but does get worse for our dynamic block rewards. However, our analysis of real-world data suggests that the fairness degradation is very small.

Pricing Crypto Lending Pools with Options

Cryptocurrency lending pools offer overcollaterized loans with one cryptocurrency held as collateral and a different currency given out as a loan. Because they are overcollaterized, they do not increase the borrower’s liquidity, and so should not be compared to standard loans. Rather, lending pools allow borrowers to increase their exposure to a particular currency, just as options allow investors to increase their exposure in the standard financial world. As a result, we build models to price the interest rates and other parameters of lending pools by replicating them as options, and using ideas from options pricing.

Lending pool loans are overcollaterized, and if the value of the collateral dips too low, the entire loan is liquidated: the borrower cannot repay the loan to claim their collateral, which is instead sold off at a discount. Borrowers have the option to “top-up” their loans at any time by depositing more collateral which they retrieve when they repay the loan.

We first use the famous Black-Scholes model, which assumes that the price of the asset (i.e., the crypto) obeys geometric brownian motion, that the market is frictionless, and that trades are continuous. If we simplify lending pools by ignoring the ability of borrowers to top up their loans, we can replicate lending pool loans via (down-and-out) barrier options, which are options that are called off when the price of the asset dips too low. This lets us get a simple model for pricing interest rates and collaterization parameters of lending pools.

However, modeling top-ups requires us to deviate from the Black-Scholes model. In particular, top-ups are trivial if there is no discount factor. So, we augment the Black-Scholes model to allow for discounting, and implement simulations to price interest rates with top-ups. To do so, we consider loans which borrowers are allowed to top-up a finite number of times. We show that such loans can be considered nested barrier options, such that topping up $k$ times is a barrier option with a rebate equivalent to the value of being allowed to top-up only $k-1$ times. We show, via simulations, that the values of these limited top-up options converges quickly.

Finally, we study lending pools in practice, and show the discount rates induced by their interest rates, under our model.

Chunking Tasks for Present-Biased Agents

A well known technique that instructors use to combat procrastination is to break up a large project into more manageable chunks. But how should this be done best? We formalize chunking within an existing model of present bias for task completion, derive algorithms for optimal chunking, and prove that a relatively small amount of chunking can ensure that biased agents behave optimally.

We build on a model originally proposed by Jon Kleinberg and Sigal Oren [KO14]. In their model, tasks are represented by directed, acyclic graphs with designated start node $s$ and end node $t$. The agent traverses a shortest $s\to t$ path with one twist: when the evaluating the cost of a path, they multiply the cost of the first edge by their bias parameter, $b$. They traverse the first step of this biased path, and then recompute the best path. Existing results show that these biased agents can take exponentially more costly paths through a given graph.

In our model, the task designer can chunk edges into smaller pieces. We show that, for edges along the shortest path, the optimal way to chunk an edge is to make initial pieces easier and later pieces progressively harder. For edges not along the shortest path, optimal chunking is significantly more complex, but we provide an efficient algorithm. We also show that with a linear number of chunks on each edge, the biased agent’s cost can be exponentially lowered to within a constant factor of the true cheapest path. Finally, we show how to optimally chunk task graphs for multiple types of agents simultaneously.

An “Ethics-First” Approach to Evidence

There are many ethical norms surrounding evidence. Evidence determines when researchers should halt drug trials, and what pharmaceutical companies are obliged to disclose to the public. Prosecutors and witnesses are obliged to disclose certain types of evidence in court trials, and jury members are obliged to be responsive to evidence when rendering their verdicts. But scientific evidence is often defined in purely statistical terms. We bridge this gap, by deriving theories of statistical evidence from ethical premises. Specifically, we argue that several theories of statistical evidence, including likelihoodism, Bayesian confirmation theory, and Robust Bayesianism (see below), answer the ethical question, “When is a scientist obliged to disclose experimental data or analysis?” In the future, we hope to extend our “ethics-first” approach to other epistemological concepts, such as knowledge or confirmation.

Present Bias in Competitive Task Completion

Can competition eliminate the harms of procrastination? We augment an existing model of present bias in task completion with a unique competitive twist, and find that a small amount of competition alleviates the significant harms of procrastination.

We build on a model originally proposed by Jon Kleinberg and Sigal Oren [KO14]. In their model, tasks are represented by directed, acyclic graphs with designated start node $s$ and end node $t$. The agent traverses a shortest $s\to t$ path with one twist: when the evaluating the cost of a path, they multiple the cost of the first edge by their bias parameter, $b$. They traverse the first step of this biased path, and then recompute the best path. Existing results show that these biased agents can take exponentially more costly paths through a given graph.

We augment this model by having two agents compete to finish the task first, with the winner getting a reward. We show that, for any graph, a very small amount of reward convinces biased agents to behave optimally, even when their natural behavior would have exponentially high cost. The amount of reward needed to guarantee optimal behavior with competition is also significantly less (in general) than several non-competitive reward schemes in the existing literature.

Robust Bayesianism: A New Theory of Statistical Evidence

We investigate a new theory of statistical evidence that analyzes likelihood-based methods through a Bayesian lens. Our theory better clarifies the objectivity of evidence, and takes the view that evidence primarily serves as a tool to convince a community of the truth of some hypothesis. Our theory is technically interesting because it’s compatible with weaker forms of rationality. Specifically, we work with a mathematical framework of qualitative, conditional probability, rather than quantitative degrees of belief.

We start from the perspective that evidence is inherently social – its primary purpose is persuasion. We thus formulate principles that essentially state that $E$ is good evidence for hypothesis $H_1$ over $H_2$ if it raises one’s posterior in $H_1$ by more than one’s posterior in $H_2$, regardless of one’s prior (it thus convinces all Bayesian agents). Our theory is closely related to likelihoodism. We show that likelihoodist principles are equivalent to our Robust Bayesian principles in the standard setting, where agents are assumed to have probabilistic beliefs. However, our theory generalizes to a setting of qualitative belief functions, and in this broader setting only some likelihoodist principles have analogs, suggesting that those principles which don’t generalize may simply be modeling artifacts. In a related line of work, we argue that social norms of rationality better justify some aspects of Bayesianism, especially in scientific contexts. That paper is also listed below, as the ‘Foundations Paper’

Privacy for Correlated Data

Differential privacy (DP) is the gold standard for provable database privacy guarantees. In this work, we investigate a stronger privacy definition, Bayesian differential privacy (BDP), which is necessary for correlated data. We provide optimal mechanisms for BDP when data correlations can be modeled by a Markov chain.

Though DP is a popular choice is many settings, we argue that DP either does not apply or provides insufficient guarantees when the database is correlated (and this correlation structure is easily inferrable). Unfortunately, databases almost always store correlated data (examples include location data, medical data, power grids, social networks, etc.), and the correlation models are often easily learned from historical data, and so should be assumed to be public knowledge. We thus investigate a stronger notion of privacy, BDP, which offers strong guarantees even when adversaries know the correlation structure.

We provide optimal mechanisms (in terms of noise-privacy tradeoffs) for achieving BDP with Markov chain data. Our mechanism is non-interactive, since it outputs a sanitized database, and local, since it doesn’t require a centralized data curator. We also experiment on real world heart rate data, demonstrating that our mechanism is robust to somewhat varying correlation models.

Trial and Error for Hidden Graph Properties

We investigate a novel property testing setting, where access to a graph is hidden by an oracle. In this setting, some graphs can never be fully learned, even with infinite queries. Despite this, some properties are efficiently certifiable; we provide a reduction to the standard setting and use matroids to form sufficient conditions for which properties are efficiently certifiable in this hidden setting.

In the standard property testing setting, researchers are interested in sublinear algorithms, since one can trivially learn the full input in linear time. In our setting, we fix the number of vertices and consider properties which are monotone increasing (in the number of edges), such as connectivity. Such properties are are fully defined by their certificates, minimal subgraphs which have the property (e.g. spanning trees for connectivity). While the number of vertices is known, the graph must be accessed through an oracle, which replies to potential certificates with the lexicographically first edge that doesn’t exist in the graph (or $\text{SAT}$ if the certificate is valid). The algorithm must proceed in a trial-and-error fashion, guessing certificates and getting a small amount of information with each query. However, because of the lex-first nature of the oracle, there are examples where one cannot fully learn the graph, even with infinite queries.

We develop a transfer theorem for this setting, by providing a poly-time reduction to the problem of certificate extension (in the non-hidden setting), which asks if one can extend a set of edges to a valid certificate. This problem is easy for some properties (connectivity) but NP-Hard for others (directed path). We provide some sufficient conditions for which properties are efficiently extensible by looking at whether the certificates form the bases of a matroid.

A Strong Bridge Principle for Logical Normativity

This paper seeks to answer the question “In what sense does logic provide principles for what we ought to believe?” I argue in favor of a strong “bridge” principle that connects logical laws to rationality.

Tech Policy

I spent a year at UW’s Tech Policy Lab, where I synthesized security/privacy research around new and upcoming technologies for legal and policy audiences. The areas I focused on were autonomous vehicles, IoT devices, and cell-site simulators (devices of dubious legality that law enforcement officers use to monitor local cell traffic). We published a paper on the attitudes of children and parents towards internet connected toys.

Teaching

High School Research Mentorship

I’ve done 1-on-1 research mentoring for over 30 high school students, on a wide range of topics, including historical economics, machine learning, philosophy, and more. Several of my students have published papers, and won science fairs.

CS 4830: Introduction to Cryptography

CS 4820: Introduction to Analysis of Algorithms

CSEP 590B/CSE 490Z: Incentives in Computer Science

CSE 490C: Cryptography

CSE 311: Foundations of Computing I

CSE 421: Introduction to Algorithms

I was a teaching assistant for this course three times: Spring ‘18, Winter ‘19, and Spring ‘19.