I finally shipped a personal site. Felt overdue.
Here's what I'm working on right now.
Paper at ENASE 2026. Accepted. It's about the trust gap in AI agent pipelines. When an agent calls a tool, it blindly trusts the response. A compromised database, a man-in-the-middle, a prompt injection payload in tool output. The agent has no way to tell. We propose using Authenticated Data Structures (the same cryptographic primitive that lets Bitcoin lightweight clients verify transactions) so agents can verify tool responses against a 32-byte digest. We demonstrate feasibility through ADS4All, our framework for building generic ADS in Haskell, and its Python port.
Building ProofTrail. This grew out of the research. ProofTrail is a decorator-based SDK that produces tamper-evident audit trails for AI agent tool calls. You annotate a tool function, and every call gets hashed (SHA-256), chained to the previous entry, and added to a Merkle tree. Any call can be independently verified against the tree root after the fact. The paper shows this adds about 210 microseconds of overhead even for large payloads. It started as a research artifact and is becoming a product.
Exploring lazy code synthesis. More speculative. What if LLM-generated code only materialized when it was actually called? You write a skeleton with type hints and a docstring, attach a contract (Hypothesis property-based tests), and the real implementation gets synthesized on demand. The LLM gets the actual runtime arguments as context, not just the signature. I have a working proof of concept but there's a long way to go.
The thread connecting all of this is cryptographic verification of computational processes. Whether it's agent tool calls, audit trails, or generated code, the question is the same: can you prove what actually happened?