Research

Jason’s research continues in multiple directions — though the threads often connect.

The first direction is towards using association as a learning signal — discovering what goes together, continuously, from lived experience. This versatile idea has been tested on literature, drug discovery (papers below) and is currently being explored in embodied AI (ongoing).

The second is the optical encoding of weight matrices in simulated holographic volumes to reveal interesting properties of distributed storage (papers below).

The third is reservoir computing in networks of oscillators spanning many frequency decades, where a slow rhythm selects which fast dynamics are doing the computing (currently underway).

Each paper has links to arXiv, Zenodo and the associated Git repo.

Published

Interference-Resistant Weight Matrix Updates in a Shared Holographic Volume via Differentiable Beam Propagation

The foundation: the first encoding scheme, and the first hint of the effect. A simulation study of storing many weight matrices in one shared wave-optical volume.

Data → Code →

Cooperative Encoding of Weight Matrices in a Simulated Multi-Rotational Wave-Optical Volume

The companion to the paper above: the mechanism behind the effect, the geometry that scales it, and the 64-matrix demonstration. A simulation study.

Data → Code →

Beyond Expression Similarity: Contrastive Learning Recovers Functional Gene Associations from Protein Interaction Structure

The same contrastive learning method that improved text retrieval and discovered narrative structure in 9,766 novels also works in molecular biology. Trained on protein interaction data from STRING, the model recovers functional gene relationships invisible to expression similarity — strong discrimination (AUC 0.908) precisely where similarity-based methods fall to chance (0.518). Biological associations transfer to unseen genes where text associations do not, improvement concentrates on understudied genes, and tighter association quality beats larger but noisier training data.

arXiv → Data → Code →

Concept Discovery Through Predictive Associative Memory

PAM trained at scale on 9,766 Project Gutenberg novels (25M text chunks, 373M temporal relationships) discovers hierarchical narrative structure without supervision. The model learns what passages do rather than what they’re about — grouping a Victorian chase with a Russian chase because they perform the same structural beat, despite sharing no vocabulary. Held-out novels receive coherent single-pass assignments, demonstrating inductive transfer. An interactive demo is live.

arXiv → Data → Code → Demo →

Predictive Associative Memory: Retrieval Beyond Similarity Through Temporal Co-occurrence

The foundational paper. A JEPA-style predictor trained on temporal co-occurrence retrieves items linked through shared experience rather than representational proximity. On a synthetic benchmark, its top retrieval is a true temporal associate 97% of the time — recovering associations across representational boundaries where cosine similarity scores zero. A temporal-shuffle control confirms the signal is genuine co-occurrence structure, not embedding geometry.

arXiv → Data → Code →

Association-Augmented Retrieval: Learning Corpus-Specific Associations for Multi-Hop Retrieval

The applied paper. A lightweight MLP (4.2M parameters) trained on passage co-occurrence reranks dense retrieval for multi-hop QA. On HotpotQA, Recall@5 improves 8.6 points without eval-set tuning, concentrated on the hardest questions (+28.5 where the dense baseline fails); on MuSiQue, +10.1. Training on similar-but-not-associated pairs actively degrades retrieval — association and similarity are sometimes opposite. Adds 3.7ms per query, trains in under two minutes.

Data → Code →

Current research

When Adding More Makes Each One SharperOngoing

Pack several patterns into one piece of holographic material and they fight for space — each new pattern muddies the ones already there. That’s the usual story for shared storage. We’ve been chasing a setup that breaks it.

In simulation, we send light through a shared three-dimensional volume and write many weight matrices into it at once, each tagged by its own angular signature. As we add matrices, the ones already stored get sharper. The volume carries far more structure than any single matrix needs, so every matrix we add pins that spare structure down a little further, tightening everything at once. We call it cooperative encoding.

The trick that unlocks it is geometric. Casting the optics in cylindrical coordinates gives exact, clean separation between channels — and because this lives in simulation, we can keep adding rotational axes that no physical optical system could have. Four axes give an enormous number of addressable slots. In one run, 64 matrices share a single volume, every one read back cleanly, none of them degraded.

A caveat we hold firmly: this is all simulation. We’re mapping the mathematical structure of wave-optical storage before we commit it to hardware. The structure is what excites us — a shared medium where room is abundant, channels stay clean, and packing things in tighter actually helps. Whether that becomes associative memory, a dense store for model weights, or something we haven’t pinned down, is what the next stretch of work is for.

Two papers tell the full story:

→ Interference-Resistant Weight Matrix Updates in a Shared Holographic Volume — the foundation: the first encoding scheme, and the first hint of the effect.

→ Cooperative Encoding of Weight Matrices in a Simulated Multi-Rotational Wave-Optical Volume — the mechanism behind it, the geometry that scales it, and the 64-matrix demonstration.

A Reservoir That Works Best With No Edge to Sit OnOngoing

Reservoir computing has a long-standing rule of thumb: these systems tend to compute best when poised at the “edge of chaos,” balanced between order and disorder. A natural idea is to turn that edge into a control. Take a network of oscillators whose natural frequencies span several decades, add a slow periodic drive, and use the drive to move the network across the edge — selecting how it computes as the rhythm advances. We set out to test whether that works. It doesn’t, and the reason it doesn’t is the substance of the result.

In simulation we drive a multi-decade oscillator network with a slow periodic signal and check whether the drive reshapes the network’s effective dynamics — whether performance tracks the bifurcations the drive is meant to engineer. Across four independent lines of attack the answer is consistent: it does not. The regime where the network computes best turns out to have no edge to sit on. It is deeply damped, held active only by the input passing through it, and no setting of the drive’s strength or frequency moves its stability appreciably.

The explanation comes down to bandwidth. A single rhythm can phase-match a set of oscillators packed into one narrow band, but a network spread across decades has no single collective mode for one rhythm to act on. The wider the frequency spread, the less any drive can engage it. Stated as a principle: parametric control of this kind requires spectral coherence, and a broadband reservoir does not have it.

The usual caveat applies — this is simulation throughout — and so does a second one: this is a negative result. We think it is a useful one. It closes a tempting line of work with a mechanism rather than a guess, and it leaves a positive finding standing. The right way to read this network’s memory is its driven stability spectrum, a diagnostic the standard edge-of-chaos picture overlooks in this setting.

The work also points forward. The same coupling that damps the network — oscillators of different speeds shedding energy into one another — is also a channel between them. The drive cannot tune the network, but information injected into one frequency band does reach the others through that lossy exchange. The next phase asks how much of it survives: whether a reading taken from the slow band can recover what was written into the fast band, through the coupling alone.

One paper is in preparation:

→ Floquet Engineering Fails on a Multi-Decade Oscillator Reservoir, and Why — the four null results, the spectral-coherence mechanism that unifies them, and the memory diagnostic that survives.

Other publications

Confidence-Weighted Plasticity

A reliability-weighted learning mechanism where component adaptability is determined by predictive accuracy. Plasticity reactivates automatically during distribution shifts — an investigation into managing the handoff from bootstrapping to intrinsic cortical function.

Data → Code →

Confidence-Weighted Plasticity: Experimental Validation and Boundary Conditions

Experimental validation confirming the core mechanism, alongside identification of boundary conditions in tightly-coupled architectures.

Data → Code →