Discussion about this post

User's avatar
Rohit Krishnan's avatar

Great essay btw, big fan of applying transformers to other domains. Meanwhile, I skimmed the code. Think your solve_policy_matrix solvest eh wrong linear system, it's NK-like not NK I think. M = np.kron(R.T, A) + np.kron(I3, B) (+ instead of -) . also since R is diagonal you could just solve column by column, would be easier. The pooled varx is also thetea agnostic btw.

Also, thinking out loud, since the transformer is learning input as theta and history of innovations, output as yt, the mapping is linear, I wonder whether we should make it more complex since we know transformers can learn linear-ish relations quite well anyway? The tests also checks that model runs btw, I assume that's intentional.

Expand full comment
1 more comment...

No posts

Ready for more?