05/02/2026 1:45 PM
05/02/2026 1:45 PM
Where does Jason keep up on papers? New ones and exciting ones. Not things that are becoming classics or ones just in our subfield?
Lab Meeting:
Andy Presentation
Neural Tool Invocation via Learned Compression
How does the classification work? Just outputs in response?
Are the tags actual tokens you put in or just outputs?
What is the prompt tp enable tool use? Is it standard?
Are models fine-tuned to use a specific format for tool inputs and outputs?
Vigil only executes tools after the response ends. Doesn't especially know whether production systems do this, but probably?
What LLMs have the ability to produce continuous tokens? How do they do it? A separate head?
Is this like attending to a "tool token"? Like contrastive learning is similar to what happens inside attention. Perhaps there is something like inserting tokens into the last layer of the transformer and if they are highly attended to you run the tool and the value is an output of the tool somehow? Probably not. How do you embed the entire output into a single token on the output? Perhaps you don't need to? Insert new tokens partially through the transformer?
He clustered similar tools. Are they closer in the embedding space than dissimilar ones?
What loss is here?
He's using circle loss.
Do past works train on toolbench or is it exclusively for evaluating?