20/03/2026 12:09 PM
20/03/2026 12:09 PM
Advice from Porche researchers:
Use customer support dataset for ambiguous question answering.
Sometimes you can get gaming crowd to do crowsourced datasets because you can get companies to give them skins if they participate.
Worked for them. A lot cheaper than getting annotators.
Also just get people to play the game by offering a PS5 or something. Need to have moderation tools to catch cheaters in real time so that you don't just get a bunch of people submitting meaningless responses.
Maybe if it is multi-player you can have players judge each other as well. Maybe turing test networks like Brent was presenting on yesterday?
They like the idea of using entailment to create a baseline. I think it would make sense to have human data as well as LLM generated data using LLMs fine tuned on the human data.
Should have many tasks represented so that it is an interesting dataset
How do we get the ambiguous questions in the first place?
https://scholar.google.com/citations?user=jee2Dy0AAAAJ&hl=en
Put people in situations where there will be ambiguity and have them wear glasses. Get the real distribution of questions people ask.
Even questions like "who is this in front of me".
Put vigil in the wild.
Have a guide and an executor. The executor has basically no idea what is going on and needs to rely on the guide to answer questions.
Need to have situations where multiple questions are asked?
Tasks:
1. Tell me the name of this person
2. Social tasks in general
3. Go to a conference. Who is this?
4. Where is my phone.
Personalized LLM. Can we teach our llm to speak like pre-k kids?
Mingqian Feng - Email about whether we can work together.
- [ ] @TODO Send this email and cc