Multi-Turn Multi-Modal Question Clarification for Enhanced Conversational Understanding

⬅️ [Recursive Langauge Models](<./Recursive Langauge Models.md>) | ⬆️ [Reading List](<./README.md>) | [Linguistic structure from a bottleneck on sequential information processing](<./Linguistic structure from a bottleneck on sequential information processing.md>) ➡️

https://arxiv.org/abs/2502.11442

Look further into the Melon (Yuan et al., 2024) dataset. The multi-turn dataset is constructed of multiple concatenated single turn dialogs that are put together to ensure that multiple pieces of information need to be clarified.

They don't give examples of their dataset. I wish I knew what it actually consisted of. What goes into the model? They named the dataset ClariMM, but I can't find it anywhere.

Don't use RL. Just construct a multi-turn dataset from scratch and train on it.

Find limited performance boost of asking questions after the first one.

[ ] @TODO Read this paper
[ ] @TODO Check if the dataset from this paper is useful

⬅️ [Recursive Langauge Models](<./Recursive Langauge Models.md>) | ⬆️ [Reading List](<./README.md>) | [Linguistic structure from a bottleneck on sequential information processing](<./Linguistic structure from a bottleneck on sequential information processing.md>) ➡️