Meet MMToM-QA: A Multimodal Theory of Mind Question Answering Benchmark
Understanding the Theory of Mind (ToM), the ability to grasp the thoughts and intentions of others, is crucial for developing machines with human-like social intelligence. Recent advancements in machine learning, especially with large language models, show some capability in ToM understanding. However, current ToM benchmarks primarily rely on either video or text datasets, neglecting the…