We are a compact and hardcore research team focused on harnessing the power of Multimodal Agents. Our goal is to develop these agents into tools that are helpful, reliable and fully automated. Achieving this goal involves a multifaceted approach, encompassing a broad spectrum of technologies, including:
  1. Trustworthy AI - Ensuring our models behave responsibly.
  2. Self-Evolving Machines - Allowing models to improve on its own.
  3. Compositional Agents - Creating systems that can dynamically assemble multiple models to collectively tackle complex tasks.
  4. Controllable Image and Video Generation - Developing methods to generate visual content that is both high-quality and easily directed by user input.


Is Your Multimodal Large Language Model Oversensitive to Safe Queries?

Humans are prone to cognitive distortions — biased thinking patterns that lead to exaggerated responses to specific stimuli, albeit in very different contexts. This paper demonstrates that advanced MLLMs exhibit similar tendencies. While these models are designed to respond queries under safety mechanism, they sometimes reject harmless queries in the presence of certain visual stimuli, disregarding the benign nature of their contexts.

To systematically evaluate MLLMs' oversensitivity to these stimuli, we propose the Multimodal OverSenSitivity Benchmark  (MOSSBench).


At TurningPoint AI, there are no ranks or limits - only opportunities. We prioritize transparency, fairness, and integrity. Active contribution and team spirit are fundamental to our team culture. For research assistants and collaboration opportunities:

