03-03-2023 17:40 via tech.slashdot.org

Microsoft Unveils AI Model That Understands Image Content, Solves Visual Puzzles

Researchers from Microsoft have introduced Kosmos-1, a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions. From a report: The researchers believe multimodal AI -- which integrates different modes of input such as text, audio, images, and video -- is a key step to building artificial general intelligence (AGI) that can perform general tasks at the level of a huma
Read more »