MM1 is a multimodal large language model, or MLLM, meaning it is trained on images as well as text. This allows the model to respond to text prompts and also answer complex questions about particular images.