Tech »  Topic »  Read graphs, diagrams, tables, and scanned pages using multimodal prompts in Amazon Bedrock

Read graphs, diagrams, tables, and scanned pages using multimodal prompts in Amazon Bedrock


Large language models (LLMs) have come a long way from being able to read only text to now being able to read and understand graphs, diagrams, tables, and images. In this post, we discuss how to use LLMs from Amazon Bedrock to not only extract text, but also understand information available in images.

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API. It also provides a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI.

Solution overview

In this post, we demonstrate how to use models on Amazon Bedrock to retrieve information from images, tables, and scanned documents. We provide the following examples:

  • Performing object classification and object detection tasks
  • Reading and querying graphs
  • Reading flowcharts and ...

Copyright of this story solely belongs to aws.amazon.com - machine-learning . To see the full text click HERE