Skip to main content
Version: 2024.1

Hugging Face Image To Text

This action can be executed on an asset level and lets you automatically send selected assets to a configurable Hugging Face endpoint to extract text from the images. The extracted text will be stored in the asset's metadata.

Configuration Options

# Name of the metadata field in which the extracted text will be stored.
meta_data_field_name: 'extracted_text'

# Language of the metadata field
meta_data_field_language: 'en'

# Name of a thumbnail configuration to be used.
asset_thumbnail_configuration_name: 'content'

# Endpoint url of the Hugging Face model to use.
model_endpoint: 'https://api-inference.huggingface.co/models/Salesforce/blip-image-captioning-base'

Detailed Configuration Options

  • meta_data_field_name: Required string. Name of the metadata field in which the extracted text will be stored. If the field does not exist, it will be created.
  • meta_data_field_language: Optional string. Language of the metadata field in which the extracted text will be stored.
  • asset_thumbnail_configuration_name: Optional string. Name of a thumbnail configuration to be used. Instead of the original image, the thumbnail will be sent to the endpoint.
  • model_endpoint: Required string. Endpoint url of the Hugging Face model to use.