Skip to main content
Version: 2023.3

Indexing Details

Index Structure

All data that is delivered by simple rest endpoints is indexed in elasticsearch indices. Queries and data delivery takes place directly out of the elasticsearch (not from Pimcore database).

For each Datahub configuration separate elasticsearch indices will be created and updated.

Indexing of data takes places asynchronously with an update queue and the process queue command datahub:simple-rest:process-queue. This command needs to be executed on a regular base, e.g. every 5 minutes.

Index mapping and queue filling takes place automatically when creating and updating Datahub configurations. In addition, also commands for index management are available (datahub:simple-rest:create-or-update-mapping, datahub:simple-rest:init-index).

Per endpoint, multiple indices are created - one for each dataobject class, one for dataobject folders, one for assets and one for asset folders.

For assets meta data exif, xmp and iptc, indexed dynamic objects are used. It might be necessary to turn off the indexing for these objects in order to avoid indexing conflicts or when the limit of maximum data fields is reached.

Indexing can be turned off in bundle configuration in indexing_options area. Then data is stored in index and delivered via endpoint, but it is not indexed.

Tree Hierarchy Management The indexing process tries to keep a valid folder structure in index. Based on workspace settings a combined parent folder is calculated. This combined parent folder, might be a sub folder of the parent folder in Pimcore folder structure, and all element paths are rewritten to it.

Also it might be possible, that due to workspace and data schema settings, missing links in folder structure occur. In this case, the indexing process creates virtual folders to fill up these gaps.

For updating whole index structure after changes, multiple runs of datahub:simple-rest:process-queue might be necessary (since additional items might be added to queue during queue processing).

Indexing Options

Via symfony configuration, detailed indexing options can be configured.

Global Options

Define some global options like automatic data type detection for dynamic objects.

pimcore_data_hub_simple_rest:
indexing_options:
global_options:

# Enable numeric detection for dynamic objects (like embedded asset meta data, etc.)
numeric_detection: false

# Enable date detection for dynamic objects (like embedded asset meta data, etc.)
date_detection: true

Assets

Define, if embedded metadata of assets should be indexed. If, they will be indexed as dynamic objects.

pimcore_data_hub_simple_rest:
indexing_options:
assets:

# Enable indexing for exif data
enable_exif: true

# Enable indexing for xmp data
enable_xmp: true

# Enable indexing for iptc data
enable_iptc: true

Number of Shards

By default, number of shards for indexes is set to 1. Configurations allow to change that for all created indices via the default_number and on index level via the index_specific setting if needed.

# Configure number of shards for created indices
pimcore_data_hub_simple_rest:
indexing_options:
number_of_shards_config:

# default number is picked if no index specific settings is set
default_number: 3

# Define number of shards for certain indices. Define index name (without -odd/-even postfix) as key, and number of shards as value.
index_specific:
enterprise_simple_rest__pt_rest__asset: 5
enterprise_simple_rest__<ENDPOINTNAME>__<asset|DataObjectClass>: 3