Skip to main content

Documentation Index

Fetch the complete documentation index at: https://openmetadata-codex-audit-docs-codebase-alignment.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Metadata Ingestion - Incremental Extraction

The default Metadata Ingestion roughly follows these steps:
  1. Fetch all the information from the Source.
  2. Compare the information with the OpenMetadata information to update it properly.
  3. Compare the information with the OpenMetadata information to delete entities that were deleted.
While on one hand this is a great simple way of doing things that works for most use cases since at every ingestion pipeline run we get the whole Source state, on other hand this is fetching and comparing a lot of data without need since if there were no structural changes we already know there is nothing to update on OpenMetadata. We implemented the Incremental Extraction feature to improve the performance by diminishing the extraction and comparison of uneeded data. How this is done depends a lot on the Source itself, but the general idea is to follow these steps:
  1. Fetch the last successful pipeline run.
  2. Add a small safety margin.
  3. Get all the structural changes since then.
  4. Flag deleted entities.
  5. Fetch/Compare only the entities with structural changes.
  6. Delete entities flagged for deletion.

External Ingestion

When using the Incremental Extraction feature with External Ingestions (ingesting using YAML files instead of setting it up from the UI), you must pass the ingestion pipeline fully qualified name to the configuration. This should be {service_name}{pipeline_name} Example:
source:
  serviceName: my_service
# ...
# Other configurations
# ...
ingestionPipelineFQN: my_service.my_pipeline

Feature available for

Databases

https://mintcdn.com/openmetadata-codex-audit-docs-codebase-alignment/tNjHrRMxU_eOcQue/public/images/connectors/bigquery.webp?fit=max&auto=format&n=tNjHrRMxU_eOcQue&q=85&s=f856090477585a3045e580f913e5d6ed

BigQuery

BETA | OpenMetadata
https://mintcdn.com/openmetadata-codex-audit-docs-codebase-alignment/7ZuRLdip1K9iVONr/public/images/connectors/redshift.webp?fit=max&auto=format&n=7ZuRLdip1K9iVONr&q=85&s=8425b74ba1616962002323232372c6df

Redshift

BETA | OpenMetadata
https://mintcdn.com/openmetadata-codex-audit-docs-codebase-alignment/UiXUHx8CQ7jXVLc-/public/images/connectors/snowflakes.webp?fit=max&auto=format&n=UiXUHx8CQ7jXVLc-&q=85&s=4dcebb88ee377b935ae18f8067f462a8

Snowflake

BETA | OpenMetadata