Metamapper
  • Documentation
  • Discussion
  • Blog
  • Github

›Installation (Self-Hosted)

Metadata Management

  • Welcome
  • Getting Started
  • Schema Inspection
  • SSH Tunnels
  • Custom Properties
  • Annotations
  • Search

Datastores

  • Overview
  • AWS Athena
  • AWS Glue Data Catalog
  • Azure Synapse
  • Google BigQuery
  • Hive Metastore
  • MySQL
  • Oracle
  • PostgreSQL
  • Redshift
  • Snowflake
  • Microsoft SQL Server

Workspace Management

  • Introduction
  • Access Management
  • Single Sign-On (SSO)
  • SSO Setup: Google
  • SSO Setup: Github
  • SSO Setup: SAML2

Installation (Self-Hosted)

  • Getting Started
  • Configuring Metamapper
  • Extensions
  • Asynchronous Workers
  • Email Configuration
  • File Storage
  • Security
  • Search
  • Healthchecks

Asynchronous Workers

Metamapper uses a queue-based architecture to process certain load-intensive tasks asynchronously. For example, schema inspection tasks are queued periodically and processed via background workers so that user experience is not negatively impacted.

Metamapper uses Celery to schedule periodic tasks and manage workers.

Running a Worker

Workers can be started like any other Celery application:

$ celery worker --app metamapper -l info

If you installed Metamapper using the suggested Docker setup, this is synonymous to running the following docker-entrypoint command:

$ docker run --env-file .env metamapper/metamapper:latest worker

We recommend that you run multiple workers to ensure timely processing of tasks.

Configuring the Broker

Metamapper uses config_from_envvar to configure different aspects of Celery. Refer to the Advanced Configuration section to see how to override the default settings.

Any of the three stable brokers mentioned in the Celery documentation are currently supported.

Redis

Redis should work as a broker in the vast majority of instances, which is why we recommend it as the default message broker.

broker_url = "redis://localhost:6379/0"

RabbitMQ

For handling high traffic workloads, we recommend using RabbitMQ as the broker.

broker_url = "amqp://guest:guest@localhost:5672/metamapper"

SQS

You can also use Amazon SQS as a broker. Please note that SQS has not been actively tested by our development team.

One known limitation of SQS and Celery is there is no supported results backend. However, Metamapper does not rely on the results backend, so it should (in theory) behave properly.

broker_url = "sqs://"

Starting the Scheduler

Metamapper relies on scheduled tasks to operate. For example, Metamapper inspects connected datastores every hour to check for schema changes. This periodic task scheduling is handled via the scheduler process.

You can start the scheduler using the following command:

$ celery beat --app metamapper -l info

If you installed Metamapper using the suggested Docker setup, this is synonymous to running the following docker-entrypoint command:

$ docker run --env-file .env metamapper/metamapper:latest scheduler
Last updated on 7/9/2020
← ExtensionsEmail Configuration →
  • Running a Worker
  • Configuring the Broker
    • Redis
    • RabbitMQ
    • SQS
  • Starting the Scheduler
Metamapper
Documentation
User GuideInstallation Guide
Community
DiscussionGitHub
Copyright © 2020 Scott Cruwys