Application

Databricks is a cloud "lakehouse" platform that can handle both data warehouse and data lake workloads. It allows users to create and run data pipelines, develop and deploy analytics and machine learning models etc.

Databricks is based on Spark, providing automated cluster management and iPython-style notebooks (Databricks data pipelines).

Control-M Integration with Databricks enables you to do the following:

  • Connect to any Databricks workspace using Databricks Personal access token (PAT) for authentication.
  • The PAT token can be generated in your Databricks workspace regardless the cloud vendor.
  • Integrate Databricks jobs with other Control-M jobs into a single scheduling environment.
  • Trigger and monitor your Databricks jobs and view the results in the Monitoring domain.
  • Attach an SLA job to your entire Databricks data service.
  • Introduce all Control-M capabilities to Databricks, including advanced scheduling criteria, complex dependencies, quantitative and control resources, and variables.
  • Run 50 Databricks jobs simultaneously per Control-M/Agent.

Control-M Integration with Databricks is available for these product versions:

  • Control-M 20.200 and later
  • Control-M SaaS 21 and later

Supporting Documentation

For more information on this integration, including how to create a connection profile and define a job, please visit:

Integrating Databricks and Control-M

Integrating Databricks and Control-M SaaS

Topic

Business & IT Automation

Publisher

BMC Software