Which Scenario Would Be Best Tackled Using Databricks Machine Learning

This article describes the tools that Azure Databricks provides to help you build and monitor AI and ML workflows. The diagram shows how these components work together to help you implement your model development and deployment process.

Why use Databricks for machine learning and deep learning?

With Azure Databricks, you can implement the full ML lifecycle on a single platform with end-to-end governance throughout the ML pipeline. Azure Databricks includes the following built-in tools to support ML workflows:

  • Unity Catalog for governance, discovery, versioning, and access control for data, features, models, and functions.
  • Lakehouse Monitoring for data monitoring.
  • Feature engineering and serving.
  • Support for the model lifecycle:
    • Databricks AutoML for automated model training.
    • MLflow for model development tracking.
    • Unity Catalog for model management.
    • Databricks Model Serving for high-availability, low-latency model serving. This includes Foundation Model APIs for deploying LLMs which allow you to access and query state-of-the-art open models from a serving endpoint.
    • Lakehouse Monitoring to track model prediction quality and drift.
  • Databricks Workflows for automated workflows and production-ready ETL pipelines.
  • Databricks Repos for code management and Git integration.
Refer to more articles:  Which State Of Matter Does This Model Represent

Deep learning applications on Databricks

Configuring infrastructure for deep learning applications can be difficult.

Databricks Runtime for Machine Learning takes care of that for you, with clusters that have built-in compatible versions of the most common deep learning libraries like TensorFlow, PyTorch, and Keras, and supporting libraries such as Petastorm, Hyperopt, and Horovod. Databricks Runtime ML clusters also include pre-configured GPU support with drivers and supporting libraries.

For machine learning applications, Databricks recommends using a cluster running Databricks Runtime for Machine Learning. See Create a cluster using Databricks Runtime ML.

Use Databricks for deep learning applications

Databricks Machine Learning provides pre-built deep learning infrastructure, enabling development across the entire deep learning lifecycle. Databricks Model Serving and Databricks Runtime for Machine Learning include built-in, pre-configured GPU support with drivers and supporting libraries. Databricks Model Serving enables creation of scalable GPU endpoints for deep learning models with no extra configuration. Databricks Runtime for Machine Learning includes the most common deep learning libraries like TensorFlow, PyTorch, and Keras and supporting libraries like Petastorm, Hyperopt, and Horovod.

To get started with deep learning on Databricks, see:

  • Best practices for deep learning on Azure Databricks
  • Deep learning on Databricks
  • Reference solutions for deep learning

Large language models (LLMs) and generative AI on Databricks

Databricks Runtime for Machine Learning includes libraries like Hugging Face Transformers and LangChain that allow you to integrate existing pre-trained models or other open-source libraries into your workflow. The Databricks MLflow integration makes it easy to use the MLflow tracking service with transformer pipelines, models, and processing components. In addition, you can integrate OpenAI models or solutions from partners like John Snow Labs in your Azure Databricks workflows.

Refer to more articles:  Which Shows A President's Involvement In Civic Life

With Azure Databricks, you can customize a LLM on your data for your specific task. With the support of open source tooling, such as Hugging Face and DeepSpeed, you can efficiently take a foundation LLM and train it with your own data to improve its accuracy for your specific domain and workload. You can then leverage the custom LLM in your generative AI applications.

In addition, Databricks provides Foundation Model APIs which allows you to access and query state-of-the-art open models from a serving endpoint. Using Foundation Model APIs, developers can quickly and easily build applications that leverage a high-quality generative AI model without maintaining their own model deployment.

For SQL users, Databricks provides AI functions that SQL data analysts can use to access LLM models, including from OpenAI, directly within their data pipelines and workflows. See AI Functions on Azure Databricks.

Databricks Runtime for Machine Learning

Databricks Runtime for Machine Learning (Databricks Runtime ML) automates the creation of a cluster with pre-built machine learning and deep learning infrastructure including the most common ML and DL libraries. For the full list of libraries in each version of Databricks Runtime ML, see the release notes.

To access data in Unity Catalog for machine learning workflows, the access mode for the cluster must be single user (assigned). Shared clusters are not compatible with Databricks Runtime for Machine Learning. In addition, Databricks Runtime ML is not supported on TableACLs clusters or clusters with spark.databricks.pyspark.enableProcessIsolation config set to true.

Create a cluster using Databricks Runtime ML

When you create a cluster, select a Databricks Runtime ML version from the Databricks runtime version drop-down menu. Both CPU and GPU-enabled ML runtimes are available.

Refer to more articles:  Which Is An Example Of An Organizational Process Asset

If you select a cluster from the drop-down menu in the notebook, the Databricks Runtime version appears at the right of the cluster name:

If you select a GPU-enabled ML runtime, you are prompted to select a compatible Driver type and Worker type. Incompatible instance types are grayed out in the drop-down menu. GPU-enabled instance types are listed under the GPU accelerated label.

Libraries included in Databricks Runtime ML

Databricks Runtime ML includes a variety of popular ML libraries. The libraries are updated with each release to include new features and fixes.

Databricks has designated a subset of the supported libraries as top-tier libraries. For these libraries, Databricks provides a faster update cadence, updating to the latest package releases with each runtime release (barring dependency conflicts). Databricks also provides advanced support, testing, and embedded optimizations for top-tier libraries.

For a full list of top-tier and other provided libraries, see the release notes for Databricks Runtime ML.

Next steps

To get started, see:

  • Tutorials: Get started with ML

For a recommended MLOps workflow on Databricks Machine Learning, see:

  • MLOps workflows on Azure Databricks

To learn about key Databricks Machine Learning features, see:

  • What is AutoML?
  • What is a feature store?
  • Model serving with Azure Databricks
  • Lakehouse Monitoring
  • Manage model lifecycle
  • MLflow experiment tracking

Related Posts

Which Type Of Therapy Is Best For Me Quiz

Which Type Of Therapy Is Best For Me Quiz

About Rational Emotive Behavior Therapy (REBT) and Cognitive Behavior Therapy Rational Emotive Behavior Therapy (REBT) and Cognitive Behavior Therapy (CBT) both follow the ABC model. “A” stands…

Which Nfl Kicker Won Mvp

Which Nfl Kicker Won Mvp

You may be interested Which Of The Following Areas Would Utilize Hcahps Which Way Should Outside Ac Fan Spin Which Biome Is Characterized By Permafrost Which Of…

Which Witch Is Which Lyrics

[Sparkle sound] Two little witches (two little witches) Played a little game (played a little game) They were very good friends (very good friends) Who looked the…

Which Of The Following Is A Cause Of Gynecological Emergencies

Surgical Management of Ectopic Pregnancy In the shocked patient with intraperitoneal bleeding, which is the typical presentation in the tropics, laparotomy should be undertaken. In subacute presentation…

Which Is The Beautiful Airport In The World

Which Is The Beautiful Airport In The World

Airports are more than just gateways to our destinations; they can also be architectural marvels and serene retreats. Some airports around the globe are designed to impress,…

Which Of The Following Is True Regarding Project Methodologies

Which Of The Following Is True Regarding Project Methodologies

Learning Objectives After completing this module, you will be able to:You may be interested Which Dungeons And Dragons Class Are You Which Kerastase Shampoo Is Best For…