Accelerate big data analytics and artificial intelligence (AI) solutions with Azure Databricks, a fast, easy and collaborative Apache Spark–based analytics service.
Data scientists, data engineers, and business analysts can collaborate on shared projects in an interactive workspace. Apply your existing skills with support for Python, Scala, R, and SQL, as well as deep learning frameworks and libraries like TensorFlow, Pytorch, and Scikit-learn. Native integration with Azure Active Directory (Azure AD) and other Azure services enables you to build your modern data warehouse and machine learning and real-time analytics solutions.
The Steps:
- Bring together all your structured, unstructured and semi-structured data (logs, files, and media) using Azure Data Factory to Azure Blob Storage.
- Use Azure Even Hubs to capture data continuously from any streaming source, or logs from website clickstreams, and process it in near-real time.
- Use Azure Databricks to clean and transform the structureless datasets and combine them with structured data from operational databases or data warehouses.
- Use scalable machine learning/deep learning techniques, to derive deeper insights from this data using Python, R or Scala, with inbuilt notebook experiences in Azure Databricks.
- Leverage native connectors between Azure Databricks and Azure SQL Data Warehouse to access and move data at scale.
- Power users take advantage of the inbuilt capabilities of Azure Databricks to perform root cause determination and raw data analysis.
- Run ad hoc queries directly on data within Azure Databricks.
- Take the insights from Azure Databricks to Cosmos DB to make them accessible through web and mobile apps.