The Coeo Blog

Databricks Structured Streaming - Part 1 (Creating the Cluster)

Written by Andy Mitchell | 25-Feb-2020 12:15:00

Continuing on from our previous posts about Databricks, we are now going to look at structured streaming. For this series of blog posts we are going to use some data from the Seattle Fire Service that is updated every 5 minutes.

As not everyone has access to an Azure subscription or the resources to allow them to use an existing Databricks cluster, we will start off by creating a Databricks Community account.

By using the Community Edition of Databricks you, the reader, can experience the basics of structured streaming without the additional cost of a configuration that comes with the other editions.

The main objective of these blogs is to get you up and running with Databricks quickly so that you can play with the features and learn at your own pace.

  1. Navigate to the Databricks website
        
  2. Click on "TRY DATABRICKS" image on the right of the screen

        
  3. The following web page will be shown

        
  4. Choose "Community Edition" by clicking "GET STARTED"

         
  5. Sign up for the Community Edition

         
  6. Click "Sign Up" to confirm

        
  7. Once the sign up is complete you will be prompted to check your email address to complete the authentication
       
  8. Click on the link in the email to complete the signup process.

    

If all goes well you should now be able to sign into the Databricks Community Portal:

    
  1. Navigate to the Databricks website

        
  2. Click on the "LOG IN" link

        
  3. On the "Login" page click the "Sign in here" link next to the "Looking for Community Edition"

        
  4. Log in to Community Edition with the credentials that you signed up with
       
  5. If you receive the "Your email address is not verified" message then you have not verified your account using the email sent

        
  6. Log in to Databricks https://community.cloud.databricks.com/login.html

         
  7. From the left hand side click "Clusters" or choose "New Cluster" from common tasks

         
  8. Create a new cluster 

          
  9. Enter the name for the cluster

        
  10. Click "Create Cluster" and wait for the cluster to be created

        
  11. Once the cluster is created we can continue

 

In this post we have

  • Created a Databricks Community account
  • Created our first Databricks cluster.

In the next post we will explore the dataset that we want to use and prepare it for use.