Making SQL sense

+44 (0)20 3051 3595 info@coeo.com
coeoClose

Making SQL sense

+44 (0)20 3051 3595 info@coeo.com

Careers

We're looking for people who share our commitment to excellence in Microsoft's data platform to join us consultants working on exciting business intelligence, analytics, and SQL Server projects on-premises and in the cloud.

Getting Started with SSIS Scale Out

The Coeo Blog

Recently I was asked what my favourite SQL Server 2017 new feature was. The answer was easy, SSIS Scale Out. This is a feature that provides a quick and easy solution to a variety of ETL problems. If your ETL processing looks like one of these scenarios, this could be for you.

  1. You have a workloads that peak at certain times
    • at a specific time of day (overnight load, during business hours)
    • at specific times of month or year (month end, black Friday)
  2. You have a variety of unrelated ETL jobs owned by different people or business areas running on the same server and competing for resources at the same time
  3. You have an increasing data volume due to business growth, and you want your ETL solution to grow with you
  4. You have a variety of unrelated ETL jobs owned by different people or business areas installed on different servers making it hard to know what's installed where

Of course, there are many ways to manage these scenarios. The traditional approach would be to scale up; to install more resources, especially more memory. But that's not necessarily the ideal solution. For a start, if your issue is that all your loads are happening at the same time, then for the rest of the day your expensive additions are simply not getting used. And if your business is growing, do you resource your platform for now, 2 years’, 5 years’ time?

You could scale out and have multiple SSIS servers instead, adding more servers as your business grows. If you're using Azure VMs you could also stop and start them as required to ensure they're only running when needed. But then you introduce maintenance problems. Which server should packages be deployed to? What happens if they're deployed to the wrong place, or worse to multiple places? How many different versions of a package do you have doing similar things because nobody knew there was one already that did that?

Introducing SSIS Scale Out

SSIS Scale Out provides a way for you to have all your ETL in one place for ease of management but with your workloads distributed to avoid resource contention. You can even mix your on-premises and azure VMs in a hybrid estate. SSIS Scale Out consists of a master SSIS Server, and one or more workers. You deploy your packages to the master, schedule your executions and allow the workers to run the packages. And when you've done that, you can use the SSISDB Executions Dashboard to view the results. Sounds great, and it’s easy to set up too.

Overview

Installing Scale Out Master

The first thing to do is to install the Scale Out Master. This requires the following features to be installed:

SQL Server Installation Media - Feature Selection

  • SQL Server Database Engine   
  • Integration Services
  • Scale out Master

 

 

 

 

 

On the Database Engine Configuration screen, make sure that Mixed Mode is selected as SQL Authentication is required to allow the execution log user to write to SSISDB.

On the Integration Services Scale Out Configuration – Master Node screen, specify the port to use (the default is 8391), and a certificate to allow communication with the worker nodes. You have the option to create a new SSL Certificate, or use an existing one. Having the wizard create a new certificate is the easiest solution. By default, the machine name and private IP of the master node are included. If you are working in a hybrid environment and need to communicate between cloud and on-premises, you will need to include the public IP address as well, or use your own certificate that includes it.

04-SSISMasterNode-Certificate
 

Once the install is complete you’ll need to create a Firewall rule to allow traffic through the port that you’ve selected to use.

In SQL Server Management Studio, create an SSIS Catalog, ensuring that the “Enable this server as SSIS scale out master” is ticked. You can then deploy your packages as normal, and even execute them as normal, outside of the scale out architecture.

Installing Scale Out Worker

Copy the certificate generated by the installation of the Master (or your pre-created certificate) to the worker. The certificate can be found in your SQL Server t installation folder in \140\DTS\Binn\SSISScaleOutMaster.cer. The location on the worker doesn’t matter, I just use a temporary location where it’s easy to find. You don’t need to install the certificate.

 
SQL Server Installation Media - Feature Selection

To install the Scale Out Worker, run the SQL Server installation setup and select the Scale Out Worker feature on the Feature Selection screen.

 

 

 

 

On the Service Account screen, the Scale Out Worker is configured to use NT Service\SSISScaleOutWorker140. It is a good idea to change this to a domain account that will be permitted access to all the resources in the packages this worker will execute. I will cover this in more detail in a later post, but for my simple installation I created an account called SSIS_SVC which I use for all my workers. This account has read/write privileges on the databases as required. 

03-ServiceAccount

On the Integration Services Scale Out Configuration – Worker Node screen enter the master node endpoint. This is the fully qualified host name of the master, including the port number (e.g https://winssismaster.mydomain:8391). Enter the location of the SSISScaleOutMaster certificate that you copied earlier. The installation wizard will install the certificate for you.

Once the installation of the worker is complete, copy the generated worker certificate from \140\DTS\Binn\ SSISScaleOutWorker.cer to the SSIS Scale Out Master and install the certificate. Note, you must install this for Local Machine, into the Trusted Root Certification Authorities store.

Repeat this for each worker that you want to set up.

Configuring Scale Out

With all that done, you can now enable the workers for use. You can see the available workers by looking in the catalog.worker_agents view in SSISDB. You should see all of the workers you have installed, with IsEnabled set to 0. Note, it may take a few minutes for them to be visible.

You can enable each worker by calling catalog.enable_worker_agent and passing the WorkerAgentId. You can also enable workers by right clicking the catalog, selecting “Manage Scale Out” and clicking “Enable Worker” on the Worker Manager tab.

And there you are, up and running. You can execute packages in scale out by right clicking a package or project and selecting “Execute in Scale Out”. This allows you to pick one or more workers, or allow any worker.

And while this is useful for testing, in the real world you will want to be able to schedule your executions, and control which workers are selected. In the next post I'll talk about how to execute packages for different scenarios, and discuss adapting your Scale Out environment to meet different security requirements.

If you enjoyed this article, make sure you don't miss out on future updates - sign up to recieve the latest technical articles, exclusive event invitations and more:

New Call-to-action

Subscribe to Email Updates