Running Scheduled Tasks Against Azure SQL Database

Published On: 2020-07-14By:

If you’ve worked with Microsoft SQL Server for any period of time, you are familiar with the SQL Server Agent. The Agent, which remains mostly unchanged since I started working with in 1999, is a fairly robust job scheduler that can also alert you in the event of job failures or system errors. I feel as though it’s a testament to the quality of the original architecture that the code hasn’t changed very much–it still meets the needs of about 90-95% of SQL Server workloads, based on an informal twitter discussion I had a few months ago. There are some cases where an enterprise scheduling tool is needed, but for maintaining most SQL Servers and executing basic ETL, the agent works fine. There’s one problem–the agent is only available in SQL Server and Azure SQL Managed Instance.

colorful toothed wheels
Photo by Digital Buggu on Pexels.com

The lack of an agent, along with the lack of cross-database queries (I know you can do elastic query, it’s not the same thing) has been one of the major barriers to entry to Azure SQL Database. There are a few options that you have depending on the nature of what operations you are running. I’m not going to walk through how to build each of the options–that’s a different post. I will talk about the costs and benefits of each operation.

Automation Runbooks

Automation has been a part of Azure since at least around 2013, and perhaps even longer than that. The main component is called a runbook, which houses your code, and can be scheduled, or manually executed. Runbooks can be PowerShell, Python, or Graphical. The graphical runbooks are pretty limited in operations, and I’ve never seen a Python runbook for SQL Server operations in seven years. Automation has a decent learning curve–it’s not easy, for example, after creating your account, you will need to import all of the PowerShell modules you need to use. And there is the wonkiness of executing SQL commands via PowerShell–you can use invoke-sqlcmd, or import DBATools and use some of the options there. The schedule recurrence has a minimum recurrence of an hour–which is fine for most maintenance activities, but probably won’t work for granular operations.

The other thing you should now about automation is that you need to have your PowerShell code written and mostly working before you run in testing in your automation account. There’s a couple of commands, you can in only run in the context of a runbook (specifically around connecting to Azure itself–it it’s just a SQL query this is pretty trivial) so you want to have your basic code working before you have to troubleshoot the automation path. You can connect to Azure from within your runbook–this is particularly useful if you need to run a query against all of the databases within in a subscription. The alerting process for job failures is challenging–failures aren’t always captured correctly by the automation framework correctly, so you need to build your own error handling.

With all that said–automation is probably the most robust option for large scale operations around Azure SQL Database. Which brings us to Logic Apps…

Logic Apps

Logic Apps, Azure Functions, and Power Automate all offer similar functionality at different levels of abstraction and code writing. Functions, are generally speaking purely code operations and provide a great deal of flexibility, where as Power Automate lands on the almost “no code” side of things. Logic Apps fall squarely in the middle allowing you to use some code, but allowing for “development by click”. To compare Logic Apps to Automation, it took me about 30 minutes to build a runbook this morning to execute some queries against a SQL DB (I had the PoSH written already), where as it took me about 5 minutes to build a Log App to do the same tasks. Logic Apps can be triggered by a schedule (which has a per minute granularity). Logic apps aren’t quite as robust in terms of scoping–it would be really painful for me to build an app that queried every database in my subscription. I would probably have to build a large series of logic apps to do that.

A couple of things to note with Logic Apps–SQL Server = Azure SQL Database–they use the same connector, and it’s a little confusing as Azure SQL DW (yes, it’s Synapse Analytics now, the connectors haven’t been updated) has it’s own set of connectors. If you need to repeatedly run a query (or small set of queries) against a single database, this is your best option.

Elastic Jobs

Elastic jobs are an Azure SQL construct, that requires a main job database that orchestrates jobs amongst the other databases. While there is a portal option to create your job server, the jobs have to be defined either using T-SQL or PowerShell. While Elastic Jobs are GA and the Jobs database effectively has an Agent schema, and offers very similar functionality, the lack of a GUI, and broad community adoption has limited the uptake of elastic jobs. I tried to implement for a project I was working on, and ultimately gave up, because the documentation was limited, and I had Automation code that just worked. I would really like to see better support for Elastic Jobs in the Azure portal–it should be really easy for a user to deploy a job from either the context of their Job database, or the database they are working on. I think this would be the best solution for Azure SQL Database scheduling.

Contact the Author | Contact DCAC

Imitation is NOT a Form of Flattery After All

Published On: 2020-07-13By:
https://www.flickr.com/photos/1024/3407840726/

Recently it came to my attention that a website has been reposting blog posts that I wrote. And they have been doing it for years. The oldest posts of mine on the site were from 2013. On top of that, they had links to download 3 of the books that I’ve written over the years. To add to this, if you go to the website with an ad blocker enabled, they complain that they don’t work for free.

Laptrinhx is stealing

Except that they do, because they steal content that I’ve written as well as hundreds of other authors, possibly thousands given that they are boasting 3.6M posts, most of all of which they didn’t write. But they are getting paid for them based on adversiting.

Now I’ve got a pi-hole setup at home, so they didn’t get any ad-revenue based on my looking through their website so I could get a list of all my posts that they stole from me.

Needless to say, I’m not that impressed that they decided to not only take my hard work as their own, but then they decided to profit off of it.

Needless to say, I’ve submitted a DMCA notice to have the content that DCAC owns the Copywrite for removed from their servers. But it appears that they are hosted by Google, so my DMCA notice probably isn’t going to do much. But there are several big companies that will be able to weigh in including Microsoft, Elsevier (my book publisher), Databricks and others I’m sure.

Laptrinhx’s Twitter Header

Laptrinhx appears to have a Twitter account, but it doesn’t appear to have any followers as it’s been around since 2009, has 5800+ tweets and 12 whole followers.

Don’t be like Laptrinhx, don’t be a thief

If you are a fan of someone’s posts, there are a few things that you can do.

  1. Add them to the blog role on your site so that other people can go and to their site and read their posts.
  2. Blog about them, and point your readers to their site.

The one thing you don’t ever do is post their content on your site. It isn’t your content, you don’t get to post it.

If you want to see the massive take down that I filed with Google, it’s here.

Denny

Contact the Author | Contact DCAC

You Should Push Back on Your Software Vendors

Published On: 2020-07-09By:

I’ve seen two twitter discussions in the last two days about terrible software vendor practices. The first was a vendor who wanted to install binaries on the C: drive (and only the C: drive) of the server hosting the SQL Server database for the application. The other was a vendor who didn’t support using replication to another database to report against their database. Both of these scenarios are terrible–database servers should really only run database services, and it’s none of your software vendor’s business as to what you do with your transaction log.

blue jeans
Photo by VisionPic .net on Pexels.com

Speaking of the software vendor’s business, let’s talk about the business model of software vendors. For the most part, smaller vendors don’t make their profits on your initial purchase of your licenses, instead they charge an annual maintenance fee (sometimes they have different names for it, like support, or in the case of Microsoft Software Assurance), As part of this agreement, you are typically entitled to patches, security fixes, new versions of software, and in some cases support tickets. In order to stay supported, you need to agree to a certain set of requirements from the vendor.

This is a lucrative business–software has very high profit margins, making it a target for investors, private equity, and venture capital. The latter two of those can do bad things to otherwise good companies in order to try to extract every penny of profit out of them. This can include laying off core engineering staff, and replacing them with much cheaper offshore resources, who while good engineers, aren’t familiar with the core architecture and more importantly use cases of the product. They may also cut testing resources, so the program is only “certified” on older versions of database and operating system software. Private equity has done terrible things to a lot of businesses and software isn’t exempt from that. Read this article about a company that has acquired a bunch of zombie independent software vendors (ISVs) and just milks support fees for profit. Seriously, read that article.

While there are some good ISVs out there, they are few and far between, but at all times you need to remember that you are their customer, and they work for you. That doesn’t mean you can yell at a support engineer on the phone, but when they tell you that you can only run their software on SQL Server 2005 running on Windows 2003, and oh yeah, they need the SA account for their application to run, you should push back.

A lot of DBAs I encounter are too timid to do this–the business needs some app for manufacturing widgets, and the widget team just wants that app even though the vendor insists that the backups need to be shipped to a server in Russia. I will say this–pick you battles–it’s not worth to argue about something like a MaxDOP requirement (unless there’s an obvious performance problem there), but when the vendor says something like you can’t use Availability Groups with their app, or wants to trap you onto a legacy version of the RDBMS, you should push back. The other #sqlhelp thread I saw was where someone wanted to build a log reader to apply transactions to a secondary database for reporting, because the vendor didn’t support replication. That’s stupid–you’d be building a very fragile system, when SQL Server has a perfectly good feature (transactional replication) to do the same thing. Or even the vendor who wanted to install software on the SQL Server. No. Just say no.

In summary–you own your software, and you manage your database environment. You shouldn’t do anything that puts your business’s data at risk, but at the same time, you want to manage your environment in as consistent a fashion as possible, while adhering to best practices. https://www.forbes.com/sites/nathanvardi/2018/11/19/how-a-mysterious-tech-billionaire-created-two-fortunesand-a-global-software-sweatshop/#38fe205e6cff

Contact the Author | Contact DCAC

Refreshing a Power BI Dataset in Azure Data Factory

Published On: By:

I recently needed to ensure that a Power BI imported dataset would be refreshed after populating data in my data mart. I was already using Azure Data Factory to populate the data mart, so the most efficient thing to do was to call a pipeline at the end of my data load process to refresh the Power BI dataset.

Power BI offers REST APIs to programmatically refresh your data. For Data Factory to use them, you need to register an app (service principal) in AAD and give it the appropriate permissions in Power BI and to an Azure key vault.

I’m not the first to tackle this subject. Dave Ruijter has a great blog post with code and a step-by-step explanation of how to use Data Factory to refresh a Power BI dataset. I started with his code and added onto it. Before I jump into explaining my additions, let’s walk through the initial activities in the pipeline.

ADF pipeline that uses web activities to gets secrets from AKV, get an AAD auth token, and call the Power BI API to refresh a dataset. Then and Until activity and an If activity are executed.
Refresh Power BI Dataset Pipeline in Data Factory

Before you can use this pipeline, you must have:

  • an app registration in Azure AD with a secret
  • a key vault that contains the Tenant ID, Client ID of your app registration, and the secret from your app registration as separate secrets.
  • granted the data factory managed identity access to the keys in the key vault
  • allowed service principals to use the Power BI REST APIs in in the Power BI tenant settings
  • granted the service principal admin access to the workspace containing your dataset

For more information on these setup steps, read Dave’s post.

The pipeline contains several parameters that need to be populated for execution.

ADF pipeline parameters

The first seven parameters are related to the key vault. The last two are related to Power BI. You need to provide the name and version of each of the three secrets in the key vault. The KeyVaultDNSName should be https://mykeyvaultname.vault.azure.net/ (replace mykeyvaultname with the actual name of your key vault). You can get your Power BI workspace ID and dataset ID from the url when you navigate to your dataset settings.

The “Get TenantId from AKV” activity retrieves the tenant ID from the key vault. The “Get ClientId from AKV” retrieves the Client ID from the key vault. The “Get Secret from AKV” activity retrieves the app registration secret from the key vault. Once all three of these activities have completed, Data Factory executes the “Get AAD Token” activity, which retrieves an auth token so we can make a call to the Power BI API.

One thing to note is that this pipeline relies on a specified version of each key vault secret. If you always want to use the current version, you can delete the SecretVersion_TenantID, SecretVersion_SPClientID, and SecretVersion_SPSecret parameters. Then change the expression used in the URL property in each of the three web activities .

For example, the URL to get the tenant ID is currently:

@concat(pipeline().parameters.KeyVaultDNSName,'secrets/',pipeline().parameters.SecretName_TenantId,'/',pipeline().parameters.SecretVersion_TenantId,'?api-version=7.0')

To always refer to the current version, remove the slash and the reference to the SecretVersion_TenantID parameter so it looks like this:

@concat(pipeline().parameters.KeyVaultDNSName,'secrets/',pipeline().parameters.SecretName_TenantId,'?api-version=7.0')

The “Call Dataset Refresh” activity is where we make the call to the Power BI API. It is doing a POST to https://api.powerbi.com/v1.0/myorg/groups/{groupId}/datasets/{datasetId}/refreshes and passes the previously obtained auth token in the header.

This is where the original pipeline ends and my additions begin.

Getting the Refresh Status

When you call the Power BI API to execute the data refresh, it is an asynchronous call. This means that the ADF activity will show success if the call is made successfully rather than waiting for the refresh to complete successfully.

We have to add a polling pattern to periodically check on the status of the refresh until it is complete.

We start with an until activity. In the settings of the until loop, we set the expression so that the loop executes until the RefreshStatus variable is not equal to “Unknown”. (I added the RefreshStatus variable in my version of the pipeline with a default value of “Unknown”.) When a dataset is refreshing, “Unknown” is the status returned until it completes or fails.

ADF Until activity settings

Inside of the “Until Refresh Complete” activity are three inner activities.

ADF Until activity contents

The “Wait1” activity gives the dataset refresh a chance to execute before we check the status. I have it configured to 30 seconds, but you can change that to suit your needs. Next we get the status of the refresh.

This web activity does a GET to the same url we used to start the dataset refresh, but it adds a parameter on the end.

https://docs.microsoft.com/en-us/resGET https://api.powerbi.com/v1.0/myorg/groups/{groupId}/datasets/{datasetId}/refreshes?$top={$top}

The API doesn’t accept a request ID for the newly initiated refresh, so we get the last initiated refresh by setting top equal to 1 and assume that is the refresh for which we want the status.

The API provides a JSON response containing an array called value with a property called status.

In the “Set RefreshStatus” activity, we retrieve the status value from the previous activity and set the value of the RefreshStatus variable to that value.

Setting the value of the RefreshStatus variable in the ADF pipeline

We want the status value in the first object in the value array.

The until activity then checks the value of the RefreshStatus variable. If your dataset refresh is complete, it will have a status of “Completed”. If it failed, the status returned will be “Failed”.

The If activity checks the refresh status.

If activity expression in the ADF pipeline

If the refresh status is “Completed”, the pipeline execution is finished. If the pipeline activity isn’t “Completed”, then we can assume the refresh has failed. If the dataset refresh fails, we want the pipeline to fail.

There isn’t a built-in way to cause the pipeline to fail so we use a web activity to throw a bad request.

We do a POST to an invalid URL. This causes the activity to fail, which then causes the pipeline to fail.

Since this pipeline has no dependencies on datasets or linked services, you can just grab my code from GitHub and use it in your data factory.

Contact the Author | Contact DCAC
1 2 3 466

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner   Denny Cherry & Associates Consulting LLC BBB Business Review    Microsoft MVP    Microsoft Certified Master VMWare vExpert
American Business Awards Gold Award    American Business Awards Gold Award    FT Americas’ Fastest Growing Companies 2020       Best Full-Service Cloud Technology Consulting Company
   Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers
Share via
Copy link