When creating resources and resource groups, the locations in which the resource groups are created rarely get a thought. Resource group locations matter, not when everything is working but when there is a failure at an Azure region.
When you select the location for your resource group, you aren’t limited to putting resources from that location in the resource group. Resources from any Azure region can be placed in an Azure resource group which is stored in any Azure region. When the Azure region that is hosting the resource group fails, no changes can be made to the resources within that resource group, or to the resource group itself.
Firewalls as an Example
Let’s look at an example that I was working on for a client. Production for this client was is North Central US and the Disaster Recovery environment for this client is in South Central US. This client wanted to use Azure Firewall and have that deployed for both their North Central US and South Central US environments. As we were discussing the firewall rules that we were going to put in place we wanted to have a common rule set for outbound rules so that the exact same rules would be in place both for both Azure regions. This meant that we needed to place all the Azure Firewalls in the same resource group (as Azure Firewalls Policies can only inherit rules from another policy, when that policy is stored in the same region the parent policy).
This presented a problem. If we stored the resource group and the policies in North Central US and North Central US failed, then we couldn’t edit the Disaster Recovery policy. If we stored the policies and the resource group in South Central US and South Central US failed then we couldn’t edit or change the production firewall policy until our DR site came back up. The same applied to the resource group location. If we put the resource group in North Central US and the policies somewhere else, and production failed, there’s no guarantee that we’d be able to make firewall changes if those changes required being able to update the resource group itself.
What we ended up going in this case was putting the resource group and the Firewall Policies in a third Azure region, US West 2 in this case. This was if North Central US fails we can still edit our Azure Firewall Policies for our Disaster Recovery environment, and if South Central US fails we can still edit our Azure Firewall Policies for our Production environment.
This same process applies to a web farm that is spread out across multiple Azure regions. If for example, you had a WebApp in two Azure regions with one in Central US and the second one in North Europe you wouldn’t want these to both be in the same resource group. If they were in the same resource group, you’d want that resource group to be in a third site. Because whatever site holds the resource group needs to be online for you to be able to make changes to the resources within the resource group, provided that those changes need to update the resource group directly.
The solution to this is to either have one resource group for each region or put the resource group itself in a third region so that a failure of either region won’t affect it.
Continuing with our Performance Tuning series as you have learned baselining and knowing your normal system performance metrics is an important part of performance tuning. In part 1 we discussed collecting vital system metrics such as CPU Utilization and Memory usage. In part 2 we took a glimpse into how we can gather those metrics and other using the different tools that are available to use like Performance Monitor, Profiler, Extended Events and DMVs. In this part 3 we will look at how identifying the appropriate waits events and corresponding resolution will be critical for resolving performance issues. But first you need to understand what waits are and where to find the information needed to identify waits in your system.
When a thread is being executed and is forced to wait on something this is called queueing. The thread waits for system resources, lining up in a first come first served fashion waiting for things like a CPU scheduler. The time waiting between actively running and suspended (waiting) states is the measurable wait event that SQL Server keeps track of via using. internal metrics. The database engine tracks why executed waits must wait and how long they are queuing for resource threads. These are commonly called wait statistics. These metrics are easily discernible using the dynamic management view (DMV) sys.dm_os_wait_stats (or in Azure SQL Database sys.dm_db_wait_stats). This information is important to understanding what baseline performance of your database and can help you identify specific performance issues both with query execution and hardware limitations. Data gained from this view can be aggregated and tracked to provide a clearer picture on what might be causing the executed thread to wait. Knowing the top waits that occur in your environment is a key task for all DBAs. Below you will learn common actionable waits as well as those that are benign and occur normally as part of SQL Servers thread queueing process.
Check out the full part 3 article on the Idera Community Site.
Question: When an activity in a Data Factory pipeline fails, does the entire pipeline fail? Answer: It depends
In Azure Data Factory, a pipeline is a logical grouping of activities that together perform a task. It is the unit of execution – you schedule and execute a pipeline. Activities in a pipeline define actions to perform on your data. Activities can be categorized as data movement, data transformation, or control activities.
In many instances, when an activity fails during a pipeline run, the pipeline run will report failure as well. But this is not always the case.
There are two main scenarios where an activity would report failure, but the pipeline would report success:
The maximum number of retry attempts is greater than 0, and the initial activity execution fails but the second attempt succeeds
The failed activity has a failure path or a completion path to a subsequent activity and no success path
In the General settings of any activity is a property called Retry. This is the number of times Data Factory can try to execute the activity again if the initial execution fails. The default number of retries is 0. If we execute a pipeline containing one activity with the default Retry setting, the failure of the activity would cause the pipeline to fail.
I often set retries to a non-zero number in copy activities, lookups, and data flows in case there are transient issues that would cause a failure that might not be present if we waited 30 seconds and tried the activity again.
Dependency with a Failure Condition
Activities are linked together via dependencies. A dependency has a condition of one of the following: Succeeded, Failed, Skipped, or Completed. If we have a pipeline containing Activity1 and Activity2, and Activity2 has a success dependency on Activity1, it will only execute if Activity1 is successful. In this scenario, if Activity1 fails, the pipeline will fail.
But if we have a pipeline with two activities where Activity2 has a failure dependency on Activity1, the pipeline will not fail just because Activity1 failed. If Activity1 fails and Activity2 succeeds, the pipeline will succeed. This scenario is treated as a try-catch block by Data Factory.
Now let’s say we have a pipeline with 3 activities, where Activity1 has a success path to Activity2 and a failure path to Activity3. If Activity1 fails and Activity2 succeeds, the pipeline will fail. The presence of the success path alongside the failure path changes the outcome reported by the pipeline, even though the activity executions from the pipeline are the same as the previous scenario.
What This Means for Monitoring
This difference between pipeline and activity status has a few implications of which we should be aware as we monitor our data factories.
If we are using Azure Monitor alerts, we need to understand that setting an alert for pipeline failures doesn’t catch all activity failures. If there is a retry of an activity and the second attempt is successful, there would be an activity failure but no pipeline failure.
Conversely, if we set an alert to notify us of activity failures, and we have a pipeline designed with the try-catch pattern, we might get an alert about an activity failure, but the pipeline would still show success. You would need to look at the status of the activities within the pipeline execution to see the failure of which you were alerted.
For many of my implementations, just setting an alert to notify me when any activity failure occurs is fine. For others, I really only care if the pipeline fails. Sometimes I need to set more specific alerts where I choose only certain activities to monitor for failure.
You could also use the Data Factory SDK to roll your own monitoring solution. If you write PowerShell, C#, or Python, you can retrieve the status of any pipeline or activity run and take subsequent actions based upon the results.
What This Means for Pipeline Design
You may need to add activities to your pipelines to support your monitoring scenarios if you need something more customized than what is offered from Azure Monitor and don’t want to use the SDK.
If you have notification needs that Azure Monitor can’t accommodate, you could add an activity in your pipelines to send an email based upon your desired activity outcomes. You can cause that activity to execute using an activity dependency alone, or by combining it with a variable and an If Condition activity.
There are times where we may need a pipeline to fail even though we are using the try-catch pattern that results in pipeline success. In that case, I add an additional web activity to the end of my pipeline failure path that hits an invalid url like http://throwanerror. The failure of this activity will cause the pipeline to fail. Keep monitoring and notifications in mind as you design your pipelines so you are alerted as appropriate.
Azure Data Factory Activity and Pipeline Outcomes
To help clarify these concepts I made the below guide to Data Factory activity and pipeline outcomes. Feel free to share it with others. You can download it directly from this link.
When you are making changes to an existing Non-Clustered index SQL Server provides a wide variety of options. One of the more frequently method is DROP EXISTING; in this post you will learn all about that option. This option automatically drops an existing index after recreating it, without the index being explicitly dropped. Let us take a moment understand the behavior of this choice.
DROP EXSITING=ON which is my preferred method, will DROP the current index only after it finishes creating and building the index with the new definition. The pitfall is that if the index does not exist, you will get an error and must create it without the option or set it to OFF. However, the more important benefit of using this one is all about performance. The index will still be used by active queries until it is rebuilt with the new definition.
CREATE NONCLUSTERED INDEX [dcacIDX_ServiceType] ON [dbo].[Accounts]
INCLUDE([AccountId]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = ON, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
If index does not exist, you will get a 7999 error.
Msg 7999, Level 16, State 9, Line 1
Could not find any index named ‘dcacIDX_ServiceType’ for table ‘dbo.Accounts’.
A nonclustered rowstore index to a clustered rowstore index.
With DROP_EXISTING, you cannot change:
A clustered rowstore index to a nonclustered rowstore index.
A clustered columnstore index to any type of rowstore index.
DROP and CREATE
This option is a cleaner and wont error if the index doesn’t already exist. However, I caution you when using this especially when it is a large table. Using this option drops the index before it creates the new, leaving your system without the previous index definition. This can create a huge performance issue while the system waits for the new index to be created. I know this firsthand, as I did this with a client a few years ago, during the day while trying to fix a performance issue. I created a worse issue while the waiting for the new one to be created. It took 45 mins to create the new index with the new definition which caused CPU to spike to 100% while active queries were trying to come through. Which sadly, in turn, slowed down the new index creation.
DROP INDEX IF EXISTS [dcacIDX_ServiceType] ON [dbo].[Accounts]
CREATE NONCLUSTERED INDEX [dcacIDX_ServiceType] ON [dbo].[Accounts]
INCLUDE([AccountId] WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
Now I should also note that the DROP_EXISITING method is also faster when you must modify a Clustered index. Every Non-Clustered index refers to the Clustered index using what is called a clustering key, essentially, a pointer to the row in the clustered index. When a clustered index is dropped and re-created, SQL Server must rebuild the Non-Clustered indexes on that table. In fact, it gets done twice by actually rebuilding them upon drop and rebuild again on the create of the Clustered index. Using DROP_EXISTING=ON prevents you from having to rebuild all these indexes as their keys will stay the same, thus making it significantly faster.
The reason I took the time to write this quick blog is to remind those to consider using the DROP EXSITING=ON rather than using the DROP and CREATE method when possible. Do not introduce a performance issue when you can avoid it and you can more efficiently make the desire changes you need. Just a friendly reminder.
As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.
And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.