As application developers, it’s our responsibility to ensure that the applications we create are using credentials and other secret configuration values in a secure way. Oftentimes, this task is overlooked in the pursuit of our primary concern: building new features and delivering business value quickly. In some cases, this translates into developers tolerating flat-out unsafe practices in the name of convenience, such as hardcoding secrets into the application source code or sharing secrets with team members via insecure communication channels and storing them on their development machines.

Fortunately, Microsoft provides a solution to this problem that should be attractive to both security experts and developers, known as “managed identities for Azure resources” (formerly “Managed Service Identities”). The idea is pretty simple: associate an Azure AD security principal* to your Asp.Net Core Web App and let it use this ‘identity’ to authenticate to Azure Key Vault and pull secrets into memory at runtime. Microsoft provides the glue to make all of this easy for developers: on the programming side, they provide a simple library for your Asp.Net Core app to pull the secrets from Key Vault (as demonstrated here), and on the hosting side they implement the mechanisms that make the identity available to the app’s runtime via first-class support in Azure hosting environments and local development tools.

For those unfamiliar, a security principal is any kind of digital ‘identity’ that can be authenticated and configured with permissions that authorize it to access Azure resources. Examples include a user’s personal login, an AD group, or a service principal. This is also known as App Registrations in Azure. They allow you to create a custom identity and credentials just for applications or other automated processes, so they can be granted access to Azure resources they need to interact with.

So, what’s to be gained with this approach, and what are the tradeoffs? There are two audiences that have a stake in this:

  • The business stakeholders and security team that place a high priority on protecting applications and user data from exposure
  • The developers that just want to make sure they can stay productive and spend less time worrying about how configuration values are provided. I’ll address these groups and their distinct concerns separately.

The Security Perspective

There are numerous security benefits that come with this approach. Most critically, there are far fewer points of exposure for your secrets. The reliance on developers to do the right thing and manage secrets responsibly is almost entirely removed, to the point where developers would have to go out of their way to do the wrong thing. Another benefit is the administrative access control built into Key Vault, which makes it easy to manage who should and shouldn’t be able to run the app and access secrets.

We will start with how this approach limits the exposure of your secrets. Without managed identity and Asp.Net Core Key Vault configuration, you are directly responsible for making your secrets available to your app, whether it’s hosted or running locally. A hosted app, for example, one running in Azure App Service, means configuring the PaaS App Settings or modifying the appsettings.json file that you deploy with your app binaries. The secrets must be put there by the process that regularly builds and deploys your application. It also needs to store and retrieve these secrets from somewhere, which could be Key Vault, a release variable, or some other data store, maybe even just a VM or user’s file system. Local development also spreads the surface area of secret exposure. In the best case, you might pull these onto the developer’s machine using a script that stores them in the dev’s file system, but too often people will take the path of least resistance and send these to each other over email, chat, or, even worse, hardcode them into source control.

In a managed identity world, the app simply reaches out to Key Vault for these secrets at runtime. This trims out several problematic points of exposure:

  1. No more accessing these credentials from the deployment pipeline where they might accidentally get captured in logs and build artifacts, and where they may be visible to those with permission to manage deployments.
  2. If a person is tasked with running your deployment scripts directly (to be clear – not ideal) they wouldn’t need access to app secrets to do a code deployment.
  3. No more storing these credentials in persistent storage of the app runtime host, where they can be inspected by anyone with management access to the host.
  4. No more spreading secrets across developer’s local devices, and no more insecure transmission of secrets on channels such as email or chat. It also makes it easy to avoid bad habits like hardcoding secrets into the app and checking them into source control.

Another benefit of this approach is that it doesn’t rely so heavily on developers and operations folks being mindful and responsible about security. Not only can they avoid insecurely distributing them amongst teammates, but they also don’t have to worry about removing them from their local machines or VM’s when they no longer need them, because they are never stored. Of course, developers always should be mindful and responsible for security, realistically things don’t always work out that way. Developers frequently overlook security concerns while focusing on being productive, and often people are simply under-educated about security. Any opportunity to improve security via architecture and design, and to make humans less capable of doing the wrong thing is a win.

Those with a focus on security will also appreciate the level of access control that is provided by Key Vault. Access to secrets is not managed via typical Azure RBAC (Resource Based Access Control). Instead, access policies are created to grant specific permissions for each user, service principal, or group. You can grant specific kinds of access, such as reading or editing/adding secrets. This can make Key Vault serve as a control center for deciding who should be allowed to run the app for a given environment. Adding a new team member or granting temporary access to debug a higher environment is as easy as adding a user to a Key Vault access policy that allows reading secrets only, and revoking access is as easy as removing them. See here for more info on securing access to Key Vault.

The Developer Perspective

Developers may have concerns that a centralized configuration approach could slow things down, but let’s look at why that doesn’t have to be the case, and why this can even improve velocity in many cases. As we’ll see, this can make it super easy to onboard new team members, debug multiple environments, regenerate keys due to recycling or resource recreation, and implement a deployment process.

We will start with onboarding. With your app configured to use managed identity and key vault authentication, onboarding a new team member to run and debug the app locally simply involves adding them to an access policy granting them the permission to read keys from the key vault. An even easier approach is to create an AD group for your developers and assign a single Key Vault access policy to the entire group. After that, they just need to login to the subscription from their personal machine using Visual Studio or the Azure CLI. Visual Studio has this support integrated and will apply when you start your app from there, and the Azure CLI extends this support to any other IDE that runs the app using the dotnet CLI, such as VS Code. Once they have been granted authorization and logged in, they can simply start the app, which will retrieve the secrets from Key Vault using their permissions. If this team member were to eventually leave the team, they can have their access revoked by removing their access policy. They will have nothing to clean up because the secrets were never stored on their computers, they only lived in the app runtime memory.

Another benefit of centralizing secrets when using shared resources is in situations where secrets may often change. Particularly in a development environment, you may have good reason to delete resources and redeploy them, for example, to test an infrastructure deployment process. When you do this, secrets and connection strings for your resources will have changed. If every developer had their own copy of the secrets on the machine, this kind of change would have broken everyone’s local environments and disrupt their work until they’ve acquired all the latest secrets. In a managed identity scenario, this kind of change would be seamless. The same benefit applies when new resources are added to your infrastructure. Dev team members don’t need to acquire the new connection secrets when they get the latest code that uses a new service, the app will just pull them from Key Vault.

Another time secrets may change is when they expire or when you intentionally rotate them for the sake of security. Using a key vault can make it significantly easier to implement a key rotation strategy. The key vault configuration provider can be configured to pull app secrets once at app start time (which is the default) or at a regular interval. Both can be part of a secret/key rotation strategy, but the first requires orchestrating an app restart after changing a secret, which isn’t necessary with the second approach. Implementing key rotation support in your app is fairly straight forward: most Azure resources provide two valid keys at a time to support rotation. You should store both keys for each service in Key Vault, but only use one of them in your app until it becomes invalid. Once your client hits an auth error, you should catch that exception, set the other key as the actively used key, and replay the request. Using approach 2, configure the Key Vault config provider to refresh on an interval, maybe 5 or 10 minutes, and then have an external process (Azure Automation Runbooks are a recommended solution for this) reset only one key at a time. If both keys are cycled at the same time, your app config won’t refresh fast enough to get the new keys and will start to fail. By doing one at a time, you ensure having at least one valid key available to your app at any given time.

Another way that this can improve developer agility is that you can easily change the environment you target with a simple configuration change. For example, let’s say some pesky issue is popping up in your UAT environment that isn’t showing up anywhere else, and you’re tearing out your hair looking through logs trying to understand it. You’re at the point where you’d give your left foot to just run the app locally targeting that environment so you can attach a debugger and step through the problematic code. Without using managed identity and the key vault configuration provider you would have to copy the secrets for that environment to your local computer. This is gross enough that you should probably seek any other option before resorting to it. However, if you were using managed identity and key vault, you could simply reconfigure the name of the key vault you want your local app to use with the one for the target environment and create a temporary access policy for yourself. As a good practice, you should still revoke your access afterward, but at least you have nothing sensitive on your local device to clean up.

Finally, let’s talk about the benefits of using this approach from the perspective of building a deployment pipeline. Specifically, the benefit is that you have one fewer thing to implement and worry about. Since secrets are centralized in the key vault and pulled during app runtime, you don’t need to have your process pull in the secrets from wherever you store them, then pave them into an appsettings.json file, or assign them as PaaS-level environment variables. This saves you time not having to code this behavior, and it also saves you time when something breaks because there’s one fewer place where something could have gone wrong. Having your app go directly to key vault streamlines the configuration and creates fewer opportunities to break things. It also has the advantage that you don’t need to run a full app deployment just to update a secret.

Counter Arguments

This may sound good so far, but I suspect you may already have a few concerns brewing. Maybe you’re thinking some of the following: Do I have to start keeping all my configuration values in Key Vault? Doesn’t this represent additional configuration management overhead? Won’t I have conflicts with other team members if I need to change secret values to personalize my local environment? Doesn’t this create a hard dependency on an internet connection, meaning I won’t be able to run a local environment fully offline? All of these are valid questions, but I think you’ll see that they all have acceptable and satisfying answers.

So, does this mean that Key Vault needs to become the singular place for all app configurations, both secret and non-secret? If we only put secrets there, then don’t many of the above arguments about the benefits of centralization become moot, since we still need to do distributed config management for non-secret values? Azure’s answer to this question is Azure App Configuration, a centralized app configuration service that gives you a nice level of control over non-secret configuration settings for your app, including cool features like config value versioning and feature flags. I won’t go too deep into the details of this service here, but it’s worth noting that it also supports managed identity and can integrate with your app in the same way as Key Vault. However, I’ll also offer the suggestion that you can incorporate App Configuration on an as-needed basis. If you are dealing with a small app with less than 10 environment-specific settings, then you might enjoy the convenience of just consolidating all your secret and non-secret values into Key Vault. The choice comes down to preference, but keep in mind that if your settings are changing semi-often or you expect your app to continue adding new config settings, you may get tired of editing every config using Key Vault’s interface. It’s tailored for security, so it should generally be locked down as much as possible. It also doesn’t have all the features that App Configuration does.

Regarding configuration management overhead, the fact is that, yes, this does require creating/managing a Key Vault service and managing access policies for dev team members. This may sound like work you didn’t previously have, but I assure you this kind of setup and ownership is lightweight work that’s well worth the cost. Consider all the other complexities you get to give up in exchange: with centralized config management, you can now do code-only app deployments that can ignore configuration management entirely. That makes it faster and easier to create your deployment process, especially when you have multiple environments to target, and will give you high marks for security. As we also mentioned, centralizing these config settings makes it simpler to onboard new team members and possibly to iterate on shared infrastructure without breaking things for the team.

You may also be concerned that sharing your configuration source will result in a lot of stepping on toes with your team during development. But consider this: nothing is stopping you from using the same kind of local environment configuration approaches that developers already use in addition to Key Vault. Asp.Net Core’s configuration system is based on the idea of layering configuration providers in a stack, where the last-in wins. If you want to allow your developers to be able to override specific values for development purposes, for example, to point at a personal database instance (maybe even a local database, like SQL Server or the Cosmos DB Emulator), you can still pass those as environment variables, in appsettings.Development.json, or as ‘dotnet user-secrets’. This doesn’t necessarily defeat the purpose of centralizing secret or config management. The benefits of centralization apply most to shared resources. If you want to use a personal resource, there’s no harm in personalizing your config locally. An alternate approach to personalization is to provide your own complete set of resources that make up an environment in Azure. Ideally, you already have a script or template to create a new environment easily, and if you don’t, I strongly recommend it, in which case you’ll get your own Key Vault as well, and you can simply point your local app at it.

Lastly, I’d like to address the question of whether this makes it impossible to do fully offline local development. There are a couple of considerations here:

  1. How to target local services instead of live-hosted ones
  2. Overcoming the fact that the Key Vault configuration provider relies on an internet connection.

The first is handled the same way you would handle configuration personalization, by overriding any config settings in something like appsettings.Development.json or ‘dotnet user-secrets’ to target your local database or Azure service emulator. The second is relatively simple, just put the line of code that configures Key Vault as a config provider within an ‘if’ condition that checks to see if you are running in a development environment (see a sample approach here). This is assuming that Key Vault is truly your only remaining dependency on an internet connection. If it seems strange to hear me recommend disabling Key Vault after advocating for it, consider again that the benefits of centralized configuration apply most to using shared resources, so if you are designing to support an entirely local development environment then using Key Vault becomes unnecessary when running in that mode.

Using centralized configuration services like Key Vault via managed identity requires a different mindset for developers, but it comes with clear advantages, especially when it comes to limiting the exposure of your application secrets. This kind of solution is an absolute win from a security perspective, and it has the potential to considerably improve your development team’s experience as well. Due to Asp.Net Core’s pluggable configuration system, it’s easy to apply to existing projects, especially if you’re already storing secrets in Key Vault, so consider looking at how you could incorporate it into your existing projects today, and don’t miss out on the chance to try it in your next greenfield project. Your security advocates and fellow developers just might thank you.

If you are like me, you have used cloud services in a limited fashion to create VM’s for testing or perhaps you have used them extensively. You’d also like to gain an understanding of the broader group of services offered by cloud providers. In my situation, this was due to the recent attainment of an Engagement Manager position and my desire to help AIS expand our business through the development of new opportunities. I realized that I needed to have at least a top layer understanding our offerings in order to realize potential use cases AIS could present to solve problems, more cost-effective options to current solutions, and develop completely new solutions to improve client business. It was obvious to start with Microsoft’s Azure and Amazon’s AWS platforms, being that these are the top focus of AIS and the industry as a whole.

What was not obvious, was where to start. Both platforms are not only extremely broad but also moving targets. I needed to find a way to dip into this process without drowning in all the information, in addition to holding the responsibilities in my day job. I looked at classroom training options, YouTube videos, and continued researching until I stumbled upon two paths. These paths not only provided a nice prepackaged set of materials, but I could complete at my own pace, at home, and they resulted in certifications. I will get to the details, but first a word about certifications.

I am sure many of you will be rolling your eyes when you read the “certifications” aspect of that second to the last sentence. Yes, certifications are not as valuable an indicator of a person’s skills and knowledge in an area as real-world experience. However, they provide the following benefits in order of least to most important:

  1. Provide a good starting point for someone that has no current projects in an area.
  2. Fill knowledge gaps that even a person with experience in an area has, especially in those services or techniques that are not used often.
  3. Provide value to AIS in maintaining various statuses.
  4. Provide a potential client with proof that you at least have an understanding of the basics.
  5. Most importantly, they result in a $500 bonus from AIS, and reimbursement of testing and training costs!

The paths I found are the Microsoft Azure Fundamentals learning path and certification and the Amazon AWS Cloud Practitioner training and certification. The training for both of these includes videos with the Azure path including an estimated ten hours of content and the AWS training about five hours. The Azure path estimates were spot on, and the AWS training took a bit longer, due to my complete lack of experience with the platform.

Microsoft Azure Fundamentals

This path included videos, reading, hands-on experience, and quick knowledge checks. It can be completed with an Azure account that you create just for the training or an account linked to the AIS subscription if you have one. Both the reading and videos provide just enough information, but not get bogged down in the minutia. The only thing I had done with Azure prior to the training created a few VM’s to set up SharePoint environments. I had done that years ago, but I didn’t do that much within those environments.

For me, most of the content was new. I believe if I had a more in-depth experience, the training would have filled in gaps with specific details.

These were the topics I found either completely new or helpful in understanding how to look at and/or pitch Azure services to clients:

  • Containers, app services, and serverless options and how they work
  • Reducing latency with the traffic manager
  • Azure policies and tags to enforce standards
  • Review of data centers, region pairs, geographies, availability zones
  • Various was to predict costs and manage costs such as calculators, Cost Manager, and Azure Advisor

The training took me probably two-thirds of the estimated time, after which I went through the knowledge checks for each section once more. After that, I spent maybe an hour reviewing some things from the beginning. From there, I took an exam and passed. The exam process was interesting and can be done from home with some software that enables someone to watch you. Prior to the exam, you are required to show the person the entire room and fix anything that might enable you to cheat.

After I completed the certification process, I submitted the cost of the exam ($100) as an expense as well as submitted my request for a certification bonus. I received both in a timely manner. See links at the end of this post for materials concerning reimbursements and bonuses. Don’t forget approval from your EM/AE prior to incurring any costs for which you might want reimbursement and to submit your updated certifications spreadsheet to the AIS PI Team.

AWS Cloud Practitioner

This path exclusively contains videos. In my opinion, the content is not as straight forward as the Azure Fundamentals content and the videos cannot be sped up, which can be very frustrating. The actual content was a bit difficult to find. I have provided links at the conclusion of this post for quick reference. Much of the video content involves Linux examples, so Putty and other command-line tools were used. This added a further layer of complexity that I felt took away from the actual content (do I really need to know how to SSH into something to learn about the service?).

As far as content, everything is video, there is no reading, hands-on examples, and knowledge checks. I felt the reading in the Azure path broke things up. The hands-on exercises crystalized a few things for me, and the knowledge checks ensured I was tracking. I would like to see Amazon add some of these things. That being said, the videos are professionally done and included helpful graphics. With zero experience with AWS, I am still finding that I am able to grasp concepts and the videos do a decent job of presenting use cases for each service.

My biggest complaint is the inability to speed up videos that are obviously paced for the lowest common denominator and I find admittedly ADD attention waning often. Something I found that helps is taking notes. This allowed me to listen, write and not get bored.

Amazon provides a list of recommended prep (see links below) that includes self-paced training, a one-day classroom option, exam guide, a list of four base white papers and links to many others, practice exams, as well as a link to the schedule certification exam. I scanned the whitepapers. They all looked like they were useful, but not necessary to knock out the exam. I say this with confidence as I was able to pass the exam without a detailed review of the whitepapers. My technique was to outline the videos, then review them over the course of a couple weeks.

Summary

Whether you are a budding developer or analyst wishing to get a broad overview, a senior developer that wants to fill gaps, or a new EM like me who wants a bit of both, the Microsoft Azure Fundamentals learning track/certification and Amazon AWS Cloud Practitioner training/certification is a good place to start. AIS will cover any costs and provide you with some additional scratch for your effort. Obtaining these certifications also improves AIS standings with providers, clients, and the community as a whole. It also greatly improves your value to clients, meets the criteria of certain AIS Career Paths and Competencies, and who knows, you might learn something!

Links:

  1. Azure Fundamentals Learning Path: https://docs.microsoft.com/en-us/learn/paths/azure-fundamentals/
  2. Azure Fundamentals Cert: AZ900 Microsoft Azure Fundamentals Exam.
  3. AWS Cloud Practitioner Preparation and Cert:  AWS Recommended Prep
  4. Training Reimbursements and Certification Bonuses: https://appliedis.sharepoint.com/sites/HR/Pages/Additional-benefits.aspx
  5. Submitting certification inventory to PI Team: Reach out to AIS – Process Improvement Team (ais-pi-team@appliedis.com) for more info.
While checking out one of the automated messengers a coworker created, we had an idea. Why not use Azure to help with daily tasks or streamline routine daily tasks? The logic apps listed here take about 15-20 minutes at most to create and go from easiest to hardest to setup. Listed below is what you will need for the app before listing the steps. Keep in mind that while Logic Apps are available on Azure Gov, you might need to talk with a supervisor before implementing these there.

Completely new to Logic Apps? Click here to create your first one!

SharePoint Item Tracker

What you’ll need:

  • Office 365 account, with Teams, enabled.
  • SharePoint access to the desired list
  • The Completed App looks like:

Completed App

Steps:

  1. Start with a blank Logic App.
  2. Select the SharePoint Trigger “When an item is created”. This will require you to sign in with your Office 365 credentials.
  3. You’ll see a box like this. Rename the title by clicking the three dots in the top right corner.
    When An Item is Created
  4. Select the Site Address of the SharePoint site with the desired list. The dropdown will be populated with all the available sites on the SharePoint domain. Select the List Name from the dropdown of SharePoint lists. If the list you’re looking for isn’t there, it might be hiding on a different site. Set the Interval and Frequency to “1” and “Day”, respectively.
  5. Add a new action to the Logic App and find the action “Post your own adaptive card as the Flow bot to a channel” (As of this writing, this action is still in preview). After signing in again to your Office 365 account, you’ll see this box. (Again, the dots in the top right corner will allow you to change the name)
    Notify the Channel
  6. Add the Team ID and Channel ID. These should correspond with where you want to send the notification to.
  7. In the Message Field, you can add the message to send to the channel. You also have the option of adding Dynamic Content, which can add the Name of the item, a link to the Item, or other various properties.
  8. Hit Save in the top left to save the Logic App.

Twitter Tracker

What you’ll need:

  • Office 365 account
  • Twitter Account
  • The Completed App looks like:

Twitter Tracker

Steps:

  1. From the empty Logic App, scroll down to find the “Email yourself about new Tweets about a specific keyword via Office365”
  2. Click “Use this template”
  3. You’ll be asked to connect to the following: Twitter, Office 365 Users, and Office 365 Outlook.
  4. Click continue and include the desired keyword.
  5. Save the logic app by clicking the Save button in the upper right corner.

Event Time-boxer

This is a useful logic app to keep two different calendars updated with each other.
What you’ll need: 

  • Office 365 account
  • The email address to forward events to

The Completed App looks like:

Event Time-Boxer

Steps:

  1. From the templates page, select “Empty Logic App”
  2. In the search box type “event is created”, and scroll down to select the one from Outlook.com
  3. You’ll be prompted to connect an Outlook account to the Logic App.
  4. Select the preferred calendar to check for events and set the Interval to 1 dayWhen a New Event is Created
  5. Add a new step, and in the search box, type “Condition”, and scroll down to Control “Condition”Check if Event is New
  6. Under the condition, choose the Subject from the dynamic options. For the condition dropdown, select Contains, and in the right text field, type in “Timebox”Timebox
  7. Under the “false” condition, add a new step. Look for the “Create Event” action from Outlook.com. You might be told to authenticate again.
  8. Select a Calendar for the event and make the End time and Start time the same from the trigger event. The subject should be “Timebox:” followed by the subject. This way, we won’t be triggering the event again, because of our previous condition.
  9. Add the required attendee’s field and include the email address you want to forward events to.Create Event V2
  10. Save the logic app.

Honorable Mention: Traffic Sensor

So, there is one more use for Logic Apps that we didn’t cover. Microsoft has a cool tutorial for creating a Traffic Checker, which checks the traffic in the morning and sends an email based on the result. You can find it here.

There are a ton of different connectors for Logic Apps and, so there are more ideas out there than the ones listed.

Over the last couple of years, I’ve moved from serious SharePoint on-premise development to migrating web applications to Azure.  My exposure to Azure prior to the first application was standing up SharePoint development virtual machines that pretty much just used the base settings.  Trial by fire tends to be the way I learn things best which is good because not only was I having to learn about Azure resources, but I also had to refamiliarize myself with web development practices that I haven’t had to work with since .Net 2.0 or earlier.

One of my first tasks, and the focus of this post, was to migrate the SQL Server database into an Azure SQL (Platform as a Service) instance.  The databases generally migrated well.  Schemas and the majority of the stored procedures were compatible with Azure SQL.  Microsoft provides a program called Data Migration Assistant. This program not only analyzes but can migrate the database.  The analysis will return any suggested or required changes.  There are several types of issues that could arise including deprecated features, incompatible features, and syntax blockers.

Deprecated Field Types

There are three field types that will be removed from future versions: ntext, text, and image. This won’t hold up a migration, but I did change the fields to future-proof the database. The suggested solution is to use: nvarchar(max), varchar(max), or varbinary(max).

Azure SQL Database

T-SQL Concerns

I ran into several issues in scripts. Some had simple solutions while others required some redesign.

  • ISSUE: Scripts can’t connect to multiple databases using the USE statement.
  • SOLUTION: The scripts that did use the USE statement didn’t need it as it was only meant to connect to a single database.
  • ISSUE: Functions such as sp_send_dbmail are not supported. The scripts would send notification emails rather than sending from the application.
  • SOLUTION: This required an involved redesign. I created a table that logged the email and a scheduled Azure Web Job to send the email and flag the record as sent. The reference to sp_send_dbmail was replaced with an INSERT INTO Email command. If the email needed to be sent near-instantly, I could have used an Azure Queue and had the Web Job listen to the Queue then send the email when a new one arrives.
  • ISSUE: Cross-database queries are not supported. A couple of the apps had multiple databases and data from one was needed to add or update records in the other.
  • SOLUTION: This was another major design change. I had to move that logic into the application’s business layer.T-SQL Concerns

Linked Servers

Some of the applications had other databases, including other solutions like Oracle, that the application accessed. They use Linked Server connections to execute queries against the other data sources. So far, they’ve just been read-only connections.

Linked Servers are not available in Azure SQL. To keep the functionality, we had to modify the data layer of the application to pull the resources from the external database and either:
A. Pass the records into the SQL Stored Procedure, or
B. Move the logic of the Stored Procedure into the application’s data layer and make the changes in the application

One of the applications had a requirement that the external database was staying on-prem. This caused an extra layer of complexity because solutions to create a connection back to the database, like Azure ExpressRoute, were not available or approved for the client. Another team was tasked with implementing a solution to act as a gateway. This solution would be a web service that the Azure application would call to access this gateway.

SQL Agent Jobs

SQL Agent Jobs allow for out-of-process data manipulation. A couple of the applications used these to send notification emails at night or to synchronize data from another source. While there are several options in Azure for recreating this functionality such as Azure Functions and Logic Apps, we chose to use WebJobs. WebJobs can be triggered in several ways including a Timer. The jobs didn’t require intensive compute resources so it could share resources with the application in the same Azure App Service. This simplified the deployment story because it could be packaged and deployed together.
SQL Agent Jobs
Database modifications tend to be one of the major parts of the migration project. Some of the projects have been simple T-SQL changes while others have needed heavy architectural changes to reproduce functionality in a PaaS environment. Despite the difficulties, there will be major cost savings for some of the clients because they no longer need to maintain an expensive, possibly underutilized, server. Future posts in this series will cover Automation & Deployment, Session State, Caching, Transient Fault Handling, and general Azure lessons learned.

Data Lake has become a mainstay in data analytics architectures. By storing data in its native format, it allows organizations to defer the effort of structuring and organizing data upfront. This promotes data collection and serves as a rich platform for data analytics. Most data lakes are also backed by a distributed file system that enables massively parallel processing (MPP) and scales with even the largest of data sets. The increase of data privacy regulations and demands on governance requires a new strategy. Simple tasks such as finding, updating or deleting a record in a data lake can be difficult. It requires an understanding of the data and typically involves an inefficient process that includes re-writing the entire data set. This can lead to resource contention and interruptions in critical analytics workloads.

Apache Spark has become one of the most adopted data analytics platforms. Earlier this year, the largest contributor, Databricks, open-sourced a library called Delta Lake. Delta Lake solves the problem of resource contention and interruption by creating an optimized ACID-compliant storage repository that is fully compatible with the Spark API and sits on top of your existing data lake. Files are stored in Parquet format which makes them portable to other analytics workloads. Optimizations like partitioning, caching and data skipping are built-in so additional performance gains will be realized over native formats.

DeltaLake is not intended to replace a traditional domain modeled data warehouse. However, it is intended as an intermediate step to loosely structure and collect data. The schema can remain the same as the source system and personally identifiable data like email addresses, phone numbers, or customer IDs can easily be found and modified. Another important DeltaLake capability is Spark Structured Stream support for both ingest and data changes. This creates a unified ETL for both stream and batch while helping promote data quality.

Data Lake Lifecycle

  1. Ingest Data directly from the source or in a temporary storage location (Azure Blob Storage with Lifecycle Management)
  2. Use Spark Structured Streaming or scheduled jobs to load data into DeltaLake Table(s).
  3. Maintain data in DeltaLake table to keep data lake in compliance with data regulations.
  4. Perform analytics on files stored in data lake using DeltaLake tables in Spark or Parquet files after being put in a consistent state using the `VACCUUM` command.

Data Ingestion and Retention

The concept around data retention is to establish policies that ensure that data that cannot be retained should be automatically removed as part of the process.
By default, DeltaLake stores a change data capture history of all data modifications. There are two settings `delta.logRetentionDuration` (default interval 30 days) and `delta.deletedFileRetentionDuration` (default interval 1 week)

%sql
ALTER table_name SET TBLPROPERTIES ('delta.logRetentionDuration'='interval 240 hours', 'delta.deletedFileRetentionDuration'='interval 1 hours')
SHOW TBLPROPERTIES table_name

Load Data in DeltaLake

The key to DeltaLake is a SQL style `MERGE` statement that is optimized to modify only the affected files. This eliminates the need to reprocess and re-write the entire data set.

%sql
MERGE INTO customers
USING updates
ON customers.customerId = updates. customerId
WHEN MATCHED THEN
      UPDATE email_address = updates.email_address
WHEN NOT MATCHED THEN
      INSERT (customerId, email_address) VALUES (updates.customerId, updates.email_address)

Maintain Data in DeltaLake

Just as data can be updated or inserted, it can be deleted as well. For example, if a list of opted_out_consumers was maintained, data from related tables can be purged.

%sql
MERGE INTO customers
USING opted_out_customers
ON opted_out_customers.customerId = customers.customerId

WHEN MATCHED THEN DELETE

Summary

In summary, Databricks DeltaLake enables organizations to continue to store data in Data Lakes even if it’s subject to privacy and data regulations. With DeltaLakes performance optimizations and open parquet storage format, data can be easily modified and accessed using familiar code and tooling. For more information, Databricks DeltaLake and Python syntax references and examples see the documentation. https://docs.databricks.com/delta/index.html

Once you’ve decided to instrument your ASP.NET Core application with Application Insights, you may be looking for how to anonymize or customize the data that is being sent to Application Insights. For details on why you should be using Application Insights and how to get started, please reference my previous post in this blog series.

Two things to consider with telemetry:

  1. Adding additional telemetry that is not recorded by Application Insights.
  2. Removing or anonymizing data that is already being sent to Application Insights.

There will a later post in this series discussing how to add new telemetry, this post focuses on anonymizing or removing data.

Personally Identifiable Information (PII) Already in Application Insights Account

We’ll start with a likely scenario, during an audit or during testing you discovered that you are logging PII to your Application Insights account. How can you go about fixing that? The short answer is to delete the entire Application Insights resource. That means you’ll lose access to all historical telemetry that was in the account and your production system will no longer be logging telemetry anywhere unless you create a new account and update your production system with the new telemetry key. However, this does solve your immediate problem of retaining PII. See the Microsoft documentation, for details on what is captured by Application Insights, how it’s transmitted and stored.

Application Insights does provide a PURGE endpoint, but requests are not timely, the endpoint isn’t well documented and it will not properly update metrics to account for the data that was purged. In general, if you have a compliance concern, delete the Application Insights account. Remember, Application Insights is designed to be a highly available high-performance telemetry platform, which means it is designing around being an append-only system. The best solution is simply not to send data to the platform that you shouldn’t.

API Use Case

Think of an API your business may have built. This API allows us to search for customers by e-mail to find their customer id. Once we have the customer id, we can make updates to their record such as their first name, birthday, etc. By default, Application Insights records the request URL and the response code. By default, it does NOT record any HTTP headers or the request body or the response body. First, let’s think of how we might design the search endpoint, we have two approaches:

  1. GET /api/customer/search?emailAddress=test@appliedis.com
  2. POST /api/customer/search
    a. In the request body, send JSON:
    { “emailAddress”: “test@appliedis.com”}

If we design the API using approach #1, we will be sending PII to Application Insights by default. However, if we designed the API using approach #2, by default no PII would be sent to Application Insights. Always try and keep your URLs free of any PII.

That may not always be possible, so let’s look at another use case. Let’s say we have the primary key of a customer record and we want to view and make edits to that record, what would the endpoints look like:

  1. GET /api/customer/9b02dd9d-0afd-4d06-aaf1-c34d3c051ec6
  2. PUT /api/customer/9b02dd9d-0afd-4d06-aaf1-c34d3c051ec6

Now, depending on your regulatory environment logging these URLs to Application Insights might present a problem. Notice we are not logging e-mail addresses, phone numbers or names; we are logging behavior about an individual. Pay attention to when the site was accessed and when was their profile updated? To avoid this we would like to anonymize the URL data that is being sent to Application Insights.

Anonymize Data Sent to Application Insights

This section assumes you are using ASP.NET Core and have already configured Application Insights, see my previous blog post for details.  Also, if you need to troubleshoot your configuration or need to verify it’s working as expected, please see my other blog post for details.

The Application Insights NuGet package provides an interface for exactly this purpose called ITelemetryProcessor. You simply need to subclass it and implement the Process method. The Telemetry Processor implementation acts much like ASP.NET Core middleware, in that there is a chain of telemetry processors. You must provide a constructor in your implementation that accepts an ITelemetryProcessor which is next in the chain. In your process method, you are then responsible for calling onto the next processor in the chain. The last processors in the chain are the ones provided by the NuGet package that implements the same interface and sends the telemetry over the wire to the actual Application Insights service in Azure. In the Process method which you are required to implement, you receive a single argument, ITelemetry. You can cast that to one of the subclasses, e.g. DependencyTelemetry, RequestTelemetry, etc. In the Process method, you can then mutate the telemetry in whatever way you need to, e.g. to anonymize data. You’ll then be responsible for calling the Process method on the next telemetry processor in the chain, e.g. the one that was provided to the constructor of your class. If you want the given telemetry item to never be sent to Application Insights, simply omit the call to the Process method of the next telemetry processor in the chain.

Now we will look at the source code for one that does what we are proposing. This will be for anything that resembles a customer id in the URL of RequestTelemetry and then replaces it with the word “customerid”.
RequestTelemetry

As seen above, in the constructor we receive the next telemetry processor in the chain. In the process method, we check to see if we have RequestTelemetry, e.g. ignoring all other telemetry types, like TraceTelemetry or DependencyTelemetry. The RequestTelemetry has a .Name and .Url property both of which might contain details about the URL which contains our PII. We use a regular expression to see if either contains a customer id, if so, we replace it with the word “customerid”. And then we always ensure that we call the next telemetry processor in the chain so that the modified telemetry is sent to Application Insights.
Remember, just having this ITelemetryProcessor coded won’t do anything. We still need to register it with the Application Insights library. In the Startup.cs, you will need to add a line to the ConfigureServices method, see below:

Add a Line to the ConfigureServices Method

Now that you’ve registered the processor, it’s just a matter of deploying the code. If you wanted to debug this locally while still sending telemetry live to Application Insights in Azure, please see my blog post with details on how to do that.

Summary

You learned about the ITelemetryProcessor extension point in the Application Insights NuGet package and how to use that extension point to prevent PII data from being logged to your Application Insights account. We’ve discussed how to design your API endpoints efficiently by default so that hopefully you don’t need to customize your telemetry configuration. Lastly, I have shown you how to delete PII data that may have been logged accidentally to your production Application Insights account. You may also want to take advantage of a relatively new feature of Application Insights to set the retention period of logs. Previously it was only 90 days, but now you can configure it in as low as 30 days. This may be useful for handling forgotten requests if your logging system is storing PII and you need to ensure it is removed within 30 days of being requested. Next, in this blog series, we will discuss logging additional telemetry to create an overall better picture of your system’s health and performance.

Once you’ve decided to instrument your ASP.NET Core application with Application Insights, you may be looking for a quick way to troubleshoot telemetry configuration. For details on why you should be using Application Insights and how to get started, please reference my previous post in this blog series. How do you go about testing your telemetry configuration? Typically, developers would adjust the application and would then deploy to Azure, ideally your development environment in Azure. However, making a change to an application, building, publishing to Azure and testing the given endpoints. Waiting for telemetry to appear in Application Insights can require upwards of 30 minutes per change assuming that you know what you are doing and without making any mistakes. Is there a way to work locally?

Troubleshooting Locally

Let me start by discussing the end goal. We want to simulate our production environment while running locally on our dev machine. Azure typically provides local emulators for services like storage and Cosmos. They do not provide an Application Insights emulator. When running locally we want to simulate production, but we will need to send data to an actual Azure Application Insights account. In order to simulate running in production, we should publish our Dotnet application in Release mode and begin outside of Visual Studio. The reason for starting the application outside Visual Studio is that our production environment will not have Visual Studio installed. Another reason for starting the application outside Visual Studio is because Visual Studio includes a Diagnostic panel that captures the Application Insights telemetry and prevents it from being sent to the Azure Application Insights account. I’d like to emphasize that the Diagnostics panel built into Visual Studio is not an emulator and shouldn’t be used for that purpose.

First, we must publish the application in Release mode. You can do that using the dotnet command-line as shown below.

Publish in Release Mode

This will publish to a directory similar to the below,

Publish to Directory

Once in the directory where the build artifacts are, we should find both appsettings.json and the .dll for our main application, CustomerApi.dll in my case. From the command-line. We can then run Kestrel directly, using the following command-line.

Run Kestrel Directly

If using the defaults, your application will now be running and available in a browser at either http://localhost:5000/ or https://localhost:5001/. We are likely still missing one step, which is configuring the telemetry key for Application Insights. In the bin\Release\netcoreapp3.0\ folder, locate the appsettings.json. Open the file and put the telemetry key in the file.
configuring the telemetry key

If you go back to the command-line you can press Ctrl+C to exit the running web application and then re-run the dotnet CustomerApi.dll command to restart the application. We now have an application running locally that is sending telemetry to Application Insights in Azure.

View Live Telemetry from Local Machine

In the Azure portal, open the Application Insights resource and then locate the “Live Metrics Stream” blade.
Live Metrics Stream

The Live Metrics panel should open and connect as long as the application is running locally using “dotnet CustomerApi.dll”. Once open, scroll to the bottom of the pane.

Bottom of Pane

At the bottom, you will see a list of connected servers. In my example below, you see two servers. The one highlighted in red is my local developer machine. The other server is the Azure Application Service that I have running in my development environment in Azure.

Developer and Azure Application Server

To quickly recap we have our application running locally outside Visual Studio on a command-line and in Azure Application Insights, we can see our local machine is connected up to live metrics. In order to actually see telemetry flow into this panel, you will likely want to make one other change. In the upper-right, click on the filter icon to adjust the live metrics filters.

Telemetry Sample

You will then be prompted with the following dialog. If you trust the servers you can safely ignore.

Authorize Connected Servers

You will then see a dialog with the current filters. Notice that the default configuration will only show failed requests and dependency calls. Since we are troubleshooting it’s likely you will want to see all requests. Feel free to click the “x” to remove both filters and then the “Save” button.

Current Filter Dialogue

Once you have completed this step, you can go back to your web browser on your local machine, either http://localhost:5000/ or https://localhost:5001/ and then make a request to your API. I tried a URL I know returns a 404 response. You can see the live telemetry that showed up for me:

Sample Telementry

Then, click on that row for more details about the given telemetry item. This telemetry is also being logged to Application Insights and you will be able to see it on all the usual dashboards and search for it using Log Analytics query, just be aware there is still the typical 2 to 5-minute delay between when it is sent and when it will appear in queries and dashboards.

Summary

You have now learned how to troubleshoot Azure Application Insights quickly and without needing to deploy your application to Azure. To summarize, you run “dotnet publish” in “Release” mode locally and then run the application from the command-line outside Visual Studio. This is done for a few reasons:

  • When publishing in release mode, you do not need to worry about appsettings.development.json
  • By running outside Visual Studio, you do not need to worry about launchSettings.json setting any special environment variables that don’t match your production environment (e.g. ASPNETCORE_ENVIRONMENT)
  • When running outside Visual Studio, you do not need to worry about the diagnostics panel deciding to capture your telemetry and preventing it from being sent to Azure.

Once your application is running locally and has the Application Insights telemetry key configured properly, you will find the telemetry in the “Live Metrics” view so you can avoid the typical 2 to 5-minute delay between sending telemetry and will see it elsewhere in Application Insights.

If you are concerned that this setup will not allow for the use of the Visual Studio Editor, think again! Once you have the application running outside Visual Studio, simply use the “Attach To Process…” menu item in Visual Studio. This gives you the best of both worlds:

Visual Studio Debugger

Hopefully, this post helped you understand how to more quickly how to troubleshoot your Application Insights telemetry configuration. That will come in handy in the next post for this series, where we talk about customizing telemetry to keep PII (Personally Identifiable Information) out of your application insights logs.

When building a web API or web application it is critically important to know that the application is functioning as intended. Whether that be from a performance perspective or simply knowing that external clients are using the application correctly. Historically, for an on-premise solution that involves installing agent monitoring software and configuring a logging solution with associated storage management. With Azure, that now becomes a turn-key solution using Application Insights. Application Insights can be used whether your actual application is deployed on-premise or in the cloud. In this post, I’d like to talk about configuring Application Insights for an ASP.NET Core application and I’d also like to talk about structured logging.

Enable Application Insights for ASP.NET Core

The way to enable Application Insights for your ASP.NET Core application is to install the Nuget package into your .csproj, file as shown below.

Enable Application Insights for ASP.NET Core

The rest of this article assumes you are using version 2.7.1 or later of the Nuget package. There have been several changes in the last 6 months to the library.
Please add the following code to your Startup.cs,

Add Code to Your Startup.cs

Allocate your Application Insights resource in Azure, whichever way you prefer. This could be Azure Portal, Azure CLI, etc. See Azure Docs for more details.

In your appsettings.json, add the following:

appsettings.json

By now you’ve enabled Application Insights for your ASP.Net Core application. You’ll now get the following features:

  • Request Lodging
  • Automatic dependency logging for SQL requests and HTTP requests
  • A 90-day long retention period
  • Live metrics, which permit you to view and filter the above telemetry along while viewing CPU and memory usage statistics live. For example, see the below screenshots.

Live Metrics Stream

Telemetry Types and Waterfall View

One of the interesting features that Application Insights provides compared to other logging systems is that it has different kinds of telemetry. This includes RequestTelemetry, DependencyTelemetry, ExceptionTelemetry, and TraceTelemetry. Application Insights also provides the ability to have a parent operation that other telemetry operations belong to and you can view a waterfall view of a given request. For an example see the screenshot below:

End to end transaction

Structured Logging

Any of the telemetry types will provide the ability to add arbitrary key-value pairs. Those values will then be logged as key-value pairs to Application Insights. Then using the Log Analytics feature of Application Insights, one can then query on those custom key-value pairs. Effectively, you are getting a schema-less ability to attach custom properties to any telemetry in real-time. This is commonly referred to as Structured Logging with other frameworks. This is so you are not creating one long message string, then trying to parse the message string. Instead, you get custom key-value pairs and can simply query for a given key having a given value. The screenshot below provides an example of a Log analytics query on a custom property:

Customer API App Insights

Log Your Own Custom Messages and Telemetry

We now ask the question of how do you go about logging custom telemetry to Application Insights from within your ASP.NET Core application? The Application Insights NuGet package automatically registers the TelemetryClient class provided by the library into the Dependency Injection container. You could add that as a constructor argument to your Controller for instance and then directly call methods on the TelemetryClient. However, at this point, you are coupling more parts of your application to ApplicationInsights. It’s not necessary that you do that. With the latest versions of the ApplicationInsights NuGet for ASP.NET Core, they register an ILogger implementation with ASP.NET Core. So, you could then update your controller as follows:

Log Custom Messages

In the above example, we have logged a message and a custom key-value pair. The key will be “id” and the value will be the value of the argument passed into the Get function. Notice, we have done this only with a dependency on ILogger, which is a generic abstraction provided by Microsoft. ILogger will typically log to multiple outputs, Console, ApplicationInsights and you can find many implementations of ILogger. ILogger natively supports structured logging and will pass the information down to the actual log implementation. The below example being Application Insights.
Currently, by default Application Insights will only log warning messages from ILogger. So, my above example would not work. To disable the built-in filter, you would need to add the following to Startup.cs in ConfigureServices.

ConfigureServices

Summary

With Application Insights, we can provide within minutes in Azure. You’ll receive 5 GB of data ingestion free per month and free data retention for 90 days. It is trivial to instrument your application. Some of the benefits you’ll receive are:

  • Automatic logging of requests/responses
  • Ability to drill into recent failures/exceptions in Azure portal
  • Waterfall mapping of a request
  • Automatic dependency logging of out-bound SQL and HTTP requests
  • Arbitrarily query your data using Log Analytics
  • Ability to drill into recent performance metrics in Azure portal
  • Live metrics view as your application is running in production with filtering.
  • Custom graphs and charts with Notebooks
  • Ability to create an Azure Portal Dashboard
  • Application map that will show the topology of your application with any external resources it uses.

Application Insights is a very powerful tool to ensure your application is functioning as intended, and it is very easy to get started. You spend your time instrumenting your application and checking application health, not time provisioning log storage solutions and picking log query tools.

Look for future blog posts covering additional topics like keeping Personally Identifiable Information (PII) out of your logs and troubleshooting your Application Insights configuration.