Of course, my flows are tested, error-free and do always run. So it's not up to me if a mistake happens.... But seriously, mistakes can happen, so what things should I include in my flows as preparation?
The idea for this article came to me at the European Collaboration Summit in Düsseldorf - after a conversation with Oliver Menzel, whom I have seen very often on Fridays. In the PowerAtelier, a community call initiated by the two MVPs Stefan Riedel and Tomislav Karafilov. Oliver and I exchanged ideas on how to help users faster in case of errors. I briefly outlined how we handle this in WeDit and also in other applications. Words turned into an open laptop and a promise to create a .zip with the flow. But why only for Oliver? Hence this blog entry.
I started incorporating the process into my flows around the beginning of 2019 - at that time researched using forum posts in a simpler form. Today there are good blog articles about it, e.g. here by Matthew Devaney or here by Richard A. Wilson.
Let's define a goal
Let's create flows in a similiar way so that...
- the flows are able to give feedback to the user about success/errors
- error messages set-up uniformly and are always the same way
- errors are logged in a central logging
- first information about the error is send to the application administrators
- optional/nice-to-have: the error handling can be rolled out to many environments
Hint: The complete solution is at the end of this article.
The basic idea for catching errors - Try/Catch/Finally
Anyone who has ever programmed will know the procedure:
- Try - try to execute one or more operations
- Catch - catch the errors if necessary
- Finally - execute a meaningful, concluding operation, e.g. for notification purposes
In Flow this can be easily implemented using scopes and adjusting the conditions when the actions should run. The two linked blog articles above describe the basic setting in a simple way. I don't want to repeat that and have a closer look on our "best practice".
For us, any flow that does not look like above (or slightly different, because more complex, in WeDit) fails internal quality testing. All operations must be in the action "SCOPE - main". This way we can also grab always the same point for more information: the name "SCOPE - main". Within this Scope we are usually working for scopes again, in order to receive a better structure:
Two actions are placed after the scope: the response action that returns the statusCode 200 (i.e. success) and the scope "SCOPE - errorHandler" - however, this only runs in the event of an error. A response action that returns the statusCode 500 (i.e. error) and then calls a child flow. This child flow takes care of the appropriate processing of the error.
One small thing is also added to the scope - a Terminate action:
Without this terminate, the flow would be recorded as a "success" because it was completed successfully even if an error occurred: The error interception worked. Therefore, I throw an error again by means of the Terminate to make it visible in the automatic logging of Flow.
Another little thing is hidden in the conditions when the action "RUN CHILD" should be triggered: Also when the action "RESP - 500" is skipped. It could be that there are reasons in which no feedback needs to be given to an app or flow. However, we still want our errorProcessing to be triggered, of course.
At this point, we can check off the requirement to "be able to give feedback to the apps/flows and have them be used for output to the user".
Conclusion so far...
the flows are able to give feedback to the user about success/errors- error messages set-up uniformly and are always the same way
- errors are logged in a central logging
- first information about the error is send to the application administrators
- optional/nice-to-have: the error handling can be rolled out to many environments
Processing the error
But what happens in the child flow that should help us with the other requirements. Actually only two essential things:
- add a row in central logging
- sending a mail
However, the flow features many more actions than just these two - why?
Receive information - input variables
From the calling, i.e. the faulty flow, we can get a lot of information - for us the following three turned out to be useful:
workflow()
This variable can be used to determine, among other things, the run (to create a link to perform the error analysis), the environment, the name of the flow and even more interesting things.
trigger()['outputs']['headers']
In this JSON we find, among other things, the triggering user of the flow - but of course only if it really was a "physical" user who triggered the flow. More about that later.
result('SCOPE_-_main')
In this JSON are the real "error messages" - potentially why a flow fails. In my example, that a division by 0 occurs. This variable is also the reason why every scope, in which is calculated, must be called "SCOPE - main": This way the inputs in the calling flow are always the same.
So on the calling flow side, the whole catch block looks like this:
So no, actually it looks like this:
Since we can't use objects/arrays as variables in the non-premium actions and triggers, we have to do a conversion. Unfortunately, the same thing needs to happen on the child flow side in the other direction.
Maybe you are asking yourself, why did I actually seperated INIT and SET of the variables apart? Well, when setting (in this case converting a string to JSON) something could go wrong. I would like to catch that. In the case of errorProcessing it's a bit theoretical, but we use that approach in every flow.
By the way, there's one piece of information I'm still looking for: the id of the solution the flow is running in. If someone has an idea, please report here.
Central logging - SharePoint list
For recording the errors, I use a simple SharePoint list in our example. In the case of WeDit, this is of course a table in the Azure SQL database. Two scenarios would be conceivable for me: Should there be a central list that logs multiple applications? Should there be a separate list per application?
If a central list is decided on, of course further thoughts about permissions etc. have to be made. The good thing right away: By using a child flow we will have to store a fixed connection. I.e. the end users do not need to have access to our list.
In my case the SharePoint list looks very simple:
A small note: The list can be used generically, i.e. PowerApps could also log to it, so there is an "errorType" column. The meaning of the columns will surely become much clearer by filling them in the flow:
I use the following functions - errorUser and errorSessionLink I would like to skip for a moment:
- errorMessage: string(variables('arrWorkflowResult'))
- errorSessionId: variables('objWorkflowData')?['run']?['name']
- errorFunctionName: variables('objWorkflowData')?['tags']?['flowDisplayName']
- errorData: variables('objWorkflowData')
errorUser
This expression is a little longer - why? Because a user does not necessarily always call the flow, but sometimes a child flow does. This is called by an Azure Logic app, the user is not passed through. Therefore, unfortunately, an if-condition must be used, otherwise an error will occur. We certainly want to prevent this in errorProcessing!
By checking whether the "User-Agent" element contains the azure-logic-apps, we know whether it is a child flow or not. Otherwise, we write the UPN of the calling user in the list.
errorSessionLink
The sessionLink should be easily clickable in the mail, so I put it together to point directly to the erroneous run. In the blog article of Matthew Devaney is also again exactly that well explained.
Another conclusion...
the flows are able to give feedback to the user about success/errorserror messages set-up uniformly and are always the same wayerrors are logged in a central logging- first information about the error is send to the application administrators
- optional/nice-to-have: the error handling can be rolled out to many environments
Just send the mail... or more?
One little thing is still pending - the sending of a mail that a flow has failed! A corresponding link that can be used to jump to the failed flow is created in the list, which in turn can also be used in the mail.
But let's take the following scenario: The flow fails, e.g. a third party service is out of order. This can happen and is reported accordingly in the flow via an error message to the action. Should the application maintainers really have to go to the flow in such cases? No. For this we have the variable result('SCOPE-main').
However, this is an array, so the relevant records need to be filtered out. For this we can use the status, which simply has to be "Failed". But how should more than one error message be displayed in the mail? It would make sense to have a structured view as HTML table. Exactly for this the select action can be used. In it we prepare the HTML line of the error message. The first column contains the name of the failed action, the second column contains the error message:
There is a problem with the error message readout. In the result-JSON the error messages are in different places, depending on the failed action. I try to catch this via an if.
But wait - from the select action an array is returned as result. Can this be used in a mail at all? The answer is very simple: join() without a separator.
Since Microsoft unfortunately has not yet solved the "unfold bug" for mails and teams messages formatted as HTML:
As a result, we now get this mail:
Permissions for the flow
By using a child flow, we are forced to store fixed connections:
In our case, this is very helpful, as it means that users do not need any permissions to the SharePoint list or shared mailbox they are using. A technical account comes in handy in this case and leads me almost seamlessly to the last, optional requirement.
Final conclusion
the flows are able to give feedback to the user about success/errorserror messages set-up uniformly and are always the same wayerrors are logged in a central loggingfirst information about the error is send to the application administrators- optional/nice-to-have: the error handling can be rolled out to many environments
Roll out/use in multiple environments
In the present solution I have the flow together with the "application" - which in my demo consists of only two flows that generate errors. Would this be a concept that would make sense in reality? From my point of view no!
The errorProcessing flow, the environment variables and connection references can be packed into a separate solution and played out uniformly on all environments. Providing a standardized errorProcessing does not have to be the only thing provided to makers in the business in the long run. Other examples would be logging flows at the start of PowerApps to meet requirements of the works council or or or...
The big advantage is that makers don't have to worry about Catch/Try/Finally - except for the correct integration within their flows. But this is limited to the use of a scope called "SCOPE - main" and the copy of the errorHandler scope from a template.
All requirements would be implemented! Or is anything missing - feel free to contact me via LinkedIn!
The Solution and the implementation
The .zip of the unmanaged solution can be found here. Before importing, a SharePoint list should be set up on an appropriate SharePoint site. When importing, the two SharePoint environment variables are to be filled with this site and the list. The connections can or should be equipped a function account (multiplexing consider). Fill the environment variable envTxt_errorSharedMailbox with an appropriate shared mailbox (don't forget authorization for the connection) and off you go!
Stefan Jackmuth