Findability: How to prevent missing metadata

09-01-2019

Deze klanten gingen je al voor

The underlying drive of the Search functionalities of Office 365 isn’t searchability. It is findability. You want to be able to find the content you’re looking for. Finding documents relies on the indexation of the metadata attached to certain content.  

The reliance of Search on metadata makes it vital for your users to fill out said metadata. Of course, reality is hard to bend to conform to theoretic ideals. End users often forget to fill out the metadata because of other obligations or distractions. How can we change this behaviour?  

User story 

‘I want to have an automated reminder for my users when they forget to add the metadata to uploaded content on our Sharepoint intranet. This in order to improve the acquisition of the metadata, which improves our search functionalities.’ 

A desire for automated processes in Sharepoint sounds like a job for Microsoft Flow. 

Let’s deconstruct the user story. It requires an automated process that gets triggered whenever someone uploads a document. The uploader will receive a reminder to fill out the metadata of the content. That’s the basic requirement from the client. 

On my end, I have some additions to the user story to make a more sustainable end-product.  

  • Must be scalable and easy for future implementation
    A quick fix in one library is nice, but we aren’t consultants for quick temporary fixes. A client is helped out best when we think ahead and work for a sustainable product.
    The end-product has to be easy to implement throughout the entire intranet on the wishes of an admin, and it has to be viable in the future when the Intranet has grown.  
  • Loop until metadata has been filled
    Sending out one reminder is fine but might not influence people’s behaviour. Undesirable behaviour (in this case: not filling out the metadata) has to lead to some annoyance. The alternative desired behaviour must become attractive. 
  • Keep resource allocation as low as possible
    Flow requires processor power from the Microsoft servers. It has certain caps on the amount of runs you can do per month. Also, giving a Flow several parallel runs will result in a slower Sharepoint environment, because the service is pro-occupied with Flow runs. 

Looping Flows 

Wait? Looping Flows? Don’t you mean an Apply to Each or Do Until loop?
Nope. I mean triggering the same flow multiple times in a row until certain conditions are met. Apply to Each or Do until loops force the Flow to run until the situation has been fixed or crawl the entire library of content to find the single file that isn’t filled out right. Imagine uploading 5 files in a document library with hundreds of files. It would take ages to crawl the library.

An Apply to each or Do until will cause the Flow to pick up, handle and put down every single file in the library that doesn’t meet the right conditions.
That won’t work well and puts massive unnecessary stress on your Sharepoint resources. 

Looping an entire Flow turns into a bit of a challenge, as Flow does not actually have a loop functionality as of now. A Flow gets triggered, runs its course and stops at the end, with no clear way to have the trigger getting triggered again in repetition without manually acting it out again.   

What it does have are the HTTP-action and the Request-trigger that can be used in Nested Flows. Nested Flows allows you to call upon another Flow to run based on data from a current Flow. These functionalities will be vital for creating a loop construction looping an entire Flow run. 

How do you check for metadata? 

My solution for checking the metadata was giving at least one choice option in the columns a default value. In order for it to work, this column needs to be added in every document library (which could be automated through the Content Type hub, but that’s another topic for another time).  

This column is checked by Flow for its values. It could be little more than asking the user if they ‘filled in all the metadata’ with a ‘yes’ or ‘no’ response (‘no’ being the default value, of course).
If this choice menu isn’t toggled on another option than the default option, chances are the user forgot all the other metadata as well. 

The Flow process has a condition check for that column, to check whether the value in that column equals the default value. 

Initial design 

With these requirements in mind I was ready to start drawing out an initial design. In order to get a grip on an idea, it helps me to visualize what I have in mind. Drawing helps with that and if it helps, don’t be afraid to do something old fashioned! When I drew out my initial design, it looked like this: 

 Flow Design

So, what happens here? Let’s walk through the process. 

Trigger Flow: The trigger Flows are at the top. This is where the first check occurs. When a file is created or modified, the Flow stalls for a while (you have to allow your end user to fill out the metadata by themselves) before continuing with a check to see whether metadata equals its default value in a defined column. 

If the metadata checks out as ‘non-default value’, the Flow is done and will stop there. If it checks out as ‘default value’, then we continue the Flow further to the HTTP action.  

HTTP action: Here you have to collect all dynamic data that is needed for the engine flow. This is done in simple JSON construction where you enter dynamic data into the values of all the variables.
The JSON isn’t flexible, as this is the moment for you to define all variables to work with further on in the process. You can’t add more variables here or lose some that are needed further down the drain. There are just a few variables embedded in the engine for the engine to work with. Adding variables in this HTTP action won’t bring you anything. Removing variables might disrupt the engine.  

 Flow HTTP JSON

The HTTP action posts these variables in JSON towards a defined URL, which is the URL defined in the Request action of the engine. Note how I also send out the Site Address and Library (in red) as a set value. It will be important later. 

Engine Flow: The Engine flow is where the magic happens. It’s triggered when it receives a HTTP call to its URL. It receives the JSON content as a package, with all the values defined to the variables. It’s significant to realize that you only have to define what the Request is to expect: you will receive variable A, which has value X, which will be of the string-type.
Flow takes care of the rest. It took me a while to figure that out. ?   

 Request

Next, we have an action to pick up the properties from the file library to compare its values to the values that are coming from the other Flow in the JSON.
In the first run, this is a bit redundant, since we just come from a Flow that worked with the values in said library. We’re already thinking ahead now. Since this engine is designed to loop, there has to be a point where we bring in the actual variables from the library itself into this engine. Otherwise we’ll just create an infinite loop, as we continue to work with the variable defined as a ‘default value’ and are never able to check whether the live version in the library itself has been updated in the meantime.   

Dynamic Library look-up: It’s important to make this library check dynamic. The engine is designed to accommodate any trigger Flow, so you can’t apply this check to a fixed set of values. They need to be variable within the engine Flow.  

This is why we send along the Site Address (in URL) and Library (in GUID). This is the moment to use these values. The reason they are fixed, is because they are inherently linked to the specific trigger Flow. This way, we can allocate this check-up back to the right library. 

Library Lookup 

The filter query is added to reduce the amount of data that comes out of this look-up. It makes sure the only data that comes out of this check-up is the data attached to the file in question. Not the entire library. This too reduces the stress on your resources.

Getting the GUID: Note that the Library value has to be the GUID value. Not the name of the library that normally pops up in the drop-down menu. That’s the user interface helping you out as a user. The inner system works with the GUID. Adding the name of the library will only result in an error, as Flow can’t identify the library. The GUID can be found in the URL in the browser when you look in the library settings of said library.  

Onwards into the condition checks! 

Now that we have both data sets (the live document library and the values of the file when it entered the trigger Flow), it’s time to compare them. This comparison will happen every time the engine loops again to see whether end-user has updated the metadata of the file.  

We go into a series of condition checks. The first checks if the ‘default value’ has been updated. If the live document library has a different value than the default value, we can assume that the end-user has taken heed and added the metadata. The Flow will end there.
If the default value is still present in the document library, we can assume the user hasn’t added the metadata yet, so we continue to the next condition check. 

This condition check checks to see if the creation date equals the modification date. If it does, the next step works with the creator variables (display name, email address, etc). If they don’t equal out, we work with the modifier variables. 

As a last point, we’ll send out one of the two email templates that we’ve written up. One is written for the creator, the other one for the modifier. This speaks for itself, I think. 

Looping 

The loop is actually nothing more than a gimmick with the HTTP and Request actions. I collect the data again in an HTTP post (all as a variable this time) at the end of the engine Flow, let it stall for X amount of time and send it around to the Request of the Engine to trigger it again. It’s a delay and a HTTP post back to the Engine trigger.  

It’s not really that complex and I think this blog explains much of what you will need to know to implement it yourself. 

So, when trying this design out, I kept the delays to a minimum of one minute each (to speed up the test process). If your inbox floods with e-mails telling you to ‘please add the metadata to your file!’ I bet people will experience an incentive to change their behaviour and add metadata on their own before their files go into the automated process.  

Of course, flooding someone’s inbox with one-minute intervals is a bit excessive. Common sense should be applied when setting these delays. Find what kind of delay works for you. If required, you might even make this delay a variable if libraries require different urgencies. You can send it along with the rest of the information in the JSON inside the first HTTP post. 

Does it meet my requirements? 

Remember how I said I was looking for a long-term solution that’s easy to implement in future? To achieve this, I looked for a way to make an ‘engine’ or a function that could be called upon by a trigger. By detaching the trigger from the function, I have an easy-to-build trigger Flow, while the more complex motor of the process is off somewhere else, unchanged and only fired up if the trigger Flow sends out a JSON package to it. 

Because of all the regular checks in the Flows I can give the process enough opportunities to abort and quit, bringing down the number of runs to its minimum. The delays take little processing power, as it just stalls the process. It requires no further stress on the resources of your Sharepoint environment.
The design reduces the amount of runs and therefore the amount of resources and time it takes to run to a minimum.  

The design scales nicely as you only need to create a new trigger Flow, dedicated to their own document library. Of course, I haven’t tried this out in a surrounding where it has hundreds of users and you run thousands and thousands of document libraries in your intranet. It’s difficult to simulate that in a test environment. 

The trigger Flow is quite straightforward and requires little to no experience or understanding of variables, JSON code or HTTP calls. I hope Microsoft will create a copy functionality for Flows soon, as it will make it even easier to create a new trigger Flow. Then you’ll only need to define a different document library and add it to the JSON in the HTTP post. The surrounding Flow-structure is already set up by the copying process. 

Lees onze andere blogs

Deze website gebruikt cookies

Met deze cookies kunnen wij en derde partijen informatie over jou en jouw internetgedrag verzamelen, zowel binnen als buiten onze website. Op basis daarvan passen wij en derde partijen de website, onze communicatie en advertenties aan op jouw interesses en profiel. Meer informatie lees je in ons cookie statement.

Accepteren Afwijzen Meer opties

Deze website gebruikt cookies

Met deze cookies kunnen wij en derde partijen informatie over jou en jouw internetgedrag verzamelen, zowel binnen als buiten onze website. Op basis daarvan passen wij en derde partijen de website, onze communicatie en advertenties aan op jouw interesses en profiel. Meer informatie lees je in ons cookie statement.

Functionele cookies
Arrow down

Functionele cookies zijn essentieel voor het correct functioneren van onze website. Ze stellen ons in staat om basisfuncties zoals paginanavigatie en toegang tot beveiligde gebieden mogelijk te maken. Deze cookies verzamelen geen persoonlijke informatie en kunnen niet worden uitgeschakeld.

Analytische cookies
Arrow down

Analytische cookies helpen ons inzicht te krijgen in hoe bezoekers onze website gebruiken. We verzamelen geanonimiseerde gegevens over pagina-interacties en navigatie, waardoor we onze site voortdurend kunnen verbeteren.

Marketing cookies
Arrow down

Marketing cookies worden gebruikt om bezoekers te volgen wanneer ze verschillende websites bezoeken. Het doel is om relevante advertenties te vertonen aan de individuele gebruiker. Door deze cookies toe te staan, help je ons relevante inhoud en aanbiedingen aan je te vertonen.

Alles accepteren Opslaan

Ontdek onze QSEH Star