Pages

Wednesday, 24 May 2023

Control Containers - Part 2 - CReW Container Error Check Macro

In part 1 of this series we looked at the case where we didn't have an input hooked up to our control container in the logging container. This time we are going to look at what we can do with the input anchor connected.

How does the Input Anchor work?

We noted in the last post that without an input anchor the control container works very much like the CReW Runner macro. Well if we connect up the input anchor then the control container works very much like the CReW Conditional Runner macro. That is to say:
  • If the Input Anchor is disconnected
    • All tools inside run as though they were in a regular container (or straight on the canvas)
  • If the Input Anchor is connected
    • If the input anchor receives 0 records – The tools in the Control Container DO NOT run
    • If the input anchor receives >0 records – The tools in the Control Container run AFTER the last input record is received
Or in pictures



Introducing the Container Error Check Macro

With this we now are very close to replacing the CReW Runner macros with Control Containers! All we need is the S and F outputs provided by the runner macros. We replace this with the Container Error Check Macro.

You can download it from:


(To install it copy it into the macros folder where you have the CReW macros installed on your machine. For example C:\CReWMacros\Macros)

You connect it directly after a control container and it has the following behaviour:
  • If the incoming log messages contain an error all records get pushed out of F (Fail) output
  • If incoming log messages have no errors all records get pushed out of S (Success) output
Anyone who has used the Runner macros will find this very familiar.

Conditional Running

Using this macro we can now conditionally run parts of our workflow!

Success case:



Failure case:

This workflow generates a PDF report and then if there are no errors uploads it via an API using the download tool. If on the other hand there are errors, then the file is not uploaded, but an email is sent instead.

Learn More

If you are at Inspire today, please come to my breakout session to learn about this use case and more. 

If you aren't at Inspire, but would like me to present this talk at your local user group them please get in touch and we can set up a date!

Monday, 22 May 2023

Control Containers - Part 1 - Logging Container

Well it is Adam here. I'm back posting on my personal Alteryx Blog after a very long time, but Control Containers are such an important new feature to desktop designer (that I have wanted to see in the product for so long) that I just had to write a blog post about them.


What are Control Containers?

In short Control Containers are the biggest innovation in controlling the order of how things run in an Alteryx workflow since we released the CReW runner macros over 8 years ago. Ever wanted to control what order your outputs write in? Or run one section of a workflow before another? Or how about capturing logging and error messages? Control containers can do all of this and I suspect more things that we havn't even thought of yet. I don't say this lightly, but I think Control Containers are going to be a game changer with what you can do in an Alteryx workflow.

They were released last week as part of the 2023.1 release https://help.alteryx.com/release-notes/designer/designer-20231-release-notes

They are so innovative that myself and Jeff Arnold who worked on the initial design for control containers got a patent granted for the idea https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20230113187.


This series of blogs is going to teach you about how Control Containers work through a number of example use cases of what you can do with them.


First up the Logging Container!

When Jeff and I were first throwing around ideas for Control Containers, I actually had proposed that the feature would consist of two tools: A Logging Container and a Control Container. The Logging Container being exactly the same as a Control Container, but with no input anchor. In the end we ended up with a single Container with an optional input. So for now let us pretend the Control Container has no input anchor and just focus on what that output anchor does.

First of all like a regular container a Control Container groups together a set of tools. The first difference we will look at though is that new output anchor.

When a workflow runs, all of the log messages for tools inside the container that usually get written to the messages window, also get output as data from the Control Container's outpur anchor. For anyone familiar with the CReW runner macros this use case is very much like the CReW runner macro, but in a single workflow!




So what is this useful for? Well a number of things, but today we will use it to create an audit trail for our workflow. We enclose the entire workflow in a Control Container and then using regular Alteryx tools we add the current datetime and workflow name and write it to a database.


Now every time this workflow runs we can capture this information and build up a historical record of how things change over time!



Next time we will introduce a new CReW macro tool for making this logging data even more useful.

For anyone at Inspire this week in Las Vegas, please come to my session on Wednesday when we will talk about this use case and a lot more!




Tuesday, 16 August 2022

CReW Expect Records, or "What to expect when you're expecting data"

 This morning I awoke to a WhatsApp message from my friend, @grossal (Alex) asking the following:

[5:12 AM, 8/16/2022] Alex Gross: Is there a CReW "Expect Records" Macros? 

I see a lot of time Use Cases, e.g. Database Connections or files where I need to check if records come in. I think this could also be useful for the SHA256 Macro.

I just had the scenario where I connected to a database and got zero records back - but it didn't crash my workflow because the metadata was still provided by the db.

Alex had already explored the macro updates to either create a cloned copy of the existing macro or to include functionality to choose between expect 0 or expect something.  It would be easy.  

Easy is in the eye of the beholder.  I knew that I'd have to change the image and that change would complicate everything.  I also don't want to interfere with any existing use of the macro.  I opted for another macro.  Of course, I'd opt for Alteryx to include an ERROR condition on tools that they should allow for an ERROR if 0 records are passed through them.


Another vertical workflow!  Instead of using a COUNT RECORDS macro, I opted for using the JOIN tool (configured with join by record position).  When 0 records are present in the I (Input) anchor, then the "Error Message" data is passed to the MESSAGE tool.  That tool is configured as an "ERROR" and passes the contents of the TEXT INPUT tool to the Results panel.

This method is ever so slightly more performant.  But more importantly it doesn't include another macro inside of it.  That simplicity appealed to me.  When I can use fewer tools and less macros (oxymoron) I think that I've got a winner.  After testing with and without AMP and testing just to see that it works I am confident enough to let you, my friends check it out too.

You can find the macro in the Alteryx Community (Gallery).  Right now I'm getting ready for Alteryx & Beer with Alex and the German Alteryx Community.

Cheers,


Mark

P.S. During the Alteryx & Beer session, I was coerced into creating an output anchor for the macro.  If data is present (e.g. NOT an error), the data will pass through the macro.  This was an ask and so it was made to happen.

P.P.S  If you have a LARGE set of data, the "Count Records" method will literally count every record.  This method only requires 1 record to know if things are kosher or not.



Wednesday, 3 August 2022

CReW SHA256 Macro: Enter the Python

 Did CReW just freeze over?

CReW SHA256 Macro

This wasn't the first rodeo for me, but it is the first time that I've publicly created a macro that uses the python tool in Alteryx.  I pride myself in doing things with BASE-A (100% Alteryx) whenever possible.  That happens virtually 100% of the time.  But until Alteryx adds SHA256 (or other encryption) to their products (note:  a FIPS version of Alteryx is available help.alteryx.com), this macro will get SHA256 encryption on your current (2022.2) version.

Where is the macro?

This link is at the top of the article for those who don't care about the making of the macro....


How does it work?

Select a single data field (String Type) and it will be SHA256 encrypted as:  [SHA256_Output].  There are warnings if EMPTY data is present or if the incoming data is NOT lowercase (the macro will automagically lowercase the input field).  If you don't want these features turned-on, you can disable them in the macro configuration.

The entire record is read and output by the macro.  Only the SHA256Input field is passed into a python tool which uses HASHLIB and encrypts your field.  If the incoming data is Empty(), it is bypassed.  If you have lots of duplicate data, you might want to UNIQUE that data and only pass the unique values through the tool.  It runs about 2,000,000 records per minute on my computer (with AMP on).  Without AMP, it runs about 50% slower for this function.  

Memory and AMP and Python can lead to shortages of memory.  Use caution.  If you experience memory-related issues, you might want to keep things simple.  I ran 200 million records through this process in less than 1.5 hours while testing.

What's inside the box?

Input:

Your incoming data is routed through a common RecordID tool and the entire record circumvents the encryption process and awaits the results in a JOIN tool (append to end of record).

Process-Prepare for Encryption

A formula tool assigns the selected (Interface Drop-Down) string field into the __SHA256Input__ field.  Conversion warnings (MESSAGE) are issued if the input field is Empty().  Conversion warnings are also issued if the input field contains uppercase letters.  The input field is converted to lowercase() for all records.  If these function offend you, you can deselect the execution of the preparation functions in the configuration (via CheckBox that updates the DETOUR).

Python - SHA256 Encryption

Empty data is filtered and then unioned to the results of the encryption.  Data then goes into the Python tool and the following script is executed (please note that the script is saved as a comment inside of a disabled container):

#################################
from ayx import Package
from ayx import Alteryx
import hashlib
#################################
df = Alteryx.read('#1')
#################################
def text_to_sha256(text):
    h=hashlib.new('sha256')
    h.update(text.encode('ascii'))
    return h.hexdigest()
#################################
df['SHA256_Output'] = df.apply(lambda row: text_to_sha256(row['__SHA256Input__']),axis=1)
#################################
Alteryx.write(df,1)

The python tool reads #1 input and outputs via #1.  The imported library, hashlib, is used to created a function that converts text to sha256.  A dataframe for output uses a lambda function to parse the data from __SHA256Input__ and creates your SHA256_Output field.

In an effort to minimize the python use of memory (as tested python uses 50% more memory than Alteryx was consuming), only the RecordID and SHA256Input are passed through this tool.  The tool is configured in "Production Mode".  A copy of the script is saved as a comment as friends of mine tell me is a good practice (sometimes it is lost otherwise).

Error Checking:

What could go wrong?  I can't imagine records being lost in the join, but should it happen there are two (2) CReW Expect 0 macros waiting to throw error messages if any unjoined data tries to escape the macro.


In Closing:

It has been a while since I've posted.  The community gallery is where I've been stashing macros.  Every once in a while I hear questions about the use of CReW while using AMP.  On my to-do list is to look through the parallel-BUD tool and modify it to be AMP compatible.  In reading community posts it seems that it might not work without compatibility mode enabled.  If you have any CReW or AMP feedback/concerns (related to CReW), please let us know.

A special note of thanks to Alex Gross (AYX Community @grossal), ACE and friend who helped to get through some python woes.

Cheers,

Mark


Monday, 15 November 2021

Alteryx Runner 🏃 Macro: Check out the Alteryx Gallery!

Inside of the Alteryx Gallery, find the Districts page and look for Alteryx Product District:

Scroll down and you'll find the Runner macro (from Alteryx)

If that's all you need to get running, you can quit here.  The Alteryx runner macro works the same as the CReW Runner macro.  If you're asking, "Where can I find the other Runner macros?"  I don't have an answer for when they may appear.  But I can tell you that if you're looking for a List Runner macro that you're in luck.  We're prepared to show you a quick DIY hack for creating a List Runner macro out of the "official" Alteryx Runner macro.

Ingredients:

Download Link

Control Parameter

Action

Macro Output (x2)

Instructions:

Download the macro (yes, you can use the link above.)

Save the macro somewhere special.  Don't worry about error messages regarding their assets.

Open a new workflow and insert the Runner Macro.

Above the macro place your control parameter onto the canvas.

Connect the Q anchor to the Lightning bolt on the RUNNER tool set action to update the "Value" on the Runner.

Connect a Macro Output tool to each of the RUNNER anchors (Label them Success 'S' and Failure 'F' properly).

Save!

Additional Learning Option:

Watch Esther & me walk you through the process in a YouTube video:  
https://youtu.be/mKpJjVYH-yU

You can improve the quality of the macro and let us know what you've done.  Perhaps you can save your macro to Alteryx?  

I'm not posting this macro here.  Why?  We already have a CReW List Runner.

Cheers,


Mark





Thursday, 17 December 2020

CReW Data Cleanse

Yet another new macro is about to be released!  Alteryx has "AMP"ed up their game and while not all tools are AMP-Compatible, surely more speed is coming.  I've long thought about taking a stab at a CReW Data Cleanse tool, but since the release of an ALTERYX macro that delivers that functionality I have stayed in the peanut gallery.  I'll be releasing a CReW macro before Christmas 2020.  On Friday, December 18th I will surprise the Dallas-Fort Worth User Group (Virtual) with a presentation on the macro instead of (or in addition to) their expected presentation on CReW Generate Dummy Variables.



When you use one (or more) Data Cleanse macros in your workflow, you do sacrifice performance.  Think of it as cleaning the house with a toothbrush and a cup of water.  I am offering you a mop and a bucket.  Make your workflows run faster and give others a chance to use the shared resources.  If you're working with large sets of data, this macro upgrade is essential.

Alteryx Community Post:  https://bit.ly/34mfA0U

Presentation (YouTube) on this macro can be found here:  https://youtu.be/kIDqnwT4g5Q  

Walk-thru (YouTube) of macro can be found here:  https://youtu.be/uA5oyfmE2uc

Cheers,

Mark






Thursday, 10 December 2020

Building CReW Generate Dummy Variables

Background:

This story goes back to the 1980's when I was asked to test new bank software and to document what errors I uncovered.  I completed an Error Incident Report and walked that form to the next building for review by the development manager.  Finding errors meant that I got to leave my windowless cubicle and walk in the California sunshine (at least temporarily).  The developers needed to know what process they would need to perform to recreate the incident and what the expected results of such actions should be.  Upon return to my desk, I would have to summarize the days findings and explain back to the project management what issues were found and what their potential impacts would be on the bank's production environment if left unresolved.

The tried and true waterfall development process left testing until after the requirements and design were completed and I eagerly awaited software to find what were sure to be plenty of bugs.  The biggest contention was whether the "expected" results were well defined in the requirements because I would report "errors" that could be considered as missed requirements.

Creating a CReW macro does not include requirement documents, design specifications, use cases, or an approval process.  In my, Mark's, POV I create what I think captures the essence of a design pattern (assembly of Alteryx tools) into a single tool.  Some of these patterns are fairly simple and straight-forward and reduce the time to configure multiple-tools and some patterns are difficult for beginners to gain the concepts for.  In either case, the tool obfuscates the pattern and makes something easier for the user.

As an Alteryx ACE, I'm able to attend developer days where we have an opportunity to share feedback with the development (sometimes with product managers) team and discuss our views and needs as well as to give our feedback on their ideas.  Observant as I am to the NDA, that's where that story ends and this macro blog begins.

Let's talk Cross Tab"

Who doesn't like a Cross Tab tool?  Frankly, I don't know why it is a "Cross Tab" tool and isn't a "Crosstab" tool.  There are more than 200 ideas in the Alteryx Community returned when you query that term where 1/3 of these ideas are authored by Alteryx ACEs.  Just because there are literally hundreds of ideas posted, doesn't mean that the tool is broken.  It means that improvements are requested.  The cross tab tool is at the heart of the create dummy variables process and this process was the topic of my latest rant to Alteryx.

Cross Tab - Areas of potential improvement

Here are a few of my personal grievances:
  • Configuration is not saved as some new tools do
  • Dropdown and select are not searchable
  • Domain/headers replace non-alphanumeric characters with underscore
  • Output columns are in alphabetical order
  • You need data that supports all possible values otherwise Output columns are not generated
  • Downstream tools fail if not run with sufficient data

Cross Tab - Work-arounds

  • Use a find replace tool to prepend a sort sequence to each field value (to change the incoming data) so that the column name sorts data (e.g. 00_Sunday, 01_Monday).  Later I can dynamically rename the columns
  • Use an Ensure Fields (CReW) macro to create needed columns of data (after Cross Tab). 
  • Place a select in front of the cross tab.  Only bring a minimum amount of fields into cross tab.  Join back on either record sequence or recordid depending on need (e.g. use of AMP or if output records aren't 1:1 to input)
  • Sample the output data (0 records) and save to YXDB. This saves the metadata for a good output.  You can later Union this data within the workflow and ensure order and presence of data 

Let's talk Dummy Variables

After my short (not so) rant, I realized that I should take on the challenge to improve Alteryx myself through the creation of the Generate Dummy Variables tool.  It won't improve the Cross Tab, but does address the issues that take the greatest amount of my time when creating workflows that either model or score data.  With 30 or more community posts on the subject, maybe it will help folks in the same boat as I am in.



Defining Scope:

Since I don't apply requirement documents or use cases to my CReW activities, I need to set boundaries for functionality and time to build.  What can I create in a day?  Yes, when I build a macro I hope to have it created in a day.  That day generally spans for more than a week, but that includes blogs, builds, testing, feedback, documentation and icon selection.  

Scope for Dummy Variables:

The scope is for a single variable (despite my knowledge that you may have multiple source fields to convert).  I want to accept categorical data elements and provide the ability to rename the output columns to include the column name.  If possible, I want to add the functionality of an ENSURE FIELDS macro (CReW), but not require the user to have that macro downloaded.

Scope Creep:

That idea to include the variable name as the first part of the output column name (e.g. Color_G) created a thought that a custom prefix might be preferred to the actual field name.  Then I ran into problems with spaces (thank you Cross Tab) in the output column name field.  When I added the ensure fields capability (to pretend that data values not present were accounted for), I had space problems and NULL value issues that needed more creativity.  Finally, I was ready for testing and went to fellow ACE, Dan Languedoc for testing support.  He suggested (insisted) that I allow for INT fields to be supported as categorical variables.  "Fine!", I'll add them.  He later said that my macro was indestructible.

Development:

I was unhappy with my final product.  I felt that it wasn't easy to walk-through to explain to others and didn't feel satisfied with the macro.  I had added functionality and had to ensure compliance in multiple locations of the workflow.  It was not pretty.  So I salvaged what I could and re-wrote the macro.  It took me about an hour to come up with the current version.

Testing:

While I was building the macro, I was actively testing the macro in another workflow.  I was using different configuration settings as well as different incoming data (e.g. things with and without spaces).  When I want to see what's happening inside of the macro, I use a "debug" trick to place a Macro Output tool before and/or after a tool and look at the results in the secondary workflow.  I can see what the data looks like going into and out of a dynamic rename or other tools that are in the middle of the workflow and can quickly reconnect them to another point of interest.

Defensive Configuration or "APA Readiness":

Will this macro fail in the future?  I hope not.  I hope that it works and that I never have to maintain or support it again.  But what might be worse is to have consistently wrong results.  The biggest risk that I saw was dropping or muliplying records.  I use a TEST tool to ensure that the macro input record count exactly matches the output record count.  If you drop a record (e.g. Create Points) and don't let the user know about it, I think that it is an error.  I also added a check that if you ask to ENSURE field valuess that the macro will error if you don't actually input a field value.  Also, if you copy/paste a field value into the ensured fields, if you don't mark it as required, that we will ignore the request.  

Now if the macro fails, we can all blame Dan.  He said that it was good to go.

P.S.:  Here's a link to the macro!