Thursday, 10 December 2020

Building CReW Generate Dummy Variables


This story goes back to the 1980's when I was asked to test new bank software and to document what errors I uncovered.  I completed an Error Incident Report and walked that form to the next building for review by the development manager.  Finding errors meant that I got to leave my windowless cubicle and walk in the California sunshine (at least temporarily).  The developers needed to know what process they would need to perform to recreate the incident and what the expected results of such actions should be.  Upon return to my desk, I would have to summarize the days findings and explain back to the project management what issues were found and what their potential impacts would be on the bank's production environment if left unresolved.

The tried and true waterfall development process left testing until after the requirements and design were completed and I eagerly awaited software to find what were sure to be plenty of bugs.  The biggest contention was whether the "expected" results were well defined in the requirements because I would report "errors" that could be considered as missed requirements.

Creating a CReW macro does not include requirement documents, design specifications, use cases, or an approval process.  In my, Mark's, POV I create what I think captures the essence of a design pattern (assembly of Alteryx tools) into a single tool.  Some of these patterns are fairly simple and straight-forward and reduce the time to configure multiple-tools and some patterns are difficult for beginners to gain the concepts for.  In either case, the tool obfuscates the pattern and makes something easier for the user.

As an Alteryx ACE, I'm able to attend developer days where we have an opportunity to share feedback with the development (sometimes with product managers) team and discuss our views and needs as well as to give our feedback on their ideas.  Observant as I am to the NDA, that's where that story ends and this macro blog begins.

Let's talk Cross Tab"

Who doesn't like a Cross Tab tool?  Frankly, I don't know why it is a "Cross Tab" tool and isn't a "Crosstab" tool.  There are more than 200 ideas in the Alteryx Community returned when you query that term where 1/3 of these ideas are authored by Alteryx ACEs.  Just because there are literally hundreds of ideas posted, doesn't mean that the tool is broken.  It means that improvements are requested.  The cross tab tool is at the heart of the create dummy variables process and this process was the topic of my latest rant to Alteryx.

Cross Tab - Areas of potential improvement

Here are a few of my personal grievances:
  • Configuration is not saved as some new tools do
  • Dropdown and select are not searchable
  • Domain/headers replace non-alphanumeric characters with underscore
  • Output columns are in alphabetical order
  • You need data that supports all possible values otherwise Output columns are not generated
  • Downstream tools fail if not run with sufficient data

Cross Tab - Work-arounds

  • Use a find replace tool to prepend a sort sequence to each field value (to change the incoming data) so that the column name sorts data (e.g. 00_Sunday, 01_Monday).  Later I can dynamically rename the columns
  • Use an Ensure Fields (CReW) macro to create needed columns of data (after Cross Tab). 
  • Place a select in front of the cross tab.  Only bring a minimum amount of fields into cross tab.  Join back on either record sequence or recordid depending on need (e.g. use of AMP or if output records aren't 1:1 to input)
  • Sample the output data (0 records) and save to YXDB. This saves the metadata for a good output.  You can later Union this data within the workflow and ensure order and presence of data 

Let's talk Dummy Variables

After my short (not so) rant, I realized that I should take on the challenge to improve Alteryx myself through the creation of the Generate Dummy Variables tool.  It won't improve the Cross Tab, but does address the issues that take the greatest amount of my time when creating workflows that either model or score data.  With 30 or more community posts on the subject, maybe it will help folks in the same boat as I am in.

Defining Scope:

Since I don't apply requirement documents or use cases to my CReW activities, I need to set boundaries for functionality and time to build.  What can I create in a day?  Yes, when I build a macro I hope to have it created in a day.  That day generally spans for more than a week, but that includes blogs, builds, testing, feedback, documentation and icon selection.  

Scope for Dummy Variables:

The scope is for a single variable (despite my knowledge that you may have multiple source fields to convert).  I want to accept categorical data elements and provide the ability to rename the output columns to include the column name.  If possible, I want to add the functionality of an ENSURE FIELDS macro (CReW), but not require the user to have that macro downloaded.

Scope Creep:

That idea to include the variable name as the first part of the output column name (e.g. Color_G) created a thought that a custom prefix might be preferred to the actual field name.  Then I ran into problems with spaces (thank you Cross Tab) in the output column name field.  When I added the ensure fields capability (to pretend that data values not present were accounted for), I had space problems and NULL value issues that needed more creativity.  Finally, I was ready for testing and went to fellow ACE, Dan Languedoc for testing support.  He suggested (insisted) that I allow for INT fields to be supported as categorical variables.  "Fine!", I'll add them.  He later said that my macro was indestructible.


I was unhappy with my final product.  I felt that it wasn't easy to walk-through to explain to others and didn't feel satisfied with the macro.  I had added functionality and had to ensure compliance in multiple locations of the workflow.  It was not pretty.  So I salvaged what I could and re-wrote the macro.  It took me about an hour to come up with the current version.


While I was building the macro, I was actively testing the macro in another workflow.  I was using different configuration settings as well as different incoming data (e.g. things with and without spaces).  When I want to see what's happening inside of the macro, I use a "debug" trick to place a Macro Output tool before and/or after a tool and look at the results in the secondary workflow.  I can see what the data looks like going into and out of a dynamic rename or other tools that are in the middle of the workflow and can quickly reconnect them to another point of interest.

Defensive Configuration or "APA Readiness":

Will this macro fail in the future?  I hope not.  I hope that it works and that I never have to maintain or support it again.  But what might be worse is to have consistently wrong results.  The biggest risk that I saw was dropping or muliplying records.  I use a TEST tool to ensure that the macro input record count exactly matches the output record count.  If you drop a record (e.g. Create Points) and don't let the user know about it, I think that it is an error.  I also added a check that if you ask to ENSURE field valuess that the macro will error if you don't actually input a field value.  Also, if you copy/paste a field value into the ensured fields, if you don't mark it as required, that we will ignore the request.  

Now if the macro fails, we can all blame Dan.  He said that it was good to go.

P.S.:  Here's a link to the macro!


  1. You need to have a good network marketing strategy, using the Internet to provide online marketing tool. With Internet marketing company, will help you to get your website in search engine advertising and optimization services provided by many of the results, but eventually you need to determine what kind of network marketing will give you the best of a product line. Leadgeneration

  2. There are a number of quality construction career training possibilities that allow students to train for a career in construction. Students looking to pursue the necessary education to work in construction can do so by enrolling in an accredited vocational or trade school. Students can choose to specialize in a number of areas. Students can train for certificates or degree. Degrees range from an associates to a masters level. آرین سازه

  3. They will thusly educate you with respect to whatever your task should hold fast to in the Building Regulations. XMX

  4. Assuming you have an endorsed examiner giving the Building Control Service, the duty regarding ensuring that Building Regulations are conformed to is theirs.

  5. except if you're building reviewer is 130 years of age the consolidated insight of the organization is of little advantage. The experience and information on the individual building and additionally bug monitor nearby is what matters.

  6. When starting a job or start a new business, there are many difficulties encountered. Try to be persistent and achievements will bring good results to those who try.

  7. Hello! Today there are many great applications and software. If you follow someone now, it's really easy with today's developed technology. In the world of technology, there are a variety of applications, it will be difficult for you to choose, here, will help you solve, give advantages and disadvantages, suitable for your needs.

  8. This comment has been removed by the author.

  9. Probate real estate investing provides an opportunity for estate administrators to sell real estate holdings. This is particularly beneficial for administrators who are struggling to pay mortgage payments or maintain upkeep on property held in probate. pif finance

  10. Antistatic corrugated box for packaging Item Antistatic corrugated box Material Paperboard, corrugated Color Black MOQ 3,000pcs FOB Price USD 0.1-1.0/pcs Surface disposal Embossing Hot stamping, glossing/matte lamination, varnishing, glitter, creasing, die cutting Delivery time 10-15days. Visit our website anti static box to learn more.

  11. This site has particular software articles which emit an impression of being a significant and significant for you individual, able software installation. This is the spot you can get help for any software installation, usage, and crack.
    ez cd audio converter Crack
    wondershare tidymymusic Crack
    cyberghost vpn Crack
    macro recorder Crack