This story goes back to the 1980's when I was asked to test new bank software and to document what errors I uncovered. I completed an Error Incident Report and walked that form to the next building for review by the development manager. Finding errors meant that I got to leave my windowless cubicle and walk in the California sunshine (at least temporarily). The developers needed to know what process they would need to perform to recreate the incident and what the expected results of such actions should be. Upon return to my desk, I would have to summarize the days findings and explain back to the project management what issues were found and what their potential impacts would be on the bank's production environment if left unresolved.
The tried and true waterfall development process left testing until after the requirements and design were completed and I eagerly awaited software to find what were sure to be plenty of bugs. The biggest contention was whether the "expected" results were well defined in the requirements because I would report "errors" that could be considered as missed requirements.
Creating a CReW macro does not include requirement documents, design specifications, use cases, or an approval process. In my, Mark's, POV I create what I think captures the essence of a design pattern (assembly of Alteryx tools) into a single tool. Some of these patterns are fairly simple and straight-forward and reduce the time to configure multiple-tools and some patterns are difficult for beginners to gain the concepts for. In either case, the tool obfuscates the pattern and makes something easier for the user.
As an Alteryx ACE, I'm able to attend developer days where we have an opportunity to share feedback with the development (sometimes with product managers) team and discuss our views and needs as well as to give our feedback on their ideas. Observant as I am to the NDA, that's where that story ends and this macro blog begins.
Let's talk Cross Tab"
Who doesn't like a Cross Tab tool? Frankly, I don't know why it is a "Cross Tab" tool and isn't a "Crosstab" tool. There are more than 200 ideas in the Alteryx Community returned when you query that term where 1/3 of these ideas are authored by Alteryx ACEs. Just because there are literally hundreds of ideas posted, doesn't mean that the tool is broken. It means that improvements are requested. The cross tab tool is at the heart of the create dummy variables process and this process was the topic of my latest rant to Alteryx.
Cross Tab - Areas of potential improvementHere are a few of my personal grievances:
- Configuration is not saved as some new tools do
- Dropdown and select are not searchable
- Domain/headers replace non-alphanumeric characters with underscore
- Output columns are in alphabetical order
- You need data that supports all possible values otherwise Output columns are not generated
- Downstream tools fail if not run with sufficient data
Cross Tab - Work-arounds
- Use a find replace tool to prepend a sort sequence to each field value (to change the incoming data) so that the column name sorts data (e.g. 00_Sunday, 01_Monday). Later I can dynamically rename the columns
- Use an Ensure Fields (CReW) macro to create needed columns of data (after Cross Tab).
- Place a select in front of the cross tab. Only bring a minimum amount of fields into cross tab. Join back on either record sequence or recordid depending on need (e.g. use of AMP or if output records aren't 1:1 to input)
- Sample the output data (0 records) and save to YXDB. This saves the metadata for a good output. You can later Union this data within the workflow and ensure order and presence of data
Let's talk Dummy Variables
Scope for Dummy Variables:
That idea to include the variable name as the first part of the output column name (e.g. Color_G) created a thought that a custom prefix might be preferred to the actual field name. Then I ran into problems with spaces (thank you Cross Tab) in the output column name field. When I added the ensure fields capability (to pretend that data values not present were accounted for), I had space problems and NULL value issues that needed more creativity. Finally, I was ready for testing and went to fellow ACE, Dan Languedoc for testing support. He suggested (insisted) that I allow for INT fields to be supported as categorical variables. "Fine!", I'll add them. He later said that my macro was indestructible.