Wednesday, 18 March 2015

Testing Alteryx Macros - Part 3 - Comparing Data Values

OK time to write some more tests.  But first we need to add some more functionality to our "Expect Equal" test macro.  If you recall from out last post, currently all it does is check if the column names match between the expected and actual data.  Our next series of tests are going to consider the data in those tables which is going to need some new functionality to tell us if the data differs.

It looks like this:

The old functionality is still there above these tools, but I omitted it from the screenshot for clarity.

So what is this doing?  Well the Record ID tools are adding a row number onto both datasets and then the transpose tool uses this as a key and all other fields (importantly including unknowns) as data fields.

The Join then joins the two streams together on the row number and field name.

Finally the message tool is configured:

Whenever the data differs we get a nice error message:

Message (30) Unexpected Value - Field:Total Row:1 Expected:1 Actual:0

With this in place we can write our next tests.  So just a quick reminder of the TDD work cycle:
  1. Write a test for a piece of functionality which will fail.
  2. Write some code which makes the test pass.
  3. (Optionally) Re-factor your code.  Without adding any new functionality and without breaking any tests.
and we are off.

Write a Failing Test

The next smallest piece of functionality I could think of is this:

"Single Field Named Field1 With Value 1 Gives Total of 1"

I add a new module to my test folder called Single Field Named Field1 With Value 1 Gives Total of 1.yxmd and copy the module from my first test

All I change is the incoming test data to

and the Expected data to

Then I save the module and run my test runner which gives me this log

Great a failing test with a clear and precise error message.

Make The Test Pass

So next the smallest amount of code to make the test pass.  Easy open up our Add Totals macro

And just change the hard coded formula value from 0 to 1.  Run the tests again


Write a Failing Test

Well I can quickly see where my current implementation is going to fall down.  What if my user's data contains something other than the value 1?  So let us write a test to exercise that:

"Single Field Named Field1 With Values 1,2,3 Gives Totals 1,2,3"

So again copy and paste one of our existing tests and update the data

Incoming test data

Expected data
And run the tests

Fails as expected and again a pleasing and exact error message.

Make The Test Pass

Again easy enough to make this test pass.  We just update our formula to [Field1]


Oh dear.  Our new test is passing, but our first test "Adds A Column Called Total" has failed.  What's happened?

So the error message is telling us that our new formula is giving a string result.  If we recall our first test our fields didn't have any data in them at all.  This means that the text input tool has no data to use to determine the type of Field1 so defaults to a type of V_WString.  So yes trying to use that in a formula which should result in a double is an error.

But that's not a very good error for users of our macro.  It refers to internal tools in our macro and doesn't tell the user what to do to fix it.  We can do better than that, but not right now.  Right now we are working on making our tests pass having added a new one.  We shouldn't be adding any new functionality.

In TDD it is often helpful to have a small list somewhere (I tend to scribble on a pad on my desk) of tests that you are going to add in the near future, then when you find things like this you can simply add them to the list and get back to the actual problem you are working on.  So we'll do just that

Tests To Add
- If a user tries to total a string column they get a meaningful error message.

This is some of the strength in the TDD approach you are exploring ways in which your macro will be used and can break as you develop it.

So back to the problem at hand.  Our first test was never meant to test anything about wrong field types, it was just supposed to check the Total column got added.  So let's just fix up the test with a couple of select tools to force our columns to be numeric types


All good!

Write a Failing Test

"Two columns Field1, Field2 With Values 1,2,3 Gives Totals 2,4,6"

Again the test looks identical to our previous ones just with updated Test and Expected data.

Make The Test Pass

Now we're beginning to get to some real implementation.  The naive fix of updating our formula to [Field1] + [Field2] isn't going to cut it as it would fail on the tests which don't have Field2.

I need to dynamically get the names of my fields, use them to build up my formula and then apply that formula to my data. 

Which implemented looks like this

Quick overview from left to right:
  • The field info tool gets all of the field names
  • The summarize tool builds the total formula by concatenating the field names with a + separator
  • The formula tool sets up the Type, Size and Field Name that we need for the dynamic formula tool
  • The dynamic formula tool applies the formula to the data
Run the tests

and not only do we know we have implemented out latest piece of functionality, but we can also see that we have not broken any functionality that we previously added.

Next Time

Now we are beginning to get something that is beginning to look like it might be useful.  Looking at my list of future tests I think now is going to be a good point to tackle that better error message if my user tries to total a string field, which we will look at in my next post.

Wednesday, 11 March 2015

Testing Alteryx Macros - Part 2 - Our First Test and the Start of a Test Framework

At the end of my previous post I talked about the need for a test framework to start writing our tests.  I am going to build it out piece by piece as I need it.

Recall our first test was going to be

"The macro will add a new column called Total"

So what does a test look like for this?  Well it is an Alteryx module like this:

Let me talk you through it tool by tool.

The bottom left text input is our test input data.  It is configured like so

It is a single field called Field1 and has no data.  Having no data might seem a little strange, but if you look back to our test description, all we are testing is that a new column is added, at this point we don't care about the data just the columns.  Testing the data will be a future test.

The blue circle is the Add Totals tool we are building out, except that remember we haven't added any functionality yet, so right now all it does is pass the input data straight through and looks like this:

The top left text input is our expected data.  It is configured like this

So again we have no actual data, but this is setting the expectation of what we expect our macro to output in order for us to say it passes the test "The macro will add a new column called Total".

That is to say if we put a browse tool after the expected data text input and one after the macro we want the contents of those two browses to be identical.

Which brings us to our final tool and the first part of my test framework that I have built.  It is the "Expect Equal" macro.  The premise of it is fairly simple:  It takes two inputs: an Expected Data and an Actual data and if they differ will throw an error.  Hopefully the error will be detailed enough to quickly explain the difference.  

As I mentioned at the start of this post I am building up the functionality of the framework as I need it.  So right now all it compares is the field names on the two data streams, like so:

So we have a field info tool on each input. We join on name and then collect the mismatches.  If there are any we summarize these into a single record and use a message tool to issue an error message.

All of this together means that if we run our first test module (the one at the top of this page) we get the following output log:

Our first failing test!  With a useful error message telling us the field "Total" is missing from our actual data (of course because we haven't added anything to create it yet).  A good start to our TDD approach to macro building.  

Now before we go on to writing some code to make our test pass I want to look at a better way to run our tests.  My plan is that each test that we identify and build will be a separate Alteryx module and the name of that module will be the name and condensed description of the test.  That way if we break one of our tests we get fast feedback about what we broke.

My first module is called "Adds A Column Called Total.yxmd"  (I can use spaces in windows file names and it improves readability so why not?)

I will keep all of the tests together in a folder and what I want is a one button way to run them all at once.  And that brings us to the second part of our test framework the "Test Runner" macro

This macro will look in a particular folder (where we have saved all of our tests) and run each of the tests in turn and report a pass or fail for them.  So right now it produces this output log

From this log we can see straight away which test is failing and also the error message which is causing it to fail.

And how does the Test Runner macro work?  By using the CReW runner macros of course!

A quick run through of the tools from left to right:
  • The Directory tool picks up all of our test modules from the folder we have saved them in.
  • The List Runner macro runs each of those tests.
  • The first Message tool issues a passed info message for tests which do not have errors.
  • The second Message tool issues a failed error message for tests which have errors.
  • The Log Parser macro breaks the logs out into a record per row.
  • The final Message tool issues info messages for the errors that occurred in the failing test the second row that you can see in the log above.
  • The Browse is just there for testing.
It was a great feeling when I realized that the runner macros were exactly what I needed to solve this problem.

So all there is left for us to do is write some functionality to make our test pass and that functionality is as simple as adding a formula tool to our macro

The macro creates a new field called Total and hard codes it to zero.  "Wait a second" you might be saying, "that doesn't seem very useful".  And right now it isn't.  But a guiding principal of TDD is small, fully tested steps.  Write just enough test to make it fail and then juts enough code to make it pass.  And right now we have no tests about what data is in the field, just that the field is created.  So that formula tool is "just enough" code to make our test pass; and if we save the macro and re-run our test runner macro we get the following log

Success! We have completed our first TDD cycle.  Given we have just written the code there really shouldn't be anything to re-factor at this point so time to write our next failing test.

We will continue to build out this macro and the Test Framework in the next post.

Wednesday, 4 March 2015

Testing Alteryx Macros - Part 1 - A Question

I have been meaning to write a post about testing Alteryx macros for quite a long time now, but for some reason have never quite got started on it.  I hope that now is the time that I can finally get some of my thoughts down in writing.

And this series of posts is going to a start with a question.  A question whose answering I hope will take me to all the places that I want to talk about regarding testing Alteryx macros.  As I write this first post I don't know the answer to my question, but I also have a feeling that I will still achieve what I want from these posts regardless of what that answer may be.  After all "Life is a journey, not a destination".

So here is the question that I want to answer:

"Is it possible to use Test Driven Development (TDD) to develop an Alteryx macro?"

For anyone unfamiliar with TDD then I recommend a quick read of the Wikipedia article here which will fill you in on the basics.  But the work cycle looks like this:

  1. Write a test for a piece of functionality which will fail.
  2. Run the tests and see the test fail.
  3. Write some code which makes the test pass.
  4. Run the tests and check they all pass.
  5. (Optionally) Re-factor your code.  Without adding any new functionality and without breaking any tests.
  6. Repeat.

Working in this way means you add functionality in small and fully tested steps until you have built up a larger feature.  The nice thing with knowing that all the functionality is tested is that when you come to re-factor in step 5 you can do so confident in the knowledge that you haven't broken anything.

So how do we apply this to building an Alteryx macro?  Well let's have a go.  

For this example I am going to try and rebuild the CReW macro "Add Totals".  This is a fairly simple macro which allows the user to select a number of numeric columns and then adds a total row and total column on to the data table.  

This seems a good macro to start with as it has a reasonable amount of complexity (but not too much) and also has the benefit that I wrote it a long time ago and have no idea how it currently works.  I intend to build my new version without looking and then compare the two at the end.

OK so let's start.  Step 1 we need to identify a small piece of functionality and write a test for it which fails.  So I can imagine that the first piece of functionality we might identify is adding just a total column for a data stream which has a single field called Field1 on the incoming data stream.  Not very useful in itself, but is the smallest first step I can think to take and TDD emphasizes small steps.

So how do we write a test for that?  Ah... We have hit our first problem.  For other programming languages, where TDD is regularly used, there exist unit test tools.  Right now for Alteryx there really isn't an available tool for us to use.

Now as any of you that know me well will know, one of my mantra's with Alteryx is "if it doesn't exist then build it" and so building a Alteryx testing tool is going to be stop 1 on our journey towards TDDing an Alteryx macro.  Look out for my next post when will we start to do just that.