Monday 5 November 2018



Trick or Treat:  Find Closest 

I've been tinkering with spatial requirements from clients and have found myself inside of a rabbit's hole.  The result was the creation of a new macro, Find Closest
Unlike the Distance Tool, the Find Closest macro calculates distances without the need of a spatial object.  In speed comparisons, this macro is 100 times faster than the distance tool. This macro also differs from the Alteryx Find Nearest tool as it does not use the spatial or drivetime functions for calculating distances.  Underneath the covers it uses the Haversine formula for calculation of distances on a sphere.  Although the earth is flattened, it does very well for calculation of distances.  It is virtually exact on what would be a trade area.  I've shown this macro and formula to a few ACE friends @andy_moncla and @Treyson for their opinions and have had positive feedback.  I am going to release it via the Alteryx Gallery as a CReW macro with a caveat that the distances calculated may vary from those using the Distance tool.

This macro is simple, but offers features of “Distance”, “Find Nearest” and “Create Points” tools in one macro.  While reviewing a workflow with a large set of customers and potential store locations, I was hoping to improve the performance of the functions without the use of drivetime data.  The client could then take the potential matches through an API call to gather drivetime data (if needed) for the subset of “best” records.  Starting with a latitude and longitude I was pleased to see the performance gains from a single formula tool.  After validation of the results I wanted to make a general-purpose tool for the calculation of distances.  With their needs in mind (note: The engagement had already been completed) I started to think of how other analysts may benefit from a tool like this.  Creation of the first prototype took roughly 4 hours to build, test and review.  After that, I tinkered with more user options and the creation of a macro image. 

Description:
The find closest macro calculates the distance between two sets of latitude & longitude coordinates.  Given an input stream of Target and Source locations it will calculate the distance between all combinations of Targets and Sources.  Optionally, it can accept a stream of data that contains both the source and target coordinates (for distance calculations only).  When calculating via the dual inputs, it allows the analyst to limit the distance output (matches) and can (if required downstream) create spatial objects.

Features:
  • Inputs
    • Allows for either 1 or 2 input data streams
  • Distance Configuration
    • Ignore 0 Distance Matches
  • Find Closest Matches
    • Maximum Distance (Kilometre or Mile)
    • Limit by quantity
  • Spatial Objects
    • Optional creation of spatial object(s)
Benefits:
  • Speed.  Tested on 1 million records this macro runs in under 3 seconds compared to over 12 minutes using spatial tools.
  • Spatial Data.  Because the distances are only "as the crow flies", no spatial data is required (Find Nearest requires TomTom data).
  • Create Points.  Create points are NOT required.  An option to create Target, Source or Both points is available if needed downstream.
  • Output Miles.  Both Kilometre and Mile distances are displayed on output.
Notes:
If you take a look under the covers of this macro you will find the Haversine formula implemented using native mathematic expressions.  Detours are used to avoid unnecessary functions and as a last-minute addition, I added the use of a formula tool to create spatial objects (if someone really wants them on output).  If you haven’t discovered yet, you can perform many spatial functions from within the formula tool.  Please check out my YouTube video that demonstrates the raw power of this elegant beast in computing distances.  If you haven’t yet subscribed to my YouTube channel, please do consider it.

As always, your comments and feedback are welcomed.  You can download the macro here.

No comments:

Post a Comment