Friday, 30 January 2026

Spatial Capabilities in Enso Analytics - Creating Spatial Points and Spatial Joins

In our last post on spatial capabilities we looked at visualizations, inputting spatial files and spatial expressions. In this post we will look at creating points and spatial joins.

A Spatial Question?

Let's do something more interesting and answer a spatial question that needs a join. The question we are going to answer is which county in the UK has the most castles? I will start by showing you the answer and then talk through the steps.



With the dataset that I have the answer is Lincolnshire with 13 castles. 

How did we get there? Lets walk through it step by step.

Creating Spatial Objects Using st_point

Like in the last blog post we got the castle points using the overpass-api. Looking inside the get_castles component in our main workflow we see this.



This calls the overpass api with an overpass query to get the castles in the UK and then extracts the castle name from the json that is returned from that API.

Then back in the parent workflow we trim the data down to the three columns we are interested in:  Name, lat and long


Right now this is still just a standard table of data. To do some spatial processing on it we need to load it into DuckDB and convert those lat longs to spatial objects, which we do with the new st_point component. This component takes a longitude and latitude column and returns a DuckDB spatial object as we can see below:


Reading In the Boundary Data from a Shapefile

I found the UK county boundaries as a shape file on the geoboundaries website (https://www.geoboundaries.org/countryDownloads.html) which I downloaded to my local machine and unzipped.

We then read this in using the read_spatial_file component on our DuckDB connection


I wrapped this into a User Defined Component called get_county_boundaries to keep our parent workflow looking clean and easy to read.


Spatial Join

Now we have both our boundaries and our points as spatial objects in DuckDB we can use the DuckDB spatial engine to do a Spatial join. We do this using the same join component that we would do a regular data join with, but because these are spatial tables we get new spatial join options!


Here we choose Contains, as we want to match the counties to the castles when the castle is contained by the counties boundary. We choose a Left_Outer join so we can keep all of our counties even if they don't contain any castles.


After the join we now have all of our castle data with their containing county attached.

The Answer

The last thing to answer our question, is to aggregate the data to county level, count how many castles and then finally sort:


Other Questions

Of course now we have built the workflow we can re-use it to answer other questions? Like which county has the most pubs?

This time I found a data file at https://www.getthedata.com/open-pubs and loaded it directly into DuckDB using read_file


The rest of the workflow is the same as the castle one.

In the next post we will look at what DuckDB gives us in terms of handling larger data files, including my favourite csv file: the UK companies house dataset!

Friday, 23 January 2026

Spatial Capabilities in Enso Analytics - Visualisations, Spatial File Formats and Spatial Expressions

This is the third post in my series of DuckDB and Enso Analytics. In the last post we talked about how bringing spatial capabilities to Enso was a big motivation for adding DuckDB. In this post we will take a look at what spatial capabilities we now have and how they work.

Spatial Visualisation

This has actually existed in Enso for a long time now and the capabilities here haven't changed in this release. Enso has the ability to visualise points on a map when a data set has latitude and longitude points.

For example if I want to use the overpass api to visualise restaurants in Cambridge (or any other location) I can pull the data and quickly visualise it on a map:


Future releases will expand on this capability to allow more spatial types to be visualised and extra options around the visualisation itself.

Spatial Data formats

This is some new functionality brought to us via DuckDB, so we can start by looking at their help page.

https://duckdb.org/docs/stable/core_extensions/spatial/gdal

And of interest here, following on from my last post about us using DuckDB's spatial capabilities, we can see that DuckDB uses the GDAL library to implement its spatial file reading.

And following through we can see that running this query

SELECT * FROM ST_Drivers();

will give us all of the supported file formats. So let's do that in Enso and see what spatial file support we gained from DuckDB in Enso:


That is a huge 54 supported spatial file formats. Complete with clickable links to help pages describing each format.

To actually load a spatial file you again need to connect to DuckDB using Database.connect, but don't worry you don't need a DuckDB instance, Enso will just create one in memory for you.

And then you can use read_file to read in your spatial file


The geom column is the spatial object.

It is worth noting that today the spatial visualisation doesn't know how to render these spatial objects (that will be coming in a future release) so to visualise these points on a map we will have to make use of some of DuckDBs spatial capabilities!

Spatial Expressions

Now we have some spatial data, what can we do with it? Well we now have spatial functions in the expression language and this is a great opportunity to show another new feature of the latest Enso release: Expression Autocomplete!

If we add a set component and select expression and start typing st_ we can get a list of the spatial functions complete with a description of what they do:


To be able to visualise the points on the map we need two columns with the latitude and longitude values. We can do that using the st_latitude and st_longitude functions in two set components, and then we are able to see the points from our shape file that we loaded in previously:



In the next blog post we will take a look at spatial joins and writing out spatial files.



Wednesday, 14 January 2026

Standing on the Shoulders of Ducks

 "If I have seen further, it is by standing on the shoulders of ducks..." 

--Isaac Newton (probably... I may have misremembered the quote...)


Why DuckDB?

In the first article in this series about Enso Analytics and DuckDB we looked at why those two pieces of technology worked and fitted so well together. But why did we want to add DuckDB to Enso Analytics at all? Why go to the effort of adding a whole new complex piece of technology to the product?

And the answer is an easy one: New features!

DuckDB has functionality and capabilities that previous to the last release Enso Analytics did not. And not only do we gain all of the functionality DuckDB has built so far, but we will gain all of the innovation and new features that DuckDB will add in the future. This is why this is such an accelerator for what Enso can do both today and tomorrow.

So in this article we will look at a high level what some of that functionality looks like.

Spatial

This is the big one and in many ways what started us on the journey to DuckDB. Our users have been asking us to add spatial capabilities to Enso for a while now (after all everything happens somewhere) and it was going to take us too long to build that functionality from scratch. Bringing in a piece of technology that has a rich set of existing functionality and is open source so we can build on top of it allows us to accelerate both ours and our users spatial journeys.

If you want to read more about the details I'd suggest reading some of the DuckDB documentation here https://duckdb.org/2023/04/28/spatial

Because Enso is a dual textual and visual language you don't have to write raw SQL to use the spatial capabilities: you can build your workflow using Enso's easy to use visual programming language.

More on what this looks like in a future post in the series dedicated to spatial.

File Formats

Another feature request we have had is reading and writing Parquet files. Well again with DuckDB we get this functionality for free. (and with some rather nice performance too!).

Compressed csv files? ✅

From the DuckDB website:

"CSV files are often distributed in compressed format such as GZIP archives (.csv.gz). DuckDB can decompress these files on the fly. In fact, this is typically faster than decompressing the files first and loading them due to reduced IO."

Sounds good! And now is available in Enso too!

Fast Csv Reader

And while we are on file formats, the DuckDB csv reader is rather good too. It is less forgiving than the built in Enso csv reader, but for well formatted csv files it is very fast. Take for example my favourite csv file - the UK companies house file (https://download.companieshouse.gov.uk/en_output.html). This is a 2.7 GB csv file with 5.6 million rows. Prior to this release I would have said files this size belong in a database. Today I still do, but that database is DuckDB and it ships directly with Enso.


What Does Enso Bring to DuckDB?

Now all this is great, but you might be saying to yourself I can do all of this in DuckDB already. Why do I need Enso?

Well in the same way that DuckDB has features that Enso doesn't. Enso has features that DuckDB doesn't, so using the two pieces of technology together means you get to use *all* of that functionality, including:

- a rich visual programming environment with live updates showing the results of your changes as you make them

- easy interaction with web APIs: use Enso to query a web API and then combine the results of that with data in DuckDB

- version control and workflow sharing with your team. Easily keep track of and share your analytical pipeline with your team members.

And more! Check out what makes Enso Analytics unique at our website www.ensoanalytics.com

Friday, 9 January 2026

Enso Analytics and DuckDB - A Union That Was Meant To Be

Enso Analytics 2025.3 introduces in-built DuckDB functionality, which is a gamechanger for what the product can do. 

In this series of blog posts I will discuss what this means for you as a user and why Enso and DuckDB are such a good union.

But before we get to what this means for Enso first some background on DuckDB.

Back in 2022, when I was working at another data analytics company and we were in the middle of a number of company acquisitions, the CTO from one of those acquisitions came to me (as one of the principal engineers for the data engines) and said he thought we should think about replacing our internal data engine with DuckDB. Now if you haven't heard of DuckDB (or have heard of it but aren't quite sure what it is) then we should take a quick diversion to what DuckDB is.

From the duckdb.org website: DuckDB is a 

  • Fast
  • Analytical
  • In-process
  • Open-source
  • Portable
database system.

A lot of words there, but what do they all mean? Let's break it down. And we will start at the end. 

Database System


It is a database system. Great! So like Snowflake? Or Postgres? Fundamentally yes, with some key differences that we will get to later, at its core it is a database. It stores data in tables and you can query data from those tables using the standard language of the database world: SQL.

So nothing too interesting here, Enso already had support for running its processes inside of Postgres and Snowflake so this is just another DB that it now supports? Kind of. Let's look at some of the other features from the DuckDB website to see why DuckDB might be more than that.

Fast


Great! Everyone wants a fast database right? But equally fast is relative and no database manufacturer is going to say their database is slow. And actually the real question is "Fast at what?"


Analytical


Now this one is more interesting. Here at Enso we are all about analytics (it is in our name!), but why do DuckDB call that out as their second point on their home page? Well what DuckDB mean is their database is designed for analytic use cases, and taking this in conjunction with the fast bullet their database engine is designed to be fast for analytical use cases.

You can read more about that over at DuckDB https://duckdb.org/why_duckdb#fast.


In-process


Now this comes back to why is DuckDB different to Snowflake or Postgres? When they say DuckDB is in-process it means it can run as part of the hosting application (in our case Enso Analytics). As opposed to Snowflake say which runs in the cloud or Postgres which typically is installed on a server somewhere.


Open-Source and Portable


Both Enso Analytics and DuckDB are open source and can run on Windows, Macs and Linux machines, another checkmark on our list of why we are a perfect match.


Back to 2022 and my meeting with the CTO from the acquired company. My first question was why? To which the answer was all the great features, performance and capabilities. Which sounded great, but we had just spent the last few years building our own highly performant data engine. The second question was how? And this is where it got harder for that other company as their data engine didn't directly translate to SQL and actually at times deliberately worked differently to other database engines out there so compatibility was going to be a nightmare. And that coupled with the fact we were already trying to combine at lot of different acquired technologies meant we really didn't need to throw yet another one into the mixing pot.

So why is it a different story here at Enso Analytics?

Well the biggest reason is when we built the Enso product we built it to be compatible with other database technologies not different to them. This means that we can use the same set of components to run In-Memory, or in Snowflake or Postgres... And now in DuckDB. Of course not all databases support the same functionality: the Enso application allows you to do as much in database as is possible, and gracefully tell the user when something is not possible in database, and it is time to move the data in memory. This meant the framework for integrating DuckDB was there and ready to go. The how was obvious.

And then to the why. For that we go back to the list of DuckDB features and see how each matches against the Enso product.

✅ A fast analytics database - Enso is designed to do analytical work with data, so a fast analytics database is a perfect fit.

✅ In-process - This means DuckDB can be packaged in as part of the Enso installation. You don't need to set up anything apart from installing Enso and you have the power of DuckDB behind an easy to use visual data analytics programming language.

✅ Open-Source and Portable - DuckDB and Enso Analytics both share the same philosophy of being open source and work on any platform.

Oh and did I mention that DuckDB has a rich set of spatial capabilities? So as of 2025.3 Enso Analytics for the first time has spatial support. But more on that in a future post. In fact the next post looks at what new features DuckDB brings to Enso.

If you want to try it yourself download a free copy today at EnsoAnalytics.com

Monday, 11 December 2023

Advent Of Code and Alteryx


 Image generated with the assistance of AI [1]

It is now just over 5 years since I tweeted the below on X (Twitter):



and, pulling that tweet up to write this post, I realise I actually got very little engagement. But fast forward 5 years and at the time of writing (Day 9) over 100 people have solved at least one of this year's problems using Alteryx; there is a very active WhatsApp group (that I struggle to keep up to date with!) and you can get a shiny Advent Of Code badge on the Alteryx community!

Image from Alteryx community [2]

What is the Advent Of Code?

But what is this Advent Of Code (AoC)? And why do we care that people are solving it in Alteryx?

Advent Of Code was created by Eric Wastl in 2015[3] and is "an annual set of Christmas-themed computer programming challenges that follow an Advent calendar."[4] So not particularly data problems or designed to be solved by Alteryx, people solve them in all sorts of computer languages. Over the years I have solved them in many languages including Python, R, Scala and Rust. For myself as a software engineer I find it an interesting way of playing with a new language, learning its capabilities, strengths and weaknesses. And also as a way of seeing how other people write code.

So why solve them in Alteryx? Well the simplest answer is why not? Doing strange things with Alteryx has been something I have enjoyed doing for many years. 

The second answer is because I can :) (Sometimes...) What I noticed back in 2018, and has held true since, is that Alteryx is very good at solving the early days of the AoC challenges. And this is in many ways due to how Eric designed the puzzles: the input to every problem is always a single text file and the answer (or output) is always a single integer. (Well two integers, as each day has two parts.) So however complex the puzzle, all of the input is always contained in a single text file, that has to be parsed into its parts, before you move on to solving the actual puzzle. And if there is one thing that Alteryx can do very well, it is parsing text files!

Where things become interesting is as the problems get more complex, we begin to run into the limits of Alteryx as a programming language. Which leads me to my next interesting question:

Is Alteryx A Programming Language?

This is an interesting question. It is certainly not marketed as a programming language. Typically we think of a programming language being made up of code and Alteryx is talked about as being "code free".

But in reality: yes, Alteryx is a programming language.

As the user drags and drops tools on to the canvas they are building up an xml document that represents the data transformation that they have defined. That xml (or code) is then executed by the Alteryx engine.

If we are going to be more specific we can say that Alteryx is a "Visual Programming Language" and it is a "Domain Specific Language". The domain in question being data analytics. And it is this domain which brings certain limits to the language which makes some of the Advent Of Code problems more challenging...

But Isn't Alteryx Turing Complete?

Well yes. I think it was Steve Ahlgren who famously stated that Alteryx was Turing complete if you used a rock to hold down the run button. But being Turing complete isn't actually that interesting for real world applications. My 10 year old daughter with a pen and a pad of paper is Turing complete. As is Excel [5]. As is "Magic: the Gathering" [6]. But none of those are going to be particularly good for writing arbitrary computer programs in. Turing completeness is a useful concept when we are looking at the theory of what computer algorithms can or can not do. It is not a great measure for what they can practically do with a reasonable amount of computing power and time.

Where are the limits of the Alteryx language?

I think there are two major limitations that you run into in trying to solve Advent Of Code problems in Alteryx: Types and Loops.

(Please add in the comments if you can think of more.)

Types

The only data type in the top level Alteryx language is the data table. This makes a lot of sense given our previous statement that is a Domain Specific Language for manipulating data, but does limit its capabilities when trying to use it as a more general purpose language. Of course within the data table there is a rich type system and usually you can work within the table to represent the data variables you need.

Loops

I think this is the big one, and usually the point where I give up on AoC in Alteryx for the year. Alteryx does have the concept of loops in the form of batch macros (loosely equivalent to a FOR loop) and iterative macros (loosely equivalent to a DO WHILE loop), but these can be difficult to configure (especially if you have some complex data variables you are manipulating in tables. But more of a problem on the practicality front they can take a long time to execute. I have built AoC solutions that have taken over 24 hours to execute. Generate rows can be a good way to avoid a macro solution and simulate a loop, but can sometimes bring different data size issues.

Conclusion

Day 9 is usually about as far as I get with Advent Of Code and Alteryx, before the busy-ness of family life and December overtakes life. (And having departed Alteryx last week, my trial Alteryx license will soon run out). So next year I will be completing AoC in a different language.

Good luck to everyone still playing!

Merry Christmas and a Happy New Year

Adam

References:

[1] Image Creator from Microsoft Designer (bing.com)

[2] Advent of Code 2023 (2023). https://community.alteryx.com/t5/Alter-Nation/Advent-of-Code-2023/ba-p/1211440.

[3] Advent of Code 2023 (2023). https://adventofcode.com/.

[4] Wikipedia contributors (2023) Advent of Code. https://en.wikipedia.org/wiki/Advent_of_Code.

[5] Couriol, B. (2021) 'The Excel Formula language is now Turing-Complete,' InfoQ, 2 August. https://www.infoq.com/articles/excel-lambda-turing-complete/.

[6] Churchill, A. (2019) Magic: The Gathering is Turing Complete. https://arxiv.org/abs/1904.09828.

Tuesday, 5 December 2023

The mountains are calling and I must go

 

Image generated with the assistance of AI [1]


...and I will work on while I can, studying incessantly." - John Muir, 1873 [2]

Some personal news to share with you all today: after almost 13 years working at Alteryx, the time has come for me to move on to new adventures.

My last day was December 1st 2023.

Thank Yous

I have to start with the thank yous in case some of you don't make it to the end. And after 13 years there is a lot of people to thank, so apologies in advance to all the people I miss.

My biggest thank you has to go to Ned Harding for not only creating the fantastic piece of software that is Alteryx Desktop Designer, but for giving me the opportunity to be a part of building it. Thank you for your mentorship and friendship over the years. And of course thank you to Dean Stoecker and Libby Duane Adams for creating this place I have called home for so long. It has been an amazing journey!

My First Alteryx

This is a hard thank you list to write, as Alteryx has changed so much since when I started to when I left, that in many ways it feels like I have worked for multiple different companies. Each with its own people and styles. And suddenly I am leaving all of them at once, even though some of the earlier incarnations are long gone.

I would like to thank all of the fine folk of the Boulder office and my "first Alteryx" for making myself and my wife so welcome when we first moved there back in 2011. There are too many to name you all but they include: Linda Thompson, Amy Holland, Tara McCoy Giovenco, Rob McFadzean, Rob Bryan, Catherine Metzger, Margie Horvath, Damian Austin, Nathalie Smith, Hannah Keller, Wendy Chow and Kim Hands.

The Spirit Of Alteryx

The next set of thank yous go to the people who embody the "Spirit of Alteryx" starting with Steve Algren and Linda Thompson (one of whom I think coined this phrase) for everything you do to keep that magic of what makes Alteryx special continue to burn. Especially in recent years at the Inspire conference and working with the ACEs. Which brings me to the ACEs: one of the most amazing groups of people I have the pleasure to know in my life. Some of my happiest moments have been in a room with these people, brainstorming problems and ideas of how to solve complex problems and make the product better. You are too many to call out individually, but I do have to thank Mark Frisch my team mate for the CReW macros and someone who embodies the Spirit of Alteryx in all that he does.

Projects

I will end with some thank yous for a few of my favourite projects over the years and for the good people who were there with me:

  • The CReW macros - The project that I call my greatest success and failure at Alteryx - Mark Frisch, Chris Love, Joe Mako and Daniel Brun
  • The AMP engine - The second generation massively multi-threaded Alteryx engine. One of the projects that I am most proud of. - Ned Harding, Scott Wiesner, Chris Kingsley, Sergey Maruda, Roman Savchenko and all of the amazing C++ engineers who have contributed to that project.
  • The Black Pearl project - The one that got away... (Naming a project after a cursed pirate ship was perhaps in hindsight asking for trouble, but I still think in a parallel universe this ship still sails on and would have been a great feature.) - Boris Perusic, David Vonka and all of the good crew who sailed that short but exciting voyage with us.
  • Control Containers - My swan song. Another one that I wasn't sure would make it out at times, but I am so happy that it did. My last big contribution to desktop designer. - The one and only Jeff Arnold who pulled it over the line with me.

Apologies again for all the people who I have not being able to mention by name. I thank all of you for your contributions over the years.

My Journey

It was over 15 years ago that I first discovered a product that I then called Alteryx made by a company called SRC, and that you would now call Desktop Designer made by a company called Alteryx. And it is fair to say that product has been one of the great loves of my life. If you have ever seen the film Good Will Hunting, there is a scene in it where Matt Damon's character (Will) is trying to explain to Mini Driver's character (Skylar) how he is so good at Maths:

Will : Beethoven, okay. He looked at a piano, and it just made sense to him. He could just play.
Skylar : So what are you saying? You play the piano?
Will : No, not a lick. I mean, I look at a piano, I see a bunch of keys, three pedals, and a box of wood. But Beethoven, Mozart, they saw it, they could just play. I couldn't paint you a picture, I probably can't hit the ball out of Fenway, and I can't play the piano.
Skylar : But you can do my o-chem paper in under an hour.
Will : Right. Well, I mean when it came to stuff like that... I could always just play. [3]

Well for me, when it came to Alteryx, I could always just play.

And what a journey that has taken me on. I have gone from a data analyst to a principal software engineer. From an individual contributor to a director of over 50 engineers. From my first Inspire, feeling absolutely terrified talking in front of 20 people to the London Inspire at Tobacco Dock, closing out the conference on the main stage with Ned.

I have laughed, cried (only once in the office), grown personally and professionally in more ways than I could ever have imagined, and made some life long friends who I am so happy are part of my life.

Please don't be a stranger. Always happy to meet up for a beer if you are ever in London, or jump on a zoom call and talk data and analytics.

Where Next?

Whenever someone posts a leaving post on LinkedIn there is always the comment that asks "where are you off to"? Well good readers that is another book that is yet to be told, and one that you will have to wait a little while to start reading. But for now let us say I am excited for a new adventure, and I will leave you with the Haiku from Alteryx past:

Chaos reigns within.

Reflect, repent, and restart.

Order shall return. [4]

Adam

References: