“Hey world — let me get my data!”

Brett Uglow
5 min readJun 20, 2021

M y name is Brett Uglow. I’m a software engineer… and a bot-writer. But I don’t want to be a bot-writer. I just want my data.

This is the story of how I came to write web-bots, and why I’d rather not send my bots to your organisation’s website every day. So if you are from a digital service provider of any kind, keep reading. This article will hopefully give you more insight into why some people use bots, and how you can meet their needs more easily.

But first, some background. A “bot” is a computer program that does something useful — like automating a repetitive task. I write web-bots — bots that go to websites and get data. Website-makers generally hate web-bots, and do everything they can to make it difficult for web-bots to access their site. Conversely, web-bot writers do everything they can to avoid detection and keep getting the data they want to get.

Bad bots, good bots

Bots can be used to do a lot of damage to a website. They can send thousand-of-requests to the website, causing the website’s service to degrade. They can register fake accounts. They can click on ads to generate fake ad-revenue. And, and, and… the list is endless. So it’s understandable that website creators do not want to support bots on their site.

But, there are also good bots. For example, there are bots that take screenshots of a website & compare them to baseline images to verify that the website looks the way it should. Other bots can check whether a website is down or not¹. Yet others can perform security scans to help avoid data being stolen.

So like any technology, bots are not intrinsically bad or good. It all comes down to how they are used, and why.

Losing my bot-ginity

My first experience with bots was when I worked for Telstra in the Cable Management division, fresh out of university. Telstra — in conjunction with Foxtel — had just installed a massive cable-network, and all of the data sat in an old IBM AS400 terminal-screen program. The operations team had to constantly check a bunch of different terminal-screens to view the performance of the network, and if issues were noted, send out maintenance crews.

One enterprising individual in the operations team decided to automate this process by learning just-enough Visual Basic 6 and MS Access to send commands to the terminal screens and scrape the data. The code was basically (and literally!) a giant for loop. And whilst the bot did work (mostly), Telstra didn’t want to trust its cable operations to a program written by a spare-time programmer. So my task was to document how it worked so we could write a “proper” cable management system.

As I documented the bot by running it, setting breakpoints, seeing it work & fail, I came to understand how brittle such programs are. A single character in the wrong place? Crash. Screen takes longer than expected to update? Crash. A new version of the terminal program is released? Massive crash. The lesson I learned: writing a program that depends on a user (human) interface is labour intensive and prone to failure.

A home for my data

Skip forward 22 years to November 2021, and I decided I would try and track some of my personal data. Loyalty points. Bank balances. That kind of thing.

“But Brett, there’s an app for that!” Yes, thanks Steve 🙄. I have enough apps on my phone. But here’s the problem: there’s no single place I can go to to see all the data I want to see. I want a dashboard, or something that I can customise to suit my needs. I don’t want to open 20 different apps each day just to see my data.

Now here’s an interesting fact: Every app & website that is displaying my data uses something called APIs — Application Programming Interfaces — to get that data into the app/website.

If organisations would make some of their APIs publicly available, I would never need to write a bot. Ever.

Instead, most organisations would rather play a wasteful and endless game of cat & mouse to block all bots from their site — both the bad and the harmless — using Captchas and device fingerprinting.

I just want my own data

Despite numerous successes in retrieving my own data, there are still a few organisations whom are very good at detecting my web-bot & causing it to fail. As I said, writing bots is labour intensive and prone to failure. But while I’m wasting time, organisations are wasting both time & money upgrading their bot-defences every few years. This works for a little while until the bot-writers come up with workarounds. Rinse & repeat.

Here’s the point: I just want access to my own data. I put the data into each organisation’s system. The data is about me. I have a legal right to it. Smart organisations would make it as easy as possible for customers to get their own data.

The API Solution

There is some good news — this is a solved problem! If an organisation has already built APIs for their app & website to use, it is relatively easy to provide public access to most of those APIs. Here’s what organisations should do:

  1. Add an API-key field to every user profile.
  2. When someone views their own user profile(in app / website), display an option to generate an API key.
  3. Create basic API docs (how to login, reading your user profile) & link to the docs from the user profile screen.
  4. Create APIs that require the API key plus a user-token (e.g. a JWT) that can only be obtained by logging in (via an API). This prevents Bob from logging in and getting Alice’s data. Bob can only get Bob’s data.
  5. Create controls around API & API key usage & document these controls. For example, “You can make 20 API calls per hour”.

An organisation that makes (some of) their APIs public will see the following benefits:

  • Website traffic from people like me will disappear!
  • API traffic can be controlled more easily than website traffic. E.g. ignore anything without an API key, implement throttles & usage limits, API keys can be revoked, etc. Many of these features will be readily available through the API gateway tool already being used to provide the APIs.
  • It becomes easier for developers to integrate an organisation’s services into other products, which creates financial opportunities for everyone (either through licensing, increased sales, or more traffic).

--

--

Brett Uglow
Brett Uglow

Written by Brett Uglow

Life = Faith + Family + Others + Work + Fun, Work = Software engineering + UX + teamwork + learning, Fun = travelling + hiking + games + …

No responses yet