Ghost in the Machine: Designing Interfaces for Machine Learning Features
Like so many other designers, I am currently following the development of new Machine Learning (ML) features in computer systems. For obvious reasons, I am really curious about the implications of ML for user interface design.
There is a lot going on — but a substantial part or the debate is on a fairly abstract level. Currently, very few real examples of user interfaces for ML systems or ML features exist. Machine Learning is usually kept in the background and is used for predictive analytics, recommendation engines or natural language processing.
There are few examples where users consciously interact with a ML system. As commercial products, voice assistants like Alexa, Siri or Google Assistant come to mind. A number of experimental projects like a GPT3 Figma plugin are also out there. However, it is safe to say that consumer-facing interfaces for ML systems are rare.
The fact that there are not many examples of user interfaces for Machine Learning features is obviously related to the main job of ML: automatisation. The whole point of ML is to automate tasks — we are delegating the decision-making to the algorithm.
The user interface (UI) is the space where the computer becomes visible, tangible and recognisable. It is a way to perceive and control programs, parameters and processes. Using input devices, we can interact with the computer and manage its performance.
An intriguing research question in user interface design is this: do we need different user interfaces for interacting with ML systems? Or can we simply continue to use existing UI languages? Computers are fundamentally tools for automatisation. If ML is just a different way of doing this — why should the interface differ to non-ML automatisation?
It is impossible to discuss this question on an abstract level — so let me give you a concrete example.
In 2019, the image editor Pixelmator Pro has introduced a very interesting feature: ‘ML Super Resolution’. This feature is a nifty new way to increase the resolution of pixel based images. It allows users to scale up images while keeping lines and shapes crisp and detailed. This level of image quality is achieved by using Machine Learning. The developers of Pixelmator wrote a detailed and insightful posting about the new feature.
How does the user interface look like? Well — it is almost non-existent:
If you select ‘ML Super Resolution’ from the ‘Image’ menu, the image just scales up. This is a good example for ML interfaces — the user delegates the task entirely to the machine. No questions asked — the image is scaled up. It just works.
Another way to invoke the feature is via the ‘Image Size’ dialogue box. Here, the user is provided with a couple of settings regarding the image size. ‘ML Super Resolution’, however, is just presented as one of the four available scaling algorithms. There are no ML-specific options or parameters.
Both interfaces are very simple and suggest that ML scaling works just like any other feature in the application. Only the highlighted ‘ML’ in the image size dialogue indicates that the feature is slightly different from the others. And why should it look different? It is just a user interface for scaling up images. ‘ML Super Resolution’ is just a scaling algorithm like the other three.
But is it?
ML algorithms work fundamentally different compared to traditional programming. Conventional image scaling algorithms like Bilinear or Lanczos are content-agnostic. They just interpolate the pixels based on the input values. They are deterministic and will always produce the same result.
ML image scaling algorithms, on the other hand, are based on image classification and object detection. They try to identify elements of an image. Is an element a line or an edge? Is it an elephant or a car? Is it a number or a flower? ML algorithms are stochastic and as such non-deterministic. They depend on training data and they will usually produce different results.
This has tremendous implications for the user interface. Just look at the following example:
To be clear: this is not the result of Pixelmator Pro’s ‘ML Super Resolution’. The image was created by the experimental tool ‘Face Depixelizer’. On June 20, 2020 the Twitter user @Chicken3gg published the above image as a reaction to the announcement of the just mentioned ‘Face Depixelizer’. This tool creates a high resolution portrait based on a low-resolution, pixelated input image. And while the overall technology is impressive, the output is not very convincing. The pixelated image on the left is clearly Barack Obama, the generated portrait on the right is — well — clearly not Barack Obama.
I am in no way qualified to discuss the technical details of this process. But the fact that the first African-American president of the United States is converted into a white dude points at fundamental problems of ML image processing. Every output of a Machine Learning algorithm is an extrapolation of the underlying training data. And as the above image demonstrates, every bias in the data is at some point visible in the output. This fact makes each image that was scaled up using ML algorithms unpredictable and untrustworthy.
To be clear: I am very much aware of the fact that Pixelmator Pro and Face Depixeliser use very different ML algorithms. Pixelmator utilises a convolutional neural network — and the developers explain how this works very well in their blog posting. Face Depixelizer, on the other hand, uses a generative model (StyleGAN) for creating high-resolution images that match the input image when downscaled. So strictly speaking, Face Depixelizer is not an image scaling algorithm.
However, it is quite obvious that more ML features will appear in image editors or other design tools. It is not difficult to imagine that Pixelmator will have a ‘ML Super Resolution’ that is optimised for faces. And as soon as this feature is technically available as a commercial product, the interface designers will face a number of difficult and ethical questions.
I don’t want to overly criticise Pixelmator Pro’s user interface. For now, it’s fine. But in the future, UI designers will have to make a decision. They can either pretend that ML features work just like a regular image scalers and provide the users with almost no options or explanations. This would be pragmatic but also misleading and maybe even dangerous. Or the designers can be honest and develop a completely new user interface language that addresses the specific issues of Machine Learning in general and ML image scaling in particular.
How could such an interface look like? Well — I don’t know. It will definitely need to inform the users about the underlying process and the stochastic nature of the output image. The interface has to clarify that the output is a prediction and as such undetermined, speculative and not trustworthy. Maybe it will even make the training data and algorithm transparent. There will be many implications and novel challenges. Perhaps we need a to completely re-think our established user interfaces and come up with new and visually creative solutions! The only way to find out is to design the actual user interface.
In any case, I strongly believe that user interfaces for interacting with Machine Learning algorithms should reveal its implications and specific mechanism. Interacting with ML algorithms has unique challenges and the interfaces should reflect this. An ‘OK Button’ is not enough.