Skip to content

Image super-resolution locally in the browser

This blog post showcases the main components needed to build a simple image super-resolution app, which runs machine learning models entirely in the browser without transferring any data to a server

All relevant resources can be found here:

The main concepts

The application is built using Vue.js and styled with TailwindCSS. Machine learning models and inference are handled by the Hugging Face Transformers.js library. For deployment, we've used GitHub Pages. The following graph outlines how the main components interact with each other.

Vue.js

Vue.js is one of the most popular javascript frameworks for building web applications. We use it here primarily for two reasons:

  1. It makes it easy to build reactive web applications. This means that the UI is automatically updated when the underlying data/state changes.

  2. It makes it easy to build reusable components. That can be used in different parts of the application or even in different applications.

Hugging Face Transformers.js

Hugging Face Transformers.js is a javascript library enabling the smooth integration of machine learning models into web applications. The Hugging Face team has ported many of the uploaded python models using the ONNX Runtime (we will dive into how to do this in a later blog post). They also made it super easy to run these machine learning model. For instance, the primary code for the image super-resolution model is as follows:

javascript
import { pipeline } from "@xenova/transformers";

let pipe = pipeline("image-to-image", "Xenova/swin2SR-lightweight-x2-64");
let result = await pipe(imageUrl);

TailwindCss

TailwindCss is a utility-first CSS framework. It provides a set of small utility classes that can be used to style the application. The major advantages of tailwind for me are that it makes it easy to build responsive UIs. And second that unlike traditional methods where styling logic is scattered across multiple files or concentrated at the end of one file, TailwindCSS allows for styling logic to be directly integrated within the code, enhancing visibility and accessibility.

Github pages

GitHub Pages is a free static site hosting service. We use Github Actions to automatically build the application and deploy it to Github pages. This makes it super easy to deploy the application and also to update it. All that's required is to push changes to the repository, and the application updates automatically.

The code

The main parts of the code are the following:

HomeView

The HomeView serves as the primary interface of the application. It contains the main UI elements and the mechanisms to operate the image super-resolution model.

When the user clicks the Get super-resolution button, the run function is activated. This function initially captures the user-selected area, then proceeds to process this area using the image super-resolution model in a background worker thread. Upon completion, the enhanced result is displayed within the UI.

Background worker

This brings us to the next important part of the code the background worker. Running the model in a background worker thread ensures that the UI is not blocked while the model is running. The hugging face team has put together an excellent tutorial and explainer on how to use background workers in a react application. I highly recommend reading it.

Image Processing Composable

The image processing composable is a small utility function that is used to draw the selected area on the image and handle all the state changes.

Conclusion

As we've seen, developing machine learning applications that operate solely in the browser is becoming more accessible. This approach offers practical benefits like reduced server costs and keeping data local. It's likely we'll see more applications moving in this direction. In my next post, I'll discuss how to port a Hugging Face model to the browser using ONNX Runtime.

References

Created by M. Dennis Turp