GeistHaus
log in · sign up

aabidk.dev – Blog

Part of aabidk.dev

Recent content in Blog on aabidk.dev

stories primary
Dockerizing a Laravel and Inertia App
Introduction

This post assumes you are familiar with Laravel, Inertia and have a basic understanding of Docker. To follow along, you can create a new blank Laravel project with React. There are some complexities around file permissions, health checks, and volumes that we will address in this post.

Note

Reddit discussion around this post is available here . There were some suggestions, I’ve updated the post to reflect those changes. See this comment thread for more details.

Prerequisites
  • Docker installed on your machine. You can download it from Docker’s official site .
  • Laravel installed on your machine. You can follow the official installation guide here .
  • Basic knowledge of Laravel and Inertia.js.
  • A Laravel Inertia application. If you already have Laravel installed, you can create a new project using the following commands(Skip the npm install and npm run build commands when prompted):
laravel new demo-project --using=laravel/react-starter-kit:^1.0.1

The fixed version is merely used for stability here, you can use the latest version as well. After creating the project, we will not install anything else locally. Let’s dockerize this application step by step.

Create a Dockerfile

First, we will quickly get the application up and running, and then we will improve it with multi-stage builds. We will be using serversideup/php images as our base. These images are optimized for running PHP applications, with good defaults, easy configuration, and include some automation scripts for Laravel. In the root of the Laravel project, create a file named Dockerfile with the following content:

Dockerfile
# Use a base image with PHP, Nginx, and Alpine Linux.
# Use fixed version tags for stability.
FROM serversideup/php:8.4.11-fpm-nginx-alpine3.21-v3.6.0 AS development

# Switch to root to install dependencies
USER root

# Install any needed PHP extensions
# RUN install-php-extensions intl

# Defaults
ARG USER_ID=1000
ARG GROUP_ID=1000

RUN docker-php-serversideup-set-id www-data $USER_ID:$GROUP_ID \
 && docker-php-serversideup-set-file-permissions --owner $USER_ID:$GROUP_ID --service nginx


# Install Node (for building assets)
# This is quick and dirty, we'll fix it in the multi-stage version
RUN apk add --no-cache nodejs npm
USER www-data

# Copy composer files first
COPY --chown=www-data:www-data composer.json composer.lock /var/www/html/
RUN composer install --optimize-autoloader --no-interaction

# Copy package files for frontend
COPY --chown=www-data:www-data package.json package-lock.json /var/www/html/
RUN npm ci

# Copy rest of the app
COPY --chown=www-data:www-data . /var/www/html/

# Build frontend assets
USER root
RUN npm run build
USER www-data

# Run the application
EXPOSE 8080
CMD ["php", "artisan", "serve", "--host=0.0.0.0", "--port=8080"]

Let’s break down what each part does:

  1. FROM serversideup/php:8.4.11-fpm-nginx-alpine3.21-v3.6.0 AS base: This line specifies the base image we are using, which includes PHP 8.4 with FPM and Nginx on Alpine Linux v3.21. The version tag v3.6.0 is used by the serversideup image. We are using fixed versions as much as possible for stability. Using the Nginx variant of the image allows us to serve the static files, while PHP-FPM handles the PHP processing.

  2. USER root: Switches to the root user to install necessary dependencies.

  3. RUN docker-php-serversideup-set-id www-data $USER_ID:$GROUP_ID: This command sets the user and group IDs for the www-data user inside the container to match those of your host system. This helps avoid permission issues when mounting volumes during development. The script is included in the serversideup/php image itself. More information can be found here

  4. RUN docker-php-serversideup-set-file-permissions --owner $USER_ID:$GROUP_ID --service nginx: This command sets the correct file permissions for the Nginx service to ensure it can read and serve the application files.

  5. RUN apk add --no-cache nodejs npm: Installs Node.js and npm, which are required for building the frontend assets.

  6. COPY --chown=www-data:www-data composer.json composer.lock /var/www/html/: Copies the Composer files to the container and sets the ownership to www-data. This benefits from Docker’s layer caching . If the composer.json or composer.lock files have not changed, Docker can use the cached layer rather than reinstalling dependencies. Note that we are doing composer install in the image itself for now.

  7. RUN composer install --optimize-autoloader --no-interaction: Installs PHP dependencies using Composer.

  8. COPY --chown=www-data:www-data package.json package-lock.json /var/www/html: Copies the package files for the frontend and sets the ownership. Again, this is done before copying the rest of the application to benefit from caching.

  9. RUN npm ci: Installs Node.js dependencies. Using npm ci ensures a clean and consistent install based on the package-lock.json, and is preferred in CI/CD environments over npm install.

  10. COPY --chown=www-data:www-data . /var/www/html/: Copies the rest of the application files to the container.

  11. RUN npm run build: Builds the frontend assets.

  12. EXPOSE 8080: Exposes port 8080 for the application.

  13. CMD ["php", "artisan", "serve", "--host=0.0.0.0", "--port=8080"]: Starts the Laravel development server.

You can build and run the Docker container by using the following commands:

docker build -t demo-project .
docker run -p 8080:8080 demo-project:latest

The application should now be accessible at http://localhost:8080.

Few things to note here:

  1. By default the serversideup/php images use port 8080 for HTTP and 8443 for HTTPS, which we have exposed in the Dockerfile.
  2. If you have not changed any environment variables, the application by default will be using SQLite as the database.
  3. Right now, the image is large (around 600mb in my case). You can use dive to analyze the image layers. At this point, if you inspect the image, you will notice that we have many unnecessary files included in the image such as node_modules, vendor directory, tests, git files and so on.
Add a .dockerignore File

To prevent unnecessary files from being copied into the Docker image, create a .dockerignore file in the root of your project with the following content:

.dockerignore
vendor
tests
.git
.github
storage/logs
storage/framework/cache/*
!storage/framework/sessions/*
storage/framework/views/*
node_modules
.env
.env.*
.gitignore
.gitattributes
.DS_Store
.idea
.vscode
.editorconfig
Dockerfile
compose.*.yml
npm-debug.log
yarn-error.log
_ide_helper.php
_ide_helper_models.php
stubs
eslint.config.js
phpunit.xml
.phpactor.yml
README.md

We try to exclude all the unnecessary files and folders that are not needed in the Docker image :

  • vendor and node_modules directories as they will be installed inside the container.
  • .git directory to avoid copying version control data.
  • storage/logs and other cache directories to avoid copying log files and cached data.
  • Environment files (.env, .env.*) to avoid exposing sensitive information.
  • IDE and editor specific files and folders, log files, configuration files for linters, formatters, testing, etc.
  • Docker-related files to avoid copying Dockerfiles and compose files. These are only needed on the host machine.

This will reduce our docker image size, decrease build times, and improve security by not including sensitive files. You need to rebuild the image after adding this file.

Local Development with Docker Compose

We will use Docker Compose to set up our dev environment. We will be using PostgreSQL as our database and Mailpit as the local mail server. Create a file compose.dev.yml in the docker folder with the following content:

services:
 db:
 image: postgres:17.6-alpine3.21
 restart: no
 ports:
 - "5432:5432"
 env_file:
 - .env
 healthcheck:
 test: ["CMD-SHELL", "pg_isready -U $POSTGRES_USER"]
 interval: 5s
 timeout: 5s
 retries: 5
 volumes:
 - demo-project-data:/var/lib/postgresql/data

 mailpit:
 image: axllent/mailpit:v1.27.7
 restart: always
 ports:
 - "${FORWARD_MAILPIT_PORT:-1025}:1025"
 - "${FORWARD_MAILPIT_DASHBOARD_PORT:-8025}:8025"

 app:
 build:
 context: .
 target: development
 args:
 USER_ID: ${USER_ID:-1000}
 GROUP_ID: ${GROUP_ID:-1000}
 restart: no
 depends_on:
 db:
 condition: service_healthy
 env_file:
 - .env
 healthcheck:
 # /up is Laravel's built-in health check route
 test: ["CMD-SHELL", "curl -f http://localhost:8080/up || exit 1"]
 interval: 5s
 timeout: 5s
 start_period: 5s
 retries: 5
 ports:
 - "80:8080"
 - "443:8443"
 volumes:
 - .:/var/www/html

 queue:
 build:
 context: .
 target: development
 args:
 USER_ID: ${USER_ID:-1000}
 GROUP_ID: ${GROUP_ID:-1000}
 restart: no
 depends_on:
 db:
 condition: service_healthy
 env_file:
 - .env
 environment:
 - AUTORUN_ENABLED=false #prevents migrations from running again, they will already have run in app service
 command: ["php", "/var/www/html/artisan", "queue:work", "--tries=3"]
 stop_signal: SIGTERM
 healthcheck:
 test: ["CMD", "healthcheck-queue"]
 start_period: 5s
 volumes:
 - .:/var/www/html

 node:
 image: node:22.19.0-alpine3.21
 working_dir: /app
 env_file:
 - .env
 volumes:
 - .:/app
 command: sh -c "npm ci --legacy-peer-deps && npm run dev"
 ports:
 - "5173:5173"

volumes:
 demo-project-data:

Let’s break down the services defined in this file:

  1. db service : We are using the official Alpine based PostgreSQL image with fixed version tag. We expose port 5432 for database connections and set up a health check to ensure the database is ready before other services depend on it. The database data is stored in a Docker volume named demo-project-data to persist data across container restarts.

  2. mailpit service : This service uses the Mailpit image to provide a local SMTP server for testing email functionality. It exposes ports 1025 for SMTP and 8025 for the web dashboard.

  3. app service : This service builds the Laravel application using the Dockerfile we created earlier. It depends on the db service and waits for it to be healthy before starting. It exposes ports 80 and 443 for HTTP and HTTPS access respectively. We are using a bind mount to mount the current directory into the container for hot reloading during development. A health check is also defined to ensure the application is running.

  4. queue service : This service is responsible for running Laravel’s queue worker. It also builds from the same Dockerfile and also depends on the db service. The command specified runs the queue worker with a maximum of 3 tries for each job. We disable automatic migrations here since they will already have run in the app service. To gracefully stop the queue worker, we set the stop signal to SIGTERM. For our health check, we are using the healthcheck-queue command provided by the serversideup/php image. More information can be found here

  5. node service : This service uses the official Node.js Alpine image to handle frontend asset compilation. It mounts the current directory into the container and runs npm ci followed by npm run dev to start the development server. Port 5173 is exposed for Vite’s development server.

We are passing the .env file to all services to ensure they have the necessary environment variables. We need to add the following variables to our .env file to configure the database and migration settings:

.env
# These are for Laravel
APP_URL=http://localhost
APP_DEBUG=true
DB_CONNECTION=pgsql
DB_HOST=db
DB_PORT=5432
DB_DATABASE=laravel
DB_USERNAME=postgres
DB_PASSWORD=postgres

#Migration settings (These are for serversideup/php image automation scripts)
AUTORUN_ENABLED=true
AUTORUN_LARAVEL_MIGRATION_ISOLATION=false

#These are for db container, it expects database related variables with POSTGRES_ prefix
POSTGRES_DB=${DB_DATABASE}
POSTGRES_USER=${DB_USERNAME}
POSTGRES_PASSWORD=${DB_PASSWORD}

With the above compose file, we can simplify our Dockerfile a bit :

  1. Since we now have a separate Node service for handling frontend assets, we can remove the Node installation and asset building steps from our Dockerfile.
  2. We can also remove the port expose and CMD instructions as they will be handled by Docker Compose.
  3. Since we are using bind mounts for the application code, there is no need to copy the composer.json and composer.lock files and run composer install in the Dockerfile. They will be invalidated by the bind mount anyways. Instead, we will run composer install from the host machine when needed. See the linked Reddit discussion above for more details.

Our updated Dockerfile now looks like this:

Dockerfile
FROM serversideup/php:8.4.11-fpm-nginx-alpine3.21-v3.6.0 AS development

# Switch to root to install dependencies / fix permissions
USER root

ARG USER_ID=1000
ARG GROUP_ID=1000

RUN docker-php-serversideup-set-id www-data $USER_ID:$GROUP_ID \
 && docker-php-serversideup-set-file-permissions --owner $USER_ID:$GROUP_ID --service nginx

USER www-data

# Copy rest of the app
COPY --chown=www-data:www-data . /var/www/html/

Lastly, we need to update the dev command in our package.json and add the --host flag to ensure Vite listens on all interfaces, allowing access from outside the container:

package.json
"scripts": {
 "dev": "vite --host",
}

This should reduce our Docker image size significantly (around 110 MB in my case).

We can now start our development environment by using Docker Compose with the following command:

docker compose -f compose.dev.yml up --build

This command will build the images and start all the services defined in the compose.dev.yml file. The Laravel application should now be accessible at http://localhost, and the Mailpit dashboard at http://localhost:8025. Note that you do not need to use --build every time, only when you make changes to the Dockerfile or the compose file. Hot reloading should now work seamlessly.

Building a Production-Ready Image with Multi-Stage Builds

There are some differences when building for production. The permission fixes and the UID/GID mapping we used during development are not needed in production. We will also pre-build the frontend assets during image build, so we do not need a separate node service in production.

Let’s update our Dockerfile to add separate stages for development and production:

Dockerfile
# ============================================================
# Base image - common for both dev and prod
# ============================================================
FROM serversideup/php:8.4.11-fpm-nginx-alpine3.21-v3.6.0 AS base

USER root
#Install any needed PHP extensions
RUN install-php-extensions intl
USER www-data

# ============================================================
# Node builder (for Inertia/Vite assets) - only for production
# ============================================================
FROM node:22.19.0-alpine3.21 AS node-builder

WORKDIR /app

# Copy package files first for caching
COPY package.json package-lock.json* pnpm-lock.yaml* yarn.lock* ./

# Install deps
RUN npm ci --legacy-peer-deps

# Copy the rest of the frontend
COPY resources/ resources/
COPY vite.config.js ./

# Build frontend
RUN npm run build

# ============================================================
# Development stage (fixes host UID/GID permissions)
# ============================================================

FROM base AS development

USER root

ARG USER_ID=1000
ARG GROUP_ID=1000

RUN docker-php-serversideup-set-id www-data $USER_ID:$GROUP_ID \
 && docker-php-serversideup-set-file-permissions --owner $USER_ID:$GROUP_ID --service nginx

USER www-data

# ============================================================
# Production stage
# ============================================================
FROM base AS production

ENV PHP_OPCACHE_ENABLE=1

USER www-data

COPY --chown=www-data:www-data composer.json composer.lock /var/www/html/

#Exclude dev dependencies for production with --no-dev
RUN composer install --no-dev --no-scripts --no-autoloader --no-interaction

COPY --chown=www-data:www-data . /var/www/html

# Copy only the built assets from node stage
COPY --from=node-builder /app/public/build /var/www/html/public/build

# Now run scripts and autoloader generation
RUN composer dump-autoload --optimize && \
 composer run-script post-autoload-dump

In this updated Dockerfile, we have made the following changes:

  1. Added a base stage that has common steps for both development and production stages.
  2. Added a node-builder stage that installs Node.js dependencies and builds the frontend assets. This stage is only used in the production build. Our development setup uses a separate Node service as defined in the compose.dev.yml file discussed earlier.
  3. The development stage stays the same, still fixing permissions for local development and handling the UID/GID mapping. It still does not include composer install step as before.
  4. The production stage installs only the necessary PHP dependencies without dev dependencies. Unlike development stage, the composer dependencies are installed during image build itself to have a self-contained image, as we will not be using bind mounts in production. It then copies the pre-built frontend assets from the node-builder stage without including any build tools or source files from that stage.

You can build the development image same as before. To build the production image locally use the following command:

docker build --target production -t demo-project:prod .

There will be a slight difference in production image size as compared to development since it includes composer. Please note that the production stage for the Dockerfile is meant for building the production image both locally and in the deployment pipelines.

Testing The Production Image Locally With Docker Compose

Let’s create a new compose file for production named compose.prod.yml for building and running the production image locally :

compose.prod.yml
services:
 db:
 image: postgres:17.6-alpine3.21
 restart: unless-stopped
 env_file:
 - .env
 - .env.production
 healthcheck:
 test: ["CMD-SHELL", "pg_isready -U $POSTGRES_USER"]
 interval: 5s
 timeout: 5s
 retries: 5
 volumes:
 - demo-project-prod-db:/var/lib/postgresql/data

 app:
 build:
 context: .
 target: production
 restart: unless-stopped
 depends_on:
 db:
 condition: service_healthy
 env_file:
 - .env
 - .env.production
 healthcheck:
 test: ["CMD-SHELL", "curl -f http://localhost:8080/up || exit 1"]
 interval: 5s
 timeout: 5s
 start_period: 5s
 retries: 5
 ports:
 - "80:8080"
 - "443:8443"
 volumes:
 - demo-project-prod-data:/var/www/html/storage

 queue:
 build:
 context: .
 target: production
 restart: unless-stopped
 depends_on:
 db:
 condition: service_healthy
 env_file:
 - .env
 - .env.production
 environment:
 - AUTORUN_ENABLED=false
 - PHP_FPM_POOL_NAME=app_queue
 command:
 [
 "php",
 "/var/www/html/artisan",
 "queue:work",
 "--tries=3",
 "--timeout=90",
 ]
 stop_signal: SIGTERM
 healthcheck:
 test: ["CMD", "healthcheck-queue"]
 start_period: 5s
 volumes:
 # Same volume as app
 - demo-project-prod-data:/var/www/html/storage

volumes:
 demo-project-prod-db:
 demo-project-prod-data:

Following are the key differences in this file compared to the development compose file:

  1. We are using the production target from our Dockerfile to build the app and queue services.

  2. We have added an additional env file .env.production to separate production specific environment variables. If a variable is defined in both files, the one in .env.production will take precedence as it is listed later.

    For now, my .env.production only has the following variables :

    .env.production
    APP_ENV=production
    APP_DEBUG=false
    NODE_ENV=production
  3. We are using named volumes for persisting database data and storage data rather than bind mounts. Also, the volumes are mounted on /var/www/html/storage only and not the entire var/www/html directory, since that is the only directory that needs to be writable by the application (for file uploads, cache, sessions, etc.).

  4. The restart policy is set to unless-stopped to ensure the containers restart automatically unless explicitly stopped.

In .env.production, you can add other environment variables to replicate your production environment, such as mail server settings, logging configurations, etc.

You can run the compose file by using the following command:

docker compose -f compose.prod.yml up --build

Your application should now be accessible at http://localhost.

Again note that during development, for live reloads, you need to use the compose.dev.yml file, while for testing a production build locally, you should use compose.prod.yml.

Adapting The Production Compose File For Deployment

We can use the same Dockerfile for building the production image during deployment. However, we need to change few things in the compose:

  1. Generally, in production deployments, we do not build the image on the server itself. Instead, we build it in a CI/CD pipeline and push it to a container registry. The deployment server then pulls the pre-built image from the registry. If the server builds the images from the repository, a compose file similar to our compose.prod.yml can be used. But if we are pulling the pre-built image from a registry, we need to replace the build section with image in our compose.

  2. We will not use env_file in production deployments. Instead, we will set the environment variables directly in the deployment platform or server, and pass them to the containers. The .env and .env.production files should not be included in the image or the deployment compose file.

  3. You might not want to expose ports for database and other services, based on the requirements you can remove the ones you do not need. If you are using a reverse proxy in front of your application, you might not need to expose ports from the app container as well.

This part largely depends on your deployment platform and strategy. Below is a sample deployment compose file compose.deploy.yml assuming the above conditions :

compose.deploy.yml
services:
 db:
 image: postgres:17.6-alpine3.21
 restart: unless-stopped
 volumes:
 - demo-project-db-data:/var/lib/postgresql/data
 environment:
 - POSTGRES_DB=${DB_DATABASE}
 - POSTGRES_USER=${DB_USERNAME}
 - POSTGRES_PASSWORD=${DB_PASSWORD}
 healthcheck:
 test: ['CMD-SHELL', 'pg_isready -U $POSTGRES_USER']
 interval: 5s
 timeout: 5s
 retries: 5

 app:
 image: ${REGISTRY_URL}/demo-project:latest
 pull_policy: always
 restart: unless-stopped
 depends_on:
 db:
 condition: service_healthy
 environment:
 APP_NAME: '${APP_NAME}'
 APP_ENV: '${APP_ENV}'
 APP_KEY: '${APP_KEY}'
 APP_DEBUG: '${APP_DEBUG}'
 APP_URL: '${APP_URL}'
 ASSET_URL: '${ASSET_URL}'
 # Other environment variables as needed

 healthcheck:
 test: ['CMD-SHELL', 'curl -f http://localhost:8080/up || exit 1']
 interval: 5s
 timeout: 5s
 start_period: 5s
 retries: 5
 volumes:
 - demo-project-app-data:/var/www/html/storage

 queue:
 image: ${REGISTRY_URL}/demo-project:latest
 pull_policy: always
 restart: unless-stopped
 depends_on:
 db:
 condition: service_healthy
 command: ['php', '/var/www/html/artisan', 'queue:work', '--tries=3', '--timeout=90']
 environment:
 APP_NAME: '${APP_NAME}'
 APP_ENV: '${APP_ENV}'
 APP_KEY: '${APP_KEY}'
 APP_DEBUG: '${APP_DEBUG}'
 APP_URL: '${APP_URL}'
 ASSET_URL: '${ASSET_URL}'
 PHP_FPM_POOL_NAME: app_queue
 # Other environment variables as needed
 healthcheck:
 test: ['CMD', 'healthcheck-queue']
 start_period: 5s
 volumes:
 - demo-project-app-data:/var/www/html/storage

volumes:
 demo-project-db-data:
 demo-project-app-data:

You can set the REGISTRY_URL environment variable in your server environment to point to your container registry.

Note

  1. The image above is using the latest tag, however it is recommended to use specific version tags or commit SHAs for better control and rollback capabilities.
  2. If you face any errors related to static assets in production, ensure that the ASSET_URL variable is set correctly (which should be the same as APP_URL unless you are using a CDN or separate domain for assets).

You can use Github Actions for building and pushing images to a registry. The following is a sample workflow you can use as a starting point :

.github/workflows/deploy.yml
name: Build and Deploy to Production

on:
 push:
 branches:
 - main

jobs:
 build:
 runs-on: ubuntu-latest

 steps:
 - name: Checkout code
 uses: actions/checkout@v4

 - name: Set up Docker Buildx
 uses: docker/setup-buildx-action@v3

 - name: Log in to Docker registry
 uses: docker/login-action@v3
 with:
 registry: ${{ secrets.REGISTRY_URL }}
 username: ${{ secrets.REGISTRY_USERNAME }}
 password: ${{ secrets.REGISTRY_PASSWORD }}

 - name: Build and push Docker image
 uses: docker/build-push-action@v5
 with:
 context: .
 push: true
 target: production
 tags: |
 ${{ secrets.REGISTRY_URL }}/demo-project:latest
 ${{ secrets.REGISTRY_URL }}/demo-project:${{ github.sha }}
 release:
 runs-on: ubuntu-latest
 needs: build
 permissions:
 contents: write

 steps:
 - name: Create GitHub Release
 uses: softprops/action-gh-release@v1
 with:
 tag_name: release-${{ github.run_number }}
 name: Release ${{ github.run_number }}
 body: |
 Commit: ${{ github.sha }}
 Message: ${{ github.event.head_commit.message }}
 Image: demo-project:${{ github.sha }}
 draft: false
 prerelease: false
 deploy:
 runs-on: ubuntu-latest
 needs: release
 steps:
 - name: Trigger Deployment
 run: |
 curl -X POST '${{ secrets.DEPLOY_WEBHOOK_URL }}'

You might need to adjust the workflow based on your requirements and configure the secrets in your repository settings. After building and pushing the image, the workflow triggers a deployment webhook.

Further Improvements

You can further improve this setup by:

  1. Using a compose.base.yml file to define common services and configurations, and then extending it in compose.dev.yml and compose.prod.yml files. This will help reduce duplication and make it easier to manage changes across different environments.

  2. Sending the commit SHA in the webhook payload and using it to pull specific image versions during deployment for better traceability.

  3. Set up pre-production deployments using Github actions based on branches or tags.

  4. Set up cache in Github actions (registry cache works with docker/build-push-action) to speed up build times.

    If you have any questions or suggestions, feel free to reach out via email.

https://aabidk.dev/blog/dockerizing-a-laravel-and-inertia-app/
Building Modern Cross Browser Web Extensions: Core Functionality, Storage and Permissions (Part 5)

In the previous post, we explored the background scripts and the messaging in extensions, as well as using @webext-core/messaging for a robust and type-safe messaging system. In this part, we will use the same concepts to build the core functionality of our extension.

Basic Commands

We want to add actions to the extension such as opening a new tab, a new window as well as state based actions such as muting/unmuting or pinning/unpinning a tab. Let’s follow the same pattern we explored in the previous post.

  1. First, we will define the types for the actions we want to add to the extension.
src/lib/types.ts
export type UserAction = {
 title: string;
 id: string;
 handler: () => void;
 visible: boolean; // To hide actions such as mute/unmute conditionally
};
  1. Next, we will set up messaging in src/lib/messaging.ts as follows:
src/lib/messaging.ts

import { defineExtensionMessaging } from "@webext-core/messaging";

interface ProtocolMap {
 userAction(data: { type: "newTab" | "newWindow" }): void;
}

export const { sendMessage, onMessage } = defineExtensionMessaging<ProtocolMap>();
  1. Create a new file src/lib/user.actions.ts to define the actions:
src/lib/user.actions.ts
import { sendMessage } from "@/lib/messaging";
import type { UserAction } from "@/lib/types";

export const userActions: UserAction[] = [
 {
 title: "New Tab",
 id: "newTab",
 handler: () => {
 sendMessage("userAction", { type: "newTab" });
 },
 visible: true,
 },
 {
 title: "New Window",
 id: "newWindow",
 handler: () => {
 sendMessage("userAction", { type: "newWindow" });
 },
 visible: true,
 },
];

Note that although this file is not explicitly marked as a content script, its actions will be utilized within the Command Palette, which itself is a part of our main content script.

  1. We will also add the listeners in the background script to handle these actions:
src/entrypoints/background.ts
import { onMessage } from "@/lib/messaging";

export default defineBackground(() => {
 onMessage("userAction", (userAction) => {
 switch (userAction.data.type) {
 case "newTab":
 browser.tabs.create({});
 break;
 case "newWindow":
 browser.windows.create({});
 break;
 }
 });
});
  1. Finally, we will use these actions in the Command Palette component:
src/components/CommandPalette.tsx
import * as React from "react";
import {
 Command,
 CommandDialog,
 CommandEmpty,
 CommandInput,
 CommandItem,
 CommandList,
} from "@/components/ui/command";

import { userActions } from "@/lib/user.actions";

export function CommandPalette() {
 const [open, setOpen] = React.useState(false);

 React.useEffect(() => {
 const down = (e: KeyboardEvent) => {
 if (e.key === "j" && (e.metaKey || e.ctrlKey)) {
 e.preventDefault();
 setOpen((open) => !open);
 }
 };

 document.addEventListener("keydown", down);
 return () => document.removeEventListener("keydown", down);
 }, []);

 return (
 <>
 <CommandDialog open={open} onOpenChange={setOpen}>
 <Command loop={true} className="max-h-96 min-h-96 rounded-lg shadow-md">
 <CommandInput placeholder="Type a command or search..." />
 <CommandList>
 <CommandEmpty>No results found.</CommandEmpty>
 {userActions.map((action) => (
 <CommandItem
 key={action.id}
 value={action.title}
 onSelect={() => {
 action.handler();
 setOpen(false);
 }}
 >
 <span className="flex-1 truncate px-3 text-start">
 {action.title}
 </span>
 </CommandItem>
 ))}
 </CommandList>
 </Command>
 </CommandDialog>
 </>
 );
}

That’s all we need to do, and our actions should now be available in the Command Palette.

State Dependent Commands

We can also add actions that are dependent on the state of the tab, such as muting/unmuting or pinning/unpinning. Let’s add these actions to the extension.

  1. Add the Tab type to src/lib/types.ts:
src/lib/types.ts
export type Tab = chrome.tabs.Tab;
  1. We will need the current tab to get the state as well as to perform actions on it. Let’s update our messaging setup for the same:
src/lib/messaging.ts
import { defineExtensionMessaging } from "@webext-core/messaging";
import { Tab } from "@/lib/types";

interface ProtocolMap {
 userAction(data: {
 type: "newTab" | "newWindow" | "muteTab" | "unmuteTab";
 tab?: Tab; // We will need the tab info for updating its state
 }): void;
 getActiveTab(): Promise<Tab>;
}

export const { sendMessage, onMessage } =
 defineExtensionMessaging<ProtocolMap>();

We have added muteTab and unmuteTab actions, and an optional tab parameter to the userAction message. Whenever we want to use the browser APIs to update the state of a tab, we need to provide its unique id, which we can get from this tab parameter. We have also added a getActiveTab action to get the current active tab.

  1. Let’s update our background script to handle these actions:
src/entrypoints/background.ts
import { onMessage } from "@/lib/messaging";

export default defineBackground(() => {
 onMessage("userAction", (userAction) => {
 switch (userAction.data.type) {
 // ...listeners for newTab and newWindow actions remain the same

 case "muteTab":
 if (userAction.data.tab?.id) {
 browser.tabs.update(userAction.data.tab.id, { muted: true });
 }
 break;

 case "unmuteTab":
 if (userAction.data.tab?.id) {
 browser.tabs.update(userAction.data.tab.id, { muted: false });
 }
 break;
 }
 });

 onMessage("getActiveTab", async () => {
 const tabs = await browser.tabs.query({
 active: true,
 currentWindow: true,
 });
 return tabs[0];
 });
});

We have a separate listener for getActiveTab to get the current active tab, while muteTab and unmuteTab actions are under the userAction listener. This is in accordance with our messaging setup. Though we have not mentioned return type here for listener of getActiveTab, its return type will be Tab, which, again, is inferred from our messaging setup.

  1. In our content script, instead of directly returning userActions as an array, we will now use an async function, getUserActions, which will return actions based on the state of the tab:
src/lib/user.actions.ts
import { sendMessage } from "@/lib/messaging";
import type { UserAction } from "@/lib/types";

export const getUserActions = async (): Promise<UserAction[]> => {
 // Note that when we don't have any parameters, we have to pass undefined
 const activeTab = await sendMessage("getActiveTab", undefined);
 const isMuted: boolean = activeTab.mutedInfo?.muted || false;

 const userActions: UserAction[] = [
 // ...newTab and newWindow actions remain the same
 {
 title: "Mute Tab",
 id: "muteTab",
 handler: () => {
 sendMessage("userAction", { type: "muteTab", tab: activeTab });
 },
 visible: !isMuted,
 },
 {
 title: "Unmute Tab",
 id: "unmuteTab",
 handler: () => {
 sendMessage("userAction", { type: "unmuteTab", tab: activeTab });
 },
 visible: isMuted,
 },
 ];

 return userActions.filter((action) => action.visible);
};
  1. Since getUserActions is an async function, we need to update the Command Palette component to handle the promise:
src/components/CommandPalette.tsx

import * as React from "react";
import {
 Command,
 CommandDialog,
 CommandEmpty,
 CommandInput,
 CommandItem,
 CommandList,
} from "@/components/ui/command";

import { getUserActions } from "@/lib/user.actions";
import { UserAction } from "@/lib/types";

export function CommandPalette() {
 const [open, setOpen] = React.useState(false);
 const [userActions, setUserActions] = React.useState<UserAction[]>([]);

 React.useEffect(() => {
 const down = (e: KeyboardEvent) => {
 if (e.key === "j" && (e.metaKey || e.ctrlKey)) {
 e.preventDefault();
 setOpen((open) => !open);
 }
 };

 document.addEventListener("keydown", down);
 return () => document.removeEventListener("keydown", down);
 }, []);

 React.useEffect(() => {
 if (!open) return;
 getUserActions().then(setUserActions);
 }, [open]);

 return (
 <>
 <CommandDialog open={open} onOpenChange={setOpen}>
 <Command loop={true} className="max-h-96 min-h-96 rounded-lg shadow-md">
 <CommandInput placeholder="Type a command or search..." />
 <CommandList>
 <CommandEmpty>No results found.</CommandEmpty>
 {userActions.map((action) => (
 <CommandItem
 key={action.id}
 value={action.title}
 onSelect={() => {
 action.handler();
 setOpen(false);
 }}
 >
 <span className="flex-1 truncate px-3 text-start">
 {action.title}
 </span>
 </CommandItem>
 ))}
 </CommandList>
 </Command>
 </CommandDialog>
 </>
 );
}

Now, the Command Palette will show the Mute Tab and Unmute Tab actions based on the state of the active tab. We can also filter visible actions in useEffect instead of filtering them in the getUserActions function, if desired. We can similarly add other conditional actions such as Pin Tab and Unpin Tab.

Adding Keybindings To The Command Palette

Let’s display keybindings for the actions. Later, we will provide a way for the user to toggle the visibility of keybindings using browser storage to store user preferences.

  1. First update our UserAction type to include a keybinding property:
src/lib/types.ts
export type UserAction = {
 title: string;
 id: string;
 handler: () => void;
 visible: boolean;
 keybinding?: string[]; //optional
};
  1. We can detect the OS using the browser.runtime.getPlatformInfo API and set the keybindings accordingly. This API can only be used in background script, so we will update our messaging setup, add listener to background script, and update the user.actions.ts file:
src/lib/messaging.ts
interface ProtocolMap {
 //...existing protocol map
 getPlatformInfo(): Promise<chrome.runtime.PlatformInfo>;
}
src/entrypoints/background.ts
export default defineBackground(() => {
 //...existing listeners
 onMessage("getPlatformInfo", async () => {
 return browser.runtime.getPlatformInfo();
 });
});
src/lib/user.actions.ts
import { sendMessage } from "@/lib/messaging";
import type { UserAction } from "@/lib/types";

export const getUserActions = async (): Promise<UserAction[]> => {
 const activeTab = await sendMessage("getActiveTab", undefined);
 const isMuted: boolean = activeTab.mutedInfo?.muted || false;
 const isPinned: boolean = activeTab.pinned || false;

 const platform = await sendMessage("getPlatformInfo", undefined);
 const isMac = platform.os === "mac";

 const userActions: UserAction[] = [
 {
 title: "New Tab",
 id: "newTab",
 handler: () => {
 sendMessage("userAction", { type: "newTab" });
 },
 visible: true,
 keybinding: ["Command", "T"],
 },
 {
 title: "New Window",
 id: "newWindow",
 handler: () => {
 sendMessage("userAction", { type: "newWindow" });
 },
 visible: true,
 keybinding: ["Command", "N"],
 },
 {
 title: "Mute Tab",
 id: "muteTab",
 handler: () => {
 sendMessage("userAction", { type: "muteTab", tab: activeTab });
 },
 visible: !isMuted,
 keybinding: isMac ? ["Command", "Shift", "M"] : ["Command", "M"],
 },
 {
 title: "Unmute Tab",
 id: "unmuteTab",
 handler: () => {
 sendMessage("userAction", { type: "unmuteTab", tab: activeTab });
 },
 visible: isMuted,
 keybinding: isMac ? ["Command", "Shift", "M"] : ["Command", "M"],
 },
 {
 title: "Pin Tab",
 id: "pinTab",
 handler: () => {
 sendMessage("userAction", { type: "pinTab", tab: activeTab });
 },
 visible: !isPinned,
 },
 {
 title: "Unpin Tab",
 id: "unpinTab",
 handler: () => {
 sendMessage("userAction", { type: "unpinTab", tab: activeTab });
 },
 visible: isPinned,
 },
 ];

 const visibleActions = userActions.filter((action) => action.visible);

 // change 'Command' to '⌘', Alt to '⌥' and Shift to '⇧' on Mac
 // 'Command' is 'Ctrl' on Windows/Linux
 const formattedActions = visibleActions.map((action) => {
 action.keybinding = action.keybinding?.map((key) => {
 if (key === "Command") {
 return isMac ? "⌘" : "Ctrl";
 } else if (key === "Alt") {
 return isMac ? "⌥" : "Alt";
 } else if (key === "Shift") {
 return isMac ? "⇧" : "Shift";
 }
 return key;
 });
 return action;
 });
 return formattedActions;
};
  1. Finally, render the keybindings in the Command Palette component:
src/components/CommandPalette.tsx
{userActions.map((action) => (
 <CommandItem
 key={action.id}
 value={action.title}
 onSelect={() => {
 action.handler();
 setOpen(false);
 }}
 >
 <span className="flex-1 truncate px-3 text-start">
 {action.title}
 </span>
 {action.keybinding && (
 <div className="flex items-center space-x-2">
 {action.keybinding.map((key, i) => (
 <div>
 <kbd
 key={key}
 className="px-2 py-1 text-foreground bg-accent rounded
 text-sm text-center">
 {key}
 </kbd>
 <span>
 {action.keybinding?.length &&
 i < action.keybinding.length - 1
 ? "+"
 : ""}
 </span>
 </div>
 ))}
 </div>
 )}
 </CommandItem>
))}

The Command Palette will now show the keybindings based on the OS.

On Windows/Linux
On Mac

Note that the muteTab and unmuteTab actions have different keybindings for Mac and other platforms, which we have handled based on the isMac variable.

Storage

The browser’s asynchronous storage API provides persistent key-value storage for user preferences, extension state, and other application data. We can use the local storage, which is persistent across browser sessions, or the session storage, which is cleared when the browser is closed. We can also use the sync storage, which is synced across devices if the user is signed in to their browser account. Each type of storage has different limits on the amount of data it can store. The storage API is available in all extension contexts, including background scripts and content scripts. For more details, refer to the MDN Docs or Chrome Developer Docs . Before proceeding, you are encouraged to read these docs and understand the differences between using storage APIs and other storage methods such as localStorage or IndexedDB.

WXT provides a well designed wrapper around the browser storage APIs, with lots of features such as type safety, versioning, metadata, bulk operations, watchers and more, which makes it easier to work with storage in extensions. These are thoroughly documented in the WXT docs, so we will not go into details here.

Adding A Toggle To Show/Hide Keybindings

To demonstrate the use of storage API, we will add a toggle in the popup to show or hide keybindings in the Command Palette.

  1. We need to add the storage permission in our wxt.config.ts:
wxt.config.ts
import { defineConfig } from "wxt";

// See https://wxt.dev/api/config.html
export default defineConfig({
 srcDir: "src",
 extensionApi: "chrome",
 modules: ["@wxt-dev/module-react"],
 manifest: {
 name: "Command Palette",
 description: "A command palette to quickly perform actions",
 permissions: ["storage"],
 },
});
  1. We will add a new file src/lib/storage.ts to handle storage operations. We will use the defined storage from WXT:
src/lib/storage.ts

// Note the local: prefix, which is used to store the data in local storage
const showKeybindings = storage.defineItem<boolean>("local:showKeybindings", {
 defaultValue: true,
 fallback: true,
 version: 1,
});

export const store = {
 showKeybindings,
};

storage is auto imported here from @wxt-dev/storage. We have defined a new item showKeybindings with a default value of true, and we are using local storage for this item. For convenience, we have exported a single object store which contains all the storage items. If we have large number of storage items, instead of exporting them individually, we can export them as a single object.

  1. We will add a switch (from Shadcn) in our src/entrypoints/popup/App.tsx
src/entrypoints/popup/App.tsx
import { useState, useEffect } from "react";
import { Switch } from "@/components/ui/switch";
import { Label } from "@/components/ui/label";
import { store } from "@/lib/storage";

function App() {
 const [showKeybindings, setShowKeybindings] = useState(true);

 useEffect(() => {
 store.showKeybindings.getValue().then(setShowKeybindings);
 }, []);

 return (
 <div className="m-3 w-72">
 <h1 className="text-lg font-bold mb-3">Settings</h1>
 <div className="flex items-center justify-between mb-3">
 <Label htmlFor="showKeybindings" className="font-normal">
 Show keybindings
 </Label>
 <Switch
 id="showKeybindings"
 checked={showKeybindings}
 onCheckedChange={(checked) => {
 store.showKeybindings.setValue(checked);
 setShowKeybindings(checked);
 }}
 ></Switch>
 </div>
 </div>
 );
}

export default App;
  1. In the Command Palette, all we have to do is check the value from storage and render the keybindings accordingly. We will update the existing useEffect:
src/components/CommandPalette.tsx
import { store } from "@/lib/storage";

export function CommandPalette() {
 const [showKeybindings, setShowKeybindings] = React.useState(true);
 //...existing states and effects

 React.useEffect(() => {
 if (!open) return;
 store.showKeybindings.getValue().then(setShowKeybindings);
 getUserActions().then(setUserActions);
 }, [open]);

Use the showKeybindings state to conditionally render the keybindings in the Command Palette:

src/components/CommandPalette.tsx
{action.keybinding && showKeybindings && (
 <div className="flex items-center space-x-2">
 {action.keybinding.map((key, i) => (
 <div>
 <kbd
 key={key}
 className="px-2 py-1 text-foreground bg-accent rounded text-sm text-center"
 >
 {key}
 </kbd>
 <span>
 {action.keybinding?.length &&
 i < action.keybinding.length - 1
 ? "+"
 : ""}
 </span>
 </div>
 ))}
 </div>
)}

The value is read from the storage when the Command Palette is opened, so if the user changes the setting in the popup, they will need to close and reopen the Command Palette to see the changes. We can add a watcher to the storage item to update the state in real time:

src/components/CommandPalette.tsx
React.useEffect(() => {
 if (!open) return;
 store.showKeybindings.getValue().then(setShowKeybindings);
 getUserActions().then(setUserActions);
 const unwatch = store.showKeybindings.watch(setShowKeybindings);
 return () => unwatch();
}, [open]);

Since the storage API is available in all extension contexts, we can perform storage operations directly without needing to use message passing or adding listeners in the background script.

Viewing The Stored Data

We can view the stored data in the browser directly. Note that the storage data will be empty if the default value of showKeybindings in the popup has not been changed.

  1. In Chrome, open the service worker DevTools (the same way we accessed the console for the background script), and go to the Application tab. You will see the Extension Storage section on the left, which will has different storage areas such as Local, Session, and Sync. Our stored data will be visible in the Local section.

Chrome Storage

  1. In Firefox, open the DevTools by going to about:debugging#/runtime/this-firefox and clicking on Inspect for your extension. Under Storage tab, and you will see the Extension Storage section on the left, which has the stored data.

Firefox Storage

This is a very simple demonstration, and WXT offers many more storage related-features which are not covered here. You can refer to the WXT storage docs for more details.

Permissions

So far, our actions were simple and did not require any permissions to be declared in the manifest.json (besides the storage). However, some actions might require additional permissions, such as adding or removing a bookmark. We have already seen an example of using the history permission in the previous post, and we can similarly add bookmark permission to our manifest. But this is not the best way of handling permissions, since the permissions listed in manifest are required to be granted during installation itself, which might put off the users if our extension needs sensitive permissions. A better way is to use minimum required permissions during installation, and add the rest as optional permissions in the manifest, which can be requested on runtime if needed. Let’s see how we can add actions that require permissions as well as requesting these permissions at runtime.

Adding And Removing Bookmarks With Install-Time Permissions

Let’s quickly add new actions to add and remove the current tab as a bookmark. We will also add a new required permission bookmarks in the wxt.config.ts for now (which we will later update to runtime permission):

wxt.config.ts
import { defineConfig } from "wxt";

// See https://wxt.dev/api/config.html
export default defineConfig({
 srcDir: "src",
 extensionApi: "chrome",
 modules: ["@wxt-dev/module-react"],
 manifest: {
 name: "Command Palette",
 description: "A command palette to quickly perform actions",
 permissions: ["storage","bookmarks"],
 },
});
src/lib/messaging.ts

interface ProtocolMap {
 userAction(data: {
 type:
 | "newTab"
 | "newWindow"
 | "muteTab"
 | "unmuteTab"
 | "pinTab"
 | "unpinTab"
 | "addBookmark"
 | "removeBookmark";

 tab?: Tab;
 }): void;
 isBookmarked(data: { tab: Tab }): Promise<boolean>;

 //...existing protocol map
src/lib/user.actions.ts
const isBookmarked = await sendMessage("isBookmarked", { tab: activeTab });

const userActions: UserAction[] = [
 //...existing actions
 {
 title: "Add Bookmark",
 id: "addBookmark",
 handler: () => {
 sendMessage("userAction", { type: "addBookmark", tab: activeTab });
 },
 visible: !isBookmarked,
 keybinding: ["Command", "D"],
 },
 {
 title: "Remove Bookmark",
 id: "removeBookmark",
 handler: () => {
 sendMessage("userAction", { type: "removeBookmark", tab: activeTab });
 },
 visible: isBookmarked,
 keybinding: ["Command", "D"],
 },
]
src/entrypoints/background.ts
 onMessage("userAction", (userAction) => {
 switch (userAction.data.type) {
 //..existing cases
 case "addBookmark":
 if (userAction.data.tab?.id) {
 browser.bookmarks.create({
 title: userAction.data.tab.title,
 url: userAction.data.tab.url,
 });
 }
 break;

 case "removeBookmark":
 if (userAction.data.tab?.id) {
 browser.bookmarks.search({ url: userAction.data.tab.url })
 .then((bookmarks) => {
 if (bookmarks.length) {
 browser.bookmarks.remove(bookmarks[0].id);
 }
 });
 }
 break;
 }
 });

 onMessage("isBookmarked", async (message) => {
 if (!message.data.tab?.url) return false;
 const bookmarks = await browser.bookmarks.search({
 url: message.data.tab.url,
 });
 return bookmarks.length > 0;
 });

Our add and remove bookmark actions should work now.

Conditionally Showing Actions Based On Permissions

The visibility of certain actions depends on the specific permissions being granted. If we don’t have the bookmarks permission, we should not show either of the ‘Add Bookmark’ and ‘Remove Bookmark’ actions, as they won’t work in that case. Before displaying such actions, we need to check the permissions, as well as provide a way for the user to grant them if they are listed as optional permissions. Let’s update wxt.config.ts to remove the bookmarks permission from the required permissions and add it to optional permissions:

wxt.config.ts
import { defineConfig } from "wxt";

// See https://wxt.dev/api/config.html
export default defineConfig({
 srcDir: "src",
 extensionApi: "chrome",
 modules: ["@wxt-dev/module-react"],
 manifest: {
 name: "Command Palette",
 description: "A command palette to quickly perform actions",
 permissions: ["storage"],
 optional_permissions: ["bookmarks"],
 },
});

Again we need to update types and the messaging setup:

src/lib/types.ts
export type Permissions = chrome.permissions.Permissions;
src/lib/messaging.ts
interface ProtocolMap {
 //...existing protocol map
 getPermissions(): Promise<Permissions>;
}
src/entrypoints/background.ts
 onMessage("getPermissions", async () => {
 return browser.permissions.getAll();
 });

Consider the scenarios under which we want the ‘Add Bookmark’ and ‘Remove Bookmark’ to be visible:

Has permission Is Bookmarked Show ‘Add’ action Show ‘Remove’ action No No No No No Yes No No Yes No Yes No Yes Yes No Yes

In simpler words, if we don’t have the permission, we won’t show either of the options. If we have the permission, we will check if the page is bookmarked or not and show the corresponding action, so we need to check both the parameters here. Update the user.actions.ts file to include this logic:

src/lib/user.actions.ts
import type { Tab, UserAction } from "@/lib/types";

const getBookmarkStatus = async (
 tab: chrome.tabs.Tab,
): Promise<[hasBookmarksPermission: boolean, isBookmarked: boolean]> => {
 const permissionsInfo = await sendMessage("getPermissions", undefined);
 const hasBookmarksPermission =
 permissionsInfo.permissions?.includes("bookmarks");

 // If we don't have the permission, we won't show either of the options,
 if (!hasBookmarksPermission) {
 return [false, false];
 }
 const isBookmarked = await sendMessage("isBookmarked", { tab });
 return [hasBookmarksPermission, isBookmarked];
};


export const getUserActions = async (): Promise<UserAction[]> => {
 //...existing checks
 const [hasBookmarksPermission, isBookmarked] = await getBookmarkStatus(activeTab);

 const userActions: UserAction[] = [
 //...existing actions

 {
 title: "Add Bookmark",
 id: "addBookmark",
 handler: () => {
 sendMessage("userAction", { type: "addBookmark", tab: activeTab });
 },
 visible: hasBookmarksPermission && !isBookmarked,
 keybinding: ["Command", "D"],
 },
 {
 title: "Remove Bookmark",
 id: "removeBookmark",
 handler: () => {
 sendMessage("userAction", { type: "removeBookmark", tab: activeTab });
 },
 visible: hasBookmarksPermission && isBookmarked,
 keybinding: ["Command", "D"],
 },
];

We have added a new function getBookmarkStatus to check the discussed conditions. However, both of our bookmark actions won’t be visible as of now, as we have bookmarks as the optional permission and it hasn’t been granted by the user yet.

Requesting Permissions At Runtime

Requesting permissions is quite simple – we just have to add a way for user to trigger the request. Let’s add a button in our popup through which user can request the permission:

src/entrypoints/popup/App.tsx
import { useState, useEffect } from "react";
import { Switch } from "@/components/ui/switch";
import { store } from "@/lib/storage";
import { Label } from "@/components/ui/label";
import { sendMessage } from "@/lib/messaging";

function App() {
 const [showKeybindings, setShowKeybindings] = useState(true);
 const [hasBookmarkPermission, setHasBookmarkPermission] = useState(false);

 useEffect(() => {
 store.showKeybindings.getValue().then(setShowKeybindings);
 }, []);

 useEffect(() => {
 const onload = async () => {
 const permissionsInfo = await sendMessage("getPermissions", undefined);
 //permissionsInfo contains permissions as well as origins
 // we only need permissions here
 if (permissionsInfo.permissions?.includes("bookmarks")) {
 setHasBookmarkPermission(true);
 }
 };
 onload();
 }, []);

 const requestBookmarksPermission = async () => {
 if (!hasBookmarkPermission) {
 const granted = await browser.permissions.request({
 permissions: ["bookmarks"],
 });

 if (granted) {
 setHasBookmarkPermission(true);
 }
 }
 };

 return (
 <div className="m-3 w-72 flex flex-col">
 <h1 className="text-lg font-bold mb-3">Settings</h1>
 <div className="flex items-center justify-between mb-3">
 <Label htmlFor="showKeybindings" className="font-normal">
 Show keybindings
 </Label>
 <Switch
 id="showKeybindings"
 checked={showKeybindings}
 onCheckedChange={(checked) => {
 store.showKeybindings.setValue(checked);
 setShowKeybindings(checked);
 }}
 ></Switch>
 </div>
 {hasBookmarkPermission ? (
 <span className="text-sm text-wrap mr-2">
 Bookmarks permission granted
 </span>
 ) : (
 <div className="flex items-center justify-between">
 <span className="text-sm text-wrap mr-2">
 Request bookmarks permission(required for adding/removing bookmarks)
 </span>
 <Button className="font-normal" onClick={requestBookmarksPermission}>
 Request
 </Button>
 </div>
 )}
 </div>
 );
}
export default App;

We initially checked if the permission is already granted, and if it isn’t, we show a button to request the permission. On button click, we request the permission using browser.permissions.request and update the state accordingly.

Warning

Important note for Firefox: Permissions can only be requested via direct user interaction, such as a button click, and not programmatically. This is to prevent extensions from requesting permissions without user consent. The permission request must be the first thing on the button click. If you add any other code before the permission request, the request will be blocked. For example, try adding a simple timeout before the permission request and you should get an error on the background script’s console:

src/entrypoints/popup/App.tsx
 const requestBookmarksPermission = async () => {

 await new Promise((resolve) => setTimeout(resolve, 1000));

 if (!hasBookmarkPermission) {
 const granted = await browser.permissions.request({
 permissions: ["bookmarks"],
 });

 if (granted) {
 setHasBookmarkPermission(true);
 }
 }
 };

Firefox Permissions Error

This also means you cannot ask for multiple permissions at once on a single button click (this might also be problematic in chrome). If you need to ask for multiple permissions, the best way would be to ask for them one by one separately.

Making The Shortcut Configurable

We have been listening to keyboard events in our Command Palette component to toggle its visibility. We can move this to extension level rather than having it in the UI component. This will allow us to make the shortcut configurable by the user.

First we need to add the suggested keybinding to the manifest in wxt.config.ts:

wxt.config.ts
import { defineConfig } from "wxt";

// See https://wxt.dev/api/config.html
export default defineConfig({
 srcDir: "src",
 extensionApi: "chrome",
 modules: ["@wxt-dev/module-react"],
 manifest: {
 name: "Command Palette",
 description: "A command palette to quickly perform actions",
 permissions: ["storage"],
 optional_permissions: ["bookmarks"],
 commands: {
 toggleMainDialog: { // This is name of the command which we will listen to
 suggested_key: {
 default: "Alt+J", // Ctrl+J is taken by Chrome/Firefox
 },
 description: "Toggle the main dialog",
 },
 },
 },
});

Next, add a message that can be sent from the background script to the content script to toggle the dialog:

src/lib/messaging.ts
interface ProtocolMap {
 //...existing protocol map
 toggleMainDialog(): void;
}

Add a listener for the command in the background script (the command API is not accessible in content scripts), and send a message to the content script to toggle the dialog:

src/entrypoints/background.ts

import { onMessage, sendMessage } from "@/lib/messaging";

export default defineBackground(() => {
 //...existing listeners
 browser.commands.onCommand.addListener((command) => {
 if (command === "toggleMainDialog") { // same name as in the manifest

 // send the message to the content script of active tab
 browser.tabs
 .query({
 active: true,
 currentWindow: true,
 })
 .then((tabs) => {
 if (tabs[0]?.id) {
 // tab id is required to send message to the content script
 sendMessage("toggleMainDialog", undefined, tabs[0].id);
 }
 });
 }
 });

On receiving the message in the Command Palette component, simply update the state to toggle the dialog:

src/components/CommandPalette.tsx

import { onMessage } from "@/lib/messaging";

export function CommandPalette() {

 //...existing code
 React.useEffect(() => {

 // onMessage returns a function to remove the listener, which we can call in the cleanup function
 const removeListener = onMessage("toggleMainDialog", () => {
 setOpen((open) => !open);
 });
 return () => removeListener();
 }, []);


 // We can remove the keyboard event listener
 // React.useEffect(() => {
 // const down = (e: KeyboardEvent) => {
 // if (e.key === "j" && (e.metaKey || e.ctrlKey)) {
 // e.preventDefault();
 // setOpen((open) => !open);
 // }
 // };
 //
 // document.addEventListener("keydown", down);
 // return () => document.removeEventListener("keydown", down);
 // }, []);

 //.. rest of the code as is

Our default shortcut is now Alt+J, which user can change from the browser’s extension settings.

Configurable Shortcut in Chrome
Configurable Shortcut in Firefox

Conclusion

Through this series, we have covered the fundamentals of building modern web extensions using WXT. To recall, we have covered the following topics:

  1. Fundamentals of Web Extensions, the current state of extension development, and the need for a modern approach.
  2. Setting up a new project with WXT, Tailwind CSS, and Shadcn.
  3. Working with Content Scripts, building isolated UIs, and fixing some common issues with UI.
  4. Background Scripts, built in messaging APIs, and using an external wrapper for messaging.
  5. Using Storage API, Permissions / Runtime Permissions, and Commands.

This builds a strong foundation for building more complex extensions. We have covered a lot of ground, but there is still a lot more to explore:

  1. Other Entrypoints : Though we have used only the content script and Popup, there are other entrypoints like Sidebar, Devtools, and Options page which you can use in your extension.
  2. Publishing : Different stores have different processes for publishing extensions. After initial publishing, WXT provides a way to automate the process of updating the extension. To use the extensions in release build of browser, you need a signed version which can be obtained from respective stores. You can also automate publishing using Github Actions.
  3. Testing : WXT has support for Vitest for unit testing and suggests using Playwright for end-to-end testing.
  4. Internationalization : WXT has a package @wxt-dev/i18n for internationalization, which can be used to localize the extension based on user’s preferred language.

Finally, keep an eye on this space for future posts. If you have any questions or suggestions, feel free to reach out at reply@aabidk.dev

https://aabidk.dev/blog/building-modern-cross-browser-web-extensions-core-functionality-storage-and-permissions/
Building Modern Cross Browser Web Extensions: Background Scripts and Messaging (Part 4)

In the previous post, we explored how to work with content scripts and build isolated UIs for our extension. In this post, we will learn how to work with background scripts as well as about the communication process between the content scripts and the background scripts.

Background Scripts

Background scripts enable us to listen to browser events (such as opening/closing tabs, bookmarking a page, etc.) and use sensitive browser APIs (which cannot be accessed from content scripts directly) as long as the user has granted the necessary permissions. The term “background script” is used with Manifest V2, and with Manifest V3, the term “service worker” is used. In our project, we have a src/entrypoints/background.ts file which is our background script. WXT automatically handles the registration of the background script in the manifest file as per Manifest V2 or Manifest V3. More information about background scripts can be found in MDN Web Docs and Chrome Extension Docs .

Listening to Browser Events

Recall from previous posts that we have a browser global available via the auto imports that WXT provides, and that we can use the same for both Chrome and Firefox. Here is how we can listen to the onInstalled event and perform some action when the extension is installed:

src/entrypoints/background.ts
export default defineBackground(() => {
 browser.runtime.onInstalled.addListener(() => {
 browser.tabs.create({ url: "https://www.google.com" });
 console.log("onInstalled");
 });
});

This will open a new tab with google.com when the extension is installed. The log message will be visible in the service worker’s console. Similar other events such as onStartup, onInstalled, onMessage, etc are available in browser.runtime. There is an extensive list of events provided by various APIs such as tabs , bookmarks etc. which can be used in background scripts.

Communication between Content Scripts and Background Scripts

Communication is one of the most crucial and complex component of web extensions. Errors in communication logic are easy to introduce and difficult to debug. Hence, having a good understanding of communication fundamentals is essential.

To communicate between content scripts and background scripts, we can use the messaging APIs provided by the browser. Using these APIs, we can send one-time messages, or use long-lived connections for continuous communication. Bidirectional communication is possible, so the background script can send messages to the content script and vice versa. Let’s try to send a message from the content script to the background script and log the message in the background script:

src/entrypoints/test.content.ts
export default defineContentScript({
 matches: ["<all_urls>"],

// The main function of the content script can be async
 async main(ctx) {
 const res = await browser.runtime.sendMessage("testMessage");
 console.log(res);
 },
});

and in the background script:

src/entrypoints/background.ts
export default defineBackground(() => {

 // The listener function passed to addListener CANNOT be async
 browser.runtime.onMessage.addListener((message, sender, sendResponse) => {
 if (message === "testMessage") {
 console.log("got message from test.content.ts");
 sendResponse("testResponse");
 }
 });
});

Few important points to note here:

  • The browser.runtime.sendMessage function is used to send a message from the content script to the background script. When sending from the background script to the content script, we need to use browser.tabs.sendMessage, which also requires the tabId of the target tab.

  • The browser.runtime.onMessage.addListener function is used to listen to messages in the background script, same can be used in content scripts as well.

  • Although we haven’t used the sender parameter in the listener function, it can be used to get information about the tab that sent the message. Note that the browser.runtime.onMessage event only receives messages from our own extension or its content scripts. By default, other extensions can send messages to our extension, but web pages cannot. To receive messages from web pages, we need to explicitly allow external sources in our manifest file by specifying the externally_connectable property. In such cases, we can use the browser.runtime.onMessageExternal event to receive messages from external sources, and the sender parameter will contain the URL of the page or the ID of the extension sending the message, which can be used to verify the authenticity of the message.

  • The sendResponse function here is used to send a synchronous response back to the content script. To send an asynchronous response with sendResponse, we must return a true from the listener function. We cannot use an asynchronous listener function with addListener. For example:

src/enrtypoints/background.ts
browser.runtime.onMessage.addListener((message, sender, sendResponse) => {
 if (message === "testMessage") {
 console.log("got message from test.content.ts, sending async response");
 setTimeout(() => {
 sendResponse("testResponse async");
 }, 1000);
 return true;
 }
});

This way the message channel will be kept open until the response is sent back to the content script using sendResponse. If we have multiple listeners for same event, only the first one to call sendResponse will send the response, and rest will be ignored. Consider the following code:

src/entrypoints/background.ts
browser.runtime.onMessage.addListener((message, sender, sendResponse) => {
 if (message === "testMessage") {
 console.log("got message from test.content.ts, sending async response");
 setTimeout(() => {
 sendResponse("testResponse async");
 }, 1000);
 return true;
 }
});


browser.runtime.onMessage.addListener((message, sender, sendResponse) => {
 if (message === "testMessage") {
 console.log("A second listener for the same message");
 setTimeout(() => {
 sendResponse("This will be sent first, the other listener will be ignored");
 }, 500);
 return true;
 }
});

In the above example, the response from the second listener will be sent first, and the response from the first listener will be ignored as both are listening for the same message.

For listening to multiple messages, a cleaner way would be to use a switch case:

src/entrypoints/background.ts
browser.runtime.onMessage.addListener((message, sender, sendResponse) => {
 switch (message) {
 case "message_1":
 console.log("got message_1");
 setTimeout(() => {
 sendResponse("response_1");
 }, 1000);
 return true;

 case "message_2":
 console.log("got message_2");
 setTimeout(() => {
 sendResponse("response_2");
 }, 1000);
 return true;

 //...more cases

 // on any unknown message
 default:
 console.log("got unknown message:", message);
 //No async response here, so no need to return true
 sendResponse("unknown message");
 return false;
 }
});

You can test this by sending different messages from the content script and checking the console logs in both the consoles.

Using sensitive browser APIs in Background Scripts

Let’s say we want to access the user’s browsing history in our extension. This is a sensitive API that requires the history permission in the manifest file. We cannot access the history API directly in our content scripts, so we will use messaging as discussed previously. We can add the history permission to the manifest section in our wxt.config.ts file as:

wxt.config.ts
import { defineConfig } from "wxt";

// See https://wxt.dev/api/config.html
export default defineConfig({
 srcDir: "src",
 extensionApi: "chrome",
 modules: ["@wxt-dev/module-react"],
 manifest: {
 name: "Command Palette",
 description: "A command palette to quickly perform actions",
 permissions: ["history"],
 },
});

Let’s send a message from the content script to the background script to get the history and log it on response:

src/enrtypoints/test.content.ts
// The HistoryItem type from chrome namespace should work for both Chrome and Firefox
type HistoryItem = chrome.history.HistoryItem;

export default defineContentScript({
 matches: ["<all_urls>"],

 async main(ctx) {
 const history: HistoryItem[] = await browser.runtime.sendMessage("getHistory");
 console.log("history", history);
 },
});

and in the background script:

src/enrtypoints/background.ts
export default defineBackground(() => {
 browser.runtime.onMessage.addListener((message, sender, sendResponse) => {
 if (message === "getHistory") {
 console.log("getting history");
 browser.history.search({ text: "" }).then((history) => {
 sendResponse(history);
 });
 return true;
 }
 });
});

Our response should look something like this in the browser console:

[
 {
 "id": "2",
 "lastVisitTime": 1735732800000.0,
 "title": "DuckDuckGo - Your protection, our priority.",
 "typedCount": 0,
 "url": "https://duckduckgo.com/",
 "visitCount": 2
 }
]

We have used the HistoryItem type, allowing us to easily access the available attributes of the history item. The extensive list of browser APIs, their attributes and their events is available on MDN Web Docs and Chrome Developer Docs.

Warning

Although most of the browser APIs are similar between the browsers, there are some minor differences, which you may encounter when developing browser specific features. For example, browser.tabs.onActivated event provides an activeInfo object, which has previousTabId, tabId and windowId properties in Firefox1, but only tabId and windowId in Chrome2. As we are using @types/chrome for types, Typescript will complain that previousTabId is not available if we try to use it. We can add a check for the browser and a simple //@ts-ignore above the line to ignore the error, when using such browser specific features. Another way is to use the Webextension Polyfill (which is by default disabled as of now), but it might further add complexity in managing types in the project. In most cases, you should try to use the widely available APIs only unless absolutely necessary.

A better way of messaging

The above approach of messaging using the browser APIs is quite error-prone and can get difficult to manage as the extension grows. We have very minimally used types in it, and the complexity of asynchronous operations only adds to the complication. WXT recommends using a wrapper around the built-in messaging APIs for these reasons. We will be using the @webext-core/messaging to simplify our messaging implementation and have a more robust and maintainable design.

First, we need to install the package:

pnpm add @webext-core/messaging

We will keep our types in a separate file src/lib/types.ts:

src/lib/types.ts
export type HistoryItem = chrome.history.HistoryItem;

Our messaging setup will be in src/lib/messaging.ts:

src/lib/messaging.ts
import { HistoryItem } from "@/lib/types";
import { defineExtensionMessaging } from "@webext-core/messaging";

interface ProtocolMap {
 getHistory(data: { size: number }): Promise<HistoryItem[]>;
 //or
 //getHistory(data: { size: number }): HistoryItem[];
 // Both are same, as all messages are async. We don't explicitly need to return a Promise, but it's good for clarity.

}

export const { sendMessage, onMessage } = defineExtensionMessaging<ProtocolMap>();

We have defined a ProtocolMap interface which will contain all the messages that we want to pass between the content script and the background script. We have defined a getHistory message which will take a size parameter and return an array of HistoryItem. We have also defined the sendMessage and onMessage functions which will be used for message passing instead of the built-in browser APIs.

Note

We can skip the data key if we have only one parameter in the message.

interface ProtocolMap {
 getHistory(size: number): HistoryItem[];
}

Protocol map functions expect only one parameter data. If we have multiple parameters, we can pass them as an object with multiple keys:

src/lib/messaging.ts
interface ProtocolMap {
 getHistory(data: { size: number, query: string }): HistoryItem[];
}

Even though we have single parameter size we are using data for consistency. More information can be found on webext-core’s documentation .

Now in our background script, we can use the onMessage function to listen to messages:

src/entrypoints/background.ts
import { onMessage } from "@/lib/messaging";

//Notice that the function passed to defineBackground is still not async
export default defineBackground(() => {

// The message handler function can be async
 onMessage("getHistory", async (message) => {
 const history = browser.history.search({
 text: "",
 maxResults: message.data.size,
 });
 return history;
 });
});

We do not need to use sendResponse or return true here. The onMessage function will automatically handle the response. The message handler function can be async, and we can use await to wait for the response from the browser APIs if we need to modify it before returning, or just directly return the Promise. We also have additional attributes such as sender and timestamp available on the message parameter.

In the content script, we can use the sendMessage function to send messages:

src/entrypoints/test.content.ts
import { sendMessage } from "@/lib/messaging";
export default defineContentScript({
 matches: ["<all_urls>"],

 async main(ctx) {
 const history = await sendMessage("getHistory", { size: 4 });
 console.log(history);
 },
});

Instead of passing the size parameter directly, note that we are passing it as an object (the data object) with a key size. You should now be able to see the history logged in the browser console.

We now have a more robust and maintainable messaging system in place. We can easily add more messages to the ProtocolMap interface and use them in our content and background scripts. Using this approach allows to catch errors early in development, as it checks for literal message names (e.g. getHistory), making it easier to identify typing mistakes in message passing – for example, even gethistory would not be recognized as valid message in this case. We can further organize the code by creating helper functions in src/lib/helpers.ts:

src/lib/helpers.ts
import { HistoryItem } from "@/lib/types";
import { sendMessage } from "@/lib/messaging";

export function fetchHistory(size: number): Promise<HistoryItem[]> {
 try {
 const history = sendMessage("getHistory", { size: size });
 return history;
 } catch (error) {
 console.error("Error getting history", error);
 return Promise.resolve([]);
 }
}

and then use the helper functions in our content script:

src/entrypoints/test.content.ts
import { fetchHistory } from "@/lib/helpers";

export default defineContentScript({
 matches: ["<all_urls>"],

 async main(ctx) {

 const history = await fetchHistory(4);
 console.log("history", history);

 },
});

In summary, our process for messaging will be:

  1. Define the appropriate types in src/lib/types.ts
  2. Define the messages in src/lib/messaging.ts
  3. Add the message handlers in the background script using onMessage
  4. Use the sendMessage function in the content script to send messages / use helper functions for more complex operations.

We can also send messages from the background script to the content script using the same messaging system, the only difference is that we have to pass the tabId as the third argument:

sendMessage("getHistory", { size: 4 }, tabId);

This will send the message to the content script of the tab with the given tabId. We would also need to use onMessage in the content script to listen to messages from the background script.

Using Proxy Service: an alternative to messaging

If majority of your extension’s logic needs to be run in the background script, you can look into @webext-core/proxy-service , which allows you to register a service in background script and call its methods directly from other contexts without any message passing. We will continue to use messaging in our extension for now, but you can explore this option if it fits your use case better.

In the next post, we will use the discussed concepts to build the core functionality of our extension, as well as explore storage and permissions in web extensions.


  1. https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/tabs/onActivated#activeinfo  ↩︎

  2. https://developer.chrome.com/docs/extensions/reference/api/tabs#parameters_29  ↩︎

https://aabidk.dev/blog/building-modern-cross-browser-web-extensions-background-scripts-and-messaging/
Building Modern Cross Browser Web Extensions: Content Scripts and UI (Part 3)

In the previous post, we installed WXT, TailwindCSS and Shadcn and explored the project structure. In this post, we will explore how to work with content scripts and build isolated UIs for our extension as well as discuss some issues with UI and how to resolve them. We will build a Command Palette for quick browser actions, such as creating new tabs/windows, muting/unmuting tabs, and more over the next few posts.

Content Scripts

Content scripts are JavaScript files that run in the context of web pages. They can read and modify the DOM of web pages the browser visits but have limited access to browser APIs. We need to use message passing to communicate and get information through background script, which has access to more browser APIs.

Injecting Content Scripts

In the previous post, we saw that our project has a content script at src/entrypoints/content.ts. The WXT documentation provides information about how we can organize our content scripts in the entrypoints section . We will rename our content script to main.content.tsx. The documentation has 3 methods for injecting UIs into the pages with well explained differences. We’ll go with the Shadow Root method as it is pretty straightforward and provides isolation for our CSS styling.

First, let’s add the Shadcn Command component to our project:

pnpm dlx shadcn@latest add command

In the src/components/ create a new file CommandPalette.tsx and add the example code from the Shadcn website for using Command with Dialog. Rename the default export to CommandPalette from CommandDialogDemo.

In our main.content.tsx, we will remove the existing code and add the following:

src/entrypoints/main.content.tsx
import ReactDOM from "react-dom/client";
import { CommandPalette } from "@/components/CommandPalette";
import "@/index.css";

export default defineContentScript({
 matches: ["<all_urls>"],
 cssInjectionMode: "ui",

 async main(ctx) {
 const ui = await createShadowRootUi(ctx, {
 name: "command-palette",
 position: "inline",
 anchor: "body",
 onMount: (container) => {
 const app = document.createElement("div");
 app.id = "command-palette-root";
 container.append(app);

 const root = ReactDOM.createRoot(app);
 root.render(<CommandPalette />);
 return root;
 },
 onRemove: (root) => {
 root?.unmount();
 },
 });

 ui.mount();
 },
});

Few things to note here:

  1. ctx, the parameter to the main function, provides information about the context. When an extension is disabled or uninstalled, the content script is not removed. In such cases, the “context” of our content script becomes invalid. We can use ctx to monitor if the context is still valid, for example to remove event listeners. The UI is removed automatically when the context becomes invalid, so we don’t need to use it explicitly here.
  2. We are using createShadowRootUi to create a shadow root for our UI.(This code is same from the docs). In the DOM, our shadow root will be in a command-palette element (the name we passed to createShadowRootUi) like:
<command-palette
 data-wxt-shadow-root=""
 data-aria-hidden="true"
 aria-hidden="true"
>
 #shadow-root (open)
 <html>
 <head></head>
 <body>
 <div id="command-palette-root"></div>
 </body>
 </html>
</command-palette>
  1. The main function in content scripts can be async, but not in background scripts. We are using async here.

  2. Since we are using <all_urls> in the matches array, our content script will run on all pages. We can restrict it to specific pages by using a URL pattern.

  3. Importing @/index.css in the content script will inject the CSS into the shadow DOM, allowing our Shadcn components to use the styles.

We have used distinct names for the shadow root and the root element of our UI, so that they are easily identifiable in the DOM. You can inspect the DOM to see the shadow root and the root element structured as shown above.

There are some issues with the dialog that we need to resolve though.

Resolving Issues With The UI

If you go to any webpage and inspect the elements, you will see the above element in the DOM. However, if we try to trigger the dialog, it will look broken like this:

The broken command dialog at the bottom of the page

The issue here is that by default on keypress, the dialog div is added to the main DOM and not the shadow DOM. Shadcn is built on RadixUI, hence we can fix this by using RadixUI portals .

Using RadixUI Portals To Move The Dialog To Shadow DOM
  1. In our main.content.tsx, create a portal context and a content root component:
src/entrypoints/main.content.tsx
import React from "react";

export const PortalContext = React.createContext<HTMLElement | null>(null);

const ContentRoot = () => {
 const [portalContainer, setPortalContainer] = useState<HTMLElement | null>(
 null,
 );

 return (
 <React.StrictMode>
 <PortalContext.Provider value={portalContainer}>
 <div ref={setPortalContainer} id="command-portal-container">
 <CommandPalette />
 </div>
 </PortalContext.Provider>
 </React.StrictMode>
 );
};
  1. In the same file, update the main function to use the ContentRoot instead of CommandPalette:
src/entrypoints/main.content.tsx
root.render(<ContentRoot />);
  1. Ensure that you have @/index.css imported in your src/entrypoints/main.content.tsx file.

  2. In the src/components/ui/dialog.tsx import the PortalContext and useContext, and update the DialogContent component to use our portal:

src/components/ui/dialog.tsx

import { useContext } from "react";
import { PortalContext } from "@/entrypoints/main.content.tsx";

//... rest of the code as is

const DialogContent = React.forwardRef<
 React.ElementRef<typeof DialogPrimitive.Content>,
 React.ComponentPropsWithoutRef<typeof DialogPrimitive.Content>
>(({ className, children, ...props }, ref) => (
 <DialogPortal container={useContext(PortalContext)}>
 <DialogOverlay />

After this, the dialog should be rendered properly:

What we simply did here, is creating a ‘div’ and using it as a portal where our dialog will be rendered, thus bringing it inside the Shadow DOM.

Isolating Events

By default, events in the Shadow DOM are not isolated from the main DOM. This means the events such as key presses and scroll can still affect the main page. To see this in action, try opening YouTube, playing a video, and then opening our Command Palette. If you press the m key, the video will mute, showing that the event has bubbled up to the main DOM. Additionally, you’ll notice that scrolling doesn’t work as expected within the Command Palette. We can easily isolate events by updating our createShadowRootUi function:

src/entrypoints/main.content.tsx
 async main(ctx) {
 const ui = await createShadowRootUi(ctx, {
 name: "command-palette",
 position: "inline",
 anchor: "body",
 isolateEvents: ["keydown", "keyup", "keypress", "wheel"], // Add other events as needed
 onMount: (container) => {

 //... rest of the code as is

Once you add the isolateEvents array, the events will be isolated to the shadow DOM, and both keypresses and scrolling will work as expected.

Configuring PostCSS To Convert rem To px

There is one other issue that is not be immediately visible, but you might encounter on some pages. In Shadcn and TailwindCSS, rem units are used widely. rem units are relative to root element’s font size, and in Shadow DOM, the rem unit is actually relative to the font size of the main document’s root element (i.e., the html element), not the shadow root, and this might affect our UI.

The fix for this is rather simple: using other units like px, such as with this package . We can do it as follows:

  1. Install the package:
# https://github.com/TheDutchCoder/postcss-rem-to-px
npm install --save-dev postcss @thedutchcoder/postcss-rem-to-px

#or
pnpm add -D postcss @thedutchcoder/postcss-rem-to-px
  1. Update our postcss.config.js to use the plugin:
postcss.config.js
export default {
 plugins: {
 autoprefixer: {},
 "@thedutchcoder/postcss-rem-to-px": {},
 tailwindcss: {},
 },
};

We can now safely use rem units in our CSS, and they will be converted to px units in the Shadow DOM automatically during build.

Note that we only have the main.content.tsx as a content script, and any UI component can be directly used by importing without marking them as content script separately (just as we don’t have CommandPalette marked as content script explicitly anywhere).

Other Styling Fixes

You might also see some minor styling issues with the dialog such as z-index (again, noticeable on YouTube), which can easily fixed by adding custom CSS either in src/index.css itself or using a separate file:

src/index.css
div[role="dialog"] {
 z-index: 999999;
}

The ShadowDOM, event isolation and rem to px conversion are one time fixes, while you might need styling fixes for UI based on the components you use. We now have a properly isolated UI for our extension, and we can start adding more features to it.

In the next post, we will explore background scripts, the communication process in extensions, and setting up a robust type-safe messaging system for our extension.

https://aabidk.dev/blog/building-modern-cross-browser-web-extensions-content-scripts-and-ui/
Building Modern Cross Browser Web Extensions: Project Setup (Part 2)

In the previous post, we discussed the basics of Web Extensions and the tools we will be using to build our extension. Out of the three mentioned frameworks, we will be using WXT . In this post, we will set up the project and install the necessary dependencies and get an overview of the project structure.

Edit 2025/02/03: The source code for the extension we will be building in this series is available on GitHub .

Note

The project uses pnpm as the package manager. If you are using other package managers, adjust the commands accordingly.

Installing and setting up WXT

The documentation for WXT is pretty straightforward, and you should refer to it for more detailed information at any step. The commands below are directly taken from the documentation: PNPMBunNPM Yarn

pnpm dlx wxt@latest init
bunx wxt@latest init
npx wxt@latest init
# Use NPM initially, but select Yarn when prompted
npx wxt@latest init

Enter your project name when prompted. You will be asked to select a template, and a package manager. We will be choosing React and PNPM (or you can choose any other package manager) respectively. Once the command finishes, you will be asked to cd into the project folder and install the packages.

Warning

WXT is in active development, and the commands/structure might change in the future. The version used in the guide is 0.19.23 which you can pin in your package.json file to follow along.

You are encouraged to go through the documentation once at this point to get a better understanding of the project structure and the commands available.

We will add a src directory for better project organization. Create a src directory in your project’s root and move assets, entrypoints and public folders inside it. In the wxt.config.ts in root of your project, add the following:

wxt.config.ts
export default defineConfig({
 srcDir: "src",
});

Run pnpm dev (or pnpm dev:firefox for firefox) to ensure that the setup is successful. Your browser should open, and you will see the extension installed as below:

During development, it is recommended to test the extension on both Chrome and Firefox continuously to ensure that it works as expected on both the browsers, as well as to catch any potential browser-specific bugs early.

Installing TailwindCSS and Shadcn

Although WXT uses Vite under the hood, the Shadcn theme instructions for Vite do not work directly with WXT. We will use the Manual installation steps with a few modifications.

  1. The first few commands are for installing TailwindCSS and some other dependencies. Run the following commands:
pnpm add -D tailwindcss postcss autoprefixer
pnpm dlx tailwindcss init -p
pnpm add tailwindcss-animate class-variance-authority clsx tailwind-merge lucide-react
  1. In the tsconfig.json file, add the following paths and baseUrl in the compilerOptions:
tsconfig.json
{
 "extends": "./.wxt/tsconfig.json",
 "compilerOptions": {
 "allowImportingTsExtensions": true,
 "jsx": "react-jsx",
 "baseUrl": ".",
 "paths": {
 "@/*": [
 "./src/*"
 ]
 }
 }
}

Warning

WXT explicitly advises against adding paths directly to tsconfig.json, and suggests using the alias option in wxt.config.ts in their documentation . Doing so could cause issues in the future if WXT changes its configurations. However, Shadcn fails to resolve the paths correctly if we do not add them to tsconfig.json directly. There is an open issue in Shadcn about the same. The manual addition is just a temporary workaround and you are advised to monitor the issue’s status and update your configuration accordingly in the future.

  1. Copy the contents of the tailwind.config.js file from the Shadcn documentation . Change the value of content key as follows:
content: ["./src/**/*.{html,js,ts,jsx,tsx}", "./index.html"],
  1. Create a file index.css in the src/ folder. Copy the style from the Configure styles section of the Shadcn documentation and paste them in the index.css file.

  2. Create lib/ folder in src/. Inside lib/, create a file utils.ts and add the cn helper code.

  3. Create a components.json file in the project’s root and add the components.json code.

  4. Try adding a button with the following command:

pnpm dlx shadcn@latest add button

If everything is set up correctly, the button should be added in src/components/ui/button.tsx.

  1. Replace the contents of src/entrypoints/popup/App.tsx with the following:
src/entrypoints/popup/App.tsx
import { useState } from "react";
import { Button } from "@/components/ui/button";

function App() {
 const [count, setCount] = useState(0);

 return (
 <div className="m-10">
 <Button onClick={() => setCount(count + 1)}>Count: {count}</Button>
 </div>
 );
}

export default App;

(The main goal is to remove the App.css import, but we also removed rest of the code and added a simple button to test the setup.)

  1. In the src/entrypoints/popup/main.tsx file, remove the style.css import and add @/index.css import.

  2. Run pnpm dev to start the server. You should see the Shadcn button in the popup window. We can now use TailwindCSS and Shadcn components (adding via CLI will also work).

  3. Also, we can delete the src/entrypoints/popup/App.css file and the src/entrypoints/popup/style.css file as they are not needed.

Understanding The Project Structure

Our directory structure should look something like this:

.
├── components.json
├── node_modules
├── package.json
├── pnpm-lock.yaml
├── postcss.config.js
├── README.md
├── src
│   ├── assets
│   │   └── react.svg
│   ├── components
│   │   ├── button.tsx
│   │   └── ui
│   │   └── button.tsx
│   ├── entrypoints
│   │   ├── background.ts
│   │   ├── content.ts
│   │   └── popup
│   │   ├── App.tsx
│   │   ├── index.html
│   │   ├── main.tsx
│   ├── index.css
│   ├── lib
│   │   └── utils.ts
│   └── public
│   ├── icon
│   │   ├── 128.png
│   │   :
│   └── wxt.svg
├── tailwind.config.js
├── tsconfig.json
└── wxt.config.ts

All our configuration files are in the project’s root, and all the source code in src/ directory. It is mostly similar to the one shown in the WXT documentation , with some differences due to adding Shadcn as follows:

  • components.json - Shadcn configuration file
  • components/ui/ - Shadcn components directory
  • index.css - Our global CSS file
  • tailwind.config.js - TailwindCSS configuration file
  • postcss.config.js - PostCSS configuration file

The main files we are interested are the ones in the src/entrypoints/ directory. These files are the entrypoints for our extension. We have three entrypoints initially:

  1. The background.ts file is the entrypoint for the background script.
  2. The content.ts file is the entrypoint for the content script.
  3. The popup/ directory contains the entrypoint for the popup.

More information about these can be found here . As of writing this guide, we can have only one background script in WXT, but multiple content scripts.

On inspecting the contents of the background.ts file, we can see the following code:

background.ts
export default defineBackground(() => {
 console.log("Hello background!", { id: browser.runtime.id });
});

and in the content.ts file:

content.ts
export default defineContentScript({
 matches: ["*://*.google.com/*"],
 main() {
 console.log("Hello content.");
 },
});

Notice that the default exports are wrapped in defineBackground and defineContentScript functions. These functions are provided by WXT and are used to define the entrypoints for the background and content scripts. The matches property in the defineContentScript function is used to specify the domains where the we want the content script to run (User is notified that we need access to these URLs). In the above example, the content script will run only on *.google.com domains.

The content script and background scripts use different browser consoles for logging. To see the above log statements, run pnpm dev and wait for the browser to open. As the content script has match only for *.google.com domain, the content script will only load when the domain matches. Navigate to www.google.com and open the browser console by pressing F12. You should see the Hello content message in the console.

To see the log from background script:

  1. In chrome / chrome based browsers, navigate to chrome://extensions/ and click on the service worker link for your extension.
  2. In firefox, navigate to about:debugging#/runtime/this-firefox and click on the Inspect link for your extension and go to Console.

This should open a console where you should see the Hello background message along with the browser.runtime.id

Both these consoles are useful for debugging and logging messages. For our UI components, we can find logs in the browser console, while for background scripts they will be in the service worker / extension’s developer console.

Configurations

The wxt.config.ts file is the main configuration file for WXT. It is used to configure the extension’s name, version, and other settings. The wxt.config.ts file in our project looks like this:

wxt.config.ts
import { defineConfig } from "wxt";

// See https://wxt.dev/api/config.html
export default defineConfig({
 srcDir: "src",
 extensionApi: "chrome",
 modules: ["@wxt-dev/module-react"],
});

In the first post, we discussed about the manifest.json file, which contains the metadata for the extension. We have to list our content scripts with the URLs we want to run them on, as well as our background scripts in it for our extension to run. However, manually writing manifest.json is complex, and along with that, the property names are different in MV2 and MV3. WXT abstracts away the manifest file and generates it based on our project structure. (Recall the defineBackground and defineContentScript?). Let’s add the name and description of our project in the manifest:

wxt.config.ts
export default defineConfig({
 srcDir: "src",
 extensionApi: "chrome",
 modules: ["@wxt-dev/module-react"],
 manifest: {
 name: "Command Palette",
 description: "A command palette to quickly perform actions",
 },
});

Run the pnpm dev command. The name and description of the extension should reflect in the browser. In the .output/chrome-mv3/ or .output/firefox-mv2 folder in your project’s root, you should see the manifest.json generated by WXT.

Note

WXT uses Manifest V3 for chrome and Manifest V2 for Firefox by default. In the wxt.config.ts, we only have to add the manifest key names as per MV3, and WXT will automatically convert the keys based on MV2 or MV3.

Auto Imports And Extension APIs

Notice that there is no import for defineBackground in background script or defineContentScript in the content script. By default WXT, auto imports its own APIs, as well as few source directories. The one we will be using most is the browser global variable. It allows us to access the various browser APIs such as browser.tabs, browser.bookmarks and a lot more.

Chrome use the chrome global variable instead of browser. WXT abstracts this away and provides a consistent API across browsers. We only have to use the browser global everywhere - however we have to check if the APIs are available, as WXT assumes all APIs exist for all browsers.

For reference, a list of all available Firefox APIs can be found on MDN Web Docs (along with compatibility details with different browsers) and for Chrome can be found on Chrome Developer Docs .

In the next post, we will discuss about content scripts and UI for our extension.

https://aabidk.dev/blog/building-modern-cross-web-extensions-project-setup/
Building Modern Cross Browser Web Extensions: Introduction (Part 1)

If you are already familiar with web extensions and want to jump directly to the implementation, you can skip this post and go to the next one

Introduction

Extensions have become an integral part of modern browsers. While web development has evolved vastly over the last few years, development of extensions has mostly remained the same. This multi-post series will explore the development of cross browser web extensions using modern tools and frameworks.

Why Build Web Extensions?

Web extensions offer a unique blend of platform independence, a large user base, and feature rich APIs. They have become as significant as mobile apps in today’s world. Following are some of the reasons why you should consider building web extensions:

  • Platform Independence: Web extensions are built using web technologies, which makes them platform independent. With modern tools, you can build an extension once and deploy it across multiple browsers.

  • Large User Base: Browsers have a large user base, which means your extension can reach a large number of users. Chrome itself has over 3 billion active users.

  • Feature Rich APIs: Browsers provide a rich set of APIs that can be used to build powerful extensions. You get a well-established and robust platform - no need to worry about the backend or the infrastructure.

  • Persistent Availability: Extensions are installed by the user and stay with them (even across devices). This means you can build a personalized experience for the user. This is somewhat equivalent of an installed desktop app or a mobile app.

  • Browser Integration: Browser extensions can act as a window to your core software. Todoist and Grammarly are really good examples of this - users can easily access functionalities of the core software from the browser itself.

  • Monetization: Extensions can be monetized in various ways — from ads to premium features, or as a paid addon to the core software.

Development Challenges

Extensions have been around for a long time, but the development process has not changed much. It’s not a surprise that most of the material on the web for extension development focuses on building from scratch. This is not necessarily a bad thing, but it makes things difficult down the road. While it may work for simple extensions, building any sufficiently complex extensions can quickly become unmanageable without better tools and practices.

The transition between manifest versions adds complexity. Chrome’s push toward Manifest V3 requires significant architectural changes - from replacing background scripts with service workers to adapting to stricter API limitations. With Firefox maintaining Manifest V2 support, developers must maintain compatibility across different specifications.

Browser-specific implementations create additional hurdles. While most of the APIs are similar between the popular browsers, there might be some differences at some places, and each extension store has a different review process. An extension working perfectly in Chrome might need substantial modifications for Firefox, while Safari’s extension support is more limited. Cross browser support is often not considered from the initial phase - either making the transition difficult, or requiring multiple separate versions.

Besides these, conveniences like hot reloading and automated publishing workflows are often missing from traditional extension development.

Moving Forward With Modern Frameworks

Several modern frameworks are available for building extensions. These include CRXJS , which takes a vite-plugin approach, and supports React, Solid, Vue and Vanilla JavaScript. However, it requires more manual configuration than other available options, and provides just enough tooling to get started.

Plasmo offers a more opinionated approach, with out of the box support for React, Typescript and other frameworks. It has a long list of features and is one of the good options to check out.

WXT , which is a relatively newer framework, is quickly gaining popularity. WXT provides a good developer experience, with some wrappers around core browser APIs. We will be building our extension using WXT in this series. The comparison of these three can be found here .

While each framework offers its own advantages, WXT provides the best balance of features and flexibility for our needs. Its comprehensive tooling, framework-agnostic approach, and growing ecosystem make it an excellent choice for modern extension development. Let’s first understand some key terminology before diving into implementation.

Understanding The Terminology

There are a few terms that you should be familiar with before we dive into building extensions:

1. Manifest

The manifest file contains the configuration and metadata of the extension. It includes the name, version, permissions, and other details of the extension. It is the only required file, and must have the name manifest.json. MV2 (Manifest V2) and MV3 (Manifest V3) are successive versions of the browser extension manifest format, with MV3 being the latest iteration, introducing significant changes in security, permissions, and functionality.

2. Background Script / Service Worker

This is a script that runs in the background and can be used to listen to events, make network requests, and perform other tasks. Background scripts have access to sensitive browser APIs, although they do not have direct access to DOM. The term ‘Service workers’ is used with Manifest V3, while ‘Background script’ is used in Manifest V2.

3. Content Script

Content scripts are scripts that run in the context of web pages (as if injecting a script in page itself) . They can be used to manipulate the DOM, interact with the page, etc but have limited access to browser APIs. We have to use message passing to communicate with the rest of the extension. One important point to note is that content scripts run in an ‘isolated world’ by default (can be run in ‘main’ world too), meaning the javascript environment of the page and extension are different. Thus, the webpage as well as the content scripts of all the extensions run in isolation and cannot access other’s context.

4. Popup

The popup is a small window that appears when the user clicks on the extension icon. It can be used to display information, settings, or other UI elements.

5. Options Page and Browser Action

The options page is a full HTML page that is used to display the settings of the extension. It can be used to configure the extension, set preferences, etc. Browser Action is the button that the extension adds to the browser toolbar. It can be used to trigger actions, open the popup, etc.

Both Chrome and Firefox have excellent documentation on building extensions. You can refer to the Chrome Extension Docs and Firefox Extension Docs for more details.

Anatomy Of An Extension

The following diagram shows the anatomy of an extension:

Anatomy of an extension

It illustrates how the different components of an extension interact with each other, as previously discussed.

In the next few posts, we will explore :

  • Project setup using WXT, TailwindCSS and Shadcn and understanding the project structure and configuration
  • Content Scripts and building isolated UIs for the extension
  • Background scripts and messaging
  • Storage and Permissions

We will build an extension along the way using the concepts as we learn.

https://aabidk.dev/blog/building-modern-cross-web-extensions-introduction/