Tutorials
Last updated on:
Dec 10, 2024

How to create a robots.txt file in Webflow

BRIX Templates Logo
Author
BRIX Templates
How to create a robots.txt file in Webflow
Table of contents

Search engines rely on automated programs called 'crawlers' to discover and understand your website's content. These crawlers look for a special file called robots.txt that provides them with instructions about which parts of your site they can access and index.

However, if you're trying to create or modify this file in Webflow, you've probably noticed that there's no direct way to do it - Webflow doesn't provide built-in functionality for hosting custom files like robots.txt.

Google no longer relies on robots.txt for indexing control

In a significant change announced on July 1st, 2019, Google stated they would no longer support the usage of robots.txt for controlling page indexing. Instead, they recommend using the meta noindex tag directly in your pages when you need to prevent specific content from appearing in search results.

While robots.txt files can still be used for other important purposes, such as managing crawler access to certain sections of your site or specifying your sitemap location, implementing them requires some technical workarounds in Webflow.

If you need these functionalities for proper crawler management or want to ensure search engines interact correctly with your site's resources, we'll explore how to set this up on Webflowusing Cloudflare Workers.

Creating a robots.txt file in Webflow using Cloudflare Workers

Since Webflow doesn't allow direct file uploads, we'll need to use a reverse proxy to serve a custom robots.txt file.

How reverse proxy works on Webflow and how its relevant to robots.txt

Think of a reverse proxy as a traffic director that can modify how specific URLs on your site are handled. In this case, we'll use Cloudflare Workers to create a system where most requests go to your Webflow site, but requests for robots.txt will be handled differently.

Set up Cloudflare for your domain

Create a Cloudflare account:

  1. Go to cloudflare.com
  2. Sign up for a new account if you don't have one
  3. Click "Add a Site" in your dashboard
  4. Enter your website's domain name
  5. Select the Free plan (unless you need additional features)
Add domain to Cloudflare to setup Webflow robots txt file

Configure your DNS settings:

  1. After entering your domain, Cloudflare will scan your existing DNS records
  2. Review each record to ensure it matches your current setup
  3. Add any missing records
  4. Click "Continue"

Update your nameservers:

  1. Cloudflare will provide you with two nameserver addresses
  2. Log into your domain registrar's website (like GoDaddy or Namecheap)
  3. Find the nameserver settings
  4. Replace the current nameservers with Cloudflare's nameservers
  5. Save your changes
  6. Wait for the changes to take effect globally (Usually it takes less than an hour, however 24-48 hours are typically recommended)
Configure nameservers for Webflow site in Cloudflare

Create and configure the Worker

Access Workers & Pages in Cloudflare:

  1. Log into your Cloudflare dashboard
  2. Click on "Workers & Pages" in the left sidebar
  3. Select "Create a Worker"
  4. Choose "Create a Service"
  5. Give your service a name (e.g., "robots-txt-handler")
Create worker JS on Cloudflare for Webflow robots.txt file creation

Understanding the Worker setup

The Worker we're creating will perform three important tasks:

  1. Serve a custom robots.txt file when requested
  2. Forward all other requests to your Webflow site
  3. Ensure proper canonical URLs are maintained for SEO

Set up the Worker code:

  1. Click "Deploy" to create a new Worker
  2. Select "Edit code"
  3. Replace the default code with our robots.txt handler
/**
 * Webflow Custom robots.txt Handler with Canonical URL Support
 * Serves a custom robots.txt file while proxying all other requests to Webflow
 * @author BRIX Templates
 * @version 1.0.0
 */

// Replace these with your site URLs
const WEBFLOW_SITE = 'https://your-site.webflow.io';
const CANONICAL_SITE = 'https://www.yourdomain.com'; // Include www if desired

// Your robots.txt content
const ROBOTS_TXT_CONTENT = `
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /account/
Disallow: /checkout/
Disallow: /cart/
Disallow: /order-confirmation/
Sitemap: ${CANONICAL_SITE}/sitemap.xml
`;

addEventListener('fetch', event => {
    event.respondWith(handleRequest(event.request));
});

async function handleRequest(request) {
    const url = new URL(request.url);

    // Serve custom robots.txt
    if (url.pathname === '/robots.txt') {
        return new Response(ROBOTS_TXT_CONTENT, {
            headers: {
                'content-type': 'text/plain',
                'cache-control': 'public, max-age=3600'
            }
        });
    }

    // Forward request to Webflow
    const modifiedRequest = new Request(
        WEBFLOW_SITE + url.pathname + url.search,
        request
    );

    // Get the response from Webflow
    let response = await fetch(modifiedRequest);

    // Clone the response so we can modify it
    response = new Response(response.body, response);

    // Update canonical URL if present
    const html = await response.text();
    const updatedHtml = html.replace(
        new RegExp(WEBFLOW_SITE, 'g'),
        CANONICAL_SITE
    );

    // Return modified response with correct canonical URLs
    return new Response(updatedHtml, {
        headers: response.headers,
        status: response.status,
        statusText: response.statusText
    });
}

Important notes about the code

Once you pasted the code, it is very important that you pay a lot of attention to the following settings, as they are critical for your worker to work properly with your domain and your Webflow site.

Canonical URL setup:

  1. Set CANONICAL_SITE to your actual domain (e.g., 'https://www.yourdomain.com' or 'https://yourdomain.com')
  2. Choose whether to include 'www' based on your preferred canonical URL format, ideally respecting the one you have right now to avoid pages being re-indexed under a new canonical URL
  3. The code will automatically replaces all URLs with your canonical domain

Configuration settings:

  1. Replace WEBFLOW_SITE with your Webflow-hosted site URL.
  2. For this, you need to make sure to activate your Custom domain in Webflow under a different subdomain than the one you have. For example, if your site is 'yourdomain.com', you could setup 'website.yourdomain.com' or even a new domain like 'yourdomainwebsite.com'
  3. Remember this is only for the connection, as Webflow doesn't allow reverse proxies to parse webflow.io domains, meaning we need to connect a actual custom domain for the routing, even we won't actually use it at a front-end level

Robots.txt configuration:

Finally, once you setup the CANONICAL_SITE and WEBFLOW_SITE url, you can proceed to edit your desired robots.txt just below that (starting on line 13 of the script). On this part, feel free to add any directives you need.

Before clicking Deploy and continuing to the next step, it is recommended to test multiple URLs in the Preview sidebar, just so you can verify that all website URLs properly redirect to your website, with the exception of robots.txt which redirects to a file with your added robots.txt information.

Connect the robots.txt Cloudflare Worker to your domain

The final step is to set up the Cloudflare Worker route:

  1. Go to your website's Cloudflare dashboard
  2. Navigate to Workers & Pages
  3. Click "Add route"
  4. In the "Route" field, enter: 'yourdomain.com/*'
  5. Select your Worker from the dropdown
  6. Click "Save"
  7. Test your setup:
    • Wait a few minutes for the changes to take effect
    • Visit yourdomain.com/robots.txt in your browser
    • You should see your custom robots.txt content
    • Visit other pages on your site to ensure they load normally
Configure Cloudflare router settings for Webflow

Monitor and maintain your setup:

Finally, it is recommended to do regular checks every 2-4 weeks to ensure that everything is working as expected.

  • Ensure your canonical URLs are displaying correctly, (you can use the Detailed SEO free Chrome extension)
  • Check that all pages load properly through the proxy
  • Verify your robots.txt is accessible every few weeks

Need a hand from our Webflow team?

Setting up a custom robots.txt file through Cloudflare Workers involves several technical steps and careful configuration. If you're not comfortable with this process or need assistance ensuring it's set up correctly, our team of Webflow experts at BRIX Templates is here to help. We can assist with implementation, testing, and optimization to ensure your site's crawler directives are working as intended.

BRIX Templates Logo
About BRIX Templates

At BRIX Templates we craft beautiful, modern and easy to use Webflow templates & UI Kits.

Explore our Webflow templates
Join the conversation
Join our monthly Webflow email newsletter!

Receive one monthly email newsletter with the best articles, resources, tutorials, and free cloneables from BRIX Templates!

Webflow Newsletter
Thanks for joining our Webflow email newsletter
Oops! Something went wrong while submitting the form.
BRIX Templates - Email Newsletter with Webflow ResourcesBRIX Templates - Email NewsletterBRIX Templates - Webflow Email Newsletter
How to connect your Webflow forms with Klaviyo

How to connect your Webflow forms with Klaviyo

Discover how to save time by automating lead transfers from Webflow to Klaviyo.

Feb 3, 2025
How to programmatically truncate text in Webflow: Step-by-step guide

How to programmatically truncate text in Webflow: Step-by-step guide

Learn how to automatically truncate long text in Webflow by adding a single custom attribute. 5-minute setup, no coding experience needed

Feb 3, 2025
How to add more custom checkout fields in Webflow Ecommerce

How to add more custom checkout fields in Webflow Ecommerce

Step-by-step guide to bypass Webflow's 3-field checkout limit. Learn how to add unlimited custom fields to your Webflow Ecommerce checkout.

Jan 30, 2025