Search engines rely on automated programs called 'crawlers' to discover and understand your website's content. These crawlers look for a special file called robots.txt that provides them with instructions about which parts of your site they can access and index.
However, if you're trying to create or modify this file in Webflow, you've probably noticed that there's no direct way to do it - Webflow doesn't provide built-in functionality for hosting custom files like robots.txt.
In a significant change announced on July 1st, 2019, Google stated they would no longer support the usage of robots.txt for controlling page indexing. Instead, they recommend using the meta noindex tag directly in your pages when you need to prevent specific content from appearing in search results.
While robots.txt files can still be used for other important purposes, such as managing crawler access to certain sections of your site or specifying your sitemap location, implementing them requires some technical workarounds in Webflow.
If you need these functionalities for proper crawler management or want to ensure search engines interact correctly with your site's resources, we'll explore how to set this up on Webflowusing Cloudflare Workers.
Since Webflow doesn't allow direct file uploads, we'll need to use a reverse proxy to serve a custom robots.txt file.
Think of a reverse proxy as a traffic director that can modify how specific URLs on your site are handled. In this case, we'll use Cloudflare Workers to create a system where most requests go to your Webflow site, but requests for robots.txt will be handled differently.
Create a Cloudflare account:
Configure your DNS settings:
Update your nameservers:
Access Workers & Pages in Cloudflare:
Understanding the Worker setup
The Worker we're creating will perform three important tasks:
Set up the Worker code:
/**
* Webflow Custom robots.txt Handler with Canonical URL Support
* Serves a custom robots.txt file while proxying all other requests to Webflow
* @author BRIX Templates
* @version 1.0.0
*/
// Replace these with your site URLs
const WEBFLOW_SITE = 'https://your-site.webflow.io';
const CANONICAL_SITE = 'https://www.yourdomain.com'; // Include www if desired
// Your robots.txt content
const ROBOTS_TXT_CONTENT = `
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /account/
Disallow: /checkout/
Disallow: /cart/
Disallow: /order-confirmation/
Sitemap: ${CANONICAL_SITE}/sitemap.xml
`;
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request));
});
async function handleRequest(request) {
const url = new URL(request.url);
// Serve custom robots.txt
if (url.pathname === '/robots.txt') {
return new Response(ROBOTS_TXT_CONTENT, {
headers: {
'content-type': 'text/plain',
'cache-control': 'public, max-age=3600'
}
});
}
// Forward request to Webflow
const modifiedRequest = new Request(
WEBFLOW_SITE + url.pathname + url.search,
request
);
// Get the response from Webflow
let response = await fetch(modifiedRequest);
// Clone the response so we can modify it
response = new Response(response.body, response);
// Update canonical URL if present
const html = await response.text();
const updatedHtml = html.replace(
new RegExp(WEBFLOW_SITE, 'g'),
CANONICAL_SITE
);
// Return modified response with correct canonical URLs
return new Response(updatedHtml, {
headers: response.headers,
status: response.status,
statusText: response.statusText
});
}
Once you pasted the code, it is very important that you pay a lot of attention to the following settings, as they are critical for your worker to work properly with your domain and your Webflow site.
Canonical URL setup:
Configuration settings:
Robots.txt configuration:
Finally, once you setup the CANONICAL_SITE and WEBFLOW_SITE url, you can proceed to edit your desired robots.txt just below that (starting on line 13 of the script). On this part, feel free to add any directives you need.
Before clicking Deploy and continuing to the next step, it is recommended to test multiple URLs in the Preview sidebar, just so you can verify that all website URLs properly redirect to your website, with the exception of robots.txt which redirects to a file with your added robots.txt information.
The final step is to set up the Cloudflare Worker route:
Monitor and maintain your setup:
Finally, it is recommended to do regular checks every 2-4 weeks to ensure that everything is working as expected.
Setting up a custom robots.txt file through Cloudflare Workers involves several technical steps and careful configuration. If you're not comfortable with this process or need assistance ensuring it's set up correctly, our team of Webflow experts at BRIX Templates is here to help. We can assist with implementation, testing, and optimization to ensure your site's crawler directives are working as intended.
Discover how to save time by automating lead transfers from Webflow to Klaviyo.
Learn how to automatically truncate long text in Webflow by adding a single custom attribute. 5-minute setup, no coding experience needed
Step-by-step guide to bypass Webflow's 3-field checkout limit. Learn how to add unlimited custom fields to your Webflow Ecommerce checkout.