Sitemaps are vital components of any website that seeks to rank well on search engines and bring in organic traffic.
Sitemaps are a way to create an index of all your pages on a website, so search engines can easily crawl them: this is one of the essential (yet important) techniques to improve your website's SEO.
Next.js does not have a built-in way to create a sitemap, so this article will show you how to make a sitemap for your Next.js app using a popular package: next-sitemap
and we will show how to decide whether a page should be indexed or not to improve your website's crawl budget.
What is a Sitemap?
Sitemaps are a way for search engines to find all of the pages on your website. They aren't required for search engines to index your site or for your site to be found. Still, they can help increase the number of pages indexed by search engines and improve how often those pages rank in search engine results pages, especially if your website's navigation and hierarchy aren't optimized.
Do you want to learn more about SEO? Take a look at our article for improving your Next.js website SEO ranking
Make a Sitemap for a Next.js website
To make a sitemap for a Next.js website, we will use the open source package next-sitemap.
To get started, install the package with the following command:
npm i next-sitemap -D
Now we need to create a configuration file. To do so, create a file named next-sitemap.config.js
in your root folder (ie. next to package.json) and add the following content:
const SITE_URL = process.env.SITE_URL || 'https://yourwebsite.com';
/** @type {import('next-sitemap').IConfig} */
const config = {
siteUrl: SITE_URL,
generateRobotsTxt: true,
};
export default config;
Do not index "thin content" to save crawling budget from search engines
Search Engines do not index all of your website's content, especially if you have many pages.
SEO professionals talk about a "crawl budget": this is a budget that search engines assign to your website. For example, if your website has too much content, you may inadvertently index "thin content" or low-quality pages instead of high-quality content, which you should prioritize.
Furthermore, we want our pages to be as lightweight as possible to save bandwidth for Google's bots, which will be able to index more content.
To do so, we want to add a few directives to our robots.txt
file and tell Googlebot's not to follow certain links, such as Next.js's preloaded routes (in the form of json
and js
files).
Let's see how to do it.
Create a series of paths that tell Googlebot's not to follow the matching URLs
// Save crawling budget by not fetching SSG meta files
const NEXT_SSG_FILES = [
'/*.json$',
'/*_buildManifest.js$',
'/*_middlewareManifest.js$',
'/*_ssgManifest.js$',
'/*.js$',
];
// extend the configuration
const config = {
siteUrl: SITE_URL,
generateRobotsTxt: true,
robotsTxtOptions: {
policies: [
{
userAgent: '*',
disallow: NEXT_SSG_FILES,
},
],
},
};
Exclude thin content and low-quality pages
Additionally, you should ensure not to include any page that can be classified as "duplicate" or "thin content," such as directory pages or pages with no or useful content. Among these, I include "categories," "tags," or other directories that, while helpful to the hierarchy of the websites, ultimately do not add any value to Search Engines.
Additionally, you want to exclude your application's internal pages from the sitemap, i.e., any page gated by authentication. This is a step not to be missed in applications like Makerkit, where the marketing website and the application are in the same domain and codebase.
To do so, we add the exclude field with a list of paths that should not be included in the sitemap.
These are the ones the Makerkit starter uses:
// add your private routes here
const exclude = [
'/dashboard*',
'/settings*',
'/onboarding*',
'/blog/tags*',
'/auth*',
];
/** @type {import('next-sitemap').IConfig} */
const config = {
exclude,
// .. the configuration above
};
export default config;
The final script will look like the below:
const siteUrl = process.env.NEXT_PUBLIC_SITE_URL;
// add your private routes here
const exclude = [
'/dashboard*',
'/settings*',
'/onboarding*',
'/blog/tags*',
'/auth*',
];
// Save crawling budget by not fetching SSG meta files
const NEXT_SSG_FILES = [
'/*.json$',
'/*_buildManifest.js$',
'/*_middlewareManifest.js$',
'/*_ssgManifest.js$',
'/*.js$',
];
/** @type {import('next-sitemap').IConfig} */
const config = {
siteUrl,
generateRobotsTxt: true,
exclude,
robotsTxtOptions: {
policies: [
{
userAgent: '*',
disallow: NEXT_SSG_FILES,
},
],
},
};
export default config;
Remember: you will need to update the paths with the ones from your application!
Generating the sitemap at build-time
Now that our configuration is complete, we want to add a script to the package.json.
The scripts generate the sitemap after building the Next.js application, as this package will inspect the final output to understand your website's structure.
To do so, let's add the following scripts:
{
"scripts": {
"postbuild": "npm run sitemap",
"sitemap": "next-sitemap",
// other scripts here
}
}
The above will:
-
create a script that calls the next-sitemap package
-
call the script above automatically after every build
-
The script will generate the sitemap and the robots.txt files in the public folder and, therefore, will be available in the root folder of your website.
For example, assuming your website's URL is https://mygreatsaas123.com
, you will find:
-
the sitemap at
https://mygreatsaas123.com/sitemap.xml
-
the robots.txt at
https://mygreatsaas123.com/robots.txt
Do not forget to submit your Sitemap through the Google Search Console so that Google knows of your website right away.
Why did it generate a sitemap index?
This is a newer way to tell search engines of your sitemaps when you have many URLs to be indexed:
- the index contains a list of the sitemaps
- each sitemap contains a list of URLs on your website
When you submit the sitemap to Google, only submit the sitemap indexes. Google will automatically crawl the URLs within each index.
Makerkit is an SEO ready SaaS Template built with Next.js and Firebase
Of course, we optimized our template starter for Next.js to automatically generate your sitemap and robots.txt to use all the optimization techniques listed above (and more!) from the start.
If you're looking for a SaaS starter for your Next.js application, give it a try!