Today we’re going to take on automating the submission of sitemap.xml for all of our websites.

We’ll use the following tools:

  • Google Search Console API - docs
  • Google Sheets API
  • Google Authentication
  • Ruby

Although we’re using Ruby here, you can easily implement almost exactly the same flow using Python.

Our code will have several routines that we’ll split into separate files for readability and ease of maintenance.

In general, we’ll need to take the following steps:

  • Authenticate
  • Retrieve the list of all websites
  • Submit all the sitemaps

The list of websites doesn’t need to be maintained in a Google sheet. Depending on how many you maintain and how frequently that list is changing, you could keep them in a JSON file or even hard-code them, that’s up to you (feel free to leave that part of the code out).

For clarity, here is the code hierarchy for the little tool we’re building:

submit_sitemap_tool/
    lib/
        auth.rb
        fetch.rb
        submit.rb
    submit_sitemap_tool.rb

In this file hierarchy, the main script is submit_sitemap_tool.rb, which imports the necessary modules and classes from the lib directory to authenticate with the Google API, fetch the list of domains from the Google Sheet, and submit the sitemap to Google for each domain, respectively.

Authentication

In our auth.rb, we implement the following:

require 'googleauth'

module SubmitSitemapTool
  class Auth
    SCOPE = 'https://www.googleapis.com/auth/webmasters https://www.googleapis.com/auth/spreadsheets.readonly'

    SERVICE_ACCOUNT_FILE = 'path/to/service_account.json'

    def self.authenticate
      credentials = Google::Auth::ServiceAccountCredentials.make_creds(
        json_key_io: File.open(SERVICE_ACCOUNT_FILE),
        scope: SCOPE
      )
      credentials.fetch_access_token!
      credentials
    end
  end
end

Fetch the list of domains

require 'google/apis/sheets_v4'

module SubmitSitemapTool
  class Fetch
    SPREADSHEET_ID = 'your-spreadsheet-id-here'
    RANGE_NAME = 'Sheet1!A1:A10'

    def self.fetch_domains(service)
      result = service.get_spreadsheet_values(SPREADSHEET_ID, RANGE_NAME)
      domains = result.values
      domains
    end
  end
end

Submit the sitemaps to Google Search Console

require 'google/apis/webmasters_v3'

module SubmitSitemapTool
  class Submit
    def self.submit_sitemaps(creds, domains)
      webmasters_service = Google::Apis::WebmastersV3::WebmastersService.new
      webmasters_service.authorization = creds
      domains.each do |domain|
        url = "https://#{domain[0]}/sitemap.xml"
        begin
          webmasters_service.submit_sitemap(domain[0], url)
          puts "Sitemap submitted successfully for #{domain[0]}"
        rescue Google::Apis::ClientError => e
          puts "Error submitting sitemap for #{domain[0]}: #{e.message}"
        end
      end
    end
  end
end

And all this is driven by the code in submit_sitemap_tool.rb:

require 'googleauth'
require 'google/apis/sheets_v4'
require 'google/apis/webmasters_v3'
require_relative 'lib/auth'
require_relative 'lib/fetch'
require_relative 'lib/submit'

def main
  creds = SubmitSitemapTool::Auth.authenticate

  # Fetch domains from Google Sheet
  service = Google::Apis::SheetsV4::SheetsService.new
  service.authorization = creds
  domains = SubmitSitemapTool::Fetch.fetch_domains(service)

  # Submit sitemap to Google for each domain
  SubmitSitemapTool::Submit.submit_sitemaps(creds, domains)
end

if __FILE__ == $0
  main
end

For clarity, I’m including my Gemfile here that brings in the dependencies, in case you don’t have them installed yet. You can bring those in with a simple bundle install.

source 'https://rubygems.org'

gem 'google-api-client'
gem 'googleauth'
gem 'google-cloud-storage'
gem 'minitest'

Few more notes on the Authentication

Both Google Sheets API and Google Search Console API require authentication. I chose to use Service Account credentials, but you can also use OAuth Client ID, if you’d like.

Here are step-by-step instructions to set up the Google Sheets API and obtain the credentials.json file:

  • Go to the Google Cloud Console.
  • If you don’t have a project yet, create a new project by clicking the “Select a project” dropdown menu at the top of the page and then clicking “New Project”. Enter a project name and click “Create”.
  • In the navigation menu on the left, click “APIs & Services” and then click “Dashboard”.
  • Click the “+ ENABLE APIS AND SERVICES” button at the top of the page.
  • Search for “Google Sheets API” in the search bar and click the result.
  • Click the “Enable” button to enable the API.

  • In the navigation menu on the left, click “Credentials” and then click “Create credentials”.
  • Choose “OAuth client ID” as the credential type.
  • Choose “Desktop app” as the application type.
  • Enter a name for the OAuth 2.0 client ID and click “Create”.
  • Click “Download” to download the credentials.json file. Save the file in a secure location on your computer.
  • Open the Google Sheet that contains the list of domains you want to submit sitemaps for.
  • Click the “Share” button in the top-right corner of the screen.
  • Enter the email address associated with the client ID you created earlier and click “Send”. The email address should have the format [email protected], where client-id is the client ID you created and PROJECT-ID is the ID of your Google Cloud Console project.
  • Once you have completed these steps, you should be able to use the Google Sheets API to fetch the list of domains from your Google Sheet.

Note that you need to use service account credentials for both the Google Sheets API and the Google Search Console API (which is used for sitemap submission).

To access the Google Sheets API, you’ll need to authenticate with OAuth 2.0 or a service account. Since you’re already using a service account to authenticate with the Google Search Console API for sitemap submission, it makes sense to also use the service account for the Google Sheets API to simplify the authentication process.

To use a service account with the Google Sheets API, you’ll need to share the Google Sheet with the email address associated with the service account (which should be in the format [email protected]). You’ll also need to download the JSON key file that contains the service account’s credentials and use it to authenticate your application with the Google Sheets API.

To get the JSON key file for your service account, follow these steps:

  • Go to the Google Cloud Console.
  • In the top navigation bar, make sure that you have selected the project that you want to use.
  • In the navigation menu on the left, click on “IAM & admin” and then click on “Service accounts”.
  • Find the service account that you created and click on its name.
  • Click on the “Keys” tab.
  • Click on the “Add Key” button and select “JSON”.
  • Save the downloaded JSON file to a secure location on your computer.

You can use this JSON file to authenticate your application with the Google Sheets API and the Google Search Console API. Make sure to keep the JSON file secure and not share it with anyone who shouldn’t have access to your project’s resources.

Permissions Troubleshooting for Google Sheets

Granting permissions to a specific Google Sheet

The error message indicates that the Google Sheets API is returning a PERMISSION_DENIED error because the authenticated user does not have sufficient authorization to access the requested resource.

To resolve this issue, you’ll need to make sure that the service account you’re using to authenticate with the API has the necessary permissions to access the Google Sheet. Specifically, the service account needs to have the “View” access level for the Google Sheet in question.

To grant the service account access to the Google Sheet, follow these steps:

  • Open the Google Sheet in your browser.
  • Click the “Share” button in the top-right corner of the screen.
  • In the “Share with people and groups” section, enter the email address associated with the service account (which should be in the format [email protected]).
  • Set the access level to “Editor”.
  • Click the “Send” button. After you’ve granted the service account access to the Google Sheet, try running your script again. If you continue to experience issues, you may need to double-check your authentication credentials to make sure they are correctly set up.

What’s the Spreadsheet ID?

To find the spreadsheet ID of a Google Sheet, follow these steps:

Open the Google Sheet in your browser. Look at the URL in the address bar. The spreadsheet ID is the long string of characters between the /d/ and the next forward slash (/), as shown in this example URL:

https://docs.google.com/spreadsheets/d/1aBcDeFgHiJkLmNOpQrStUvWxYzZaBc12dEfGhIjKlM/edit#gid=0

In this example, the spreadsheet ID is 1aBcDeFgHiJkLmNOpQrStUvWxYzZaBc12dEfGhIjKlM.

You can also find the spreadsheet ID by opening the Google Sheet and then clicking on “File” in the top-left corner of the screen and selecting “Spreadsheet settings”. The spreadsheet ID will be displayed next to “Spreadsheet ID” in the pop-up window.

Grant additional permissions in Google Search Console

To resolve this issue, you need to ensure that the service account has been granted “Full” or “Owner” access to the respective websites in the Google Search Console.

If you have already granted “Full” or “Owner” access to the service account and the issue persists, it’s possible that there is a problem with the way the sitemap URL is formatted or the service account doesn’t have access to the sitemap URL.

You should double-check that the sitemap URL is correct and that the service account has access to the URL. You can also try submitting the sitemap manually in the Google Search Console to ensure that the sitemap URL is correct and that the service account has access to it.

Comments