AI SEO: How to Do It? How to Appear in AI?

Indexation on Google depends on SEO for Google. And what about indexation for AI? Simple: it also depends on AI SEO.

This is one of the topics few people are discussing lately. In fact, very few are even searching on Google about how to do AI SEO.

During keyword research for this article, we found that fewer than 100 people search for these terms monthly.

But AI continues to advance. And it’s always in search of new data—often with urgency.

Today, we’ll discuss the mechanism behind AI—what it considers valid to build its responses—and how you can make your content appealing to it.

Let’s dive into AI indexation. But first, let’s understand its current state.

AI Is Hungry for Data—and It’s Becoming Scarce

The primary engine of a Generative AI system—especially NLP models (Natural Language Processing)—needs data to generate results.

This isn’t news. However, we’ve always assumed these data sources were virtually infinite—while AI learns from a dataset, more data is created and made available daily.

The issue is that AI is beginning to face problems with its data sources.

Recently, several datasets—entities that aggregate high-quality links and data and either provide or sell them—are imposing restrictions on how their content can be accessed.

In a recent article by The New York Times, many of these aggregators of valuable content are completely closing their doors to AI.

For SEOs reading this, here’s a key detail: these datasets are blocking AI from accessing their content directly via robots.txt. AI is completely barred from even knowing what these datasets contain.

Some data sources are going even further, filing lawsuits against OpenAI—as was the case with The New York Times and several other content creators.

AI needs quality data. But we’re venturing a bit beyond the main focus here.

Regarding SEO, SEM, and search in general, AI doesn’t need deeply detailed information on specific topics. But it still needs some level of data.

Let’s pause to discuss this briefly before diving into how to actually do AI SEO.

The Nature of Generative AI Data for Search

The shortage of data AI is facing—and will continue to face—is not directly tied to our work in organic traffic and content marketing.

We’re constantly producing content, and we won’t stop anytime soon.

The current landscape is straightforward. AI exists, and it’s fueled by the content brands produce.

What’s happening now is that new AI-driven search systems are emerging—highlighting tools like SearchGPT and Google Gemini, now integrated into Google searches worldwide.

These systems have a ranking structure similar to Google’s, and they cite sources that help compose their responses.

This illustrates the need for AI SEO — to be one of these authoritative source links.

Another critical reason: controlling the narrative about your niche and brand.

If you operate in a very specific niche, you need AI to deliver accurate answers to simple questions. It’s a business necessity.

For example, if you work with large-scale PEAD plumbing installations, AI must be able to inform users about PEAD’s characteristics and why it’s more suitable than other materials in this context.

This underscores the need for AI SEO. And now is the time—while resources are still in their infancy and the competition isn’t as fierce as SEO was years ago.

How to Do AI SEO

Technically, there are no fully established practical methodologies for AI SEO yet.

For instance, conventional SEO for search engines came with some rules that are still used today.

Take the structuring of headings, for example, or the use of backlinks to make your site more attractive to search engines.

AI also has its own rules, which, incidentally, are quite similar. Subheading placement, clean code, and even some aspects of E-E-A-T are shared between conventional Google SEO and AI content cataloging.

This is for a straightforward reason: much of the on-page SEO work relates to the user experience when reading a particular result.

The same applies to AI SEO. The adjustments needed to appear in AI systems are not overly complex or entirely new most of the time.

Generally, you’ll need to follow principles tied to your editorial and research realities, along with a few technical adjustments related to structured data, particularly for Gemini.

Let’s break down each point:

Maintain Strong Rankings on Google

The first step is simple: to have your content scanned and indexed by any AI, the best strategy is to continue ranking on the first page of search engines, especially Google.

It’s no secret that Google was ChatGPT’s primary source of information.

Long before it faced challenges with reliable data sources, Google had been nearly fully scanned and indexed by most NLPs.

GPT-4, for instance, has direct access to Google, even being able to list the top 10 results for a specific keyword:

So, the best approach is to keep investing in creating SEO-optimized content to appear in AI results.

Some claim that Google will become obsolete because of AI, but it hasn’t yet, and it remains the primary source of information that AI uses for the vast majority of results.

In fact, when we mentioned earlier the data scarcity AI faces, a significant part of this blackout occurred due to Reddit’s request to block indexing.

For those who frequently use Google, Reddit is a familiar presence. As a constantly evolving database, it began dominating the top search results in English.

It didn’t take long for AI to start using Reddit as a source:

In fact, Google had to strike a $60 million deal with Reddit to use those posts.

So, continuing to focus on classic SEO with the goal of appearing in the top positions on Google is, at least for now, the dominant factor in AI SEO.

This even has a name: Search Everywhere Optimization. More about that in the next topic.

The New SEO—Search Everywhere Optimization

What we need to understand now is that SEO is no longer just an effort to simply appear on Google.

It is an effort to make your content appear everywhere — on Google, in AIs, and anywhere else where search exists.

That last sentence is important. The arrival of AI has challenged the greatest institution of the internet — the famous “Google it.”

Everyone now understands that Google is no longer the only way to get answers to their questions. It is a means, primarily aimed at those seeking responses written by humans in article format.

What the new SEO proposes is that each piece of content exists within a search ecosystem, encompassing much more than just the good old Google.

“Writing a blog” used to be the standard for a traffic strategy, but that is now changing. It’s necessary to write an article. And that article needs to be broken down into:

  • Short videos for YouTube Shorts and TikTok;
  • Carousels of up to 20 images on Instagram;
  • Long-form videos on YouTube;
  • A full article for viewing on the blog;
  • Among other strategies you find necessary.

This way, you increase your chances of appearing on all platforms used for content consumption today, which also increases your chances of shaping AI’s understanding.

Technical Adjustments for Gemini

But we also need to discuss Gemini more specifically, as it has some features that differ from ChatGPT and other AIs.

Gemini often works like a Google snippet. Its database is broad and varied, like most generative AIs, but its primary focus is the search engine itself.

Because of this, a significant portion of Gemini’s results is typically from content that ranks highly on the search engine.

It’s important to optimize your content so that Gemini can easily scan it and add it to its results.

Doing this is similar to the process of getting your articles into Google’s other featured snippets—the work is done through structured data or Schema Markup.

🔎 Read also: Website Optimization Checklist — 21 SEO and CRO Actions

Schema Markup is a code added directly to your page’s HTML. There are various types available, and you can find and copy all of them at Schema.org.

For Google, the recommended format is JSON-LD. Adding it isn’t difficult, and you can do it in two ways:

  • Manually: Add the markup directly to your site with the help of a developer. Obtain the code from the Schema.org site and install it on WordPress.
  • Using plugins: Use a WordPress plugin that adds Schema to your entire site or individual pages. Choose JSON-LD markup from the options. Plugins like Schema Pro and Yoast are examples, but they’re often paid.

In addition to Schema, you’ll also need to make some basic technical SEO adjustments, including:

  • Entity optimization;
  • Site speed;
  • E-E-A-T score;
  • Well-structured site architecture;
  • Sitemap inclusion;
  • Updated robots.txt file.

You can delve deeply into technical SEO and go far beyond these points, but even surface-level efforts can yield good results.

What matters most is implementing these actions into your AI SEO strategy.

Choose the Right Keywords

The same keyword research rules for traditional SEO apply strongly to AI SEO.

An early-stage study on Gemini showed that long-tail and informational keywords are far more likely to appear among the listed sources than transactional keywords.

In other words, Google’s AI—and most other AIs—use long-tail keywords to generate responses and prioritize these over transactional ones when displaying sources.

What does this mean? Simply put, producing informational content must continue. It’s this type of content that informs AIs.

It’s easy to see that AI SEO doesn’t differ much from traditional SEO done for SERPs.

Before we conclude: the million-dollar question—do you want to appear in AI results?

Is it Worth Appearing in AI?

Everything we’ve discussed here goes somewhat against traditional marketing practices.

The primary purpose of content is to bring new users to your site. By optimizing your content for AI, you’re contributing to the historic decline in website visits.

This decline is unprecedented in scale. Websites are losing 20% to 50% of their traffic and are struggling to recover these numbers.

This drop is accompanied by a realistic interpretation from Search Engine Land: visits are decreasing due to lower CTRs on SERPs—fewer people are clicking on results, opting instead for AI-generated responses.

So, nobody wins with an AI model like Gemini, for instance, integrated directly into Google—not the user and not the site.

AI SEO becomes the fallback option to avoid losing too many visitors.

By having your content and a link cited as a source, you can at least capture visitors who want to delve deeper into the result.

If you absolutely don’t want your results displayed in AIs, you can include a blocking code directly in your robots.txt file.

Here are links to the documentation for each AI to remove access:

  • OpenAI
  • CommonCrawl
  • Perplexity
  • Microsoft CoPilot

I don’t recommend removing your site from CommonCrawl, as it’s a dataset widely used for scientific research.

However, the others are all Generative AIs and provide simple and straightforward instructions for restricting access to your content.

Except, of course, for Gemini, because it’s part of Google itself.

I hope these tips have helped you better understand the current state of organic search in the post-AI world.

If you have any questions, leave a comment—we’ll respond to them all. Thanks for reading, and see you in the next article!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *