Quickest way for Indexing
Created: Last updated:
So, you want to know the fastest, quickest way for indexing a new webpage with Google? You probably use it already but you don't know it yet.
When people think about SEO they usually think of magic and tricks. Actually, you just have to use some logic. Like that type of logic a web developer will use to build his own search engine. I don't have my own search engine but I think about it and then most things I read about SEO is just baloney.
In this document I will not only tell you what it takes to get your new webpage indexed in no time but also why Google's crawler, aka GoogleBot, will find it in no time.
To learn only about this fastest way you can skip ahead but please read on to understand why it is happening. I will also list other ways how Google will learn about a new webpage.
How do they know
First lets talk about some background and what Google is or more importantly does. Actually, not just Google. We need to understand how any search engine (like Bing, Yahoo et al) knows that there even is a webpage available on any given website. The most common way are hyperlinks. Hyperlinks is the technical term for links in a document and a search engine discovers a webpages by finding URL (these hyperlinks) in a webpage.
Internal links
Ordinarily this is the menu on your website or on some pages you have a full list with all the links of a certain section. These are commonly known as internal links.
As an example look at my website and this very document. This webpage or document has a parent folder which in my system is called a chapter in a book and it has a chapter list; you could see that following upwards with the breadcrumbs above. Oh and the links in the breadcrumbs work, too. However, in a perfect world the pages in the breadcrumbs should be already known to a search engine.
If you don't have internal links you have to wait for external links to tell Google that you have a new webpage. We should always have internal links just for our visitors, allowing them to browse our website and hopefully even click through to a document.
Just think about it. Without internal links even visitors wouldn't know a webpage exists.
sitemap.xml
Another way is to maintain a sitemap.xml. For a very large website you may not want to have all the webpages in that single file. Just the basic structure and from there the crawler bots should find their way to each document with mentioned section lists. A sitemap should be a starting point for any web crawler but there is in fact no need to have each and every webpage listed.
Webmaster Tools
Then there is Google's Webmaster Tools website. In the "Diagnostics" section you will find the "Fetch as Googlebot" page. Add the link and Google will secretly add this list to its index.
Fast and furious
Now for the fast and furious methods to let Google know that you have new webpage. Lets begin with the runner up
FeedBurner
If you have an RSS feed you may be using FeedBurner. If not consider it and ping the crawler if you have new content.
FeedBurner used to be an independent company but is now part of Google. Whenever the FeedBurner crawler visits your site it will automatically check or sync all pages with the search index. If a new document link is found it will be added to Google's crawler list.
Google Analytics
The winner by fare is certainly Google Analytics. If you have Google Analytics installed you simply open your new webpage once so Google Analytics can do its work in the background. Like above with FeedBurner it will check any webpage against the search index.
From experience with this website and looking at the first GoogleBot appearence in my logs it usually takes less than an hour for any of my webpages to be visited by the GoogleBot crawler.
Final warning
Or maybe also sort of a disclaimer. Don't confuse all the activities of a search engine and think they will happen all at once.
As a web developer I would not program all these into one single program. They would be independent modules and I bet that Google and all other search engines do the same.Follow is a little break down what should and probably happens with all search engines.
First a search engine has to learn about a link and somehow manage all the links. This will be an API like program that accepts links and maintains a list with all the links in a simple database table. Note that this is pretty much the only activity we have some sort of a control.
Following that the search engine can go about crawling the website with the link and download the content. This will be the program usually known as the crawler, spider or web bot. Its only function really is to fetch the content and store it somewhere.
Next activity in line will be a program that evaluates the content of the downloaded page. It will do what we usually understand as indexing. It picks out all the link references and determines keywords in the document. Indexing basically links the URL of the document to other information; like keywords or hyperlinks, other URLs.
Finally it happens what we now as ranking and all the fuse is about with SEO. Based on the information in the indexing process a search engine will create some sort of a value or importance factor for you new page. This ultimately will be measures against other documents on the web and determine your ranking in search results.