The following guest post was written by David Harry of www.huomah.com
Hello folks…. Steve decided to keep playin’ Hooky so the loonies are still running the asylum it would seem… we will return you to your regular programming shortly;
Is your SEO world dull? Just doing that same ‘ol thing time and time again? Are you looking for something new and interesting to sink you teeth into? Well my friend.. Step right up and we make what was old new again!!
Ok, that’s my best carnival barker routine… needs work, certainly not my forte. I do though, want to take a little trip to a far away place that I once wrote about back at the beginning of this year. That locale of delight and wonder is the land of Phrase Based Optimization.
Stop Leaning on the Keyword
The world SEO is dominated by the term ‘keyword.’ It has been ingrained in the lexicon for all time and I used to be hard pressed to get away from it. In truth it is ‘phrases’ that we should be considering far more than mere singular words. Recent data suggests that singular search terms account for only 14% of all searches and that 2 and 3 word phrases are more popular at roughly 32% and 27% respectively. This was even further askew in earlier research on the European market.
Obviously we can argue that combinations of (key) words create a ‘phrase,’ but it is not that simple. More often than not a phrase is a combination of words that seek to define a concept, not simply a set of singular ideas. Identifying these concepts and more importantly how a search engine seeks to understand them is always good to look at once in a while.
The main thing to consider when looking at this from a SEO perspective is that ‘phrases’ impart a concept. As such, there is always more than one way to describe something and many ways that seemingly unrelated items (in a Boolean world) are cousins in a phrase based one. As phrases on a given web page are digested they can be sorted into ‘semantically meaningful groups of phrasings’ which teaches the search engine new phrases that are related in any given space. In the end there is a library of related phrases/terms to any given topic – a theme so to speak.
How Phrase Based Optimization works
Now let’s consider value; in each category and its subsequent sub-categories there is an occurrence rate. That means that when a certain phrase is used, how often it is used in other documents and how often. This same process can be used to identify other ‘common’ terms that the search engine would ‘expect’ to see. This is much different than a simple ‘keyword density’ approach as there would be not only a ‘phrase density’ expectation but a ‘related phrase occurrence’ factor as well. Repeating the same term over and over in a high density, with no expected related phrases, would be a futile effort.
The ‘related phrase information’ is stored for a given phrase and when accessed it looks for a count of known related phrases. Documents that score highest on related phrases move higher up the weighting ladder. Through this a page can be given strength on a variety of related phrases concerning a single or multiple concepts or ‘topics.’
Building the results sets
What is important to understand is that this methodology would merely be another layer in presenting the ultimate results page to the end user. It is neither a stand alone concept nor a magic bullet. So let’s have a look under the hood and see what makes it tick.
The identification would begin as such during the indexation process:
- Collect possible and good phrases,* along with frequency and co-occurrence statistics of the phrases
- Classify possible phrases to either good or bad phrases based on frequency statistics
- Prune good phrase list based on a predictive measure derived from the co-occurrence statistics
*A ‘good phrase’ is one that; ‘appears in a minimum number of documents, and appear a minimum number of instances in the document collection.’
When a document (web page) is analyzed, collections of ‘good phrases’ can be stored in relation to it; this score and defined ‘phrases’ may be accessed later in the retrieval process. Now let’s look at the ranking process.
Our happy surfer comes along looking for a ‘wicker basket’
- Receive query or previous ranking information
- Create related phrases list
- Retrieve document set based on ‘good phrase’ thresholds
- Compare results for phrase thresholds on the result set
- Re-rank result documents for next phase
Let’s say we have a 2 phrase search such as ‘NHL Hockey, Stanley Cup’
- Searches for documents relating to ‘NHL Hockey’
- Secondary processing of those looking for ‘Stanley Cup’
- Compare results for phrase thresholds on the result set
- Rank documents for next phase
With a ranking process as such, one would be well served in understanding how to create a ‘theme’ surrounding a given target page for an SEO program. This is not limited to your ‘on-site’ efforts alone as we will see in the next section.
How to leverage Phrase Based Optimization
On site: much of utilizing PaIR is understanding the theme of a given term you are targeting. We can do this through a number of ways to increase singular (page level) and over-all site theme.
- Domain name
- Site structure and page naming conventions
- Content creation
- Outbound link strategies
By utilizing not only the target term, but related terms we can begin to build a theme to the core target. The obvious one on the list is the actual page content and assigning a strong sense of relevance within the content. For most in the business the domain name is equally as obvious. What is less obvious is that we can utilize the page naming conventions (and content therein) to create a stronger chain.
Along this line we can also create focussed content on each step in the chain. What’s important to remember that each step is an opportunity to tie in more ‘relevance’ through related phrases within that category or theme. The same can be said of outbound links on your target page in that they can also be used to target related phrases on authority sites which themselves are scored on relevance from a phrase score perspective.
Much of the legwork can be done during the keyword/phrase research and targeting phase. There are many long-tail opportunities which now can have a secondary role in strengthening the core money terms. By creating sub-lists of semantically related phrases we can better target the eventual content on a web page.
Off-site: inbound links enter into the PaIR world in the sense that they can be valued depending upon;
- Link Text
- Page content ‘good phrase’ score
- Page Title relevance
- Site theme
Once again, this is not related to PageRank, simply a standalone metric that can be used to weight in-bound links beyond the PR concepts we all know and love. Further developing your link profile by considering relevance strengthening with such concepts can only help in the long run. This is all that needs to be taken away from it.
What’s the Point?
What should be taken away from our little adventure is that there are many other ways to look at what we are doing. The search engineers are always looking to evolve and so must we. The concepts of Phrase Based Indexing and Retrieval are not an end-game scenario, merely another angle from which to observe this beast. I stayed away from trying to teach a ‘how to’ and opted for a virtual tour through ideas which may give birth to more ideas.
I have been living and working with these ideas for nearly a year now and have had a great deal of experience in playing with them… Is it the basis of my SEO programs? Naw… just a passing interest; another tool in the toolbox. Of late I have been more obsessive with User Performance Metrics and Personalized Search. There are always new things to get one interested in SEO… never get bored.
“By becoming attached to names and forms, not realizing that they have no more basis than the activities of the mind itself, error rises and the way to emancipation is blocked.”
Thanks for letting me drop by…. Until next time; Play Safe.