Category Archives: Webdesign

Critical Analysis Of Web Crawlers’ Algorithms

 

Critical Analysis of Web Crawlers’ Algorithms

 Minou Parhizkar 0527553

Abstract- A web crawler is a program or automated script which browses the World Wide Web in a methodical, automated manner. The objective of the paper is to make a make a critical analysis of the algorithms used by Web Crawlers. It intends to review and evaluate the different and various approaches to the methods used by the different web search engines to catalog the information.

 

 

Index Terms-

Web Crawler, Search Engines, WWW, SEO

 

•I.     INTRODUCTION

 

The software that searches for information and returns sites which provide that information is referred to as a search engine or web crawler. Everyone uses web crawlers-indirectly, at least! Every time you search the Internet using a service such as Alta Vista, Excite, or Lycos, you’re making use of an index that’s based on the output of a web crawler. Web crawlers-also known as spiders, robots, or wanderers-are software programs that automatically traverse the Web. Search engines use crawlers to find what’s on the Web; then they construct an index of the pages that were found.

 

Search Engines use spiders to index websites. When you submit your website pages to a search engine by completing their required submission page, the search engine spider will index your entire site. A ‘spider’ is an automated program that is run by the search engine system. Spider visits a web site, read the content on the actual site, the site’s Meta tags and also follow the links that the site connects. The spider then returns all that information back to a central depository, where the data is indexed. It will visit each link you have on your website and index those sites as well. Some spiders will only index a certain number of pages on your site.

A spider is almost like a book where it contains the table of contents, the actual content and the links and references for all the websites it finds during its search, and it may index up to a million pages a day.

 

 

Example: Google spider

 

When you ask a search engine to locate information, it is actually searching through the index which it has created and not actually searching the Web. Different search engines produce different rankings because not every search engine uses the same algorithm to search through the indices.

One of the things that a search engine algorithm scans for is the frequency and location of keywords on a web page, but it can also detect artificial keyword stuffing or spamdexing. Then the algorithms analyze the way that pages link to other pages in the Web. By checking how pages link to each other, an engine can both determine what a page is about, if the keywords of the linked pages are similar to the keywords on the original page. Most of the top-ranked search engines are crawler based search engines while some may be based on human compiled directories. The people behind the search engines want the same thing every webmaster wants – traffic to their site. Since their content is mainly links to other sites, the thing for them to do is to make their search engine bring up the most relevant sites to the search query, and to display the best of these results first. In order to accomplish this, they use a complex set of rules called algorithms. When a search query is submitted at a search engine, sites are determined to be relevant or not relevant to the search query according to these algorithms, and then ranked in the order it calculates from these algorithms to be the best matches first.

Search engines keep their algorithms secret and change them often in order to prevent webmasters from manipulating their databases and dominating search results. They also want to provide new sites at the top of the search results on a regular basis rather than always having the same old sites show up month after month. An important difference to realize is that search engines and directories are not the same. Search engines use a spider to “crawl” the web and the web sites they find, as well as submitted sites. As they crawl the web, they gather the information that is used by their algorithms in order to rank your site.

This paper aims at critically analyzing various search engineers, how they work and comparing their algorithms.

•II.     Working of web crawlers – a detailed look up

Let us now look at a more detailed explanation on how Search Engines work. Crawler based search engines are primarily composed of three parts.

A search engine robot’s action is called spidering, as it resembles the multiple legged spiders. The spider’s job is to go to a web page, read the contents, connect to any other pages on that web site through links, and bring back the information. From one page it will travel to several pages and this proliferation follows several parallel and nested paths simultaneously. Spiders frequent the site at some interval, may be a month to a few months, and re-index the pages. This way any changes that may have occurred in your pages could also be reflected in the index. The spiders automatically visit your web pages and create their listings. An important aspect is to study what factors promote “deep crawl” – the depth to which the spider will go into your website from the page it first visited. Listing ‘submitting or registering’ with a search engine is a step that could accelerate and increase the chances of that engine “spidering” your pages.

The spider’s movement across web pages stores those pages in its memory, but the key action is in indexing. The index is a huge database containing all the information brought back by the spider. The index is constantly being updated as the spider collects more information. The entire page is not indexed and the searching and page-ranking algorithm is applied only to the index that has been created. Most search engines claim that they index the full visible body text of a page. In a subsequent section, we explain the key considerations to ensure that indexing of your web pages improves relevance during search. The combined understanding of the indexing and the page-ranking process will lead to developing the right strategies. The Meta tags ‘Description’ and ‘Keywords’ have a vital role as they are indexed in a specific way. Some of the top search engines do not index the keywords that they consider spam. They will also not index certain ‘stop words’ (commonly used words such as ‘a’ or ‘the’ or ‘of’” so as to save space or speed up the process. Images are obviously not indexed, but image descriptions or Alt text or “text within comments” is included in the index by some search engines.

The search engine software or program is the final part. When a person requests a search on a keyword or phrase, the search engine software searches the index for relevant information. The software then provides a report back to the searcher with the most relevant web pages listed first. The algorithm-based processes used to determine ranking of results are discussed in greater detail later.

These directories compile listings of websites into specific industry and subject categories and they usually carry a short description about the website. Inclusion in directories is a human task and requires submission to the directory producers. Visitors and researchers over the net quite often use these directories to locate relevant sites and information sources. Thus directories assist in structured search. Another important reason is that crawler engines quite often find websites to crawl through their listing and links in directories. Yahoo and The Open Directory are amongst the largest and most well known directories. LookSmart is a directory that provides results to partner sites such as MSN Search, Excite and others. Lycos is an example of a site that pioneered the search engine but shifted to the Directory model depending on AlltheWeb.com for its listings.

Hybrid Search Engines are both crawler based as well as human powered. In plain words, these search engines have two sets of listings based on both the mechanisms mentioned above. The best example of hybrid search engines is Yahoo, which has got a human powered directory as well as a Search toolbar administered by Google. Although, such engines provide both listings they are generally dominated by one of the two mechanisms. Yahoo is known more for its directory rather than crawler based search engine.

Search engines rank web pages according to the software’s understanding of the web page’s relevancy to the term being searched. To determine relevancy, each search engine follows its own group of rules. The most important rules are.

- The location of keywords on your web page; and – How often those keywords appear on the page ‘the frequency’

For example, if the keyword appears in the title of the page, then it would be considered to be far more relevant than the keyword appearing in the text at the bottom of the page. Search engines consider keywords to be more relevant if they appear sooner on the page (like in the headline) rather than later. The idea is that you’ll be putting the most important words – the ones that really have the relevant information – on the page first.

Search engines also consider the frequency with which keywords appear. The frequency is usually determined by how often the keywords are used out of all the words on a page. If the keyword is used 4 times out of 100 words, the frequency would be 4%. Of course, you can now develop the perfect relevant page with one keyword at 100% frequency – just put a single word on the page and make it the title of the page as well. Unfortunately, the search engines don’t make things that simple.

While all search engines do follow the same basic rules of relevancy, location and frequency, each search engine has its own special way of determining rankings. To make things more interesting, the search engines change the rules from time to time so that the rankings change even if the web pages have remained the same. One method of determining relevancy used by some search engines ‘like HotBot and Infoseek’, but not others ‘like Lycos’, is the Meta tags. Meta tags are hidden HTML codes that provide the search engine spiders with potentially important information like the page description and the page keywords.

Meta tags are often labeled as the secret to getting high rankings, but Meta tags alone will not get you a top 10 ranking. On the other hand, they certainly don’t hurt. Detailed information on meta-tags and other ways of improving search engine ranking is given later in this chapter.

In the early days of the web, webmasters would repeat a keyword hundreds of times in the Meta tags and then add it hundreds of times to the text on the web page by making it the same color as the background. However, now, major search engines have algorithms that may exclude a page from ranking if it has resorted to “keyword spamming”; in fact some search engines will downgrade ranking in such cases and penalize the page.

Link analysis and ‘clickthrough’ measurement are certain other factors that are “off the page” and yet crucial in the ranking mechanism adopted by some leading search engines. This is quickly emerging as the most important determinant of ranking, but before we study this, we must first look at the most popular search engines and then look at the various steps you can take to improve your success at each of the stages – spidering, indexing and ranking.

For March 2003, according to a study by Jupiter Media Metrix, there were an estimated 114 million Internet users online in the US at work or at home, 80 percent of whom are estimated to have made some type of search request during the month.

•III.     a summarised comparison OF SEARCH engines

Yahoo!

been in the search game for many years. is better than MSN but nowhere near as good as Google at determining if a link is a natural citation or not. has a ton of internal content and a paid inclusion program. both of which give them incentive to bias search results toward commercial results things like cheesy off topic reciprocal links still work great in Yahoo!

MSN Search

new to the search game is bad at determining if a link is natural or artificial in nature due to sucking at link analysis they place too much weight on the page content their poor relevancy algorithms cause a heavy bias toward commercial results likes bursty recent links new sites that are generally un-trusted in other systems can rank quickly in MSN Search things like cheesy off topic reciprocal links still work great in MSN Search

Google

has been in the search game a long time, and saw the web graph when it is much cleaner than the current web graph is much better than the other engines at determining if a link is a true editorial citation or an artificial link looks for natural link growth over time heavily biases search results toward informational resources trusts old sites way too much a page on a site or sub-domain of a site with significant age or link related trust can rank much better than it should, even with no external citations they have aggressive duplicate content filters that filter out many pages with similar content if a page is obviously focused on a term they may filter the document out for that term. on page variation and link anchor text variation are important. a page with a single reference or a few references of a modifier will frequently outrank pages that are heavily focused on a search phrase containing that modifier crawl depth determined not only by link quantity, but also link quality. Excessive low quality links may make your site less likely to be crawled deep or even included in the index. things like cheesy off topic reciprocal links are generally ineffective in Google when you consider the associated opportunity cost

Ask

looks at topical communities due to their heavy emphasis on topical communities they are slow to rank sites until they are heavily cited from within their topical community due to their limited market share they probably are not worth paying much attention to unless you are in a vertical where they have a strong brand that drives significant search traffic

•IV.     Detailed Analysis of Search Engines

Now that we have understood the working and basics of web crawlers and reviewed a summarized comparison of a few major search engines out in the market, now we are in a position to have a detailed analysis and comparison between these and get into nitty gritty technical details. The sections below will deal with each of these engines one by one with a detailed analysis.

•V.     Yahoo!

 

Yahoo! was founded in 1994 by David Filo and Jerry Yang as a directory of websites. For many years they outsourced their search service to other providers, but by the end of 2002 they realized the importance and value of search and started aggressively acquiring search companies.

Overture purchased AllTheWeb and AltaVista. Yahoo! purchased Inktomi (in December 2002) and then consumed Overture (in July of 2003), and combined the technologies from the various search companies they bought to make a new search engine.

•a)                   On Page Content

Yahoo! offers a paid inclusion program, so when Yahoo! Search users click on high ranked paid inclusion results in the organic search results Yahoo! profits. In part to make it easy for paid inclusion participants to rank, I believe Yahoo! places greater weight on on-the-page content than a search engine like Google does.

Being the #1 content destination site on the web, Yahoo! has a boatload of their own content which they frequently reference in the search results. Since they have so much of their own content and make money from some commercial organic search results it might make sense for them to bias their search results a bit toward commercial websites.

Using descriptive page titles and page content goes a long way in Yahoo!

In my opinion their results seem to be biased more toward commerce than informational sites, when compared with Google.

•b)                   Crawling

Yahoo! is pretty good at crawling sites deeply so long as they have sufficient link popularity to get all their pages indexed. One note of caution is that Yahoo! may not want to deeply index sites with many variables in the URL string, especially since

Yahoo! already has a boatload of their own content they would like to promote (including verticals like Yahoo! Shopping) Yahoo! offers paid inclusion, which can help Yahoo! increase revenue by charging merchants to index some of their deep database contents.

You can use Yahoo! Site Explorer to see how well they are indexing your site and which sites link at your site.

•c)                   Query Processing

Certain words in a search query are better at defining the goals of the searcher. If you search Yahoo! for something like “how to SEO ” many of the top ranked results will have “how to” and “SEO” in the page titles, which might indicate that Yahoo! puts quite a bit of weight even on common words that occur in the search query.

Yahoo! seems to be more about text matching when compared to Google, which seems to be more about concept matching.

•d)                   Link Reputation

Yahoo! is still fairly easy to manipulate using low to mid quality links and somewhat to aggressively focused anchor text. Rand Fishken recently posted about many Technorati pages ranking well for their core terms in Yahoo!. Those pages primarily have the exact same anchor text in almost all of the links pointing at them.

Sites with the trust score of Technorati may be able to get away with more unnatural patterns than most webmasters can, but I have seen sites flamethrown with poorly mixed anchor text on low quality links, only to see the sites rank pretty well in Yahoo! quickly.

•e)                   Page vs Site

A few years ago at a Search Engine Strategies conference Jon Glick stated that Yahoo! looked at both links to a page and links to a site when determining the relevancy of a page. Pages on newer sites can still rank well even if their associated domain does not have much trust built up yet so long as they have some descriptive inbound links.

•f)                    Site Age

Yahoo! may place some weight on older sites, but the effect is nowhere near as pronounced as the effect in Google’s SERPs.

It is not unreasonable for new sites to rank in Yahoo! in as little as 2 or 3 months.

•g)                   Paid Search

Yahoo! prices their ads in an open auction, with the highest bidder ranking the highest. By early 2007 they aim to make Yahoo! Search Marketing more of a closed system which factors in clickthrough rate (and other algorithmic factors) into their ad ranking algorithm.

Yahoo! also offers a paid inclusion program which charges a flat rate per click to list your site in Yahoo!’s organic search results.

Yahoo! also offers a contextual ad network. The Yahoo! Publisher program does not have the depth that Google’s ad system has, and they seem to be trying to make up for that by biasing their targeting to more expensive ads, which generally causes their syndicated ads to have a higher click cost but lower average clickthrough rate.

•h)                   Editorial

Yahoo! has many editorial elements to their search product. When a person pays for Yahoo! Search Submit that content is reviewed to ensure it matches Yahoo!’s quality guidelines. Sites submitted to the Yahoo! Directory are reviewed for quality as well.

In addition to those two forms of paid reviews, Yahoo! also frequently reviews their search results in many industries. For competitive search queries some of the top search results may be hand coded. If you search for Viagra, for example, the top 5 listings looked useful, and then I had to scroll down to #82 before I found another result that wasn’t spammy.

Yahoo! also manually reviews some of the spammy categories somewhat frequently and then reviews other samples of their index. Sometimes you will see a referral like http://corp.yahoo-inc.com/project/health-blogs/keepers if they reviewed your site and rated it well.

Sites which have been editorially reviewed and were of decent quality may be given a small boost in relevancy score. Sites which were reviewed and are of poor quality may be demoted in relevancy or removed from the search index.

Yahoo! has published their content quality guidelines. Some sites that are filtered out of search results by automated algorithms may return if the site cleans up the associated problems, but typically if any engine manually reviews your site and removes it for spamming you have to clean it up and then plead your case.

•i)                    Social Aspects

Yahoo! firmly believes in the human aspect of search. They paid many millions of dollars to buy Del.icio.us, a social bookmarking site. They also have a similar product native to Yahoo! called My Yahoo!

Yahoo! has also pushed a question answering service called Yahoo! Answers which they heavily promote in their search results and throughout their network. Yahoo! Answers allows anyone to ask or answer questions. Yahoo! is also trying to mix amateur content from Yahoo! Answers with professionally sourced content in verticals such as Yahoo! Tech.

•j)                    Yahoo! SEO Tools

Yahoo! has a number of useful SEO tools.

Overture Keyword Selector Tool – shows prior month search volumes across Yahoo! and their search network. Overture View Bids Tool – displays the top ads and bid prices by keyword in the Yahoo! Search Marketing ad network. Yahoo! Site Explorer – shows which pages Yahoo! has indexed from a site and which pages they know of that link at pages on your site. Yahoo! Mindset – shows you how Yahoo! can bias search results more toward informational or commercial search results. Yahoo! Advanced Search Page – makes it easy to look for .edu and .gov backlinks Yahoo! Buzz – shows current popular searches

•k)                   Yahoo! Business Perspectives

Being the largest content site on the web makes Yahoo! run into some inefficiency issues due to being a large internal customer. For example, Yahoo! Shopping was a large link buyer for a period of time while Yahoo! Search pushed that they didn’t agree with link buying. Offering paid inclusion and having so much internal content makes it make sense for Yahoo! to have a somewhat commercial bias to their search results.

They believe strongly in the human and social aspects of search, pushing products like Yahoo! Answers and My Yahoo!.

I think Yahoo!’s biggest weakness is the diverse set of things that they do. In many fields they not only have internal customers, but in some fields they have product duplication, like with Yahoo! My Web and Del.icio.us. 

•l)                    Search Marketing Perspective

I believe if you do standard textbook SEO practices and actively build quality links it is reasonable to expect to be able to rank well in Yahoo! within 2 or 3 months. If you are trying to rank for highly spammed keyword phrases keep in mind that the top 5 or so results may be editorially selected, but if you use longer tail search queries or look beyond the top 5 for highly profitable terms you can see that many people are indeed still spamming them to bits.

As Yahoo! pushes more of their vertical offerings it may make sense to give your site and brand additional exposure to Yahoo!’s traffic by doing things like providing a few authoritative answers to topically relevant questions on Yahoo! Answers.

•VI.     Msn Search

MSN Search had many incarnations, being powered by the likes of Inktomi and Looksmart for a number of years. After Yahoo! bought Inktomi and Overture it was obvious to Microsoft that they needed to develop their own search product. They launched their technology preview of their search engine around July 1st of 2004. They formally switched from Yahoo! organic search results to their own in house technology on January 31st, 2005.

•a)                   On Page Content

Using descriptive page titles and page content goes a long way to help you rank in MSN. I have seen examples of many domains that ranked for things like

state name+ insurance type + insurance

on sites that were not very authoritative which only had a few instances of state name and insurance as the anchor text. Adding the word health, life, etc. to the page title made the site relevant for those types of insurance, in spite of the site having few authoritative links and no relevant anchor text for those specific niches.

Additionally, internal pages on sites like those can rank well for many relevant queries just by being hyper focused, but MSN currently drives little traffic when compared with the likes of Google.

•b)                   Crawling

MSN has got better at crawling, but I still think Yahoo! and Google are much better at crawling. It is best to avoid session IDs, sending bots cookies, or using many variables in the URL strings. MSN is nowhere near as comprehensive as Yahoo! or Google at crawling deeply through large sites like eBay.com or Amazon.com.

•c)                   Query Processing

I believe MSN might be a bit better than Yahoo! at processing queries for meaning instead of taking them quite so literally, but I do not believe they are as good as Google is at it.

While MSN offers a tool that estimates how commercial a page or query is I think their lack of ability to distinguish quality links from low quality links makes their results exceptionally biased toward commercial results.

•d)                   Link Reputation

By the time Microsoft got in the search game the web graph was polluted with spammy and bought links. Because of this, and Microsoft’s limited crawling history, they are not as good as the other major search engines at telling the difference between real organic citations and low quality links.

MSN search reacts much more quickly than the other engines at ranking new sites due to link bursts. Sites with relatively few quality links that gain enough descriptive links are able to quickly rank in MSN. I have seen sites rank for one of the top few dozen most expensive phrases on the net in about a week.

•e)                   Page vs Site

I think all major search engines consider site authority when evaluating individual pages, but with MSN it seems as though you do not need to build as much site authority as you would to rank well in the other engines.

•f)                    Site Age

Due to MSN’s limited crawling history and the web graph being highly polluted before they got into search they are not as good as the other engines at determining age related trust scores. New sites doing general textbook SEO and acquiring a few descriptive inbound links (perhaps even low quality links) can rank well in MSN within a month.

•g)                   Paid Search

Microsoft’s paid search product, AdCenter, is the most advanced search ad platform on the web. Like Google, MSN ranks ads based on both max bid price and ad clickthrough rate. In addition to those relevancy factors MSN also allows you to place adjustable bids based on demographic details. For example, a mortgage lead from a wealthy older person might be worth more than an equivalent search from a younger and poorer person.

•h)                   Editorial

All major search engines have internal relevancy measurement teams. MSN seems to be highly lacking in this department, or they are trying to use the fact that their search results are spammy as a marketing angle.

MSN is running many promotional campaigns to try to get people to try out MSN Search, and in many cases some of the searches they are sending people to have bogus spam or pornography type results in them. A good example of this is when they used Stacey Kiebler to market their Celebrity Maps product. As of writing this, their top search result for Stacey Kiebler is still pure spam.

Based on MSN’s lack of feedback or concern toward the obvious search spam noted above on a popular search marketing community site I think MSN is trying to automate much of their spam detection, but it is not a topic you see people talk about very often. Here are MSN’s Guidelines for Successful Indexing, but they still have a lot of spam in their search results. ;)

•i)                    Social Aspects

Microsoft continues to lag in understanding what the web is about. Executives there should read The Cluetrain Manifesto. Twice.Or maybe three times.

They don’t get the web. They are a software company posing as a web company.

They launch many products as though they have the market stranglehold monopolies they once enjoyed, and as though they are not rapidly losing them. Many of Microsoft’s most innovative moves get little coverage because when they launch key products they often launch them without supporting other browsers and trying to lock you into logging in to Microsoft.

•j)                    MSN SEO Tools

MSN has a wide array of new and interesting search marketing tools. Their biggest limiting factor with them is that they have limited search market share.

Some of the more interesting tools are

Keyword Search Funnel Tool – shows terms that people search for before or after they search for a particular keyword Demographic Prediction Tool – predicts the demographics of searchers by keyword or site visitors by website Online Commercial Intention Detection Tool – estimates the probability of a search query or web page being commercial, informational-transactional, or Search Result Clustering Tool – clusters search results based on related topics

You can view more of their tools under the demo section at Microsoft’s Adlab.

•VII.     Google Search

Google sprang out of a Stanford research project to find authoritative link sources on the web. In January of 1996 Larry Page and Sergey Brin began working on BackRub.

After they tried shopping the Google search technology to no avail they decided to set up their own search company. Within a few years of forming the company they won distribution partnerships with AOL and Yahoo! that helped build their brand as the industry leader in search. Traditionally search was viewed as a loss leader.

Google did not have a profitable business model until the third iteration of their popular AdWords advertising program in February of 2002, and was worth over 100 billion dollars by the end of 2005.

•a)                   On Page Content

If a phrase is obviously targeted (ie: the exact same phrase is in most of the following location: in most of your inbound links, internal links, at the start of your page title, at the beginning of your first page header, etc.) then Google may filter the document out of the search results for that phrase. Other search engines may have similar algorithms, but if they do those algorithms are not as sophisticated or aggressively deployed as those used by Google.

Google is scanning millions of books, which should help them create an algorithm that is pretty good at differentiating real text patterns from spammy manipulative text (although I have seen many garbage content cloaked pages ranking well in Google, especially for 3 and 4 word search queries).

You need to write naturally and make your copy look more like a news article than a heavily SEOed page if you want to rank well in Google. Sometimes using less occurrences of the phrase you want to rank for will be better than using more.

You also want to sprinkle modifiers and semantically related text in your pages that you want to rank well in Google.

Some of Google’s content filters may look at pages on a page by page basis while others may look across a site or a section of a site to see how similar different pages on the same site are. If many pages are exceptionally similar to content on your own site or content on other sites Google may be less willing to crawl those pages and may throw them into their supplemental index. Pages in the supplemental index rarely rank well, since generally they are trusted far less than pages in the regular search index.

Duplicate content detection is not just based on some magical percentage of similar content on a page, but is based on a variety of factors. Both Bill Slawski and Todd Malicoat offer great posts about duplicate content detection. This shingles PDF explains some duplicate content detection techniques.

•b)                   Crawling

While Google is more efficient at crawling than competing engines, it appears as though with Google’s BigDaddy update they are looking at both inbound and outbound link quality to help set crawl priority, crawl depth, and weather or not a site even gets crawled at all. To quote Matt Cutts:

The sites that fit “no pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling.

In the past crawl depth was generally a function of PageRank (PageRank is a measure of link equity – and the more of it you had the better you would get indexed), but now adding in this crawl penalty for having an excessive portion of your inbound or outbound links pointing into low quality parts of the web creates an added cost which makes dealing in spammy low quality links far less appealing for those who want to rank in Google.

•c)                   Query Processing

While I mentioned above that Yahoo! seemed to have a bit of a bias toward commercial search results it is also worth noting that Google’s organic search results are heavily biased toward informational websites and web pages.

Google is much better than Yahoo! or MSN at determining the true intent of a query and trying to match that instead of doing direct text matching. Common words like how to may be significantly deweighted compared to other terms in the search query that provide a better discrimination value.

Google and some of the other major search engines may try to answer many common related questions to the concept being searched for. For example, in a given set of search results you may see any of the following:

a relevant .gov and/or .edu document a recent news article about the topic a page from a well known directory such as DMOZ or the Yahoo! Directory a page from the Wikipedia an archived page from an authority site about the topic the authoritative document about the history of the field and recent changes a smaller hyper focused authority site on the topic a PDF report on the topic a relevant Amazon, eBay, or shopping comparison page on the topic one of the most well branded and well known niche retailers catering to that market product manufacturer or wholesaler sites a blog post / review from a popular community or blog site about a slightly broader field

Some of the top results may answer specific relevant queries or be hard to beat, while others might be easy to compete with. You just have to think of how and why each result was chosen to be in the top 10 to learn which one you will be competing against and which ones may perhaps fall away over time.

•d)                   Link Reputation

PageRank is a weighted measure of link popularity, but Google’s search algorithms have moved far beyond just looking at PageRank.

As mentioned above, gaining an excessive number of low quality links may hurt your ability to get indexed in Google, so stay away from known spammy link exchange hubs and other sources of junk links. I still sometimes get a few junk links, but I make sure that I try to offset any junky link by getting a greater number of good links.

If your site ranks well some garbage automated links will end up linking to you weather you like it or not. Don’t worry about those links, just worry about trying to get a few real high quality editorial links.

Google is much better at being able to determine the difference between real editorial citations and low quality, spammy, bought, or artificial links.

When determining link reputation Google (and other engines) may look at

link age rate of link acquisition anchor text diversity deep link ratio link source quality (based on who links to them and who else they link at) weather links are editorial citations in real content (or if they are on spammy pages or near other obviously non-editorial links) does anybody actually click on the link?

It is generally believed that .edu and .gov links are trusted highly in Google because they are generally harder to influence than the average .com link, but keep in mind that there are some junky .edu links too (I have seen stuff like .edu casino link exchange directories).

When getting links for Google it is best to look in virgin lands that have not been combed over heavily by other SEOs. Either get real editorial citations or get citations from quality sites that have not yet been abused by others. Google may strip the ability to pass link authority (even from quality sites) if those sites are known obvious link sellers or other types of link manipulators. Make sure you mix up your anchor text and get some links with semantically related text.

Google likely collects usage data via Google search, Google Analytics, Google AdWords, Google AdSense, Google news, Google accounts, Google notebook, Google calendar, Google talk, Google’s feed reader, Google search history annotations, and Gmail. They also created a Firefox browser bookmark synch tool, an anti-phishing tool which is built into Firefox and have relationships with the Opera (another web browser company). Most likely they can lay some of this data over the top of the link graph to record a corroborating source of the legitimacy of the linkage data. Other search engines may also look at usage data.

•e)                   Page vs Site

Sites need to earn a certain amount of trust before they can rank for competitive search queries in Google. If you put up a new page on a new site and expect it to rank right away for competitive terms you are probably going to be disappointed.

If you put that exact same content on an old trusted domain and link to it from another page on that domain it can leverage the domain trust to quickly rank and bypass the concept many people call the Google Sandbox.

Many people have been exploiting this algorithmic hole by throwing up spammy subdomains on free hosting sites or other authoritative sites that allow users to sign up for a cheap or free publishing account. This is polluting Google’s SERPs pretty bad, so they are going to have to make some major changes on this front pretty soon.

•f)                    Site Age

Google filed a patent about information retrieval based on historical data which stated many of the things they may look for when determining how much to trust a site. Many of the things I mentioned in the link section above are relevant to the site age related trust (ie: to be well trusted due to site age you need to have at least some link trust score and some age score).

I have seen some old sites with exclusively low quality links rank well in Google based primarily on their site age, but if a site is old AND has powerful links it can go a long way to helping you rank just about any page you write (so long as you write it fairly naturally).

Older trusted sites may also be given a pass on many things that would cause newer lesser trusted sites to be demoted or de-indexed.

The Google Sandbox is a concept many SEOs mention frequently. The idea of the ‘box is that new sites that should be relevant struggle to rank for some queries they would be expected to rank for. While some people have debunked the existence of the sandbox as garbage, Google’s Matt Cutts said in an interview that they did not intentionally create the sandbox effect, but that it was created as a side effect of their algorithms:

“I think a lot of what’s perceived as the sandbox is artefacts where, in our indexing, some data may take longer to be computed than other data.”

•g)                   Paid Search

Google AdWords factors in max bid price and clickthrough rate into their ad algorithm. In addition they automate reviewing landing page quality to use that as another factor in their ad relevancy algorithm to reduce the amount of arbitrage and other noisy signals in the AdWords program.

The Google AdSense program is an extension of Google AdWords which offers a vast ad network across many content websites that distribute contextually relevant Google ads. These ads are sold on a cost per click or flat rate CPM basis.

•h)                   Editorial

Google is known to be far more aggressive with their filters and algorithms than the other search engines are. They are known to throw the baby out with the bath water quite often. They flat out despise relevancy manipulation, and have shown they are willing to trade some short term relevancy if it guides people along toward making higher quality content.

Short term if your site is filtered out of the results during an update it may be worth looking into common footprints of sites that were hurt in that update, but it is probably not worth changing your site structure and content format over one update if you are creating true value add content that is aimed at your customer base. Sometimes Google goes too far with their filters and then adjusts them back.

Google published their official webmaster guidelines and their thoughts on SEO. Matt Cutts is also known to publish SEO tips on his personal blog. Keep in mind that Matt’s job as Google’s search quality leader may bias his perspective a bit.

Google Sitemaps gives you a bit of useful information from Google about what keywords your site is ranking for and which keywords people are clicking on your listing.

•i)                    Social Aspects

Google allows people to write notes about different websites they visit using Google Notebook. Google also allows you to mark and share your favorite feeds and posts. Google also lets you flavorize search boxes on your site to be biased towards the topics your website covers.

Google is not as entrenched in the social aspects of search as much as Yahoo! is, but Google seems to throw out many more small tests hoping that one will perhaps stick.They are trying to make software more collaborative and trying to get people to share things like spreadsheets and calendars, while also integrating chat into email. If they can create a framework where things mesh well they may be able to gain further marketshare by offering free productivity tools.

•j)                    Google SEO Tools

Google Sitemaps – helps you determine if Google is having problems indexing your site. AdWords Keyword Tool – shows keywords related to an entered keyword, web page, or web site AdWords Traffic Estimator – estimates the bid price required to rank #1 on 85% of Google AdWords ads near searches on Google, and how much traffic an AdWords ad would drive Google Suggest – auto completes search queries based on the most common searches starting with the characters or words you have entered Google Trends – shows multi-year search trends Google Sets – creates semantically related keyword sets based on keyword(s) you enter Google Zeitgeist – shows quickly rising and falling search queries Google related sites – shows sites that Google thinks are related to your site related:www.site.com Google related word search – shows terms semantically related to a keyword ~term -term

•k)                   Business Perspectives

Google has the largest search distribution, the largest ad network, and by far the most efficient search ad auction. They have aggressively extended their brand and amazing search distribution network through partnerships with small web publishers, traditional media companies, portals like AOL, computer and other hardware manufacturers such as Dell, and popular web browsers such as Firefox and Opera.

I think Google’s biggest strength is also their biggest weakness. With some aspects of business they are exceptionally idealistic. While that may provide them an amazingly cheap marketing vehicle for spreading their messages and core beliefs it could also be part of what unravels Google.

As they throw out bits of their relevancy in an attempt to keep their algorithm hard to manipulate they create holes where competing search businesses can become more efficient.

In the real world there are celebrity endorsements. Google’s idealism associated with their hatred toward bought links and other things which act similarly to online celebrity endorsements may leave holes in their algorithms, business model, and business philosophy that allows a competitor to sneak in and grab a large segment of the market by factoring the celebrity endorsement factor into being part of the way that businesses are marketed.

•VIII.     Ask Search

Ask was originally created as Ask Jeeves, and was founded by Garrett Gruener and David Warthen in 1996 and launched in April of 1997. It was a natural query processing engine that used editors to match common search queries, and backfilled the search results via a meta search engine that searched other popular engines.

As the web scaled and other search technologies improved Ask Jeeves tried using other technologies, such as Direct Hit (which roughly based popularity on page views until it was spammed to death), and then in 2001 they acquired Teoma, which is the core search technology they still use today. In March of 2005 InterActive Corp. announced they were buying Ask Jeeves, and by March of 2006 they dumped Jeeves, changing the brand to Ask.

•a)                   On Page Content

For topics where there is a large community Ask is good at matching concepts and authoritative sources. Where those communities do not exist Ask relies a bit much on the on page content and is pretty susceptible to repetitive keyword dense search spam.

•b)                   Crawling

Ask is generally slower at crawling new pages and sites than the other major engines are. They also own Bloglines, which gives them incentive to quickly index popular blog content and other rapidly updated content channels.

•c)                   Query Processing

I believe Ask has a heavy bias toward topical authority sites independent of anchor text or on the page content. This has a large effect on the result set the provide for any query in that it creates a result set that is more conceptually and community oriented than keyword oriented.

•d)                   Link Reputation

Ask is focused on topical communities using a concept they call Subject-Specific PopularitySM. This means that if you are entering a saturated or hyper saturated field that Ask will generally be one of the slowest engines to rank your site since they will only trust it after many topical authorities have shown they trusted it by citing it. Due to their heavy bias toward topical communities, for generic search they seem to be far more biased on how many quality related citations you have than looking as much at anchor text. For queries where there is not much of a topical community their relevancy algorithms are nowhere near as sharp.

•e)                   Page vs Site

Pages on a well referenced trusted site tend to rank better than one would expect. For example, I saw some spammy press releases on a popular press release site ranking well for some generic SEO related queries. Presumably many companies link to some of their press release pages and this perhaps helps those types of sites be seen as community hubs.

•f)                    Site Age

Directly I do not believe it is much of a factor. Indirectly I believe it is important in that it usually takes some finite amount of time to become a site that is approved by your topical peers.

•g)                   Paid Search

Ask gets most of their paid search ads from Google AdWords. Some ad buyers in verticals where Ask users convert well may also want to buy ads directly from Ask. Ask will only place their internal ads above the Google AdWords ads if they feel the internal ads will bring in more revenue.

•h)                   Editorial

Ask heavily relies upon the topical communities and industry experts to in essence be the editors of their search results. They give an overview of their ExpertRank technology on their web search FAQ page. While they have such limited distribution that few people talk about their search spam policies they reference a customer feedback form on their editorial guidelines page.

•i)                    Social Aspects

Ask is a true underdog in the search space. While they offer Bloglines and many of the save a search personalization type features that many other search companies offer they do not have the critical mass of users that some of the other major search companies have.

•j)                    Ask SEO Tools

Ask search results show related search phrases in the right hand column. Due to the nature of their algorithms Ask is generally not good at offering link citation searches, but recently their Bloglines service has allowed you to look for blog citations by authority, date, or relevance.

•IX.     Technical Working of a Search Engine – Taking Google as example

•1)     Google Architecture Overview

 

In this section, we will give a high level overview of how the whole system works as pictured in Figure below. Further sections will discuss the applications and data structures not mentioned in this section. Most of Google is implemented in C or C++ for efficiency and can run in either Solaris or Linux.

 

 

In Google, the web crawling (downloading of web pages) is done by several distributed crawlers. There is a URLserver that sends lists of URLs to be fetched to the crawlers. The web pages that are fetched are then sent to the storeserver. The storeserver then compresses and stores the web pages into a repository. Every web page has an associated ID number called a docID which is assigned whenever a new URL is parsed out of a web page. The indexing function is performed by the indexer and the sorter. The indexer performs a number of functions. It reads the repository, uncompresses the documents, and parses them. Each document is converted into a set of word occurrences called hits. The hits record the word, position in document, an approximation of font size, and capitalization. The indexer distributes these hits into a set of “barrels”, creating a partially sorted forward index. The indexer performs another important function. It parses out all the links in every web page and stores important information about them in an anchors file. This file contains enough information to determine where each link points from and to, and the text of the link.

The URLresolver reads the anchors file and converts relative URLs into absolute URLs and in turn into docIDs. It puts the anchor text into the forward index, associated with the docID that the anchor points to. It also generates a database of links which are pairs of docIDs. The links database is used to compute PageRanks for all the documents.

The sorter takes the barrels, which are sorted by docID, and resorts them by wordID to generate the inverted index. This is done in place so that little temporary space is needed for this operation. The sorter also produces a list of wordIDs and offsets into the inverted index. A program called DumpLexicon takes this list together with the lexicon produced by the indexer and generates a new lexicon to be used by the searcher. The searcher is run by a web server and uses the lexicon built by DumpLexicon together with the inverted index and the PageRanks to answer queries.

 

•2)     Major Data Structures

 

Google’s data structures are optimized so that a large document collection can be crawled, indexed, and searched with little cost. Although, CPUs and bulk input output rates have improved dramatically over the years, a disk seek still requires about 10 ms to complete. Google is designed to avoid disk seeks whenever possible, and this has had a considerable influence on the design of the data structures.

•a)                   BigFiles

 

BigFiles are virtual files spanning multiple file systems and are addressable by 64 bit integers. The allocation among multiple file systems is handled automatically. The BigFiles package also handles allocation and deallocation of file descriptors, since the operating systems do not provide enough for our needs. BigFiles also support rudimentary compression options.

•b)                    Repository

  

The repository contains the full HTML of every web page. Each page is compressed using zlib. The choice of compression technique is a tradeoff between speed and compression ratio. We chose zlib’s speed over a significant improvement in compression offered by bzip. The compression rate of bzip was approximately 4 to 1 on the repository as compared to zlib’s 3 to 1 compression. In the repository, the documents are stored one after the other and are prefixed by docID, length, and URL as can be seen in Figure below. The repository requires no other data structures to be

 

 

 

used in order to access it. This helps with data consistency and makes development much easier; we can rebuild all the other data structures from only the repository and a file which lists crawler errors.

•c)                   Document Index

 

The document index keeps information about each document. It is a fixed width ISAM (Index sequential access mode) index, ordered by docID. The information stored in each entry includes the current document status, a pointer into the repository, a document checksum, and various statistics. If the document has been crawled, it also contains a pointer into a variable width file called docinfo which contains its URL and title. Otherwise the pointer points into the URLlist which contains just the URL. This design decision was driven by the desire to have a reasonably compact data structure, and the ability to fetch a record in one disk seek during a search

Additionally, there is a file which is used to convert URLs into docIDs. It is a list of URL checksums with their corresponding docIDs and is sorted by checksum. In order to find the docID of a particular URL, the URL’s checksum is computed and a binary search is performed on the checksums file to find its docID. URLs may be converted into docIDs in batch by doing a merge with this file. This is the technique the URLresolver uses to turn URLs into docIDs. This batch mode of update is crucial because otherwise we must perform one seek for every link which assuming one disk would take more than a month for our 322 million link dataset.

•d)                   Lexicon

 

The lexicon has several different forms. One important change from earlier systems is that the lexicon can fit in memory for a reasonable price. In the current implementation we can keep the lexicon in memory on a machine with 256 MB of main memory. The current lexicon contains 14 million words (though some rare words were not added to the lexicon). It is implemented in two parts — a list of the words (concatenated together but separated by nulls) and a hash table of pointers. For various functions, the list of words has some auxiliary information which is beyond the scope of this paper to explain fully.

•e)                   Hit Lists

A hit list corresponds to a list of occurrences of a particular word in a particular document including position, font, and capitalization information. Hit lists account for most of the space used in both the forward and the inverted indices. Because of this, it is important to represent them as efficiently as possible. We considered several alternatives for encoding position, font, and capitalization — simple encoding (a triple of integers), a compact encoding (a hand optimized allocation of bits), and Huffman coding. In the end we chose a hand optimized compact encoding since it required far less space than the simple encoding and far less bit manipulation than Huffman coding. The details of the hits are shown in Figure below.

 

 

Our compact encoding uses two bytes for every hit. There are two types of hits: fancy hits and plain hits. Fancy hits include hits occurring in a URL, title, anchor text, or meta tag. Plain hits include everything else. A plain hit consists of a capitalization bit, font size, and 12 bits of word position in a document (all positions higher than 4095 are labeled 4096). Font size is represented relative to the rest of the document using three bits (only 7 values are actually used because 111 is the flag that signals a fancy hit). A fancy hit consists of a capitalization bit, the font size set to 7 to indicate it is a fancy hit, 4 bits to encode the type of fancy hit, and 8 bits of position. For anchor hits, the 8 bits of position are split into 4 bits for position in anchor and 4 bits for a hash of the docID the anchor occurs in. This gives us some limited phrase searching as long as there are not that many anchors for a particular word. We expect to update the way that anchor hits are stored to allow for greater resolution in the position and docIDhash fields. We use font size relative to the rest of the document because when searching, you do not want to rank otherwise identical documents differently just because one of the documents is in a larger font.

 

The length of a hit list is stored before the hits themselves. To save space, the length of the hit list is combined with the wordID in the forward index and the docID in the inverted index. This limits it to 8 and 5 bits respectively (there are some tricks which allow 8 bits to be borrowed from the wordID). If the length is longer than would fit in that many bits, an escape code is used in those bits, and the next two bytes contain the actual length.

•f)                    Forward Index

 

The forward index is actually already partially sorted. It is stored in a number of barrels (we used 64). Each barrel holds a range of wordID’s. If a document contains words that fall into a particular barrel, the docID is recorded into the barrel, followed by a list of wordID’s with hitlists which correspond to those words. This scheme requires slightly more storage because of duplicated docIDs but the difference is very small for a reasonable number of buckets and saves considerable time and coding complexity in the final indexing phase done by the sorter. Furthermore, instead of storing actual wordID’s, we store each wordID as a relative difference from the minimum wordID that falls into the barrel the wordID is in. This way, we can use just 24 bits for the wordID’s in the unsorted barrels, leaving 8 bits for the hit list length.

•g)  

Web Design – No worries

How should it be promoted? What should it look like? Can it be done on my own or should a professional be hired to do it?

These are just some of the questions that need to be answered first before designing a web site. Experts on this field can be turned to help and do the job for you.

Doing it yourself would also be an option if you are taking into consideration the expenses and the time that can be saved by doing so. There are things that needed to be considered in designing your website. And questions, too.

What is the goal of the site? It would be helpful if you know from the start what you want your site to do. Simple as it may seem, you need to get ideas organized into clear details first. Think of the site in the point of view of others.

The impressions that they would surely have upon seeing your site. Putting graphics and pictures into the site as attention-seekers is important to keep up with the many sites available nowadays. Having a site does not only mean having information to give and share. It also means creating an art work that people will be interested enough to see and read through.

What have the others got? By doing your homework and looking up probable competition sites, you can get an edge on what your site should possess.

Do your homework. You can get lessons, feedbacks and even inspiration in seeing the works of other people. Looking them up does not mean you have to copy them. It means you have to think of other ways to get leverage over the others. Once this has been done, consider yourself on the frontline and be ready to set some trend.

How do you find a good designer? In this case, you have chosen someone to do the designs for you. In finding the right designer, choose someone who understands and is in harmony with what you want your site to be.

It is important to note that some designers want their designs put into your site and not your designs into yours. Consider someone who is interested in what you’re doing, think your thoughts and makes them the center of their goal.

Is it accessible? Make it easy for people to see your site and contact you for any complaints or suggestions. Putting contact details would make it easier for people to not only get into your site but you as well.

What is there to remember? Keeping it simple. From the words to the logos to the graphic designs. People did not come into your site for those so stick to the more important things.

Basic Web Design tips Ignored

Web researchers found that you have about 2 minutes to make that first impression a good one. Visitors will judge your site in those few seconds on its professionalism and appropriateness to what they are looking for.

In fact, a website can lose about one-third of its potential customers due to poor design, according to a recent user study conducted by some professionals.

Take a long hard look at your site. Or ask a friend to give you a brutally honest review of your site. Does it pass the test of professionalism?

Are the graphics of good quality and clear? Is the formatting, font size and font colors consistent throughout the site? Or does your site commit design mistakes that speak amateur as soon as it loads?

There are some common mistakes website owners make that may cause visitors to leave early. What are these?

They post “Under Construction” signs all over the site.

Under Construction signs posted all over the website spell unprofessional in a big way. Seasoned site owners understand the power of patience. They know that timing the launch of your completed website is much more effective than doing it prematurely.

Be patient. Wait until the website is complete before publicizing your site. Doing it this way, your visitors will be impressed and gain trust faster. They won’t feel uneasy and run away because they see amateur stamped all over your site with each Under Construction sign.

Some place brightly colored counters on every page as a badge of honor.

The truth is most everyone knows counters can be set to whatever number you like. If you don’t want to start your counter at zero, you can easily start it at 10,000. It raises a red flag of questions. Therefore, it may repel your visitors faster than it attracts them. Why raise the red flag of questions, if you don’t have to.

Look at your in-depth statistics instead if you need to analyze your traffic.

Some websites do not use copyright statements.

Some uniformed site owners don’t know that their copyright is effective the moment their creative work is set in a fixed form. So they fail to put their stamp of ownership on their work.

If you truly own your work, claim it. Post your copyright information at the bottom of every page.

Charlotte Web Design: Submission Tools

Get listing in search engines and directories is not as easy as some small businesses and entrepreneurs may think. This task normally takes a lot of time and commitment, and if you are not aware of the rules and regulations you run the risk of being rejected from being listed. As of lately, Search Engine Optimization professionals and companies who tend to their own websites are beginning to rely on software tools – submission tools

A submission tool is defined as software applications that automatically submit a web site to search engines and directories. There are two types of submission tools available, they include:

Multi-submission tools – submits a website to multiple search engines and directories
Deep submission tools – submits those deep pages to search engines and directories; usually missed by search engine crawlers

When dealing with submission tools, all companies and Search Engine Optimization (SEO) professionals should be very cautious. There are certain submission tools that try to scam web site designers. These tools make you believe that they are going to submit a site to a large amount of directories and search engines for a low fee. Most of these search engines are either unreal or of no worth to a web site.

Why you should invest in a submission tool:

Improve your ranking
Save viable time
More traffic to website
Creates more backlinks

SEO professionals and small businesses should always conduct a thorough research of each submission tool they come across. Doing so is best before actually investing in one.

Stop Wasting Time Searching The Web. Let The Web Come To You

Take a moment and think about how you use the internet. Do you have favorite websites that you visit frequently? How long does it take you to browse those individual websites? Do you get distracted by ads and other miscellaneous items? How long does it take you to determine what information is new versus what information is old? Do you find yourself spending hours on all of your favorite websites trying to determine if new information is available for you to view?

Do you…

- visit news, weather, sports, movie, classifieds, and auction web sites?

- want to know what is happening?

- more importantly, want to know what is happening before others know?

- want to find the latest and greatest updates from your favorite websites?

- use social networking sites and blogs to keep in touch with friends and family?

Do you also…

- spend hours of your precious time checking your favorite news, weather, sports, classified, and auction sites for new information?

- waste time on a website trying to determine what information is new versus what is old?

- become distracted by website ads designed to take you to other websites which only waste more of your precious time?

- forget which websites you have visited that contain useful information?

- stare endlessly at social networking sites and blogs waiting for updates?

Now, let us show you a solution to help you reclaim your precious time and still be updated with the latest and greatest the Internet has to offer.  Try Web Notifier Software free to 10 days!  Once our software determines an update has occurred on a website you instructed our software to monitor, a message will appear from your taskbar with the title of the change and allow you to click on the message to be taken to the website. Not only will our software be able to display the message on the screen it is also able to read the message title to you, play a sound file, send a text message (via email), and display the message in your configured message color. You can even configure our software to behave differently for different websites and different website users (ex: social network websites users).  Go ahead and try our software free for 10 days. If you decide you would like to keep the software, you will be given the opportunity to purchase the software for a one time fee of only $29.99. Our software will notify you when your trial period is over and provide a link for you to make your purchase if you desire.

Try it FREE today!

Colours in professional Web Designing

While website designing one always thinks of getting right text and pictures on the website but people don’t focus much on the colours. Frankly speaking colours are equally important as text and pictures are. Colours used in the website designing makes a huge difference. Some points below will give you an idea to use right colours for your website:

Industry Type: Lot depends on the kind of website you have. If it is a corporate website than you will use sober colours but if it is a fashion based website you would like to use some bright colours.

Readable Text: Colours should be used in such a way that text is readable. Contrast tone of text and background to be used.

Background & Foreground: Background and Foreground Colours should always contrast.

Gradient: These days gradients are in and looks very smart. Moreover it looks professional also.

Use of tools available: Kuler is an application where you can create some great colour combinations. And the best part is ease of use.You can also share colour theme with other people. Users can then vote for your colours.

Use of Multicolours: Be very careful while designing website which is multicoloured. All the colours should gel with each other and if it is not it would not come out as a good design for your website.

Logo Colours: Always use logo colour elements in the design. It will not only impress your client but will gel with the corporate identity of your company.

Web Safe Colours: There are 216 total colors which are web safe colours. While designing a website one whould only use colours which look good and are eye soothing for the users.

Citec is a professional website design delhi company which helps you create great looking websites using appropriate colours and graphics.

Web Conferencing and President Jackson

The field trip is thought of as a staple of the high school experience. It gives students the opportunity to learn outside of the standard classroom setting. These days with shrinking budgets and tightening of belts it is not surprising that schools have to be looking for ways to save money. Making the most of modest budgets is one thing though, as it still has to be done in the context of enhancing the learning experience. With this in mind, some schools could learn a lesson from what is now being done by the Hermitage, former home of America’s seventh President. Recent reports indicate that people who are unable to visit in person, can utilize web conferencing software tools. While many people tout the economic benefits of web conferencing, it is also an effective teaching tool. All it takes is a visit to The Hermitage website to set up the online tour. This is done after opening an account, and the participants only need to meet standard requirements such as having PCs and Internet access. Naturally, internet video conferencing poses endless possibilities for historical landmarks, schools and historical and scientific depositories such as museums. Will this generate a wave of interest in such places among the public? Only time will tell, but it is easy to see how the flagging interest of otherwise occupied communities can be somewhat revived. When people hear the term web conferencing, webinars or learning environments may spring to mind, it is however about so much more than the name implies. Exploring Possibilities It is true that the benefits of using a web conferencing solution are many. The above example reminds us that even financially strapped entities do have options. Apart from the fact that many teleconferencing options are reasonably priced, they provide new ways of doing business. Undoubtedly, a virtual tour of a museum is logistically preferable to a stressed out teacher. There are other advantages to be gained from exploring what is fast becoming an essential work and learning tool. These include: ·         The ability to create more engaging content ·         In some cases, information can be recorded to be used at a later date ·         Better use of time and increased productivity ·         Vastly reduced travel costs ·         Rapid delivery of information So will more museums and other places of interest be rushing to set up online conferencing venues? That remains to be seen. Truthfully, it is not an option that is palatable to all institutions of this type. Many may simply be unwilling to give up that personal interaction which they see as vital to the learning experience. On the other hand if they should decide to follow the example of The Hermitage they will have no shortage of options. Companies like RHUB Communications have a variety of webinar software products that suit various needs and budgets.