Google Inconsistency in Search in the Mainland China Caused by a Subset of Queries - Video How Google Search Works?

For past couple years, Google Search in mainland China is accused of being "inconsistent and unreliable," with the users regularly getting error messages like "This webpage is not available" or "The connection was reset." Google's Alan Eustace, svp, Knowledge, today says this is caused by a particular set of query stating:"We've taken a long, hard […]

Google Search Mainland China Search Query Issues

For past couple years, Google Search in mainland China is accused of being "inconsistent and unreliable," with the users regularly getting error messages like "This webpage is not available" or "The connection was reset." Google's Alan Eustace, svp, Knowledge, today says this is caused by a particular set of query stating:

"We've taken a long, hard look at our systems and have not found any problems. However, after digging into user reports, we've noticed that these interruptions are closely correlated with searches for a particular subset of queries," he said.

"In order to figure out which keywords are causing problems, a team of engineers in the U.S. reviewed the 350,000 most popular search queries in China. In their research, they looked at multiple signals to identify the disruptive queries, and from there they identified specific terms at the root of the issue," Eustace explains.

"So starting today we'll notify users in mainland China when they enter a keyword that may cause connection issues. By prompting people to revise their queries, we hope to reduce these disruptions and improve our user experience from mainland China. Of course, if users want to press ahead with their original queries they can carry on," Eustace said.

To learn more, users can click on the "interruption" link, which takes them to this help center article. "They can continue with their original query (which will likely lead to an error message), or click "Edit search terms," which will remove the highlighted characters and prompt users to try other search terms."

Per Eustace's post:

We've observed that many of the terms triggering error messages are simple everyday Chinese characters, which can have different meanings in different contexts. For example a search for the single character [江] (Jiāng, a common surname that also means "river") causes a problem on its own, but 江 is also part of other common searches like [丽江] (Lijiang, the name of a city in Yunnan Province), [锦江之星] (the Jinjiang Star hotel chain), and [江苏移动] (Jiangsu Mobile, a mobile phone service). Likewise, searching for [周] (Zhōu, another common surname that also means "week") triggers an error message, so including this character in other searches--like [周杰伦] (Jay Chou, the Taiwanese pop star), [周星驰] (Stephen Chow, a popular comedian from Hong Kong), or any publication that includes the word "week"--would also be problematic.

Now, when a user types in a common term like [长江] (Yangtze River) from China, Google highlights the problem term [江] as they type, and when they press "enter" a drop-down menu appears beneath the search box (see above pic).

google search query issues in mainland china

In order to avoid connection problems, users can refine their searches without the problem keywords. For example, instead of searching for [长江], they could search for [changjiang]--which also means Yangtze River, but is written using pinyin, the system used to transliterate Chinese characters into Latin script. This won't cause a timeout, but will still generate search results related to the Yangtze River.

For those outside China, curious to see what the notifications look like, you can visit this link to try it out.

Here is the video explaining what's happening:

In other Google search news, the company today posted a video giving a more visual look at How Google Search Works?

"Google has lots of computers that continually visit and analyze web pages it knows about. These computers are collectively known as Googlebot. We give Googlebot an initial set of sites, then send it out to visit those sites. It scans the content, and then follows links to other sites or pages that it finds. It then repeats the process on each page it lands on, and continues to spider out, hence the term "spidering" or "crawling" many use to refer to a search engine's discovery process.

When Googlebot visits a webpage, it downloads and stores a copy (called a "cached" page) to our index. It analyzes each page, noting the words and any other relevant content. Googlebot understands some types of content, like text, better than others, like images or Flash (you can find some ways to make these better understood in Webmaster Academy). In order to perform well when customers search for you, it's important that Googlebot can access and understand the content of your website.

Each time someone searches on Google, our ranking algorithms draw up a list of relevant webpages from the index of information that Googlebot has saved while "crawling" the web. This list is given back as the Google Search results page. To see if your website is included in Google's index, you can use the site search operator, restricting search results to your site's domain. For example a search for [site:youtube.com] would only show results from the website youtube.com," explains Garen Checkly, Search Quality Team.