MC+A Stream

Our Blog and News Stream

Software Update : Google Search Appliance 6.10 P4

July 15th, 2011

Our customers are advised that Google Enterprise has released a patch to Google Search Appliance software 6.10.  This patch is labeled P4.  Google strongly recommends upgrading to this patch version if you are on the following versions:

  • 6.10.4.G.22
  • 6.10.4.G.22-P1

A list of the fixes included in this release can be found here.

MC+A supported customers should contact their account rep or MC+A support to schedule a time when the update can be applied.  There is a small outage equivalent to 20 seconds * the number of collections.

Simple Regular Expression To Reduce Unnecessary Files From Being Indexed By The Google Search Appliance

June 3rd, 2011

Duplicate Documents Creates Noise and Consumes Your License

Duplicate links can produce duplicate search results which:

  1. Can cost you more in licensing
  2. Produce duplicate results thereby angering your user base.

In a recent engagement, a global company’s CMS was producing and accepting urls in both of these formats:

  • http://www.mcplusa.com/company/about
  • http://www.mcplusa.com/company/about.html

The Google Search Appliance will see both of these documents as separate urls.    I reviewed the clients requirements and all of the site content either produced a ‘/’ or a file type at the end.  This is fairly common among CMS and other SEO friendly publication system.

The Expression

The regular expression that I came up with was

regex:http://[put your site here]/.*/$|[put your site here]/.*\.([a-zA-Z]{3,9})$

Which [put your site here] contained the content source.

The Take Away

We tested it out and after applying the pattern we reduced the total number of documents by 20%.   This was especially benefical since the client was at about 480k documents on their current 500k license.  The change took them well below the license limit and cleaned up the search interface.

*Plug* – If you have had your Google Search Appliance for awhile, I would recommend considering our Health Check where we can review multiple configuration settings to see if your appliance is properly tuned.

Michael Cizmar
Managing Partner

http://www.twitter.com/michaelcizmar

Google announces end of life schedule for Google Search Appliance software

February 9th, 2011

Google announced on Tuesday the end of life schedule for software versions 5.2, 6.0 and 6.2.  The 5.2 software version is scheduled to be deprecated on April 30, 2011.  Once deprecated, a software version is no longer supported by Google. This means we may require you to update to a supported version, should you require technical support.

The schedule for 6.0 and 6.2 end of life is as follows:

6.0: August 30, 2011
6.2: March 31, 2012

The latest release of the GSA software is 6.8.

Webinar: Google Search Appliance version 6.8

December 8th, 2010

On December 13th, join MC+A’s founder and managing partner Michael Cizmar for a detailed debrief on the most recent update to the Google Search Appliance, version 6.8.

Google 6.8 Logo

REGISTER TO ATTEND THIS FREE WEBINAR.

Monday, December 13, 2010

11:30PM PST | 2:30PM EST
Register Today

THE GOOGLE SEARCH 6.8 WEBINAR WILL COVER:

  • Cloud Connect – integrated search with Google Apps, Site Search and Twitter
  • People Search – Search profile information about people in your organization
  • Dynamic Navigation – filter search results with specific metadata attributes
  • Active-Active – provide high availability by directing search traffic to multiple appliances
  • Sharepoint 2010 – Search all content within Sharepoint 2010 out of the box

About Michael
Michael Cizmar, Managing Partner, MC+A
Michael Cizmar is the founder and President of MC+A, one of the very first Google Enterprise Professional partners and has advised organizations such as Volkswagen, New York Post, Los Angeles Metro, and Federal Home Loan Bank of San Francisco on findablity in the enterprise. Michael also serves on the Advisory Board to Open Pipeline.

For more information about this and any upcoming events, please contact us.

Google Search Appliance 6.8 patch released (p2)

November 24th, 2010

Google released P2 to software release 6.8.  Google strongly recommends that you install the patch if you are running 6.8.0.G30 or 6.8.0.G30-P1.

The following fixes are included:

Issue ID Issue Description
2736134 Only use queries that return results to generate the Query Suggestion database.
2959255 Security Manager (Universal Login) fails authentication if Kerberos Authorization HTTP header is larger than 8K
3120706 A feed can be submitted to the appliance regardless of ip restrictions in certain circumstances.
218751 In Database – Advanced Settings, if meta data is selected and primary document is submitted via Document URL Field or Document ID field, “action=delete” will not work.
3100032 Dynamic Navigation may return counts of unauhorized resultswith access=s parameter.
2086514 GSA only supports DES encryption for Kerberos.
2551148 Syslog files are not rotated.
3122895 SAML Security Manager configuration is not properly migrated when updating to 6.8.0.G.30.

 

MC+A support customer should contact support to schedule an update time.  Others are required to login to Google’s support portal that is included in your Google Search Appliance welcome email.

Happy Thanksgiving!

MC+A Support

Webinar: Combining the Google Search Appliance with the power of semantics to improve findability

September 15th, 2010

On September 23rd, join Smartlogic’s Toby Conrad and MC+A’s founder and managing partner Michael Cizmar for a detailed debrief on how combining Smartlogic’s Semaphore with a Google Search Appliance turns a good search engine into the platform for compelling, intuitive search applications.

Register to Attend this Free Webinar.

Thursday, September 23, 2010

12:00PM PST | 3:00PM EST
Register Buttome

The Google Search / Semaphone webinar will cover:

  • GSA search foundations (including the 6.4 updates!)
  • The semantic extension for ontology and classification that is Smartlogic’s “Semaphore”
  • Understanding the GSA integration
  • Using metadata to drive faceted search and intuitive, dynamic content navigation
  • Providing intelligent links and recommendations between topics
  • Using topic pages to drive more traffic to your site and to keep people for longer

About the Presenters

About Michael
Michael Cizmar, Managing Partner, MC+A
Michael Cizmar is the founder and President of MC+A, one of the very first Google Enterprise Professional partners and has advised organizations such as Volkswagen, New York Post, Los Angeles Metro, and Federal Home Loan Bank of San Francisco on findablity in the enterprise. Michael also serves on the Advisory Board to Open Pipeline.

About Toby
Toby Conrad
Toby Conrad has been a Senior Consultant with Smartlogic for seven years and has advised organizations such as NASA, Bank of America, RBS, and Schlumberger.

For more information about this and any upcoming events, please contact us.

Google Releases Software Patch for Version 6.4 labeled 6.4.0.G22 P8

September 14th, 2010

New Version of Google Search Appliance software 6.4

Google released a patch for appliance software release 6.4.  Release notes can be found here (goes to Google support page).

Issue Id Issue Description
2936082 LDAP group lookup feature does not work.
2631545 The Forms Authentication Wizard does not support https over proxy.
2759198 Unable to use regexp in Crawler Access configuration.
2874459 In rare circumstances DNS lookups may stop working.
1303761 Crawlingn with certificate auth does not support AES256, 3DES, AES12 ciphers.
2762588 Certain crawling patterns may prevent appliance from crawling.

MC+A support will be contacting MC+A support customers about scheduling a time for the update.

Hidden Features of 6.4 : Head Requestor Deny Rules

August 5th, 2010

The Problem

Header requests is the default method for how the Google Search Appliance (GSA) performs authorization on a document level (also known as late binding) for web based content (See The Header Requester).  There are numerous advantages and disadvantages.  One of the minuses, is that it relies on the content source to adhere to HTTP protocols.

We’ve experienced numerous content systems that don’t fully support the correct HTTP response for this to work.  In many cases of Lotus Domino or Microsoft SharePoint, a friendly message is return or their is an embedded header.  This causes the to misinterpret the response from the server and think the user has access to the document.

The common method pre 6.4 was to implement a SAML interface and develop custom code to handle the logic for the variety of content sources.  Google released several Open Source projects to jump start your efforts.  Most notably they are:

The Solution: Header Request Deny Rules

Those tended to be difficult for our clients to implement and another piece of infrastructure to deploy and manage.  In version 6.4, Google has added additional rule validation on the appliance.  You now can check the most common sets on the appliance with simple configuration:

Screen shot of the Header Request Deny Rule Form

This virtually eliminates many of the customizations that we’ve made for the wrong response.  How Neat!!!

Google Releases Software Patch for Version 6.4 labeled 6.4.0.G22

August 3rd, 2010

New Version Released For 6.4

Google released a patch for the latest search appliance release.

Release notes can be found here.

MC+A support customers will be contacted to schedule the update.

G44-P10 patch released for Google Search Appliance GB-1001, GB-7007 and GB-9009

July 11th, 2010

Google has released a patch update to the 6.2 software version. The following list contains a list of the fixes. Compare this list to any issues you having outstanding.

2399366

Sometimes documents with the same date are not sorted by relevance when sorting by date.

2386544

Sometimes sort by date does not sort documents that are chronologically close in the correct order.

2371696

Documents with dates that are in the future are not sorted properly when sorting by ascending dates.

1894928

Error on GSA when configuring Google Apps: “Cannot enable Google Apps Integration.”

2311340

GSA Admin API fails to authenticate requests if timezone is not set to US/Pacific.

1837591

Some Word documents can not be converted resulting in a crawling error.

2300088

GSA cannot convert some documents that contain embedded docs.

2327144

Some internal indexing processes are not properly terminated.

2435021

Stalled connections are not properly detected in replication setup.

2551148

System logs are not properly rotated which can fill disks.

1971858

Search Latency graph is broken under ‘Status and Reports>Serving Status’.

1408031

Encoded non-ascii filenames become garbled if saved with Internet Explorer.

2287126

SupportCall via proxy server fails with Error establishing TCP connection to supportcall.google.com:443: (111, ‘Connection refused’).

2534465

Permanent Redirect URLs with extensions that require Conversion(.doc, .pdf, .xls, etc) are not followed if at redirect time the content-length response is different than 0.

2517840

The Crawl Queue URLs in 6.2 are shown as encoded.

2633250

In federation the primary node tries to authorize urls that have already been authorized by secondary nodes.

2335370

Results with same snippet and title do not get filtered in the search results when using the ‘filter=p’ search parameter.

2487684

The lang_zh-tw language filter fails to show search results beyond the first page.

2358978

RK parameter is not working properly.

2728200

In 6.2.0.G.44-P6 Admin Console is only available in English.

Page 1 of 712345...Last »