Earlier today, we released the second release of the Pika Google Apps project sponsored by . The LSNC Google API Project is a national demonstration project funded by the LSC Technology Initiative Grants Program. This project is maintained by Legal Services of Northern California (LSNC), a ten-office legal aid program serving low-income clients in 23 counties in the upper third of California. The second release focuses on integration of Google Docs.
Now a Pika user can sync their case documents to Google Apps. Additionally, we’ve resolved some bugs with the previous release. We’ll document the project a bit more soon but go here to download the lastest files:
Our customers are advised that Google Enterprise has released a patch to Google Search Appliance software 6.10. This patch is labeled P4. Google strongly recommends upgrading to this patch version if you are on the following versions:
6.10.4.G.22
6.10.4.G.22-P1
A list of the fixes included in this release can be found here.
MC+A supported customers should contact their account rep or MC+A support to schedule a time when the update can be applied. There is a small outage equivalent to 20 seconds * the number of collections.
Having difficult finding things in Salesforce.com? Frustrated to have to login to Salesforce.com to simply look up a contact?
Join MC+A for a webinar no how to better leverage your Salesforce content with a Google Search Appliance (GSA).
Salesforce is an excellent CRM. Business like yours trust an increasing amount of important sales data to Salesforce CRM. Finding relevant sales, lead and contact information quickly involves way too much effort. Search should be simple and just work. Integrating Google search Appliance with Salesforce you can provide a universal search experience with Salesforce results integrated into your portal or search results page.
Duplicate Documents Creates Noise and Consumes Your License
Duplicate links can produce duplicate search results which:
Can cost you more in licensing
Produce duplicate results thereby angering your user base.
In a recent engagement, a global company’s CMS was producing and accepting urls in both of these formats:
http://www.mcplusa.com/company/about
http://www.mcplusa.com/company/about.html
The Google Search Appliance will see both of these documents as separate urls. I reviewed the clients requirements and all of the site content either produced a ‘/’ or a file type at the end. This is fairly common among CMS and other SEO friendly publication system.
The Expression
The regular expression that I came up with was
regex:http://[put your site here]/.*/$|[put your site here]/.*\.([a-zA-Z]{3,9})$
Which [put your site here] contained the content source.
The Take Away
We tested it out and after applying the pattern we reduced the total number of documents by 20%. This was especially benefical since the client was at about 480k documents on their current 500k license. The change took them well below the license limit and cleaned up the search interface.
*Plug* – If you have had your Google Search Appliance for awhile, I would recommend considering our Health Check where we can review multiple configuration settings to see if your appliance is properly tuned.
Google announced on Tuesday the end of life schedule for software versions 5.2, 6.0 and 6.2. The 5.2 software version is scheduled to be deprecated on April 30, 2011. Once deprecated, a software version is no longer supported by Google. This means we may require you to update to a supported version, should you require technical support.
The schedule for 6.0 and 6.2 end of life is as follows:
Google released P2 to software release 6.8. Google strongly recommends that you install the patch if you are running 6.8.0.G30 or 6.8.0.G30-P1.
The following fixes are included:
Issue ID
Issue Description
2736134
Only use queries that return results to generate the Query Suggestion database.
2959255
Security Manager (Universal Login) fails authentication if Kerberos Authorization HTTP header is larger than 8K
3120706
A feed can be submitted to the appliance regardless of ip restrictions in certain circumstances.
218751
In Database – Advanced Settings, if meta data is selected and primary document is submitted via Document URL Field or Document ID field, “action=delete” will not work.
3100032
Dynamic Navigation may return counts of unauhorized resultswith access=s parameter.
2086514
GSA only supports DES encryption for Kerberos.
2551148
Syslog files are not rotated.
3122895
SAML Security Manager configuration is not properly migrated when updating to 6.8.0.G.30.
MC+A support customer should contact support to schedule an update time. Others are required to login to Google’s support portal that is included in your Google Search Appliance welcome email.
Openpipeline released version 0.9 last week. The most notable feature in this release is a basic web crawler along with making the Connector an abstract class, and given the Stage more access to its environment.
Google.com has begun rolling out Google Instant. This new feature not only performs a Autocomplete/Search as your type but actually performs a search based on what you have begun typing. In preliminary testing, users can save 2-5 second per search.
The feature is described in detail at http://www.google.com/instant/ It’s a significant overhaul from the current search infrastructure requiring additional capacity from Google infrastructure to support the additional queries.
Key features:
Dynamic Results – In addition to autocomplete/search as you type searches are performed and returned while you type.
Predictions – the autocomplete attempts to predict and display your full query.
Scroll to Search – You can scroll through predictions instantly with an arrow down.
Header requests is the default method for how the Google Search Appliance (GSA) performs authorization on a document level (also known as late binding) for web based content (See The Header Requester). There are numerous advantages and disadvantages. One of the minuses, is that it relies on the content source to adhere to HTTP protocols.
We’ve experienced numerous content systems that don’t fully support the correct HTTP response for this to work. In many cases of Lotus Domino or Microsoft SharePoint, a friendly message is return or their is an embedded header. This causes the to misinterpret the response from the server and think the user has access to the document.
The common method pre 6.4 was to implement a SAML interface and develop custom code to handle the logic for the variety of content sources. Google released several Open Source projects to jump start your efforts. Most notably they are:
Those tended to be difficult for our clients to implement and another piece of infrastructure to deploy and manage. In version 6.4, Google has added additional rule validation on the appliance. You now can check the most common sets on the appliance with simple configuration:
Screen shot of the Header Request Deny Rule Form
This virtually eliminates many of the customizations that we’ve made for the wrong response. How Neat!!!