Google Search Appliance 6.0 New Features Video
July 13th, 2009
Google released a video overview of GSA 6.0 features. If you haven’t seen it, you should check it out.
July 13th, 2009
Google released a video overview of GSA 6.0 features. If you haven’t seen it, you should check it out.
July 8th, 2009
I just installed OffiSync. I’m excited about it’s capabilities. I love the online collaboration abilities of Google Docs, but I constantly find myself needing more advanced features.
OffiSync is suppose to:
It’s currently in beta. We’ll update as we find more about this product.
July 8th, 2009
Replacing SharePoint search with Google search is a no brainer! SharePoint’s built in search is finicky and unreliable. Not having to worry about you Intranet’s search failing and having the ease of use that come with Google search will make you understand why the appliance is worth it. How do you know which appliance is right for your SharePoint and other systems? A good approach is to know how much content you are going to be indexing. The Google Mini has a document limit of 100,000 documents, the Google Search Appliance starts at 500,000 and goes up to a billion when you chain several GSAs together! So many people are curious to know if there is a way to get a page count or a means to account for all the content on a SharePoint site. I hope some of these SQL queries help you find the answers you are looking for.
Here are some SQL Queries to find out the total number of documents/files:
1. Total number of documents:
SELECT COUNT(*)
FROM Docs INNER JOIN Webs On Docs.WebId = Webs.Id
INNER JOIN Sites ON Webs.SiteId = SItes.Id
WHERE
Docs.Type <> 1 AND (LeafName NOT LIKE '%.stp')
AND (LeafName NOT LIKE '%.aspx')
AND (LeafName NOT LIKE '%.xfp')
AND (LeafName NOT LIKE '%.dwp')
AND (LeafName NOT LIKE '%template%')
AND (LeafName NOT LIKE '%.inf')
AND (LeafName NOT LIKE '%.css')
2. Total MS Word documents:
SELECT count(*) FROM Docs INNER JOIN Webs On Docs.WebId = Webs.Id INNER JOIN Sites ON Webs.SiteId = SItes.Id WHERE Docs.Type <> 1 AND (LeafName LIKE '%.doc') AND (LeafName NOT LIKE '%template%')
3. Total MS Excel documents:
SELECT count(*)
FROM Docs INNER JOIN Webs On Docs.WebId = Webs.Id
INNER JOIN Sites ON Webs.SiteId = SItes.Id
WHERE
Docs.Type <> 1 AND (LeafName LIKE '%.xls')
AND (LeafName NOT LIKE '%template%')
4. Total MS PowerPoint documents:
SELECT count(*)
FROM Docs INNER JOIN Webs On Docs.WebId = Webs.Id
INNER JOIN Sites ON Webs.SiteId = SItes.Id
WHERE
Docs.Type <> 1 AND (LeafName LIKE '%.ppt')
AND (LeafName NOT LIKE '%template%')
5. Total TXT documents:
SELECT count(*)
FROM Docs INNER JOIN Webs On Docs.WebId = Webs.Id
INNER JOIN Sites ON Webs.SiteId = SItes.Id
WHERE
Docs.Type <> 1 AND (LeafName LIKE '%.txt')
AND (LeafName NOT LIKE '%template%')
June 24th, 2009
It’s been 5 years since the current MC+A organization was formed (When we relocated from San Francisco to Chicago, we reorganized). In that time many changes have begun to surface. We’ve seen the explosion of Microsoft SharePoint into the once Java dominated portal market. We’ve seen the Google Search Appliance become the defacto intranet search tool and we are now seeing the extension of portals into commercial SaaS platforms (i.e. mash-up and web 2.0).
The first two years are suppose to be the most difficult. As many companies across the world can attest to, these last two have been the most difficult for us. At a certain point only sheer will and determination cause positive outcomes. I definitely want to thank all of those who are involved in MC+A, my family and my wife for their support and dedication. We’ve taken a concept that I planned on the beach of Brazil and grown it into a multimillion dollar business with outreach globally.
To our clients and community, I definitely want to thank you for working with us. In the coming months we’ll be releasing new services and open source code that will help you connect to your information assets. As always, we see the utilization of these as being the single most important asset your company retains.
MC+A – Chicago will be celebrating this Friday down at the taste of Chicago. Email me if you want to meet up at my flat and join us.
June 23rd, 2009
Calling the Google Search Appliance when a SAML interface or Forms Authentication security interface is enabled causes a bit of a challenge from SharePoint or ASP.NET. The following is some code that we developed to handle all of the hand shaking from the redirects and cookie exchange. This will be published shortly as a open source project.
public string SecureSearch(string term)
{
try
{
StringBuilder defaultquery = new StringBuilder();
defaultquery.Append(ConfigurationManager.AppSettings["GSADEVICEURL"]);
defaultquery.Append("/search?q=" + term);
defaultquery.Append("&amp;client=" + ConfigurationManager.AppSettings["GSAFRONTEND"]);
defaultquery.Append("&amp;output=xml_no_dtd");
defaultquery.Append("&amp;site=" + ConfigurationManager.AppSettings["GSACOLLECTION"]);
defaultquery.Append("&amp;access=a");
defaultquery.Append("&amp;entqr=3&amp;ud=1&amp;oe=UTF-8&amp;ie=UTF-8");
defaultquery.Append("&amp;filter=" + ConfigurationManager.AppSettings["GSASEARCHFILTER"]);
defaultquery.Append("&amp;num=" + ConfigurationManager.AppSettings["RESULTSPERPAGE"]);
Regex cookieCheck = new Regex("googlecookiecheck", RegexOptions.IgnoreCase);
log.Info("Step 1 Call GSA for secure query");
string loginurl = callGSA(defaultquery.ToString(), false);
//if the return url has the cookie check added, send the url back to the gsa
//the cookie check cookie will be attached
if (cookieCheck.Match(loginurl).Success)
{
//send the url returned with cookiecheck back to GSA
log.Info("Step 2 Call GSA for secure doc, respond to cookiecheck request");
loginurl = callGSA(ConfigurationManager.AppSettings["GSADEVICEURL"] + loginurl, false);
if (!gsaResultsReturned)
{
log.Info("Step 3 Call GSA for secure only with redirect from cookiecheck request");
loginurl = callGSA(ConfigurationManager.AppSettings["GSADEVICEURL"] + loginurl, false);
}
}
if (!gsaResultsReturned)
{
log.Info("Step 4 Call GSA to get result set");
loginurl = callGSA(loginurl, true);
}
log.Info("Step 5 COMPLETE");
}
catch (Exception ex)
{
log.Error("btnSearch_Click", ex);
}
return results.OuterXml.ToString();
}
private string callGSA(string url, bool allowredirect )
{
gsaResultsReturned = false;
string returnURL = string.Empty;
try
{
log.Error("callGSA() url -&gt;" + url);
HttpWebRequest HttpWReq = (HttpWebRequest)WebRequest.Create(url);
HttpWReq.KeepAlive = true;
HttpWReq.Accept = "*/*";
HttpWReq.Headers.Add("Accept-Language", "en-us");
HttpWReq.Headers.Add("Accept-Encoding", "gzip, deflate");
HttpWReq.AllowAutoRedirect = allowredirect;
HttpWReq.CookieContainer = cookieC;
HttpWReq.Credentials = System.Net.CredentialCache.DefaultCredentials;
HttpWReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; InfoPath.1; .NET CLR 2.0.50727)";
if (allowredirect)
{
HttpWReq.MaximumAutomaticRedirections = 10;
}
CookieCollection mycookies = HttpWReq.CookieContainer.GetCookies(new Uri(ConfigurationManager.AppSettings["GSACOOKIEURL"]));
HttpWebResponse HttpWResp = (HttpWebResponse)HttpWReq.GetResponse();
log.Info("......callGSA response http status = " + HttpWResp.StatusCode);
//get header collection
WebHeaderCollection col = HttpWResp.Headers;
if (HttpWResp.StatusCode.Equals(HttpStatusCode.Found))
{
//get location and make request
string[] redirectURL = col.GetValues("Location");
returnURL = redirectURL[0];
log.Debug("......callGSA() found redirect URL from GSA to-&gt;" + redirectURL[0]);
}
//logHeader(col);
CookieCollection csk = new CookieCollection();
csk = HttpWResp.Cookies;
for (int i = 0; i &lt; csk.Count; i++)
{
log.Debug("............. callGSA response cookies " + csk[i].Name + " value " + csk[i].Value);
log.Debug("............. callGSA response cook domain " + csk[i].Domain);
log.Debug("............. callGSA response cook Path " + csk[i].Path);
log.Debug("............. callGSA response cook expires " + csk[i].Expires.ToLongTimeString());
log.Debug("............. callGSA response cook secure " + csk[i].Secure.ToString());
String[] values = col.GetValues("Set-Cookie");
if (null != values &amp;&amp; values.Length &gt; 0)
{
for (int j = 0; j &lt; values.Length; j++)
{
Uri cookuri = new UriBuilder("http", csk[i].Domain, 80).Uri;
log.Info("......callGSA moving cookie " + cookuri.ToString() + " " + values[j]);
cookieC.SetCookies(cookuri, values[j]);
}
}
}
//this is the final step, we have a result set from the GSA
if (HttpWResp.StatusCode.Equals(HttpStatusCode.OK) ||
(returnURL.Equals(string.Empty)))
{
gsaResultsReturned = true;
Stream receiveStream = HttpWResp.GetResponseStream();
Encoding UTF8_Encoding = System.Text.Encoding.GetEncoding("utf-8");
StreamReader readStream = new StreamReader(receiveStream, UTF8_Encoding);
//get the response and display
results.LoadXml(readStream.ReadToEnd());
}
}
catch (Exception ex)
{
log.Error("CallGSA ", ex);
}
return returnURL;
}
June 12th, 2009
In a recent article on IT Knowledge Exchange, The impending cloud MC+A’s president, Michael Cizmar discussed the position of VAR’s and cloud solutions.
Excerpt:
Cizmar thinks that value-added partners — those who integrate disparate services and applications, those who build applications, and those who support customers — will be more valuable than ever in a cloudy future. And that will be true whether the infrastructure provider of choice is Salesforce.com, Amazon.com, Microsoft, NetSuite, add-your-own-favorite-company here.
“Solution providers provide [domain] and technology implementation expertise. That will still be needed. Look at Salesforce.com or NetSuite. We are a tech company and we have a partner that implements our NetSuite solution,” he notes.
May 28th, 2009
Yesterday on the Google Geo Developers blog, Google announced the availability of a new api for Google Geo mashups. From the blog post, the major improvements are:
We’ll be incorporating this new API into our physican finder as soon as the API is a little more baked.
May 27th, 2009
Today Google posted on their enterprise blog an announcement about a long awaited feature for scripting Google Apps. You’ll have to sign up to be part of the beta before it’s released widely.
Here is a video describing the new feature:
May 27th, 2009
Something pretty cool to check out:
http://www.google.com/search?hl=en&rls=com.microsoft%3Aen-us&tbo=1&tbs=ww%3A1&q=michael+cizmar&tbo=1
Looks like it’s been out for a few months. You get your terms in the middle and then the related topic outward. Neat!
May 23rd, 2009
Rather than learning by Trial and Error, here are some excludes that you will want to add to your Google Search Appliance configuration if you are trying to crawl Jira Studio:
#workday/jira excludes
contains:delete
contains:Delete
contains:create
contains:Create
contains:edit
contains:Edit
contains:configureReport!
contains:reset
contains:lazyloader
contains:ViewUserIssueColumns!
contains:SaveAsFilter!
contains:ProjectRepositoryPermissions!Anon
contains:AddComment
contains:AddPortlet
contains:AttachFile
contains:ConfigurePortalPages
http://mcplusa.jira.com/secure/admin/
(replace mcplusa.jira.com with your site to not crawl the administrative pages)