[dba-VB] "Smart" fulltext search on large documents(data)base...

Gustav Brock Gustav at cactus.dk
Fri Feb 4 03:32:37 CST 2011


Hi Shamil

How fast is this? Sounds very clever.

/gustav


>>> shamil at smsconsulting.spb.ru 04-02-2011 03:38 >>>
Hi Doug,

Yes, search google docs docs can be automated  - I have found and I used
http://code.google.com/p/google-gdata/ 
That wasn't very quick to make it working but I have got it working after
all.

My sample quick & dirty code, which worked for me was as the following:

    string userName = "myTest at gmail.com";
    string password = "mypassword";

    List<Document> all = new List<Document>();
    GDataCredentials credentials = new GDataCredentials(userName, password);
    DocumentsService service = new
DocumentsService("GoogleDocumentsSample");

    System.Console.WriteLine("Logging in...");

    RequestSettings settings = new RequestSettings("GoogleDocumentsSample",
credentials);
    settings.AutoPaging = true;
    settings.PageSize = 100;
    if (settings != null)
    {
        DocumentsRequest request = new DocumentsRequest(settings);
        System.Console.WriteLine("Successfully logged in");

        System.Console.WriteLine("Gettings docs...");

        FeedQuery query = new FeedQuery();
        query.Uri = new Uri( request.BaseUri);
        query.Query = "quick brown fox";

        Feed<Document> feed = request.Get<Document>(query);  
        
        
        // this takes care of paging the results in
        System.Console.WriteLine("Collecting docs info...");

        int index = 1;
        foreach (Document entry in feed.Entries)
        {
            System.Console.WriteLine("{0}. {1}", index, entry.Title);
            all.Add(entry);
            index++;
        }

        System.Console.WriteLine("\n *** Docs collected - processing
them***\n");

        index = 1;
        foreach (Document doc in all)
        {
	// just listing collected docs...
            System.Console.WriteLine("{0}. {1}", index, doc.Title);
            index++;
        }
    }
    else
    {
        System.Console.WriteLine("Login failed.");
    }
}

 
Uploading docs to GoogleDocs can be also automated usinhg the same C# lib.

It's funny one can also use GoogleDocs engine as a document formats
convertor e.g. txt -> pdf, or txt -> doc etc. - just upload one format, and
download another one...


Thank you.

--
Shamil
 
-----Original Message-----
From: dba-vb-bounces at databaseadvisors.com 
[mailto:dba-vb-bounces at databaseadvisors.com] On Behalf Of Doug Murphy
Sent: 22 ?????? 2011 ?. 20:56
To: 'Discussion concerning Visual Basic and related programming issues.'
Subject: Re: [dba-VB] "Smart" fulltext search on large
documents(data)base...

Shamil,

Very innovative approach. Good use of the low cost and high power offered by
the "Cloud" services. I'll be interested in how this comes out.

Doug 

-----Original Message-----
From: dba-vb-bounces at databaseadvisors.com 
[mailto:dba-vb-bounces at databaseadvisors.com] On Behalf Of Shamil
Salakhetdinov
Sent: Friday, January 21, 2011 8:34 AM
To: 'Discussion concerning Visual Basic and related programming issues.'
Subject: [dba-VB] "Smart" fulltext search on large documents (data)base...

Hi All --

I have a task to implement a system providing "smart" fulltext search over a
large base of text documents.
My current plan is to use Google Docs.

I plan to get in the future 80 GB ($20.00 USD per year) hosted space on
GoogleDocs, put all the subject docs there, and then use Google API to
search via my documents base.

That seems to be it?

It should be even possible to create a simple (free?) Google Web Site as
front-end to that GoogleDocs documents base?

That GoogleDocs base/site is planned to be used by non-profit organization.

Am I missing something? 
Additional overhead costs to keep that solution's stuff on Google site?

And why I'm writing about that solution here in dba-VBA? - because I plan to
implement a front-end to that application system as an WinForms application
coomunicating with Windows API...

Thank you.

--
Shamil





More information about the dba-VB mailing list