[dba-Tech] The fastest search

Jim Lawrence accessd at shaw.ca
Sat Feb 8 23:43:21 CST 2014


Hi All:

Here is a basic search and data retrieval engine/database called Elasticsearch.

http://www.elasticsearch.org/webinars/getting-started-with-elasticsearch/?watch=1

(You may have to give your name and email but it is not serious details. The video is a little slow to get started but once the preamble is finished...) I understand that the product runs on both Linux and Windows. 

The above will show you a little sample of how the system works and if you need curl; it comes in both Linux and Windows versions: http://curl.haxx.se/dlwiz
  
Its capabilities are rather incredible. Like other NoSQL map reduce it is designed as a data store with no schema; you just write your own in code. Rows are called documents as a row can have 30,000 plus columns and though in theory all the data can be free-form it rarely is. What gives this system its incredible performance is that it indexes its entire data store, whether it is just a single directory, an entire hard drive or with the help of a Hadoop framework, a cluster of drives in the hundreds.

http://www.elasticsearch.org/overview/hadoop 

Hadoop comes along with a host of related products, database managers, programming environments and so on. 
 
Once the product is installed on your computer, the indexing process runs as a service keeping the data stores always current. The indexing process is fast as according to the documentation, given the appropriate hardware, it could index a one TB drive in a little over 90 seconds.

To search the data it uses a JSON type object request and returns the results in JSON format, suitable for further processing and display...like with D3: http://d3js.org

The Elastic Search product is fully OSS but there are a number of related products that are sold as network services. One product specializes in displaying summaries of active system logs in real-time. 

As a database it does not have the structure of a real SQL DB (I would never use an ES type product for invoicing) but it bypasses the limitations of SQL, like changing data structures, multiple joins and multiple indexes...in data summaries operation, it just can not be beat.

Jim  

    


More information about the dba-Tech mailing list