[AccessD] ElasticSearch my hind leg...

Jim Lawrence accessd at shaw.ca
Mon Mar 10 18:20:18 CDT 2014


Hi John:

Note: This is not Cloud computing specifically...it can be done on any server(s) you have in the office.

Much of my work now...semi-retired is on legacy applications, some Linux stuff and web stuff...that is what the client wants but I do wish I had access to the technology that is out there today. To say the least it is far superior and if I ever had a chance to build a new major system I would never do what I did ten years ago. 

Aside: A few weeks ago a company that I had been managing and writing their invoicing and accounting since 1989 closed their doors. I had recently made the single station batching package into a multi-user/multi-tasking product...these capabilities are standard in today's development packages but not twenty years ago. Twenty five years of work and a product that did everything in the business, is now history...the amount of time and effort put into the application is in the thousands of hours. But with the knowledge I have learned I could probably re-build a better product in about three months.

First of all I am not criticizing what you have done, dismissing the effort you have put into this project or results you have achieved as I think what you have done is stellar to say the least. But if you were starting from scratch today, you could generate the same result far cheaper and faster. 

I do not know which packages of ElasticSearch you were looking at but I downloaded the "ElasticSearch" engine for free...(It is not setup yet as I need another 64bit server or a loaded motherboard and a bunch of drives...) When you are talking nodes you are basically talking computers...100 nodes is a hundred computer cluster...this is not what you will need...probably ever.    

Jim

----- Original Message -----
From: "John W Colby" <jwcolby at gmail.com>
To: "Access Developers discussion and problem solving" <accessd at databaseadvisors.com>
Sent: Sunday, 9 March, 2014 10:02:48 PM
Subject: [AccessD] ElasticSearch my hind leg...

 >>"I was able to get (7) 200gb SSDs and form the raid array..." OMG...every home should have one. ;-)

LOL, this is a business not a home.

 >>It indexes everything and it is quick; according to the webinar, one TB can be indexed in about 
90 seconds. The application can group millions of rows of data in milliseconds.

And read the fine print.  NOBODY does those kinds of numbers without enormous cloud compute (and 
enormous budgets).

Give me some credit please for what I have managed to do for a virtual company of about 7 people, 
with a total hardware budget of around $20K over 9 years.  I started with NO hardware and had never 
even seen SQL Server, and I hand built (eventually) a dual processor 16 core machine with 96 gigs of 
RAM, 9 TB of main (rotating) storage (RAID 6), a TB of SSD storage (Raid 5) to handle SQL Server, 
and a second server with 6 cores and 32 GB of RAM and 6 VMs running third party software, CAS and 
NCOA processing 500 million addresses every month AND handling the actual orders for the client as 
well.  AND I designed and executed a very complex system in C# automating that SQL Server to push 
those 500 MILLION records to CSV files every month (that's 1000 CSV files BTW), pushing those files 
out to Accuzip on the virtual machines, babysitting Accuzip (third party software written in Visual 
Foxpro), and merging the 1000 result files back in to SQL Server.

With the exception of a student (2 year graduate) C# programmer (I met when I took my C# classes) 
helping me, I did this all BY MY SELF.

It is more than slightly annoying to have folks say "go look at xyz".  Buddy I looked at a TON of 
stuff trying to get something that I could build and handle BY MY SELF, starting in 2004 when NONE 
of this hi-falutin crap you mention was even a gleam in it's daddy's eye.

I hope you got the BY MY SELF reference.  This is NOT IBM or Google or Facebook with a 50 million 
dollar data center and a team of programmers.  This is Colby Consulting with John W. Colby doing the 
whole damned thing.  When I say EVERYTHING I mean researching and ordering hardware from Newegg, 
joining the Microsoft program to get my hands on the software, BUILDING the hardware (and 
maintaining it, and upgrading it), installing all of the Windows 2003, then 2008 and SQL Server 2000 
/ 2005 / 2008 software, researching the Accuzip solution for CAS / NCOA, buying it and learning how 
it worked and how to automate it, designing the methodology for getting these big tables (text 
files) into databases in SQL Server, designing the C# application and writing same (with my 
assistant) over 18 months, all while actually performing work on those same SQL Server databases 
providing counts and fulfilling orders for my client.

You are clueless what it took to get where I am today and what it would take to throw all this away 
just to use some other data store.  The data store is 1/4 of the business that I manage. Maybe only 
1/10th.  I look back on the last nine years and wonder how I managed to get all that crap done.

So no, it seems unlikely I am going to do that ElasticSearch thing.  Not that it isn't fascinating 
and all, but being a one man show I have to pick my battles and that isn't something I need.

Only $500 per year to monitor your first 5 nodes
$3,000 per year for each 5 node cluster thereafter

To get the numbers you mention I probably only need a thousand nodes.  Uh yea... Or rather no...

John W. Colby

Reality is what refuses to go away
when you do not believe in it

On 3/9/2014 8:18 PM, Jim Lawrence wrote:
> Hi John:
>
> "I was able to get (7) 200gb SSDs and form the raid array..." OMG...every home should have one. ;-)
>
> I know we have gone through this discussion before but given the amount of data you are working with and the complexity of the searches required, I would be so bold as to suggest that you at least look at the following technology from ElastciStretch:
>
> http://www.elasticsearch.org ...and... http://www.elasticsearch.org/resources < check out the webinar...
>
> The system in a nutshell is text based. The number of rows (document) is dependant on the hardware and can handle thirty-thousand plus columns. It indexes everything and it is quick; according to the webinar, one TB can be indexed in about 90 seconds. The application can group millions of rows of data in milliseconds. The data can be limited to a single directory, a HD, a computer or a whole cluster.
>
> Jim
>



---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
-- 
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com


More information about the AccessD mailing list