[dba-Tech] PDF data extractor recommendation

jmoss111 at bellsouth.net jmoss111 at bellsouth.net
Fri Mar 18 12:01:27 CST 2005


Thanks Marty.  I want to extract columnar text from pdf's created by the invoicing module in QuickBooks Pro 2004. I would imagine that the pdf's are probably somewhere between v3 and v6, whichever engine Intuit uses to send invoices by email. I'm only talking about twenty 25 page pdf files.

I will look into the ScanSoft product. Once again, Thanks.

Jim


> 
> From: MartyConnelly <martyconnelly at shaw.ca>
> Date: 2005/03/18 Fri PM 12:42:49 EST
> To: Discussion of Hardware and Software issues <dba-tech at databaseadvisors.com>
> Subject: Re: [dba-Tech] PDF data extractor recommendation
> 
> Do you mean text extraction or image extraction, and then it depends if 
> text has been rendered by OCR software.
> I believe Acrobat 7 holds a internal xml file of searchable text if the 
> underlying image has been OCRed.
> You could use this PDF Converter Professional   from ScanSoft $99
> Converts PDF to Word or text files(I think not sure there was a sendto 
> command) or visversa
>  http://www.scansoft.com/pdfconverter/professional/
> 
> If you just want to index and  search non OCR'ed PDF files on a disk  ie 
> a fax scanned into a PDF you could use this new Beta
> The ScanSoft OmniPage Search Indexer enables you to search text found in 
> image files using world-leading optical character recognition (OCR) 
> technology. For example, it enables you to search the text found in 
> electronic fax documents that you may receive via email, as well as 
> other image formats including PDF, TIF, JPG, BMP, and Paperport.
> http://desktop.google.com/plugins/omnipagesearch.html
> 
> It requires installation of this 3 month beta  from ScanSoft plus Google 
> Destop Search Engine
> This requires a fair amount of disk space and memory to do the indexing 
> if you plan to index a couple of thousand image PDF's
> plan on running it overnight.
> 
> jmoss111 at bellsouth.net wrote:
> 
> > Can anyone recommend a reasonably priced, or free pdf extraction tool?
> >
> >Thanks,
> >
> >Jim
> >
> >
> >_______________________________________________
> >dba-Tech mailing list
> >dba-Tech at databaseadvisors.com
> >http://databaseadvisors.com/mailman/listinfo/dba-tech
> >Website: http://www.databaseadvisors.com
> >
> >  
> >
> 
> -- 
> Marty Connelly
> Victoria, B.C.
> Canada
> 
> 
> 
> _______________________________________________
> dba-Tech mailing list
> dba-Tech at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-tech
> Website: http://www.databaseadvisors.com
> 





More information about the dba-Tech mailing list