Gustav Brock
gustav at cactus.dk
Fri Oct 3 15:49:52 CDT 2008
Hi Arthur I don't see a problem with that approach. However, no matter how fast your C++ part is, the "disk part" may be slower which means it takes a fraction of time to create, write and close the file; a chance exists that the bulk insert will start reading a file before it has been closed. This can be avoided by creating the file with another name, say xxx_20081103_01.tmp, and then let the C++ part rename it after it has been closed to .csv. /gustav >>> fuller.artful at gmail.com 03-10-2008 22:09 >>> Ok I have all the glitches sorted out. It required a change or two on the incoming side (creation of the source files from the stream) and a change or two on the destination side, but we're there. On to the next problem... The c++ program reads the stream and periodically writes an ascii file, which I then inhale using Bulk Insert. This part is insanely fast, I can't believe how fast it is. The c++ part creates an in-memory database and then periodically writes an ascii file. At that point, I grab the file and to the bulk insert. Although the insert only takes seconds, it could well happen that more ascii files appear. So the directory may at any moment contain N files, to be processed in sequence. The current plan is to grab the first file, inhale it, move it to another directory, and repeat. The files are going to have names such as xxx_20081103_01.csv. And then 02 and so on. I'm pondering how to know the next file to open. I suppose that if I read the directory list sorted by name or date that will give me the earliest file, and then I can just pass that name to the sproc that calls Bulk Insert. But before I code it, I thought to ask you wizards if you can see any holes in this approach. TIA, Arthur