[dba-SQLServer] Really Interesting puzzle

Gustav Brock gustav at cactus.dk
Fri Oct 3 15:49:52 CDT 2008


Hi Arthur

I don't see a problem with that approach.

However, no matter how fast your C++ part is, the "disk part" may be slower which means it takes a fraction of time to create, write and close the file; a chance exists that the bulk insert will start reading a file before it has been closed.
This can be avoided by creating the file with another name, say xxx_20081103_01.tmp, and then let the C++ part rename it after it has been closed to .csv.

/gustav

>>> fuller.artful at gmail.com 03-10-2008 22:09 >>>
Ok I have all the glitches sorted out. It required a change or two on the
incoming side (creation of the source files from the stream) and a change or
two on the destination side, but we're there. On to the next problem...

The c++ program reads the stream and periodically writes an ascii file,
which I then inhale using Bulk Insert. This part is insanely fast, I can't
believe how fast it is. The c++ part creates an in-memory database and then
periodically writes an ascii file. At that point, I grab the file and to the
bulk insert. Although the insert only takes seconds, it could well happen
that more ascii files appear. So the directory may at any moment contain N
files, to be processed in sequence. The current plan is to grab the first
file, inhale it, move it to another directory, and repeat.

The files are going to have names such as xxx_20081103_01.csv. And then 02
and so on.

I'm pondering how to know the next file to open. I suppose that if I read
the directory list sorted by name or date that will give me the earliest
file, and then I can just pass that name to the sproc that calls Bulk
Insert. But before I code it, I thought to ask you wizards if you can see
any holes in this approach.

TIA,
Arthur





More information about the dba-SQLServer mailing list