This is a discussion on How to remember last record read within the SQL Server forums, part of the Microsoft SQL Server category; --> Hi, Can you guys see if there's a solution to this problem? I have a database from which I ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, Can you guys see if there's a solution to this problem? I have a database from which I have to read each record and process that record. New records are being added all the time, so I need to go back and check for new records and process them. However: -- there is no 'identity' column in the database design (so I cannot keep track of the last record read by use of a numeric variable) -- I am not allowed to update the database (so I cannot flag the records I have read). My problem is: how can I know which records I have already read and which ones I haven't read yet? I don't want to process records twice and don't want to miss any records. Is there a known solution to this problem? Any ideas? Thanks. |
| |||
| Use a filed which has the datetime of type. Every time a record is inserted update this with latest datetime. When viewing the record get the latest of this field and check whether this different from the record by getting the latest datetime from the table again Madhivanan |
| |||
| On 2 Mar 2005 22:36:48 -0800, francisds@hotmail.com wrote: (snip) >My problem is: how can I know which records I have already read and >which ones I haven't read yet? I don't want to process records twice >and don't want to miss any records. Hi francisds, Madhivanan already pointed out the possibility to use a datetime column with the date a row is inserted in the table. If such a column is not available and you're not allowed to add one, here are a few other options. * You say that you're not allowed to update the database. Is that just this table? If so, you can create a new table to hold the ey values of all rows you already have processed. * If you're not allowed to add tables to the database either, you could consider keeping track of rows already processed in a table in another database. This is quite a hack, though. It can be hard to keep the databases synchronized, especially if you ever have to restore to a backup. * Yet another option would be to use a trigger on the table. Either do the complete processing in that trigger (if it's short and can't fail), or write rows with the primary keys of the inserted rows to another table that you can use to decide which rows to process. Note that even this trigger will slow down the insert process somewhat! Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address) |
| |||
| (francisds@hotmail.com) writes: > Can you guys see if there's a solution to this problem? > > I have a database from which I have to read each record and process > that record. New records are being added all the time, so I need to go > back and check for new records and process them. > > However: > > -- there is no 'identity' column in the database design (so I cannot > keep track of the last record read by use of a numeric variable) > > -- I am not allowed to update the database (so I cannot flag the > records I have read). > > My problem is: how can I know which records I have already read and > which ones I haven't read yet? I don't want to process records twice > and don't want to miss any records. > > Is there a known solution to this problem? Any ideas? So where does your process live? Is it a stored procedure or a client program? If it's stored procedure maybe a temp table would do. More information about this process and the circumstances would be needed to give a good answer. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinf...2000/books.asp |
| |||
| Thank you for the suggestions. I am not sure they will work yet, but it does give me hope of finding a solution. I still need to digest them. As for the Erland's question. I am writing an application which runs on a separate PC from the SQL Server database which holds the table. The table contains records of a business event (for example, a payment). Each time an event occurs, then a record describing that event is appended in the table. My application needs to process all those records. Again, my problem is, I am not allowed to update the table (it's a stupid political issue, not a technical one). I checked and it has no identity field. Since the table has a column containing the datetime of the event that occured, I tried using the datetime to remember the last event I processed. Unfortunately, I found out that events can be generated from many points and there is no guarantee that the records of that event will be saved in the order they occured (IOW, an event that occured at 2:30pm could be saved before an event that occured at 2:15). -Francis Erland Sommarskog wrote: > So where does your process live? Is it a stored procedure or a client > program? If it's stored procedure maybe a temp table would do. > > More information about this process and the circumstances would be > needed to give a good answer. > > |
| |||
| > * You say that you're not allowed to update the database. Is that just > this table? If so, you can create a new table to hold the ey values of > all rows you already have processed. Hi Hugo, Novice question: what do you mean by the "value" of a row? How do I get that? I discovered that there is a column of 'uniqueidentifier' data type. I could use this probably, although since the database generates thousands of records a day, the list of processed keys will be fairly large. The trigger suggestion is intriguing. I might be able to use it for another project, but not for this since I cannot change the database (uhm, I haven't used triggers before, but I assume it requires modifying the database, at least to save a trigger?) Thanks. -Francis |
| |||
| (francisds@hotmail.com) writes: > Thank you for the suggestions. I am not sure they will work yet, but > it does give me hope of finding a solution. I still need to digest > them. > > As for the Erland's question. I am writing an application which runs > on a separate PC from the SQL Server database which holds the table. > > The table contains records of a business event (for example, a > payment). Each time an event occurs, then a record describing that > event is appended in the table. My application needs to process all > those records. OK, so in theory you could just keep track in memory of which records you have processed. But if you application crashes, you will lose the knowledge of what you have processed. Then again, that depends on what your processing consists of. If you have to redo the processing in case of a crash that's the right thing. Else your alternative appears to be to write to a file on the machine you application runs on. When your application starts, you read the file to see what you already have processed. The challenge is two have a two-phase commit between the file and the processing. If you write to the file too soon, you mail fail to process a record in case of a restart. If you write too late, you may reprocess a record on a restart. Then again, if your processing involves some external mechanism, you have the same problem, if you were to update the table. > Again, my problem is, I am not allowed to update the table (it's a > stupid political issue, not a technical one). I checked and it has no > identity field. > > Since the table has a column containing the datetime of the event that > occured, I tried using the datetime to remember the last event I > processed. Unfortunately, I found out that events can be generated > from many points and there is no guarantee that the records of that > event will be saved in the order they occured (IOW, an event that > occured at 2:30pm could be saved before an event that occured at 2:15). You mentioned in another post that there is a uniqueidentifier column. Thus the table has a primary key. An identity column had been a little easier to work with since it monotonic, and a uniqueidentifier is not. But you could use a combination of datetime and uniqueidentifer. The nice with the datetime column, is that it permits you prune the log of processed records easily. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinf...2000/books.asp |
| |||
| On 4 Mar 2005 10:58:19 -0800, francisds@hotmail.com wrote: >> * You say that you're not allowed to update the database. Is that >just >> this table? If so, you can create a new table to hold the ey values >of >> all rows you already have processed. > >Hi Hugo, > >Novice question: what do you mean by the "value" of a row? How do I >get that? Hi Francis, My fault. I should have written "you can create a new table to hold the values of the primary key columns in the rows you already have processed". >I discovered that there is a column of 'uniqueidentifier' data type. I >could use this probably, although since the database generates >thousands of records a day, the list of processed keys will be fairly >large. You could only use this if you were allowed to add a column to the table. And if you are allowed to do that, than there are better solutions than uniqueidentifier. >The trigger suggestion is intriguing. I might be able to use it for >another project, but not for this since I cannot change the database >(uhm, I haven't used triggers before, but I assume it requires >modifying the database, at least to save a trigger?) Yes, the trigger is stored in the database. But I'm not sure exactly what you may and may not change, as in other messages you wrote that you are not allowed to update the table. The trigger is not part of the table, it's stored seperately in the same database. BTW, have you already considered how to handle updates to a row that you have already processed? Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address) |
| |||
| > You mentioned in another post that there is a uniqueidentifier column. > Thus the table has a primary key. An identity column had been a little > easier to work with since it monotonic, and a uniqueidentifier is not. > But you could use a combination of datetime and uniqueidentifer. The > nice with the datetime column, is that it permits you prune the log > of processed records easily. As I wrote earlier, I am not guaranteed about the order of arrival of the records. How do you figure I might use the datetime in combination with the uniqueidentifier? -Francis |
| ||||
| > >I discovered that there is a column of 'uniqueidentifier' data type. I > > You could only use this if you were allowed to add a column to the > table. And if you are allowed to do that, than there are better > solutions than uniqueidentifier. Oh, I meant to say that I discovered that the table I am supposed to read has a column of uniqueidentifiers, not that I can create one for my use. > Yes, the trigger is stored in the database. But I'm not sure exactly > what you may and may not change, as in other messages you wrote that you > are not allowed to update the table. The trigger is not part of the > table, it's stored seperately in the same database. I can't touch the database at all. :-( > BTW, have you already considered how to handle updates to a row that you > have already processed? Good alert. It's not needed for this application (the events are mechanically generated and no one will be modifying the data). (If it makes things clearer, the database I am reading is a Cisco CallManager CDR table). http://www.cisco.com/en/US/products/...html #wp33440 -Francis |