This is a discussion on Updating data in table depending on data in same table within the Oracle Miscellaneous forums, part of the Oracle Database category; --> Hey everyone, First of all apologies for posting this in the oracle.server section I meant to post it in ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hey everyone, First of all apologies for posting this in the oracle.server section I meant to post it in here first. I'm still a relative newb so go easy please! I've inherited a system from someone that is in a bit of a mess and I need to tidy it up. It's an Oracle 10g database. The part i have to clean up basically has two tables. Table A attributes are: SOURCEID, NAME, DESCRIPTION, DEPARTMENT, LANGUAGES. Table B attributes are: SOURCEID, LANGID, ROLE, STATUS. Now the way this system is supposed to work is that a row is entered into table A and X amount of languages are added to the LANGUAGES field. This field is then split up into a row for each language into table B with the corresponding SOURCEID. e.g. Table A: 123 -- Homer -- lazy -- HR -- French;Spanish;German Table B: 123 -- French -- manager -- approved 123 -- Spanish -- admin -- pending 123 -- German -- teaboy -- proposed This system had been working fine until a new person started and instead of just adding a new language to the original row in table A she added 1000+ rows that are identical to those already in the table except for the fact the language is Japanese. So now the table looks like Table A: 123 -- Homer -- lazy -- HR -- French;Spanish;German 456 -- Homer -- lazy -- HR -- Japanese; Table B: 123 -- French -- manager -- approved 123 -- Spanish -- admin -- pending 123 -- German -- teaboy -- proposed 456 -- Japanese -- janitor -- approved I have been given the unenviable task of fixing this by, identifying all terms that have identical attributes (except for the LANGUAGES attribute), changing the SOURCEID of the duplicate in table B to the master term SOURCEID (so in the example above, in table B change the row 456 -- Japanese -- janitor -- approved to 123 -- Japanese -- janitor -- approved), tagging on Japanese onto the master term (in the example above, in table A change the row 123 -- Homer -- lazy -- HR -- French;Spanish;German to 123 -- Homer -- lazy -- HR -- French;Spanish;German;Japanese in table A (in the example above, delete the row 456 -- Homer -- lazy -- HR -- Japanese Thanks for sticking with me this far (if you have!). Luckily enough I know for a fact that all the master terms are in the SOURCEID range 1 to 2000 and the duplicates are in the SOURCEID range 2500 to 6000. so identifying the terms is not a problem. I just ran an intersect query and found my ~1000 culprit terms. As you probably know though, from this I am not getting the SOURCEIDS of either row so basically i have been getting the SOURCEIDS of the identified terms and running the following command manually. update target set target.SOURCEID=MASTER_NUMBER where target.SOURCEID=DUPLICATE_NUMBER; update source set source.LANGSREQUIRED='French;Spanish;German;Italia n;Brazilian Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French Canadian;Dutch;Korean;Japanese;' where source.SOURCEID=MASTER_NUMBER; DELETE FROM source WHERE sourceid = DUPLICATE_NUMBER; As you can tell my wrists are about to fall off with the RSI of continually trying to find out the IDs and then run these commands. SQL is not really my game so i am not sure whether i would be able to use some procedure or cursor or something along those lines... I have already wasted an awful lot of time trying to read of some way to help me by running different commands but i am very quickly running out of time... Any help at all would be appreciated or even just a big fat no to say you're gonna have to put up with it and just do it manually. Thank you reading all this and I hope that I have explained it clearly enough, Jonathan. |
| |||
| The J Man wrote: > Hey everyone, > > First of all apologies for posting this in the oracle.server section I > meant to post it in here first. > > I'm still a relative newb so go easy please! > > I've inherited a system from someone that is in a bit of a mess and I > need to tidy it up. It's an Oracle 10g database. > > The part i have to clean up basically has two tables. > Table A attributes are: SOURCEID, NAME, DESCRIPTION, DEPARTMENT, > LANGUAGES. > Table B attributes are: SOURCEID, LANGID, ROLE, STATUS. > > Now the way this system is supposed to work is that a row is entered > into table A and X amount of languages are added to the LANGUAGES > field. This field is then split up into a row for each language into > table B with the corresponding SOURCEID. > e.g. > Table A: > 123 -- Homer -- lazy -- HR -- French;Spanish;German > > Table B: > 123 -- French -- manager -- approved > 123 -- Spanish -- admin -- pending > 123 -- German -- teaboy -- proposed > > This system had been working fine until a new person started and > instead of just adding a new language to the original row in table A > she added 1000+ rows that are identical to those already in the table > except for the fact the language is Japanese. So now the table looks > like > Table A: > 123 -- Homer -- lazy -- HR -- French;Spanish;German > 456 -- Homer -- lazy -- HR -- Japanese; > > Table B: > 123 -- French -- manager -- approved > 123 -- Spanish -- admin -- pending > 123 -- German -- teaboy -- proposed > 456 -- Japanese -- janitor -- approved > > I have been given the unenviable task of fixing this by, identifying > all terms that have identical attributes (except for the LANGUAGES > attribute), changing the SOURCEID of the duplicate in table B to the > master term SOURCEID (so in the example above, in table B change the > row 456 -- Japanese -- janitor -- approved to 123 -- Japanese -- > janitor -- approved), tagging on Japanese onto the master term (in the > example above, in table A change the row 123 -- Homer -- lazy -- HR -- > French;Spanish;German to 123 -- Homer -- lazy -- HR -- > French;Spanish;German;Japanese > in table A (in the example above, delete the row 456 -- Homer -- lazy > -- HR -- Japanese > > Thanks for sticking with me this far (if you have!). Luckily enough I > know for a fact that all the master terms are in the SOURCEID range 1 > to 2000 and the duplicates are in the SOURCEID range 2500 to 6000. so > identifying the terms is not a problem. I just ran an intersect query > and found my ~1000 culprit terms. As you probably know though, from > this I am not getting the SOURCEIDS of either row so basically i have > been getting the SOURCEIDS of the identified terms and running the > following command manually. > > update target set target.SOURCEID=MASTER_NUMBER where > target.SOURCEID=DUPLICATE_NUMBER; > update source set > source.LANGSREQUIRED='French;Spanish;German;Italia n;Brazilian > Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French > Canadian;Dutch;Korean;Japanese;' where source.SOURCEID=MASTER_NUMBER; > DELETE FROM source WHERE sourceid = DUPLICATE_NUMBER; > > As you can tell my wrists are about to fall off with the RSI of > continually trying to find out the IDs and then run these commands. > SQL is not really my game so i am not sure whether i would be able to > use some procedure or cursor or something along those lines... I have > already wasted an awful lot of time trying to read of some way to help > me by running different commands but i am very quickly running out of > time... > > Any help at all would be appreciated or even just a big fat no to say > you're gonna have to put up with it and just do it manually. > > Thank you reading all this and I hope that I have explained it clearly > enough, > Jonathan. What you have described above, if I understand it correctly, is a relational horror story. What happened should be technically impossible. While you certainly need to clean up the mess the LANGUAGES column serves no useful purpose as the information is available, properly stored in table B. Also be aware that some of your column names are Oracle reserved words. NAME and ROLE jump immediately to mind. But this should start you off: SELECT col1, col2, col3, col4, COUNT(*) FROM table GROUP BY col1, col2, col3, col4 HAVING COUNT(*) > 1; -- Daniel A. Morgan University of Washington damorgan@x.washington.edu (replace x with u to respond) Puget Sound Oracle Users Group www.psoug.org |
| |||
| On Feb 8, 2:14 pm, "The J Man" <jonathan.kee...@gmail.com> wrote: > Hey everyone, > > First of all apologies for posting this in the oracle.server section I > meant to post it in here first. > > I'm still a relative newb so go easy please! > > I've inherited a system from someone that is in a bit of a mess and I > need to tidy it up. It's an Oracle 10g database. > > The part i have to clean up basically has two tables. > Table A attributes are: SOURCEID, NAME, DESCRIPTION, DEPARTMENT, > LANGUAGES. > Table B attributes are: SOURCEID, LANGID, ROLE, STATUS. > > Now the way this system is supposed to work is that a row is entered > into table A and X amount of languages are added to the LANGUAGES > field. This field is then split up into a row for each language into > table B with the corresponding SOURCEID. > e.g. > Table A: > 123 -- Homer -- lazy -- HR -- French;Spanish;German > > Table B: > 123 -- French -- manager -- approved > 123 -- Spanish -- admin -- pending > 123 -- German -- teaboy -- proposed it is not clear how B gets populated from A. Table A doesn't seems like a staging table as I first thought. There is obviously more going on in the process. So is it manual or automated? What are the rules? > > This system had been working fine until a new person started and > instead of just adding a new language to the original row in table A > she added 1000+ rows that are identical to those already in the table > except for the fact the language is Japanese. So now the table looks > like > Table A: > 123 -- Homer -- lazy -- HR -- French;Spanish;German > 456 -- Homer -- lazy -- HR -- Japanese; > > Table B: > 123 -- French -- manager -- approved > 123 -- Spanish -- admin -- pending > 123 -- German -- teaboy -- proposed > 456 -- Japanese -- janitor -- approved > > I have been given the unenviable task of fixing this The first error this shows is that using an ID column as the PK buys you ZERO data integrity. The second error this shows is tha application fails to find the related data in table A. yes you have a real mess. So go to your backup from the night before and start over. reload those two tables and tell the entry person to do it right this time (it's actually not her fault, but she still has to reenter any data from that day.) You do do backups don't you? (By now this solution is too late, you let too much other stuff get inserted. time for plan B) > ... by, identifying > all terms that have identical attributes (except for the LANGUAGES > attribute), But the only common attribute is the source ID between table B and table A. > ... changing the SOURCEID of the duplicate in table B to the > master term SOURCEID (so in the example above, in table B change the > row 456 -- Japanese -- janitor -- approved to 123 -- Japanese -- > janitor -- approved), tagging on Japanese onto the master term (in the > example above, in table A change the row 123 -- Homer -- lazy -- HR -- > French;Spanish;German to 123 -- Homer -- lazy -- HR -- > French;Spanish;German;Japanese > in table A (in the example above, delete the row 456 -- Homer -- lazy > -- HR -- Japanese > > Thanks for sticking with me this far (if you have!). Luckily enough I > know for a fact that all the master terms are in the SOURCEID range 1 > to 2000 and the duplicates are in the SOURCEID range 2500 to 6000. so so this was either a bulk load, or manual data entry over a period of time. in either case the process is flawed. > identifying the terms is not a problem. I just ran an intersect query > and found my ~1000 culprit terms. As you probably know though, from > this I am not getting the SOURCEIDS of either row so basically i have > been getting the SOURCEIDS of the identified terms and running the > following command manually. > > update target set target.SOURCEID=MASTER_NUMBER where > target.SOURCEID=DUPLICATE_NUMBER; > update source set > source.LANGSREQUIRED='French;Spanish;German;Italia n;Brazilian > Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French > Canadian;Dutch;Korean;Japanese;' where source.SOURCEID=MASTER_NUMBER; > DELETE FROM source WHERE sourceid = DUPLICATE_NUMBER; Where's table A and table B??? which is which?? > > As you can tell my wrists are about to fall off with the RSI of > continually trying to find out the IDs and then run these commands. > SQL is not really my game so i am not sure whether i would be able to > use some procedure or cursor or something along those lines... I have > already wasted an awful lot of time trying to read of some way to help > me by running different commands but i am very quickly running out of > time... > > Any help at all would be appreciated or even just a big fat no to say > you're gonna have to put up with it and just do it manually. > > Thank you reading all this and I hope that I have explained it clearly > enough, > Jonathan. Solutions: 1. A PL/SQL procedure would be the fastest IF you knew PL/SQL. Given you feel less than familiar with SQL, might I ask how you got this assignment? No one else there knows Oracle? 2. If you can do it manually, then why not spool out the list of new/ old ids, and languages (ie just a SELECT query) and edit the file to make the UPDATE statements you need. You do know how to use an editor like VI with macro commands like global substitute, right? (If you say no to that, then I pity you since you clearly work for an incompetent manager who assigns people to tasks they cannot pereform.) long term, either change this application or get another job. ed |
| |||
| On Feb 8, 8:53 pm, "Ed Prochak" <edproc...@gmail.com> wrote: > On Feb 8, 2:14 pm, "The J Man" <jonathan.kee...@gmail.com> wrote: > Thanks for both of the replies... > > > it is not clear how B gets populated from A. Table A doesn't seems > like a staging table as I first thought. There is obviously more going > on in the process. So is it manual or automated? What are the rules? > Sorry, my explanation was not too clear and i was using sample data rather that the actual data, my bad. Table A is basically a list of English terms. Each term has a number of translations it needs to be translated into. This entry goes through a few status changes and when it gets to approved a row is created in table B for each language with the SOURCEID of the English term. Table A is manually added. Table B is automatically populated and then manually edited.( i.e. the translation of the actual English term is added) > The first error this shows is that using an ID column as the PK buys > you ZERO data integrity. > The second error this shows is tha application fails to find the > related data in table A. > yes you have a real mess. > Well not really, because we may want duplicate entries in the table as long as they have a valid reason for doing so. This is why this problem occurred. When the user enters her row she is warned that an entry for this already exists and asks her are they sure they want to add the new row. This is for the scenario where an English word may have different contexts in different languages. I may have given the wrong impression of this as i did not clearly explain what i was actually doing, i tried to ask a generic question. > So go to your backup from the night before and start over. reload > those two tables and tell the entry person to do it right this time > (it's actually not her fault, but she still has to reenter any data > from that day.) You do do backups don't you? > (By now this solution is too late, you let too much other stuff get > inserted. time for plan B) as you said, too late. Unfortunately. > > But the only common attribute is the source ID between table B and > table A. I meant common attributes between tuples in table A. So any tuple within table A that has the same name and description. > > so this was either a bulk load, or manual data entry over a period of > time. > in either case the process is flawed. > It was a manual data entry over time. Could you tell me why it is flawed? > > update TARGET set TARGET.SOURCEID=MASTER_NUMBER where > > TARGET.SOURCEID=DUPLICATE_NUMBER; > > update SOURCE set > > SOURCE .LANGSREQUIRED='French;Spanish;German;Italian;Braz ilian > > Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French > > Canadian;Dutch;Korean;Japanese;' where SOURCE .SOURCEID=MASTER_NUMBER; > > DELETE FROM SOURCE WHERE SOURCEID= DUPLICATE_NUMBER; > > Where's table A and table B??? which is which?? > Apologies, i copied that straight from my SQL Developer, table A is SOURCE and table B is TARGET. > Solutions: > 1. A PL/SQL procedure would be the fastest IF you knew PL/SQL. > I think i will go down this road. I was thinking of using two cursors, one to get all entries under 2000, and another to see if there are any dupes. If there are, update the rows and then fetch the next row. > Given you feel less than familiar with SQL, might I ask how you got > this assignment? No one else there knows Oracle? Bingo. When i said i was less than familiar, i meant apart from creating tables, triggers, sequences etc. I would have no in depth experience of using cursors or anything like that. (which is about to change!) > You do know how to use an editor like VI with macro commands like > global substitute, right? > (If you say no to that, then I pity you since you clearly work for an > incompetent manager who assigns people to tasks they cannot pereform.) Yes, i do know how to that. But i do have an incompetent manager. > > long term, either change this application or get another job. > Thank you for your help, J. |
| |||
| On Feb 9, 6:21 am, "The J Man" <jonathan.kee...@gmail.com> wrote: > On Feb 8, 8:53 pm, "Ed Prochak" <edproc...@gmail.com> wrote: > [] > Sorry, my explanation was not too clear and i was using sample data > rather that the actual data, my bad. > Table A is basically a list of English terms. Each term has a number > of translations it needs to be translated into. This entry goes > through a few status changes and when it gets to approved a row is > created in table B for each language with the SOURCEID of the English > term. Table A is manually added. Table B is automatically populated > and then manually edited.( i.e. the translation of the actual English > term is added) > > > The first error this shows is that using an ID column as the PK buys > > you ZERO data integrity. > > The second error this shows is tha application fails to find the > > related data in table A. > > yes you have a real mess. > > Well not really, because we may want duplicate entries in the table as > long as they have a valid reason for doing so. This is why this > problem occurred. > When the user enters her row she is warned that an entry for this > already exists and asks her are they sure they want to add the new > row. This is for the scenario where an English word may have different > contexts in different languages. I may have given the wrong impression > of this as i did not clearly explain what i was actually doing, i > tried to ask a generic question. > [] > > so this was either a bulk load, or manual data entry over a period of > > time. > > in either case the process is flawed. > > It was a manual data entry over time. Could you tell me why it is > flawed? it is flawed because the design allows these duplicates. A better design would have dealt with this diferently. Not knowing all your business requirements I won't suggest a solution here (and I don't have one off the top of my head). Another easy to point out flaw is the DEPARTMENT attribute in table A. That really should be another table, or (since dups are allowed in this table) multiple rows. You did not come here for a design review, but a big part of what I posted was just that. I apologize if I offended you. > [] > > Solutions: > > 1. A PL/SQL procedure would be the fastest IF you knew PL/SQL. > > I think i will go down this road. I was thinking of using two cursors, > one to get all entries under 2000, and another to see if there are any > dupes. If there are, update the rows and then fetch the next row. > > > Given you feel less than familiar with SQL, might I ask how you got > > this assignment? No one else there knows Oracle? > > Bingo. When i said i was less than familiar, i meant apart from > creating tables, triggers, sequences etc. I would have no in depth > experience of using cursors or anything like that. (which is about to > change!) Okay, sounds like a good plan. Any assignment where you expand you knowledge is good. > > > You do know how to use an editor like VI with macro commands like > > global substitute, right? > > (If you say no to that, then I pity you since you clearly work for an > > incompetent manager who assigns people to tasks they cannot pereform.) > > Yes, i do know how to that. But i do have an incompetent manager. Another mea culpa. I assumed you were like some of the posters who can barely write a SELECT query but get assigned projects like this. I still think some table redesign is in order, but that doesn't solve your immediate problem. > > > > > long term, either change this application or get another job. > > Thank you for your help, > J. I hope I did. Now it sounds more like you are a good fit there. so keep the job, and thank you for not taking offense at my less than complimentary remarks. Ed |
| ||||
| On Feb 9, 2:10 pm, "Ed Prochak" <edproc...@gmail.com> wrote: > On Feb 9, 6:21 am, "The J Man" <jonathan.kee...@gmail.com> wrote: > > > it is flawed because the design allows these duplicates. A better > design would have dealt with this diferently. Not knowing all your > business requirements I won't suggest a solution here (and I don't > have one off the top of my head). Another easy to point out flaw is > the DEPARTMENT attribute in table A. That really should be another > table, or (since dups are allowed in this table) multiple rows. > > You did not come here for a design review, but a big part of what I > posted was just that. I apologize if I offended you. > No, no offense taken. I realize that any design that allows dupes in a database sucks but the end user (read stupid manager) won out in this case. > > Okay, sounds like a good plan. Any assignment where you expand you > knowledge is good. > Yes it seems to have worked on the test tables i have done out so its just a matter of running it now on the production database, all the while hovering over the rollback button! > > Another mea culpa. I assumed you were like some of the posters who can > barely write a SELECT query but get assigned projects like this. I > still think some table redesign is in order, but that doesn't solve > your immediate problem. > Nah, i realize it probably came off like that because i fluffed around the facts of what it was actually trying to do. A re design could be in order as well though. > > I hope I did. Now it sounds more like you are a good fit there. so > keep the job, and thank you for not taking offense at my less than > complimentary remarks. > Ed No problem, thanks again, I'd rather get any replies than none at all. J. |