This is a discussion on Having difficulty writing a "best-fit" query.. within the pgsql Sql forums, part of the PostgreSQL category; --> Hi list, I have a many-to-many relationship between movies and genres and, in the link table I have a ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi list, I have a many-to-many relationship between movies and genres and, in the link table I have a third field called which orders the "appropriateness" of the relations within each movie. For example: movie_id, genre_id, relevance (i've used movie/genre titles for clarity here, but in reality they're id's) -------------------------------------------- beverly hills cop, action, 2 beverly hills cop, comedy, 1 the heartbreak kid, comedy, 2 the heartbreak kid, romance, 1 The above would mean, to my application: "Beverly Hills Cop is both an Action and a Comedy movie but primarily an Action movie." "The Heartbreak Kid is both a Comedy and a Romance movie but primarily a Comedy movie." First of all, if there's a better way to model this kind of ranking/ordering of many-to-many relationships, please let me know. Now, to my problem.. I'm taking a subset of all my genres, and I want to get ONE row for each movie in the subset alongside its most appropriate genre (whichever has the highest relevance). In other words, the best fit. I've had a few goes at writing this query but I can't seem to get it right.. The theory in my mind is to: 1. filter the link table down to rows that fit the subset of categories (easy) 2. filter the link table further to keep only the max(relevance) for each movie_id .. this is where i'm having trouble. If someone can shed some light on this for me, I'd really appreciate it. Thanks for your time, Jamie |
| |||
| On 10/16/07, Jamie Tufnell <diesql@googlemail.com> wrote: > I'm taking a subset of all my genres, and I want to get ONE row for each > movie in the subset alongside its most appropriate genre (whichever has the > highest relevance). In other words, the best fit. You didn't provide the expected output, but try: SELECT * FROM t tt WHERE relevance = (SELECT MAX(relevance) FROM t WHERE movie_id = tt.movie_id) ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| --- Jamie Tufnell <diesql@googlemail.com> wrote: > movie_id, genre_id, relevance (i've used movie/genre titles for clarity > here, but in reality they're id's) > -------------------------------------------- > beverly hills cop, action, 2 > beverly hills cop, comedy, 1 > the heartbreak kid, comedy, 2 > the heartbreak kid, romance, 1 > First of all, if there's a better way to model this kind of ranking/ordering > of many-to-many relationships, please let me know. Joe Celko had an example like this in his book: 17: EMPLOYMENT AGENCY PUZZLE http://www.elsevier.com/wps/find/boo...on#description the only difference was that he was modeling employees and skillsets. IIRC, the terminology for the improved model was the "full disjuctive" model. Regards, Richard Broersma Jr. ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On Oct 16, 2007, at 12:14 , Richard Broersma Jr wrote: > the only difference was that he was modeling employees and > skillsets. IIRC, the terminology for > the improved model was the "full disjuctive" model. Off chance, might the full disjunction work be relevant here? http://pgfoundry.org/projects/fulldisjunction/ Michael Glaesemann grzm seespotcode net ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| ||||
| In article <b0a4f3350710160946y4a31a227sbffbf085cdbfa487@mail .gmail.com>, "Jamie Tufnell" <diesql@googlemail.com> writes: > Hi list, > I have a many-to-many relationship between movies and genres and, in the link > table I have a third field called which orders the "appropriateness" of the > relations within each movie. > For example: > movie_id, genre_id, relevance (i've used movie/genre titles for clarity here, > but in reality they're id's) > -------------------------------------------- > beverly hills cop, action, 2 > beverly hills cop, comedy, 1 > the heartbreak kid, comedy, 2 > the heartbreak kid, romance, 1 > The above would mean, to my application: > "Beverly Hills Cop is both an Action and a Comedy movie but primarily an Action > movie." > "The Heartbreak Kid is both a Comedy and a Romance movie but primarily a Comedy > movie." > First of all, if there's a better way to model this kind of ranking/ordering of > many-to-many relationships, please let me know. This looks fine to me. > Now, to my problem.. > I'm taking a subset of all my genres, and I want to get ONE row for each movie > in the subset alongside its most appropriate genre (whichever has the highest > relevance). In other words, the best fit. You could use something like that: SELECT m.name, g.name, mg.relevance FROM movies m JOIN mg ON mg.movie_id = m.id JOIN genres g ON g.id = mg.genre_id LEFT JOIN mg mg1 ON mg1.movie_id = mg.movie_id AND mg1.relevance > mg.relevance WHERE mg1.movie_id IS NULL This means that there must not be an link table entry for the same movie with a higher relevance. ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |