Unix Technical Forum

Having difficulty writing a "best-fit" query..

This is a discussion on Having difficulty writing a "best-fit" query.. within the pgsql Sql forums, part of the PostgreSQL category; --> Hi list, I have a many-to-many relationship between movies and genres and, in the link table I have a ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Sql

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 05:58 PM
Jamie Tufnell
 
Posts: n/a
Default Having difficulty writing a "best-fit" query..

Hi list,

I have a many-to-many relationship between movies and genres and, in the
link table I have a third field called which orders the "appropriateness" of
the relations within each movie.

For example:

movie_id, genre_id, relevance (i've used movie/genre titles for clarity
here, but in reality they're id's)
--------------------------------------------
beverly hills cop, action, 2
beverly hills cop, comedy, 1
the heartbreak kid, comedy, 2
the heartbreak kid, romance, 1

The above would mean, to my application:
"Beverly Hills Cop is both an Action and a Comedy movie but primarily an
Action movie."
"The Heartbreak Kid is both a Comedy and a Romance movie but primarily a
Comedy movie."

First of all, if there's a better way to model this kind of ranking/ordering
of many-to-many relationships, please let me know.

Now, to my problem..

I'm taking a subset of all my genres, and I want to get ONE row for each
movie in the subset alongside its most appropriate genre (whichever has the
highest relevance). In other words, the best fit.

I've had a few goes at writing this query but I can't seem to get it right..
The theory in my mind is to:

1. filter the link table down to rows that fit the subset of categories
(easy)

2. filter the link table further to keep only the max(relevance) for each
movie_id .. this is where i'm having trouble.

If someone can shed some light on this for me, I'd really appreciate it.

Thanks for your time,
Jamie

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 05:58 PM
=?UTF-8?Q?Rodrigo_De_Le=C3=B3n?=
 
Posts: n/a
Default Re: Having difficulty writing a "best-fit" query..

On 10/16/07, Jamie Tufnell <diesql@googlemail.com> wrote:
> I'm taking a subset of all my genres, and I want to get ONE row for each
> movie in the subset alongside its most appropriate genre (whichever has the
> highest relevance). In other words, the best fit.


You didn't provide the expected output, but try:

SELECT *
FROM t tt
WHERE relevance = (SELECT MAX(relevance)
FROM t
WHERE movie_id = tt.movie_id)

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 05:58 PM
Richard Broersma Jr
 
Posts: n/a
Default Re: Having difficulty writing a "best-fit" query..

--- Jamie Tufnell <diesql@googlemail.com> wrote:
> movie_id, genre_id, relevance (i've used movie/genre titles for clarity
> here, but in reality they're id's)
> --------------------------------------------
> beverly hills cop, action, 2
> beverly hills cop, comedy, 1
> the heartbreak kid, comedy, 2
> the heartbreak kid, romance, 1
> First of all, if there's a better way to model this kind of ranking/ordering
> of many-to-many relationships, please let me know.


Joe Celko had an example like this in his book:

17: EMPLOYMENT AGENCY PUZZLE
http://www.elsevier.com/wps/find/boo...on#description

the only difference was that he was modeling employees and skillsets. IIRC, the terminology for
the improved model was the "full disjuctive" model.
Regards,
Richard Broersma Jr.


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 05:58 PM
Michael Glaesemann
 
Posts: n/a
Default Re: Having difficulty writing a "best-fit" query..


On Oct 16, 2007, at 12:14 , Richard Broersma Jr wrote:

> the only difference was that he was modeling employees and
> skillsets. IIRC, the terminology for
> the improved model was the "full disjuctive" model.


Off chance, might the full disjunction work be relevant here?

http://pgfoundry.org/projects/fulldisjunction/

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 05:58 PM
Harald Fuchs
 
Posts: n/a
Default Re: Having difficulty writing a "best-fit" query..

In article <b0a4f3350710160946y4a31a227sbffbf085cdbfa487@mail .gmail.com>,
"Jamie Tufnell" <diesql@googlemail.com> writes:

> Hi list,
> I have a many-to-many relationship between movies and genres and, in the link
> table I have a third field called which orders the "appropriateness" of the
> relations within each movie.


> For example:


> movie_id, genre_id, relevance (i've used movie/genre titles for clarity here,
> but in reality they're id's)
> --------------------------------------------
> beverly hills cop, action, 2
> beverly hills cop, comedy, 1
> the heartbreak kid, comedy, 2
> the heartbreak kid, romance, 1


> The above would mean, to my application:
> "Beverly Hills Cop is both an Action and a Comedy movie but primarily an Action
> movie."
> "The Heartbreak Kid is both a Comedy and a Romance movie but primarily a Comedy
> movie."


> First of all, if there's a better way to model this kind of ranking/ordering of
> many-to-many relationships, please let me know.


This looks fine to me.

> Now, to my problem..


> I'm taking a subset of all my genres, and I want to get ONE row for each movie
> in the subset alongside its most appropriate genre (whichever has the highest
> relevance). In other words, the best fit.


You could use something like that:

SELECT m.name, g.name, mg.relevance
FROM movies m
JOIN mg ON mg.movie_id = m.id
JOIN genres g ON g.id = mg.genre_id
LEFT JOIN mg mg1 ON mg1.movie_id = mg.movie_id
AND mg1.relevance > mg.relevance
WHERE mg1.movie_id IS NULL

This means that there must not be an link table entry for the same
movie with a higher relevance.


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:45 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com