Optimizing a Search Query Hello.
I have a really hard search query to perform and it's taking at least
4 seconds to get a result. Can anyone offer a better approach or
optimize my SQL?
We're searching for candidates with certain skills sets, software
experience and work experience from the following tables:
CREATE TABLE `candidates`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`first_name` varchar (127),
`surname` varchar (127),
`created_date` date,
`date_of_birth` date,
PRIMARY KEY (`id`)
) TYPE=InnoDB CHARACTER SET latin1 COLLATE latin1_swedish_ci;
CREATE TABLE `candidates_skills`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`candidate_id` integer (11) NOT NULL DEFAULT 0,
`skill_id` integer (11) NOT NULL DEFAULT 0,
`rating` tinyint (4) NOT NULL DEFAULT 0,
`years_experience` tinyint (4) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`)
) TYPE=InnoDB CHARACTER SET latin1 COLLATE latin1_swedish_ci;
CREATE TABLE `candidates_softwares`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`candidate_id` integer (11) NOT NULL DEFAULT 0,
`software_id` integer (11) NOT NULL DEFAULT 0,
`rating` tinyint (4) NOT NULL DEFAULT 0,
`years_experience` tinyint (4) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`)
) TYPE=InnoDB CHARACTER SET latin1 COLLATE latin1_swedish_ci;
CREATE TABLE `candidates_work_subcategories`
(
`id` integer (11) NOT NULL AUTO_INCREMENT ,
`candidate_id` integer (11),
`work_subcategory_id` integer (11),
PRIMARY KEY (`id`)
) TYPE=InnoDB CHARACTER SET latin1 COLLATE latin1_swedish_ci;
Both rating and years_experience in the candidates_skills and
candidates_softwares are scores out of 5.
When client's submit their search, sets of skills_id's, software_id's
and work_subcategory_id's are sent back to the server and the query is
performed in the following way:
A view is created to sum of the candidates rating and years_experience
scores for each related skill:
create view skill_view( candidate_id, skill_score) as
SELECT candidates.id, SUM( candidates_skills.rating +
candidates_skills.years_experience) as rating
FROM candidates, candidates_skills
WHERE candidates_skills.candidate_id = candidates.id AND
( candidates_skills.skill_id = 250 or candidates_skills.skill_id =
368)
GROUP by candidates.id
A second view is created to sum of the candidates rating and
years_experience scores for each related software:
create view software_view( candidate_id, software_score) as
SELECT candidates.id, SUM( candidates_softwares.rating +
candidates_softwares.years_experience) as rating
FROM candidates, candidates_softwares
WHERE candidates_softwares.candidate_id = candidates.id
AND ( candidates_softwares.software_id = 120 or
candidates_softwares.software_id = 112)
GROUP by candidates.id
A third view is created to sum of the number of work types applicable
to each candidates:
create view work_type_view( candidate_id, work_type_score) as
SELECT candidates.id, SUM( 1) as rating
FROM candidates, candidates_work_subcategories
WHERE candidates_work_subcategories.candidate_id = candidates.id
AND ( work_subcategory_id = 7 OR work_subcategory_id = 15) group by
candidates.id
And now the skills_view and software views are combined to create a
subtotal score; my understanding is that this needs to be done using a
Full Outer Join as candidates may be present in one view but not in
the other:
create view skill_software_view( candidate_id, sub_score) as
SELECT skill_view.candidate_id, ( if( software_score IS NULL, 0,
software_score) + if( skill_score IS NULL, 0, skill_score)) AS
total_score
FROM skill_view LEFT JOIN software_view ON skill_view.candidate_id =
software_view.candidate_id
UNION
SELECT software_view.candidate_id, ( if( software_score IS NULL, 0,
software_score) + if( skill_score IS NULL, 0, skill_score)) AS
total_score
FROM $skill_view RIGHT JOIN $software_view ON
$skill_view.candidate_id = $software_view.candidate_id
And finally the work type score is included by completing a full outer
join with the work_type_view and the skill_software_view:
create view best_candidates( candidate_id, total_score) as
SELECT work_type_view.candidate_id, ( if( sub_score IS NULL, 0,
sub_score) + if( work_type_score IS NULL, 0, work_type_score)) AS
total_score
FROM work_type_view LEFT JOIN skill_software_view ON
work_type_view.candidate_id = skill_software_view.candidate_id
UNION
SELECT skill_software_view.candidate_id, ( if( sub_score IS NULL, 0,
sub_score) + if( work_type_score IS NULL, 0, work_type_score)) AS
total_score
FROM work_type_view RIGHT JOIN skill_software_view ON
work_type_view.candidate_id = skill_software_view.candidate_id
And that's it! I can now perform queries on the best_candidates view.
Unfortunately these queries are taking at least 4 seconds for even a
few skills, software and work types.
Is there a way to perform the search more efficiently? I'd appreciate
any help or comment! |