On 5/26/05, [email protected] <[email protected]> wrote:
> Hello,
>
> I have 3 tables (2 tables + 1 lookup table that ties them) and running
> a straight-forward aggregate count(*) query with a couple of joins
> takes about 10 seconds (and I need it to be sub-second or so).
> Also, I am wondering if this approach is scalable with my row-counts
> and my hardware (below).
>
> My slow query is this:
> ----------------------
> SELECT keyword.name, count(*)
> FROM user_data, user_data_keyword, keyword
> WHERE (user_data.user_id = 1)
> AND (user_data.id = user_data_keyword.user_data_id)
> AND (user_data_keyword.keyword_id = keyword.id)
> GROUP BY keyword.name
> ORDER BY COUNT(*) DESC LIMIT 10;
<SNIP>
> Is there any way of speeding up my query?
>
> Also, given the number of rows expected in those tables:
> user_data: 10M
> user_data_keyword: 40M
> keyword: 4M
This sounds like a perfect candidate for a summary table. You should
read Jonathan Gardner's writeup about materialized views. Depending on
your requirements, you'll either need to build triggers or a periodic
summarization you run. This sounds like a "top N" report so a periodic
update out to work.
http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html
Tony Wasson