Copy link to clipboard
Copied
I have a table in my database with the following fields:
user_name
score
points
Some users have 30 scores and others have 5. What I want to do is create a recordset that will calculate the highest 15 points for each user. Then count the total scores (either 15 or less) and get an average points. The only way I have been able to figure anything out is by defining an exact user. I can't get it to calculate it for each users and create a list.
Any ideas?
Copy link to clipboard
Copied
It hard to understand without knowing the definition of your fields. What's the difference between score and points? Please provide some sample data. Sounds like you are looking for a 'top n per group' solution.
Search for that term or see if this helps:
http://sqlblog.com/blogs/adam_machanic/archive/2008/02/08/who-s-on-first-solving-the-top-per-group-problem-part-1-technique.aspx
http://www.sql-ex.ru/help/select16.php
Copy link to clipboard
Copied
Example:
Field: username
Field: score
Field: points
Data example:
Username: bill
Score: 80
Points: 130
Username: john
Score: 84
Points: 125
Username: bill
Score: 99
Points: 150
Username: jim
Score: 77
Points: 65
Username: john
Score: 104
Points: 55
Obviously, I have hundreds of lines of data, but this is an example. The
score is the score and the points are how many points they get associated
with the score. Varies all the time.
I want to find the highest 15 points for each username. Then I want to add
the points and divide by how many records they actually have. Not all users
will have 15 records. I want this for all different usernames.
Copy link to clipboard
Copied
Maybe this is better. Here is the code written for an individual user.
This works and is the data I need, however, I need a recordset to give me
this information for ALL users in a single recordset.
Here it is:
SELECT SUM(event_points) AS e_sum, COUNT(event_points) AS e_count, usern
FROM (
SELECT event_points, usern
FROM events
WHERE usern = 'orccim' AND event_date >= '2010-03-22'
ORDER BY event_points DESC
LIMIT 15) AS new_table
I must be missing something because I can't figure out how to get it to
calculate for each different user automatically.
Thanks for the help!
Copy link to clipboard
Copied
>I must be missing something because I can't figure out how to get it to
>calculate for each different user automatically.
The LIMIT clause limits the total rows returned from a query - so you can't use it alone in either the outer query or the derived table query as it would limit the rows returned for all users. I would suggest using a correlated subquery. A correlated subquery has an inner query that runs once for each row of the outer query. In your case, the outer query's where clause would 'join' the userid with the inner query (using the in predicate). Each outer row is evaulated by the inner query to determine if it's in the top 15.
If you're not familiar with correlated subqueries, search the web for that term. They take a little time to wrap your head around. Write back if you still need help.
Copy link to clipboard
Copied
Thanks. You are right...it looks like that is what I need, but whoa, it is
a little tough to understand.
Could you give me a query example with my fields as a basis for me to figure
out? Thanks.
Copy link to clipboard
Copied
What is the primary key?
Copy link to clipboard
Copied
User_name
Copy link to clipboard
Copied
Your sample data indicated that the user_name may appear in more than one record, so it is not a primary key. Does this table have a primary key?
Copy link to clipboard
Copied
I am only using ONE table.
Just to recap.
Table fields are:
Username
Score
Points
Date
I want to take the best 15 points from each username (if they don't have 15,
just however many they have). Then count those points (15 or less) and give
me an average of points. And I want this in one recordset by username.
Copy link to clipboard
Copied
>I am only using ONE table.
Huh? I didn't ask how many tables. I asked what was the primary key. You said it was user_name, but your sample data listed more than one row with the same name. A primary key must uniquely identify each record.
Copy link to clipboard
Copied
Ok, I didn't really understand how that was relevant, but the primary key is
the field:
Event_id
Copy link to clipboard
Copied
>Ok, I didn't really understand how that was relevant,
OK. With some DBMS's, you have a few ways to solve the 'top n per group' problem. I don't know MySQL that well, but as far as I can tell your only option is a correlated subquery. In order to use a correlated subquery, you have to test each row from the outer query inside the inner query - sooo, you need to be able to uniquely identify each row.
I need to get home now, so I will try to work on this for you tomorrow.
Copy link to clipboard
Copied
Try this:
Select event_points, usern
FROM events as e1
Where Event_id in
(SELECT event_id
FROM events as e2
WHERE e2.event_date >= '2010-03-22'
and e2.usern = e1.usern
ORDER BY event_points DESC
LIMIT 15)
Copy link to clipboard
Copied
What I can tell from what data you need, even if you can achieve this in a single SQL shot, you will get awful amount of mixed garbage, which you have to pick and choose the info you need with big and most likely nested loops and control statements.
Sometimes, the simplest and and the most primitive solutions yields the best results:
Here is the code abstract (not actual programming code):
#Get all the users first:
mysql_result = SELECT user_name FROM events GROUP BY user_name
#then, loop inside that result:
while (rows = mysql_result->fetch){
each_user_result = SELECT SUM(user_points) AS sum_points, COUNT(user_score) as total_records,
user_points, user_score, user_name FROM events WHERE user_name = rows["user_name"] ORDER BY user_points DESC LIMIT 15
while(user_info = each_user_result->fetch){
#use the user data here....
}
}
Copy link to clipboard
Copied
>What I can tell from what data you need,
>even if you can achieve this in a single SQL shot,
>you will get awful amount of mixed garbage,
You can do it with a single SQL statement. Why do you think there will be garbage? The output requirements they posted are quite simple. You would not need any nested loops. On the other hand, you are suggesting a nested loop which (if I am reading your code correctly) puts a call back to the database for each outer row.
>Sometimes, the simplest and and the most
>primitive solutions yields the best results:
Sometimes, but not always. Correlated subqueries often have poor performance, but so does a procedure that has to send multiple queries back to the database. You'd have to test with trial and error to see.
Copy link to clipboard
Copied
What I understand is, the SQL result with single SQL statement will be something like this:
each user's 15 top records with points and scores of all 15 records individually, AND sum of each user's points AND count of each user's records (less than 15), which can be a big chunk of data with all the user's names and data mixed in same result.
The code I posted, as I said, is an abstract. Every SQL call in the loop limits the query to 15, and if you use some prepared statements, like PDO Mysql, you can retrieve the data a lot faster by only changing the variables (user name) in the same database call.
Probably a matter of taste, but single SQL statement can easily return very big amount of data if the user list grows to few thousands or more.
Copy link to clipboard
Copied
>Probably a matter of taste, but single SQL statement
>can easily return very big amount of data if the user list
>grows to few thousands or more.
True. The single SQL statement will result in more data (not garbage data) being returned as each row will contain the user's individual points/score as well as the aggregates. But assuming each 'query' does result in thousands of rows, I think I would worry more about the thousands of database requests over the increase in data. If the numbers are small, either method would work fine.
Copy link to clipboard
Copied
I tried both ways and can't get either to work properly.
What do you mean at the end where you say #use the user data here?
I am not to familiar with loops so I am obviously missing something...
Get ready! An upgraded Adobe Community experience is coming in January.
Learn more