Copy link to clipboard
Copied
Simple question.
I've got a query that pulls four random records, like so:
<CFQUERY NAME="my_query" DATASOURCE="#request.my_datasource#" DBTYPE=ODBC MAXROWS=4>
SELECT p.first_name, p.last_name
FROM people p, offices o, buildings b, cities c
WHERE p.people_id = o.people_id AND o.building_id = b.building_id AND b.city_id = c.city_id AND c.city_id = 12
ORDER BY newid()
</CFQUERY>
Problem is, duplicates can occur if a the "offices" table has multiple instances of the same person, or "people_id".
"distinct" is out of the question, as it would have to be added to the SELECT list.
So, how can one generate random records, when "distinct" is not an option? I feel I got it to work with some GROUP BY trickery...
...but I'm not confident that's the best solution.
Copy link to clipboard
Copied
Why is "distinct" out of the question?
You can add min(b.building_id) to the select clause. This will accomplish two things. The group by clause will effectively be your "distinct" keyword, and you'll select fewer records. How may records are you selecting anyhow to get the 4 you want?
Many db's have a way of selecting "only x" or "top x" records using only sql. That method is more efficient than what you are doing with the maxrows attribute.
Copy link to clipboard
Copied
Do a subquery on the people table to select the random four people, then join just those four to the other tables.
--
Adam
Copy link to clipboard
Copied
Many db's have a way of selecting "only x" or "top x" records
using only sql. That method is more efficient than what you are
doing with the maxrows attribute.
Yes. I cannot recall the behavior of the current drivers, but in some previous versions MAXROWS would retrieve all records. Then discard everything but the first X rows.
Copy link to clipboard
Copied
-==cfSearching==- wrote:
Many db's have a way of selecting "only x" or "top x" records
using only sql. That method is more efficient than what you are
doing with the maxrows attribute.
Yes. I cannot recall the behavior of the current drivers, but in some previous versions MAXROWS would retrieve all records. Then discard everything but the first X rows.
A more important consideration is that - irrespctive of what CF & the DB driver know to do about row-limiting - unless one tells the DB to only fetch n rows, it'll fetch the whole recordset. Which is a large unnecessary overhead on the DB side of things.
One should always row-limit one's queries in the SQL in addition to the MAXROWS attribute.
--
Adam
Copy link to clipboard
Copied
A more important consideration is that - irrespctive of
what CF & the DB driver know to do about row-limiting -
unless one tells the DB to only fetch n rows, it'll
fetch the whole recordset. Which is a large unnecessary
overhead on the DB side of things.
Agreed. I was not suggesting one should rely on the driver behavior, but instead pointing out that it is not a reliable method. ie It does not necessarily limit the results in the way you might be thinking. As both of you have said, it should always be done explicitly in sql.
Copy link to clipboard
Copied
Possible query option:
<CFQUERY NAME="my_query" DATASOURCE="#request.my_datasource#" DBTYPE=ODBC MAXROWS=4>
SELECT DISTINCT TOP 4 first_name,last_name FROM (SELECT TOP 100 p.first_name, p.last_nameFROM people p, offices o, buildings b, cities c
WHERE p.people_id = o.people_id AND o.building_id = b.building_id AND b.city_id = c.city_id AND c.city_id = 12
ORDER BY newid()) subquery
</CFQUERY>
The TOP 100 is going to minimize the subquery, but you can adjust this or remove it depending on the number of possible sub-offices, buildings, etc.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now