Skip to main content
Inspiring
December 4, 2006
Question

Stats from Query of Query - is there a better way?

  • December 4, 2006
  • 3 replies
  • 715 views
I am building a dynamic statistics page for authors. It should show the number of publications generated by year (but still show a column if no pubs for a year between first year and latest year). Also, the number of times a publication is cited should be totaled up by type, per year. The example I have running seems very slow -- am I doing something wrong? is there a faster way?

The first script generates a simulated query with three columns and x number of rows. The initial query is inflexable and can not be used to generate any additional data.

This first part is to get the totals by type from latest year to first found year.

Any comments on how to make this better, stronger, faster for less than six-million dollars would be great.

THANKS!

    This topic has been closed for replies.

    3 replies

    December 6, 2006
    While it's great to have fully functional demo code in your problem, you're not really using CF as the database are you?

    The reason I ask is that the secret too this kind of report is to get the database to spit out a pivot/summary table all ready for display.

    This avoids loops and avoids needing CF to perform running queries and calcs.

    Just about every RDBMS can get the data this way. (CF is not an RDBMS and cannot.)

    If you tell me what DB you are using, I may be able to supply code for your particulars.
    DaveF67Author
    Inspiring
    December 7, 2006
    It is a Sybase Database.
    This was originally an afterthought to use the data from the output query to also generate stats. Yes, the Query of Queries is bad and slow, but it does work.
    The problem I had with Sybase was making the resulting data "fill in" years.

    If you run the sample ColdFusion code provided above you will see that data like this:

    1 0 Jrnl 2006
    2 0 Jrnl 2006
    3 0 Patn 2005
    4 1 Jrnl 2005
    5 14 Jrnl 2002
    6 3 Book 1999
    7 3 Jrnl 1999
    8 12 Jrnl 1999


    goes from year 1999, 2002, 2005, and 2006.

    How would I get a query to give me zeros or nulls for the missing years (i.e. 2000,2001,2003,2004) etc.

    How would I get the SQL to give me output that does this:

    <CFLOOP INDEX="x" FROM="#pubs['pubYear'][1]#" TO="#pubsMinYr.pubMinYear#" STEP="-1">

    This code loop from the highest year #pubs['pubYear'][1]# to the lowest year #pubsMinYr.pubMinYear# filling in any missing years. I then have to check if the year that is currently beeing looped over is one that has data.



    December 8, 2006
    Sybase, eh?
    That's a bummer, I was hoping you would say Access, MS SQL, MySQL or Oracle.

    I'm not familiar with Sybase yet, but if you tell me what version and whether it's ASE or SQL Anywhere, I will download it and adapt the following code as necessary.
    Please also, supply the original table and column names used for the original query.

    Anyway, this is a very common type of rollup / summary / pivot.
    Some databases make it easier than others but it's always possible to get results faster than with CF (or any non-SQL approach).

    Attached is code that is simplified for your particulars, works on my mockup, and should only require small syntax adjustments for Sybase (if any).

    The approach is this:
    1. Create a simple but very useful utility table (just has the digits 0 through 9).
    2. Use a cartesian join to get the target date range.
    3. Use another cartesian join to get the group-by matrix.
    4. Right Join the Raw table with the group-by matrix, grouping and computing the desired stats.
    5. Note that in practice, steps 2 through 4 are done in one query and some DB's automate the process more and can add roll-up data.
    6. Output now becomes very straight forward using CF's CFOUTPUT, group function.
    7. Rollups and grand totals are easy with low-impact Q of Q's.
    8. No running totals and no extraneous loops are needed.
      Skip down to the display code at bottom. That code produces the same results and the same look (styles) as your original demo code.


    Anyway, see the attached.
    Also, for another take on this kind of problem, see http://www.sswug.org/see/21626 (paid membership required) or google on that subject title.

    Inspiring
    December 4, 2006
    This was discussed at the CF Underground conference this past October.
    Straight from a ColdFusion Engineer's mouth, paraphrased by three months
    occupancy in my brain.

    "Cold Fusion is not a database management system. It's query parsing
    and processing is never going to be as efficient as what can be done in
    a database. If you have speed concerns when working with large and/or
    complex data, either try and have the database handle the heavy lifting
    and return just the results you need in the form you need. Or, convert
    the base query to some type of structure. Cold Fusion is very efficient
    at manipulating data structures."

    HTH

    Ian
    DaveF67Author
    Inspiring
    December 4, 2006
    This second part gets the citation counts by type, by year.

    Same structure as above. Still slower than I would like.