Click or scroll down Circle me on Google+ Fork me on GitHub Follow me on Twitter Ask me on Stack Overflow Gild me on Reddit Code Ninja, Entrepreneur, Idiot ChalkHQ — consulting, prototyping, mentoring HighF.in — resolve innefficiencies in your startup's workflow DearDrum.org — online open-mic / creative space The Dirac Equation (click to WikiPedia) A maxim Sun Tzu references in his magnum opus The Art of War

If you know the enemy and know yourself, you need not fear the result of a hundred battles.
If you know yourself but not the enemy, for every victory gained you will also suffer a defeat.
If you know neither the enemy nor yourself, you will succumb in every battle.
Fork me on GitHub

Tags

actionscript ad-hoc networks Adobe AIR anonymous Apple array Browsing convert Debugger Error Facebook file permissions Flash Flex fonts function future Google Google Plus grid hackers html javascript logs loop network p2p php privacy regexp Security Server social ssl technology terminal time Twitter upgrade Web 2.0 Web 3.0 Web 4.0 Web 5.0 wordpress

Featured Posts

  • Javascript: Undefined parameters
  • The Web, A Look Forward
  • Let Postfix send mail through your Gmail Account – Snow Leopard
  • Archives

  • April 2013
  • December 2011
  • July 2011
  • June 2011
  • March 2011
  • February 2011
  • January 2011
  • November 2010
  • October 2010
  • September 2010
  • July 2010
  • May 2010
  • Categories

  • Code
  • Design
  • Opinion
  • Security
  • Tools
  • Uncategorized
  • MySQL Table Order

    Here's the situation. You have User_Table — which is a table of users, and a Content_Table — which is a table of content. They have a column in common called u_id which is the user's unique id whom submitted the content.

    What we want to do is get a list of users where for each user we show their latest piece of content. We want to show every user once, even if they haven't submitted anything, and we want the user list organized by date of last submitted content. You could do this with two queries, the challenge is to do it with one.

    So here's a SQL query:

    "SELECT User_Table.u_id, User_Table.username, Content_Table.c_id FROM User_Table
    LEFT JOIN Content_Table ON User_Table.u_id = Content_Table.u_id
    GROUP BY User_Table.u_id HAVING COUNT(User_Table.u_id) >= 2
    ORDER BY Content_Table.date_added DESC"
    

    Let's just run through the query. We're doing a SELECT on the user's id and username, and the content's id. We want a list of users so we get it FROM the User_Table, and do a LEFT JOIN with the Content_Table based on the u_id column. This multiplies the two tables together and you end up with this big imaginary table so we GROUP BY u_id. We don't want to return the entire big multiplied table, we only want 1 entry for each user and only HAVING more than two entries in the big multiplied table. Because the big multiplied table includes the User_Table there will be at least one row for each user. If there's 2 rows or more it means there's at least 1 row in the Content_Table by that user. We then run an ORDER BY on the resulting smaller imaginary table which will organize the list of users based on the date_added column on the Content_Table.

    If you run this query you likely won't get the results you expected. What you get is a list of users who have submitted at least one content item. The list of users is sorted correctly - by the last content item submitted. So if UserG submitted content after UserD it'll show them in the appropriate order, but if a given user has submitted more than one content item, the c_id fetched will be for the first content item they submitted. This might seem weird because the list of users is correctly sorted based on the last content item they submitted.

    What's happening is MySQL processes the query in the order it's written, so it's running GROUP BY before it runs ORDER BY. What we'd like to happen is for the big imaginary table to be sorted by Content_Table.date_uploaded and then grouped by u_id. There are a number of ways to broach this issue that involve intensely convoluted queries; things like nesting, aliasing, and referencing the same table several times in different ways. That shit is hard to read and not compatible with older MySQL servers or different server configurations.

    Now you could run the following query just before you run the one above:

    "ALTER TABLE Content_Table ORDER BY date_added DESC"
    

    and it'll physically resort the table. The problem with doing this is that if you later INSERT, DELETE, or UPDATE(in some cases) the table, it'll mess with the ordering and reordering the table on the physical disk, and running the ALTER TABLE query can be resource intensive - especially in cases where you have a very large frequently used table.

    When I drafted this post a couple weeks ago I was on a tight deadline working on the new DearDrum.org and decided to de-prioritize the functionality until I had a chance to figure out a better way to do this. Still dealing with deadlines I figured I'd just post it and see if anyone else has a solution, or at least open it up for discussion.

    Category: Code

    Tagged: , ,