As I've said before, I'm a real big fan of SQL Profiler. You see stuff there you can't see any place else. So yesterday afternoon, looking to get a little more speed out of the framework, I fired it up and looked for inspiration.
I usually look to the Duration column to identify prime offenders. High values mean a slow and costly query. If that query is being made repeatedly, the impact can be significant. Based on the information I found there, I went back through SqlDataProvider (I'm focusing my attention on SqlServer for now) and made a few tweaks. Mostly, these consisted of moving any non-database operations executed inside readers outside, with the theory that the shorter the period of time that a reader stay open, the better. These resulted in minor improvements, but nothing dramatic.
Rob and I have a testing database (a BIG thanks to the donor) that could be considered "worst case scenario." This monstrosity has over 230 tables and well over 600 stored procedures. If there are SubSonic performance problems, this database will them apparent. So as I sat there watching SubCommander and Profiler scroll by, I had a bit of an epiphany.
SubSonic relies heavily on INFORMATION_SCHEMA queries to build classes. The queries and provide useful information about keys, constraints, column, etc. However, due to the large number of JOINs, these queries are very costly. And since these get called several times for each table, the larger the database, the worse the hit.
But the funny thing about these queries is that really the only thing that varies from on execution to the next is the parameter(s) passed to it (usually the table name) that should be used to limit the results. So what if we removed the limiting parameters, and returned the whole list to an in memory datatable? That way, we would only have to perform the SQL query once, and after that we'd simply perform as DataTable.Select() to get the given subset! So I made the change. Here are the results against the aforementioned database:
SubCommander Full Generation (Before DataTables)
Execution Time: 4:57
Schema Queries: ~1500
SubCommander Full Generation (After DataTables)
Execution Time: 1:45
Schema Queries: <10
a 282% improvement... Cool!
But this was with SQL Server on a local machine. What if it was LAN connection? I scripted up the DB on an idle server and gave it a shot: (100 MBp/s)
SubCommander Full Generation (Before DataTables)
Execution Time: 17:50 (Full SQL CPU utilization the ENTIRE time);
Schema Queries: ~1500
SubCommander Full Generation (After DataTables)
Execution Time: 1:45
Schema Queries: <10
a 1200% improvement! And with so few queries, there's virtually no penalty for remote execution.
Cool stuff... And you'll make your DBA happy...
Other Changes
- Added the first pass at SubSonic Central, which will replace SampleWeb in the near future.
- Remove the nUnit testing project, which had become so out of sync that it was basically useless.
- Made several String handling optimization across SubSonic in potentially high impact
- Minor feedback improvements to SubCommander.
Download