Ray Camden asked me to provide some info on why I used nested structs instead of one large flat struct in my CF_Accelerate tag. This all came about when I was doing performance testing on an application that was doing a lot of caching. I tested with CF_Supercache and it uses a large flat struct and we noticed performance degradation as the cache grew very large 5,000-10,000 entries. My thought was that if we broke this down into a tree of structures with less entries in each struct we could speed up the performance. Our load tests in comparison proved the theory. I didn't have exact numbers so I tried another test on my own.
Instead of just wrapping getTickCount() around the code and seeing how long it took to complete I used a load test tool to see how it would act underload in a multithreaded environment. I first created a flat structure and ran Microsoft Web Application Stress Tool against a page that called StructKeyExists() then tried isdefined(). Both StructKeyExists() and isdefined() churned out similar numbers (structure example below)
application.struct1.x
application.struct2.x
......
application.struct10000.x
|
Flat Structure |
| Struct Entries |
Pages/sec |
| 10 |
761 |
| 10000 |
210 |
The key I found in this test with the 10,000 entry structure that if I called isDefined() on the first entry in the flat structure I got 700+ pages/sec but if I called it on a entry that was further down the numbers dropped. In this test I called it on the 5,000 entry in the structure.
Next I created a nested structure tree. (example below)
application.struct1.stuct1.x
....
application.struct1.tree1000.x
application.struct2.tree1.x
....
application.struct2.tree1000.x
I used 10 structures each containing 1000 entries. Basically the same test as above with the data broken out into 10 structures. I load tested a page that was looking for the 500th entry in one of the structures and found that I was able to get 650 pages/sec. This is three times faster than the flat structure above and using this approach in a caching tag makes a big difference.
I also tried one other test with an 11th structure in this nested tree that contained 100 entries. This brought the total number of entries to 10100. I put an isdefined() call in a page that looked for application.struct11.tree50 and found that I was able to get in excess of 750 pages/sec. This proved my theory that you can improve performance by breaking out the structures into smaller trees. At least this approach worked for my caching tag.
One other note I think that is worth mentioning. I found that this theory applies even more on a multi CPU machine.
The custom in-memory caching tag CF_Accelerate uses this approach and you can download it here
Thanks for providing more info! I'm going to try to confirm this on my end as well. I have my own caching tag, and I think I may try adding a simpler version of this into my code. (My version caches by a cache name - so I think I'll try making subtrees based on the first letter of the cache name. Not great, but a quickie fix.)
On the subject of struct access speeds, I have found that struct loading speeds can be very slow for large (>1000 keys) structs if the keys are fairly similar.
In the code below, setting 5000 "similar" keys in a struct takes 35.5 seconds on my development machine. If we repeat the same loop, but reverse the similar keys to make them more "different", we get it done in 1.7 seconds.
I have submitted a bug/request to MM about this, but I don't know how/whether one can search their site to reference the bugID.
Jon Gunnip
<cfset Start1 = GetTickCount() > <cfset MyStruct = StructNew() > <cfloop from="1" to="5000" index="index" > <cfset MyStruct["FM00000000#index#"] = "John Doe" > </cfloop> <cfset End1 = GetTickCount() >
<cfset Start2 = GetTickCount() > <cfset MyStruct2 = StructNew() > <cfloop from="1" to="5000" index="index2" > <cfset MyStruct2[Reverse("FM00000000#index2#")] = "John Doe" > </cfloop> <cfset End2 = GetTickCount() >
<cfoutput> Unencrypted Loop: #End1 - Start1# <br> Encrypted Loop #End2 - Start2# </cfoutput>
We have had similar issue as well using CFMX 7.0.1. Our solution was to switch from using CF structure to using a hashmap.
The first timer executes in 6 minutes on my machine. The second timer executes in 1 second....
<cftimer label="structtest" type="debug"> <cfset s = structNew()> <cfloop from="1" to="100000" index="idx">
<cfset s["xxxx#idx#"] = 0> </cfloop> </cftimer>
<cftimer label="javatest" type="debug"> <cfobject type="java" class="java.util.HashMap" action="create" name="o"> <cfloop from="1" to="100000" index="idx">
<cfset o.put( "xxxx#idx#", "0" )> </cfloop> </cftimer>
I have done some research into why this happens. It is to do with the hashcode algorithm used for structs in CF. A more detailed writeup is at <a href="http://cfdumped.blogspot.com/2006/09/bad-hash-function-causes-problems-with.html">http://cfdumped.blogspot.com/2006/09/bad-hash-function-causes-problems-with.html</a>