Thursday, May 29, 2008

GSoC 2008: Progress report

I have made a lot of progress on my Google Summer of Code Project.

First, I've completed the code for the generation of the directory structure for the forms. This is done using the form id, which is the name with all illegal characters removed. The only valid characters left are alpha-numeric and periods. Let me tell you, the regular expression to replace all of those characters was hell to write. But, thanks to the help of Jason Davis, I was able to make it much more readable and easier digest. You've gotta see it to get an idea of how bad it was, so here goes:


/**
* Removes all illegal characters, then removes all spaces and makes it all lower-case.
*
* @param str the form name
* @return a viable form id.
*/
public static String formNameToFormId(String str) {
return str.replaceAll("[\\$\\(\\)%`~!@#&;:\\^=\\+\\?<>\"\\*\\\\////\\\\{\\}'\\]\\[\\|\\s+]", "").toLowerCase();

}

That is pretty much hell on earth to read, and it was hell on earth to type. Here's the *MUCH* simpler version:


/**
* Removes all illegal characters, then removes all spaces and makes it all lower-case.
*
* @param str the form name
* @return a viable form id.
*/
public static String formNameToFormId(String str) {
return str.replaceAll("[^a-zA-Z0-9.]", "").toLowerCase();

}

Note, you can clearly see that i'm negating that entire sequence. It does the SAME thing, but is MUCH more readable. That reads: "replace everything that isn't a letter from A to Z *OR* a to z(lower case letters) *OR* from 0 through 9 *OR* a period. Whereas the above just excluded the invalid characters explicitly but was a MONSTER of a regular expression.

Secondly, metadata persistence is written in, I'm using xstream. It's a great tool for XML serialization.

That's it. Oh, by the way, these are the things I earmarked TWO WEEKS FOR! Now, I push domain class model processing up, will work on that tomorrow.