Tagsvenson

Generating List<Something> from JSON with svenson

Often you’ll find yourself wanting to parse a JSON into a Java collection but want the values inside the collection to be of a specific type. Nothing easier than that.

import org.svenson.JSONParser;
…
// Getting a list containing your own type Something.
// Assume json to be a String containing the JSON dataset.
JSONParser parser = new JSONParser();
parser.addTypeHint("[]", Something.class);
List<Something> someThings = parser.parse(List.class, json);
// someThings will be a ArrayList instance by default. You can change
// that by changing the mappings for interfaces by calling
// org.svenson.JSONParser.setInterfaceMappings(Map<Class, Class>)

Parsing into a map is not much more complicated either

JSONParser jsonParser = new JSONParser();
jsonParser.addTypeHint(new RegExPathMatcher("\\.(f1|f2)"), Something.class);
Map<String,Object> someThings = jsonParser.parse(Map.class, json);

If we want to have our Something type for more than a single field, we need to setup a matcher. Here you see an example of a RegExPathMatcher that makes sure that both the keys “f1″ and “f2″ of the map we receive will be converted to Something, while all other fields are not.

If you want to convert all map properties to Something, the RegExPathMatcher would be like this

    … new RegExPathMatcher("\\..*") …

This would match every JSON path that starts with a property. If you don’t like RegularExpressions, or are on some kind of diet on them, you can also construct a more complex matcher tree from the compositable Matchers like this

JSONParser jsonParser = new JSONParser();
jsonParser.addTypeHint(new OrMatcher(
    new PrefixPathMatcher(".f1"),
    new PrefixPathMatcher(".f2")), Something.class);
Map<String,Object> someThings = jsonParser.parse(Map.class, json);

Update: Due to me fucking up both the Prefix-/Suffix- matchers as well as their tests, the last example will only really work with the current svenson trunk/future svenson 1.3.8


Annotating DOM nodes with JSON, Part 2

It’s been a while since I wrote Annotating DOM nodes with JSON and in retrospective I can say that I never really used the method described in a real life project. Now I’d like to present another method of decorating DOM nodes with JSON based on classes. This one I actually implemented in OpenSAGA to have arbitrary metadata from some of the OpenSAGA Widgets.

I didn’t really like the idea of misusing onclick for the purpose of meta-data and thought about a better way of doing it. Browsing the w3 HTML specs I came upon the fact that classes can be any character separated by spaces. So for use-cases where I only needed one meta-data value I used classes like

<div class="refId:id-1234">
    DIV content
</div>

A use-case specific prefix is used to mark a class as meta-data container containing the string after the prefix. The code to evaluate this in javascript is very easy

/**
 * Returns the class value with the given prefix using the giving separator
 * @param {DOMElement} elem DOM element to fetch metadata from
 * @param {String} name of the classval value
 * @param {String} separator to use between name and value. Default is ":"
 */
function classval(elem, name, separator)
{
    var match = new RegExp("\\b" + name + (separator || ":") + "([^ ]*)($| )")
                          .exec(elem.className);
    if (match)
    {
        return match[1];
    }
    return null;
}
…
// assume divElement to be DOM element of the div
var refId = classval(divElement, "refId");

I thought about going for a more elaborate prefix scheme to support nested metadata but in the end decided against it because I already have a nicely supported format for exchanging data between server and client: JSON. So I tried to come up with a scheme of using arbitrary JSON for the metadata decoration.

Only problem: Spaces are not valid inside classes, so I needed a method to encode and decode JSON into valid classes. The method should not totally mangle the JSON to keep readability and maybe write the encoded variant by hand for simple cases.

Solution:

  • HTML encode the JSON-String
  • Replace spaces with underlines and underlines with \u005f

The replacement of underlines is valid because underlines can only occur inside quoted JSON strings so they can just be replaced by their escaped unicode value \u005f.

Here is the java code to do the escaping. Since it’s basically a combination of string replacement and HTML encoding this should be easily doable in any server-side language:

    public String escapeDecoration(String s)
    {
        String escaped = StringEscapeUtils.escapeHtml(s);

        StringBuilder sb = new StringBuilder(escaped.length());
        sb.append("deco:");
        for (int i = 0; i < escaped.length() ; i++)
        {
            char c = escaped.charAt(i);
            switch(c)
            {
                case '_':
                    sb.append("\\u005F");
                    break;
                case ' ':
                    sb.append('_');
                    break;
                default:
                    sb.append(c);
                    break;
            }
        }

        return sb.toString();
    }

The escape method uses the escapeHTML method from Apache commons-lang's StringEscapeUtil. Going the other way in javascript is not that complicated either:

/**
 * Decodes the given string containing HTML entities.
 */
function htmlDecode(s)
{
    var helper = document.createElement("SPAN");
    helper.innerHTML = s;
    return helper.innerHTML;
}

/**
 * Returns the JSON decoration of the given element.
 * @param {DOMElement} DOM element
 * @param {String} decorator classval name, default is "deco".
 */
function decoration(elem, name)
{
    var value, data, result;

    value = classval(elem, name || "deco");
    if (value)
    {
       // get raw data from DOM element
       data = value.replace(/_/g, " ");
       // replace HTML entities with the original characters
       data = htmlDecode(data);
       // evaluate JSON
       result = eval("("+data+")");
    }
    return result || {};
}

In order to achieve a better readability of escaped JSON, I also used svenson's ability to deviate from the JSON standard by using single quotes instead of double quotes. Just comparing

<div id="tst2" class="deco:{'foo':'xxx\u005f_yyy','baz':[1,3,5,7,9]}">
JSON annotation
</div>

to

<div id="tst2" class="deco:{&quot;foo&quot;:&quot;xxx\u005f_yyy&quot;,&quot;baz&quot;:[1,3,5,7,9]}">
JSON annotation
</div>

should demonstrate that single quotes are not only much better readable, but also shorter. If you use eval() evaluate the JSON string, the single quotes are no problem at all. If you want json2.js / native JSON-parsing, you might have to replace the quote chars before parsing.

Links:

HTML test page with both metadata strategies

Hood: example application for jcouchdb 0.10.0-1

On the occasion of presenting CouchDB and jcouchdb at my place of work, I got around to finally create a small example application that is now downloadable as sneak preview. There need to be bugs fixed, features implemented and lots of documentation to be added, but it kind of works.

It’s called “Hood” for neighbourhood and allows you to mark places or people around a place of activity of yours, called hood. it is meant to foster collaboration / tips on local places etc.

It’s Spring Web Application demonstrating some techniques of working with jcouchdb. It’s an eclipse WTP/Spring IDE project with all dependencies you need besides couchdb and tomcat or another servlet container.

Stay tuned for hood to grow into a fullblown app.

Links:

Memory consumption changes in svenson 1.3

Implementing a streaming attachment feature for jcouchdb, I started to wonder whether it would be a good idea for svenson to support JSON parsing from a stream, too, as I don’t really need the complete stream to start constructing the java object graph.

Implementing stream parsing was really nice and easy thanks to the units test present in svenson. After that, I came upon two ways to generally cut down on memory use. All tokens with fixed values could just have a single instance. The recording of tokens to provide token based look ahead was not really needed in all cases. But how much does that save?

As a test case I wrote a small tool class to generate random, nested JSON datasets, generated two test files of 65kb and 4.5mb size and parsed these with svenson 1.2.8 and what now is svenson 1.3.

Measuring the actual memory usage for these two test files proved to be difficult. Somehow none of the programs I tried seemed to give me the data I wanted. Eclipse TPTP just ignored Strings that were no member of any class but just parameters, making stream and string parsing look exactly the same memory-wise. tijmp and others did not provide the data I wanted at all.

So in the end I wrote a little python script that parses a hprof ASCII output to

  • sum up all memory use
  • group allocations by class type, but only if the stack trace of it touches svenson
  • output the top 10 of those classes and the sums

This provided meaningful data and also showed some points for further improvement. There was a huge number of java.lang.reflect.Method allocations which turned out to be caused by svenson inspecting the target classes for annotations and appropriate methods which was done on a per target basis instead of the better per target class basis.

All in all the memory usage went down quite a bit:

memory usage for different svenson versions, with and without streaming

memory usage for different svenson versions, with and without streaming

45% less memory for the small file and 62% for the large file for all allocations. I think that is really good..

Below are some links to the files needed to repeat the benchmarking. The transform hprof script might also prove to be useful for other projects if changed appropriately.

The new jcouchdb release will also use stream parsing.

Links:

edit:
The command to generate the hprof file was something like

java -agentlib:hprof=heap=sites,depth=100,cutoff=0 -cp .. svensonperf.ReadJSONOld big.json

© 2014 fforw.de

Theme by Anders NorenUp ↑