Wednesday, 17 February 2010

Parsing SharePoint Metadata

The SharePoint Document libraries hold a lot of metadata against each document which is returned in the web services in the ows_MetaInfo field. This is a long string of metadata which, in the GetListItemChanges method is broken up into lines making it easy to pick out each key and value pair. However, the GetListItem methods returns this metadata as a plain string leaving it up to you to parse and turn in to something meaningful. After much head scratching and memory refreshing on regular expressions I have come up with a snippet of code that, for all my testing so far, successfully pulls out each key and value pair :

Pattern pattern = Pattern.compile("(\\w*):\\w{2}\\|");
Matcher matcher = pattern.matcher(ows_MetaInfo);

boolean fnd = matcher.find();

while (fnd)
{
    String key = matcher.group(1).trim();

    int start = matcher.end();

    fnd = matcher.find();

    int end = fnd ? matcher.start() -1 : ows_MetaInfo.length();

    String val = ows_MetaInfo.substring(start, end);

    if (val.length() > 0) 
    {
        System.out.println("Key: " + key + " - Val: " + val);
    } 
}