Wednesday, 17 February 2010

Parsing SharePoint Metadata

The SharePoint Document libraries hold a lot of metadata against each document which is returned in the web services in the ows_MetaInfo field. This is a long string of metadata which, in the GetListItemChanges method is broken up into lines making it easy to pick out each key and value pair. However, the GetListItem methods returns this metadata as a plain string leaving it up to you to parse and turn in to something meaningful. After much head scratching and memory refreshing on regular expressions I have come up with a snippet of code that, for all my testing so far, successfully pulls out each key and value pair :

Pattern pattern = Pattern.compile("(\\w*):\\w{2}\\|");
Matcher matcher = pattern.matcher(ows_MetaInfo);

boolean fnd = matcher.find();

while (fnd)
{
    String key = matcher.group(1).trim();

    int start = matcher.end();

    fnd = matcher.find();

    int end = fnd ? matcher.start() -1 : ows_MetaInfo.length();

    String val = ows_MetaInfo.substring(start, end);

    if (val.length() > 0) 
    {
        System.out.println("Key: " + key + " - Val: " + val);
    } 
}

5 comments:

  1. Hi, this was of great use to me. By the way, if anyone wants to do this using ColdFusion use the following:

    #REMatchNoCase("(\w*):\w{2}\|",results.ows_metaInfo)#>

    Then you can simply loop through the array and create a structure containing easily referenced name=value pairs. Thanks again!

    ReplyDelete
  2. Glad you found it useful.

    ReplyDelete
  3. This is a good information. I'm gonna bookmark your site and visit this from time to time.

    ReplyDelete
  4. Thanks for the comment - glad you found it useful.

    ReplyDelete
  5. Hi... note that if you have custom metadata columns with a space in them this will not work.... you'll need something like this:
    Pattern pattern = Pattern.compile( "(\\w*\\s)*?(\\w*):\\w{2}\\|" );

    Also note http://gskinner.com/RegExr/ is really really good for testing this stuff :)

    ReplyDelete