CCL String Extractor

This document refers to xstring version 1.2.0.

Introduction

The String Extractor tool parses strings from source and XML files and exports them to a file format suitable for localization.

Usage:

xstring -[MODE] [INPUT_FOLDER] -[OUTPUT_FORMAT] [OUTPUT_FILE] [MODEL_PATH] -[OPTION]
Command-line Arguments

Argument

Required

Description

MODE

No

Mode: {skin, menu, tutorial, metainfo, template, custom, auto, code}, see parser modes, default: auto.

INPUT_FOLDER

Yes

Input folder to parse, must be a path to a folder.

OUTPUT_FORMAT

No

Output format: {po, xliff}, default: po.

OUTPUT_FILE

Yes

Path to result file.

MODEL_PATH

Yes/No

Specify XML model file, see parser modes for details.

OPTION

No

Parser options: {v = print debug logs}.

Parser Modes

Mode

Description

Exclusive

Use of XML model file

skin

Parse <Skin> XML files.

Yes

Optional, overwrite <Skin> default.

menu

Parse <MenuBar> XML files.

Yes

Optional, overwrite <MenuBar> default.

tutorial

Parse <TutorialHelpCollection> XML files.

Yes

Optional, overwrite <TutorialHelpCollection> default.

metainfo

Parse <MetaInformation> XML files.

Yes

Optional, overwrite <MetaInformation> default.

template

Parse <DocumentTemplate> XML files.

Yes

Optional, overwrite <DocumentTemplate> default.

custom

Parse custom format XML files.

Yes

Required, specify custom XML format.

auto

Parse files of any supported content type, see auto scan supported format

No

Optional, add additional formats, overwrite built-in formats.

code

Parse source files, supported file extensions: .cpp, .h, .js, .mm.

Yes

Unsupported.

Exclusive: parse the specified content type only, ignore other formats even if supported.

Tip

Run xstring without any arguments to get a usage description.

Usage examples:

Parse any supported file format from /path/, create result file /path/output.po in gettext format:

xstring /path/ /path/output.po

Parse skin XML files from /path/to/skin, create result file /path/skin.po in gettext format:

xstring -skin /path/to/skin -po /path/skin.po

Parse custom XML format from /path/to/custom, create result file /path/custom.po, use model definition from file custom.json, print debug logs:

xstring -custom /path/to/custom -po /path/custom.po custom.json -v

Auto scan

Scan mode ‘-auto’ allows to read any supported content type from a given input directy without the need to explicitly specify an expected input file format. It enables the user to extract strings from multiple files of different format in a single scan. Auto scan supports the following file formats:

Format Description

CCL Defined

<Skin> XML

Yes

<MenuBar> XML

Yes

<TutorialHelpCollection> XML

Yes

<MetaInformation> XML

Yes

<DocumentTemplate> XML

Yes

Android <resources> XML

No

Annotated source code (.cpp, .h, .js, .mm)

Yes

XML model format

Introduction

The String Extractor tool supports parsing of custom XML based formats. Parser rules (XML models) are defined via a JSON model file. The file format is:

{
  "root":
  {
    "name": "SomeRootName", // Mandatory, expected root element name.
    "conditions":
    [
      // Optional, root element conditions (file is skipped on mismatch)
    ]
  },
  "extensions": "XML", // Mandatory, expected file extensions.
  "scope":
  {
    // Optional, scope rule applied for all matchers.
  },
  "matchers":
  [
    // List of matchers
  ]
}

With the matcher structure being:

{
  // matcher attributes depending on matcher type, see reference below
  "scope":
  {
    // Optional, matcher specific scope rule (overrides model level scope)
  },
  "conditions":
  [
    // Optional, list of conditions, AND-combined
  ]
}

Model

The root element name attribute denotes the expected XML file root element name. The String Extractor uses it to match XML files to models. The extensions value denotes the file extension for this model and causes the String Extractor to not skip this file type when traversing a directory. Note that different XML formats may share the same file extension.

Matchers

Matchers define which attributes to read from the XML file. Matchers are evaluated per XML node when the String Extractor traverses the XML tree. Matchers are OR-combined: a single XML node may satisfy multiple matchers resulting in multiple strings being returned from the same node.

Conditions

Conditions add additional constraints to a matcher. The parser requires all conditions to be satisfied to export a certain attribute. Conditions are AND-combined.

Scopes

Scopes define which string scope is exported. A scope rule can be defined on model or matcher level. Matcher level rules override the model level (global) rule. If a matcher level rule can not return a scope, the parser attempts to fallback to the model level scope.

Tip

Simplified examples for these concepts can be found here, a production example file can be found here.

Features

Some model convenience features:

Combining models

A single model JSON file may contain multiple models:

{
  "models":
  [
    {
      // model 1
    },
    {
      // model 2
    }
  ]
}

Model inheritance

A model for which a root is set typically adds support for this XML format or replaces an existing model for this format entirely. If a built-in model should be reused but for a different file extension, the inherit attribute can be used over root:

{
    "inherit": "DocumentTemplate",
    "extensions": "apptemplate"
}

In this example, the parser loads the DocumentTemplate built-in format but replaces its associated file extension to apptemplate. Note that if inherit is set, further configuration attributes such as scope rules or matchers will be ignored.

Avoid specifying multiple models for the same format as the parser can register a single model per format only. When the parser loads a model it may overwrite an existing one if the format (root or inherited) was previously registered. Thus, if the same model is to be used for multiple file extensions, specify the extensions as a list but in a single model definition:

// Correct: both extensions are registered for <DocumentTemplate>
{
  "inherit": "DocumentTemplate",
  "extensions":
  [
    "apptemplate1",
    "apptemplate2"
  ]
}

// Wrong: only apptemplate2 is registered for <DocumentTemplate>
"models":
[
  {
    "inherit": "DocumentTemplate",
    "extensions": "apptemplate1"
  },
  {
    "inherit": "DocumentTemplate",
    "extensions": "apptemplate2"
 },
]

XML model reference

Overview

Matchers: [Attribute] [Element] Conditions: [Attribute Sibling] [Element Name] Scopes: [Static Value] [Parent Element Attribute]


Matcher: attribute

Semantic: parse all attributes of a certain name, irregardless of the element.

Attributes

  • kind {string}: required, object id for internal use, must be “attribute”

  • name {string}: required, name of the attribute to match, must be single value

  • options {string, stringlist}: optional, string processing options

    • split: tokenize comma separated string

  • scope {scope handler}: optional, define scope to export for this string

  • conditions {list of conditions}: optional, matcher constraints

Example: parse all ‘text’ attributes.
{
  "kind": "attribute",
  "name": "text",
  "options": "split",
  "scope":
  {
    // ...
  },
  "conditions":
  [
    // ...
  ]
}

-> Back to overview


Matcher: element

Semantic: Parse all elements of a certain name, read string from the element text or an element attribute.

Attributes

  • kind {string}: required, object id for internal use, must be “element”

  • name {string}: required, name of the element to match

  • text {bool}: optional, try to read string from element text before attribute (default: off)

  • attribute {string, stringlist}: required for text off: name of the attribute to read string from, priority list, has no effect if text option is enabled and already returned a value

  • options {string, stringlist}: optional, string processing options

    • split: tokenize comma separated string

  • scope {scope handler}: optional, define scope to export for this string

  • conditions {list of conditions}: optional, matcher constraints

Example: Parse ‘text’ attribute for all <DemoElement>, use ‘name’ or ‘label’ attribute if ‘text’ attribute does not exist, split value into tokens.
{
  "kind": "element",
  "name": "DemoElement",
  "attribute":
  [
    "text",
    "name",
    "label"
  ],
  "options": "split",
  "scope":
  {
    // ...
  },
  "conditions":
  [
    // ...
  ]
}
Example: Parse <DemoElement> element text
{
  "kind": "element",
  "name": "DemoElement",
  "text": true,
  "scope":
  {
    // ...
  },
  "conditions":
  [
    // ...
  ]
}

-> Back to overview


Condition: attribute sibling

Semantic: Current node has another (or same) attribute with a certain value.

Attributes

  • kind {string}: required, object id for internal use, must be “attribute”

  • name {string}: required, name of the attribute to compare to

  • value {string}: required, expected value of the attribute that is compared to

  • operator {string}: optional, comparison type

    • equal

    • notequal

Example: node must have a ‘type’ attribute with value ‘string’.
{
  "kind": "attribute",
  "name": "type",
  "value": "string",
  "operator": "equal"
}

-> Back to overview


Condition: element name

Semantic: Current node must have certain element name.

Attributes

  • kind {string}: required, object id for internal use, must be “element”

  • name {string}: required, element name to match

  • operator {string}: optional, comparison type

    • equal

    • notequal

Example: current node must be <Parameter>
{
  "kind": "element",
  "name": "Parameter",
  "operator": "equal"
}

-> Back to overview


Scope: static

Semantic: constant scope value.

Attributes

  • kind {string}: required, object id for internal use, must be “static”

  • value {string}: required, scope string value to use

Example: set scope to “[SomeText]”
"scope":
{
  "kind": "static",
  "value": "SomeText"
}

-> Back to overview


Scope: parent attribute

Semantic: Use parent element attribute value as scope.

Attributes

  • kind {string}: required, object id for internal use, must be “parent”

  • element {string, stringlist}: required, name of the parent element, priority list

  • attribute {string, stringlist}: required, name of parent attribute to use, priority list

  • fallback {string}: optional, constant string value to use as fallback (default value)

Example: use ‘name’ attribute of parent <ParentName>, use ‘SomeText’ if there is no <ParentName> parent.
{
  "kind": "parent",
  "element":
  [
    "ParentName",
    "AltParentName"
  ],
  "attribute":
  [
    "name",
    "label"
  ],
  "fallback": "SomeText"
}

-> Back to overview


XML model examples

Here are a few model examples:

Attribute matcher

Parse all ‘text’ attributes, irregardless of element name.
{
  "root":
  {
    "name": "DemoFormat"
  },
  "extensions": "xml",
  "matchers":
  [
    {
      "kind": "attribute",
      "name": "text"
    }
  ]
}

Element matcher

Parse ‘text’ attribute from all <DemoElement>.
{
  "root":
  {
    "name": "DemoFormat"
  },
  "extensions": "xml",
  "matchers":
  [
    {
      "kind": "element",
      "name": "DemoElement",
      "attribute":
      [
        "text"
      ]
    }
  ]
}

Element condition

Parse ‘text’ attribute if element name is not ‘Parameter’.
{
  "root":
  {
    "name": "DemoFormat"
  },
  "extensions": "xml",
  "matchers":
  [
    {
      "kind": "attribute",
      "name": "text",
      "conditions":
      [
        {
          "kind": "element",
          "name": "Parameter",
          "operator": "notequal"
        }
      ]
    }
  ]
}

Root element filter

Skip file if root element does not have a ‘localization’ attribute
{
  "extensions": "xml",
  "root":
  {
    "name": "DemoFormat",
    "conditions":
    [
      {
        "kind": "attribute",
        "name": "localization",
        "value": "",
        "operator": "notequal"
      }
    ]
  },
  // ...
}

Static scope

Parse all ‘text’ attributes, irregardless of element name. Export each string with scope ‘TestScope’.
{
  "root":
  {
    "name": "DemoFormat"
  },
  "extensions": "xml",
  "scope":
  {
    "kind": "static",
    "value": "TestScope"
  }
  "matchers":
  [
    {
      "kind": "attribute",
      "name": "text"
    }
  ]
}

Skin model

Production example: this is the model definition for the built-in <Skin> format, with added comments for illustration purposes:

{
  "root":
  {
    "name": "Skin"
  },
  "extensions": "xml",
  // For all elements: use the parent element name as scope
  // if a parent named Form, WindowClass or Perspective
  // exists, use 'Skin' otherwise.
  "scope":
  {
    "kind": "parent",
    "element":
    [
      "Form",
      "WindowClass",
      "Perspective"
    ],
    "attribute": "name",
    "fallback": "Skin"
  },
  "matchers":
  [
    // Parse title attribute from any element.
    {
      "kind": "attribute",
      "name": "title"
    },
    // Parse tooltip attribute from any element.
    {
      "kind": "attribute",
      "name": "tooltip"
    },
    // Parse placeholder attribute from any element.
    {
      "kind": "attribute",
      "name": "placeholder"
    },
    // Parse command.name attribute from any element but
    // use "Command" as scope (overrides model level 'parent'
    // scope rule).
    {
      "kind": "attribute",
      "name": "command.name",
      "scope":
      {
        "kind": "static",
        "value": "Command"
      }
    },
    // Parse command.category attribute from any element but
    // use "Command" as scope (overrides model level 'parent'
    // scope rule).
    {
      "kind": "attribute",
      "name": "command.category",
      "scope":
      {
        "kind": "static",
        "value": "Command"
      }
    },
    // Parse range attribute from all <Parameter> elements
    // if the sibling 'type' attribute value is 'string'.
    {
      "kind": "attribute",
      "name": "range",
      "conditions":
      [
        {
          "kind": "element",
          "name": "Parameter",
          "operator": "equal"
        },
        {
          "kind": "attribute",
          "name": "type",
          "value": "string",
          "operator": "equal"
        }
      ]
    },
    // Parse range attribute from all <Parameter> elements
    // if the sibling 'type' attribute value is 'list'.
    // Stringsplit the list value into separate tokens.
    {
      "kind": "attribute",
      "name": "range",
      "options": "split",
      "conditions":
      [
        {
          "kind": "element",
          "name": "Parameter",
          "operator": "equal"
        },
        {
          "kind": "attribute",
          "name": "type",
          "value": "list",
          "operator": "equal"
        }
      ]
    }
  ]
}

Version history

Changelog v1.2.0

  • scan mode ‘-auto’ supports Android resources XML files as built-in format

Changelog v1.1.2

  • change built-in skin parser to scan ‘placeholder’ attribute on all elements

Changelog v1.1.1

  • parse <MetaInformation> ‘Package:Name’

Changelog v1.1.0

  • maintain XML formats as JSON model files

  • added -custom parser mode: scan custom XML formats

  • added -auto parser mode: scan any known content type

  • added support for built-in meta information format

  • added support for built-in document template format

  • improved logging

  • added -v option to toggle debug logs