- Creating a Taxonomy job and using the Taxonomy Manager
- Additional Attributes for Item Selection
- Taxonomy File Formats
- CSV File Validations
cml:taxonomy_beta is a new tag that will replace cml:taxonomy. It renders a widget that allows contributors to search and browse through a hierarchical list of items (a taxonomy) and select an item (or multiple) to be submitted. Taxonomy data must be formatted according to the Taxonomy File Formats section below.
Please contact your Customer Success Manager or our Platform Support team via email or chat if you would like to use the new taxonomy tool in your job.
Figure 1: cml:taxonomy_beta in Preview
Creating a taxonomy job and using the Taxonomy Manager
On the design page when a taxonomy job is created and the job design is saved, a selection with a link to the Taxonomy Manager link will be displayed as shown in Figure 2.
Figure 2: Taxonomy Manager link in Job Design view.
On the Taxonomy Manager page, the requestor can upload the taxonomy file (json or csv). If an existing taxonomy exists on the job, then the download link will be available as shown in Figure 3.
Figure 3: Taxonomy Manager
Additional Attributes for item selection
Accepts Boolean values, ‘true or ‘false’. If set to "true", the taxonomy tool will allow contributors to select multiple items. By default, a contributor can only select one item. An example of a multi-select option is shown in Figure 4.
Figure 4: Example of multi-select=true. User can select more than one item (only leaf/endpoint items)
Accepts Boolean values, ‘true or ‘false’. If set to "true", every taxonomy item will be selectable (normally only taxonomy endpoints are selectable). An example of select-all option is shown in Figure 5.
Figure 5: User can select parent items when select-all is set ‘true’
Users can set both `select-all` and `multi-select` to true to enable multiple selections of items at each level as shown in Figure 6.
Figure 6: Both parent and child items can be selected when both select-all and multi-select is set to true.
In addition to selecting items in the taxonomy, the tool also includes a search field as shown in Figure 7.
Figure 7: Search bar with matching results
Nodes can be displayed alphabetically to workers by setting <sort=”true”>. If not set or set to “false”, by default they will display in the order uploaded.”
Taxonomy File Formats
1. CSV Flat
Each row represents a category and its parent. Headers must include:
- description (optional): any information you want displayed in the taxonomy to help the user understand what it is.
If a file is uploaded, the Taxonomy Manager will convert to JSON format that is available for downloads.
E.g., flat csv file:
Taxonomy view for the above file:
2. Nested CSV
In nested taxonomies, each row describes only a single node and the relationship with the parent and children are done spatially, so ordering matters. Headers must include:
- category_1 through category_n: category_1 is the top-level category, followed by any number of sub-categories. Required fields.
- id: (optional).
- description: any information you want displayed in the taxonomy to help the user understand what it is (optional).
Example nested csv:
3. Path by Row CSV
In path-by-row csv format, each row describes a full path so there may be repeats in the parent columns. Headers must include the same as those for Nested csv format (see above).
Example: path by row csv
4. JSON file format
An example JSON file format:
5. JSON format to support Directed Acyclic Graph (DAG)
Taxonomy Manager will also support a graph that include a DAG like the below diagram.
The JSON format to support the above DAG example will be:
The same example can be supported in CSV using the path-by row format as shown below:
CSV file validations
In the Taxonomy Manager, before uploading a CSV file, the file must be formatted as follows to avoid bad parses:
- CSV must have a header row. The headers must be exactly category_1,category_2,…,category_N,description, id, in that order. The description and id fields are both optional and can be present only after the first N category level header names.
- The required delimiter for the CSV is a comma “,” Do not have spaces around this delimiting comma in your data rows, otherwise the parsed results may not be correct.
- If a field value includes a comma, you must wrap that entire field value with double quotes, e.g.
“I have, comma”
There must be no spaces before the starting quote nor after the closing quote. The quotes must be immediately adjacent to the delimiter (,) to indicate a quoted field.
- Each category path must be entirely on one row. Category paths with category names extended into multiple lines are not yet supported. Category paths that extend to a new line will be parsed as new category paths.