We've found a need for more complex tag values in our use cases for Spectrum Discover. An example of this is the ability to associate an entity with a document, where the entity has its own attributes. One such entity could be a "creator", with attributes "name", "email", "role" etc. JSON documents are perfect for this case, and some experimentation indicates that tag values support JSON-formatted values, such as:
{"name": "Jane Doe", "email": "john@doe.com", "role": "dataset_contact"}
where the tag name is 'test_json'.
Using the DB2 documentation for JSON_VALUE [1] we've verfied that querying the attributes of a JSON-formatted tag value is possible in Spectrum Discover. It works in the SQL query section of the GUI, the db2whrest/search and db2whrest/sql_query API endpoints. A working query for the above JSON-string:
"JSON_VALUE(test_json, 'strict $.name' RETURNING VARCHAR(32)) LIKE '%Doe'"
Which will return any object in the database where the 'name' attribute of the document matches.
However, the DB2 documentation also suggests that it should be possible to perform more complex queries than this. In particular, we would like to query tag values with more than one document.
Example:
A tag value similar to the one above, but with multiple instances of the same type of document.
It could look like this:
"{
creators = [
{
"name": "Jane Doe",
"email": "jane@doe.com",
"role": "dataset_contact"
},
{
"name": "John Doe",
"email": "john@doe.com",
"role": "author"
},
{
"name": "Jack Doe",
"email": "jack@doe.com",
"role": "dataset_contact"
}
]
}"
Querying with JSON_VALUE won't work in this instance. We've made some attempts using JSON_TABLE [2], with limited success.
For example:
SELECT t.name, t.email, t.role
FROM T_TEST_JSON,
JSON_TABLE(
T_TEST_JSON.value,
'strict $'
COLUMNS(
name VARCHAR(2048) FORMAT JSON PATH 'lax $.creators[*].name' WITH CONDITIONAL WRAPPER,
email VARCHAR(2048) FORMAT JSON PATH 'lax $.creators[*].email' WITH CONDITIONAL WRAPPER,
role VARCHAR(2048) FORMAT JSON PATH 'lax $.creators[*].role' WITH CONDITIONAL WRAPPER
) ERROR ON ERROR
) as t
will yield the following result when sent to the API endpoint /db2whrest/sql_query:
0,"[""Jane Doe"",""John Doe"",""Jack Doe""]","[""jane@doe.com"",""john@doe.com"",""jack@doe.com""]","[""dataset_contact"",""author"",""dataset_contact""]"
while what we would like to see is a something like:
0,""Jane Doe"",""jane@doe.com"",""dataset_contact""
1,""John Doe"",""john@doe.com"",""author""
2,""Jack Doe"",""jack@doe.com"",""dataset_contact""
on which we could apply a WHERE clause to further filter the results.
We've been unable to find any documentation or examples of JSON capabilities in the Spectrum Discover documentation. Are there any resources other than the DB2 documentation available for this? Or perhaps some examples from usage in Spectrum Discover?
[1] https://www.ibm.com/docs/en/db2/11.5?topic=functions-json-value
[2] https://www.ibm.com/docs/en/db2/11.5?topic=functions-json-table
[3] https://www.ibm.com/docs/en/i/7.3?topic=data-using-json-table