It looks like you're new here. If you want to get involved, click one of these buttons!

- All Categories 2.2K
- Applied Category Theory Course 354
- Applied Category Theory Seminar 4
- Exercises 149
- Discussion Groups 49
- How to Use MathJax 15
- Chat 480
- Azimuth Code Project 108
- News and Information 145
- Azimuth Blog 149
- Azimuth Forum 29
- Azimuth Project 189
- - Strategy 108
- - Conventions and Policies 21
- - Questions 43
- Azimuth Wiki 711
- - Latest Changes 701
- - - Action 14
- - - Biodiversity 8
- - - Books 2
- - - Carbon 9
- - - Computational methods 38
- - - Climate 53
- - - Earth science 23
- - - Ecology 43
- - - Energy 29
- - - Experiments 30
- - - Geoengineering 0
- - - Mathematical methods 69
- - - Meta 9
- - - Methodology 16
- - - Natural resources 7
- - - Oceans 4
- - - Organizations 34
- - - People 6
- - - Publishing 4
- - - Reports 3
- - - Software 21
- - - Statistical methods 2
- - - Sustainability 4
- - - Things to do 2
- - - Visualisation 1
- General 39

Options

I don't have such a transformation handy, but I'd sure like one!

Some background: As Robert Figura described to us in his wonderful post, History of Databases, software developers/engineers/architects regularly distinguish between SQL and NoSQL databases. An SQL database (like MySQL, PostgreSQL, Oracle, SQL Server, etc.) is a relational database management system expressing its querying interface via the Structured Query Language (SQL), a well-defined standard. A NoSQL database (like MongoDB, DynamoDB, Cassandra, Redis) does not (necessarily) use SQL to work with data; I think you might say, rather than say there's no well-defined standard, that there's a multitude of "standards" available for these database engines. Redis, for example, has a very well-defined Redis Serialization Protocol (RESP) for working with data -- it's just not christened with an ISO or ANSI specification for use among multiple database vendors, although other vendors could if they wished. Suffice it to day, the SQL ecosystem consists of many vendors implementing the same (or close to the same) standardized language and the NoSQL ecosystem consists of many vendors implementing many languages for data storage and querying.

Here's my question: Given a schema for an SQL based RDMS, what's a natural transformation to a schema for a document based store? My instinct as a programmer would be to identify some kind of interchange format for records SQL<->NoSQL. One easy such format may be JavaScript Object Notation (JSON). A recent favorite of mine, protocol buffers, fits the need. Dare I mention the extensible markup language (XML)? **But do any of these languages, used as an interchange format from SQL to NoSQL, have a precise mathematical expression in the language of category theory?**

## Comments

I do not know of any. But I also do not think can be done 'declaratively', any time soon. One would need a library of higher order functions (functions that produce functions) as well as domain/business specific features. Sort of like there is not one 'statistical model' that works for all cases, I think the same problem is here.

That's because, in one schema, there are several 'areas' or 'dimensions' of data model design considerations are expressed: ( read access optimization, write/update optimization, access control dimensions, allowances for new data attributes and their relations, enforceable constraints, logical constraints, time dimension).

So to create something 'equivalent' and formally provable transformation, one would have to come up with a way to preserve capabilities in all of the above mentioned dimensions.

For that to happen, we would have to have a formal definition for each of the above dimension, that would describe in quite a bit of details an 'algebra' applicable to each area. Then, same thing for the target data model, and, then, for the the transformations themselves.

Each of those things, I are not, as far as I now, well defined formally. Instead those are 'crafted' so to speak but data modeleres/architects/etc. (and of varying quality, I might add).

Probably too pessimistic, but for me it feels like -- in this area, we, in general industry, are still using abacus, but to do what you are asking (and many other similar in industrial-scale data transformations) -- we need at least a 486 CPU.

`I do not know of any. But I also do not think can be done 'declaratively', any time soon. One would need a library of higher order functions (functions that produce functions) as well as domain/business specific features. Sort of like there is not one 'statistical model' that works for all cases, I think the same problem is here. That's because, in one schema, there are several 'areas' or 'dimensions' of data model design considerations are expressed: ( read access optimization, write/update optimization, access control dimensions, allowances for new data attributes and their relations, enforceable constraints, logical constraints, time dimension). So to create something 'equivalent' and formally provable transformation, one would have to come up with a way to preserve capabilities in all of the above mentioned dimensions. For that to happen, we would have to have a formal definition for each of the above dimension, that would describe in quite a bit of details an 'algebra' applicable to each area. Then, same thing for the target data model, and, then, for the the transformations themselves. Each of those things, I are not, as far as I now, well defined formally. Instead those are 'crafted' so to speak but data modeleres/architects/etc. (and of varying quality, I might add). Probably too pessimistic, but for me it feels like -- in this area, we, in general industry, are still using abacus, but to do what you are asking (and many other similar in industrial-scale data transformations) -- we need at least a 486 CPU.`