As we explain in Logical Database Design
(forthcoming), LDD assigns the meaning of terms in conceptual models
(CMs)—properties, entities, groups, multigroup—to non-logical symbols of a
formal logic theory.
If the theory is RDM,
the symbols stand for sets—domains/attributes, tuples, relations,
database—adapted for database management. For each CM the theory acquires an
interpretation, which produces a LM (application) of the theory for database
representation and manipulation.
Here are the adapted
sets symbolized in RDM which acquire the interpretation of the terms in CMs.
”Table (n.) – a collection of information (data?)
describing a population of entities which possess some common characteristics,
called attributes.
-itis
– “suffix denoting diseases characterized by inflammation, itself often caused
by an infection.” ---------- from the Wikipedia Wiktionary.”
Tables
are the building block of relational databases. Tables must generally be
“normalized,” at least to 1NF. That may be an appropriate way to think of
databases when implemented in a modern day DBMS. However, it is not the way the
world thinks logically. People have no problem with commonly occurring
phenomena such as:
·
A multi-valued attribute, e.g., an Employee possesses
multiple Skills.
·
Many-to-many (M:N) relationships, e.g., as between Employees
and Projects
· A relationship with attributes
even though our systems may. None of these situations
can be handled directly in a relational database."
This just now, on LinkedIn (check out my comments).
“Putting to one side the argument that your
data almost certainly didn't start out broken out in to tables, and it almost
certainly isn't consumed that way either, here's the thing; MongoDB, if you
squint, is essentially a relational database with an unorthodox take on first
normal form and some great high availability and scalability features.” -- Graeme Robinson
“...
In ORM there is no concept of an entity record (tuple), although relational
tables can be automatically generated from an ORM model (furthermore,
guaranteed to be fully normalized).” --Online
comment
“Object
Role Modeling (ORM) is a ...a fact-oriented modeling approach for
specifying, transforming, and querying information at a conceptual level.
Unlike [other modeling approaches] ... fact-oriented modeling is
attribute-free, treating all elementary facts as relationships ... In practice,
ORM data models often capture more business rules, and are easier to validate
and evolve than data models in other approaches. --ORM.net
Chris
Date once published an article at the old DBDebunk titled “Models, Models,
Everywhere, Nor Any Time to Think”. If you want to get a hold of what he
meant then, you oughta do a search on the title now and see what you get.
The
continuous proliferation of models is an indication and measure of the
disregard, if not outright hostility of the industry to sound theoretical
foundations. It keeps reminding me of a decades-old piece I posted in response
to David Hay's critique of Ron Ross's then proposal of a “fact model” (yet
another one) as an alternative to data model. It is more relevant than ever,
which is why I decided to bring it up to date. The problem is so entrenched and
widespread, that even those who try to address it fail to realize that they are
victims of it too.
Hay
correctly observed:
“In our industry, there is a
strong desire to put names on things. This is natural enough, given the amount
of information that we have to classify and deal with in our work. To give
something a name is to gain control over it, and this is not necessarily a bad
thing. The problem is when the name takes the place of true understanding of
the thing named. Discourse tends to be the bantering of names, without true
understanding of the concepts involved.”
In
this industry, many of the names are just re-labeling, whether it fits or
not. Here are a couple of exquisite examples of both cases:
“I was amused to read in
[Ralph Kimball's] article that my own suppliers and parts database design was
"a perfect, beautiful star schema!" When I first learned the term
"star schema", my reaction was that a properly designed star schema
would be nothing neither more, nor less than a properly designed schema per se
(in other words, one that did obey those scientific principles of relational
design that do exist). So to see RK say that my schema was in fact a star
schema reminded me (I’m afraid) of Peter Chen’s original E/R paper, in
which—among other things—he reinvented the concept of domains, but called them
value sets, and then went on to analyze the relational model in terms of his
own ideas and said “Look, domains are just value sets!” --C. J. Date
Note: Kimball's "star schema" is, of
course, not a relational schema, but quite an attempt to avoid it, due to
failure to distinguish application views of the database from the database
schema.
“If
we step back and look at what RDBMS is, we’ll no doubt be able to conclude
that, as its name suggests (i.e., Relational Database Management System), it is
a system that specializes in managing the data in a relational fashion. Nothing
more. Folks, it’s important to keep in mind that it manages the data, not the
MEANING of the data! And if you really need a parallel, RDBMS is much more akin
to a word processor than to an operating system. A word processor (such as the
much maligned MS Word, or a much nicer WordPress, for example) specializes in
managing words. It does not specialize in managing the meaning of the words ...
So who is then responsible for managing the meaning of the words? It’s the
author, who else? Why should we tolerate RDBMS opinions on our data? We’re the
masters, RDBMS is the servant, it should shut up and serve. End of discussion.”
--Alex Bunardzik, Should Database
Manage The Meaning?
Series Preface Introduction 1. Interpretation of Database Relations 1.1. Attributes as Constrained Domains 1.2. Time-Varying Relations 2. Representation of Database Relations 2.1. Physical Data Independence 2.1.1. Uniquely Named Attributes 2.1.2. Primary Keys 2.1.3. Relations and R-tables 3. Normalization 3.1. First Normal Form and “Simple” Domains 3.2. Normalization and Non-simple Domains 3.2.1. Foreign Keys Conclusion
In "Codd Almighty!
Has it been half a century of SQL already?" the Register's Lindsay
Clark interviews "Donald Chamberlin, Michael Stonebraker and more"
about the legendary programming [sic] language. Chamberlin with Raymond Boyce
were the authors of "the 1974 paper SEQUEL: A structured English query language
as a way of addressing data in IBM's newly proposed System R, the first
database to embody Edgar Codd's paper describing the relational model for database management.”
C. J. Date, who worked at IBM at the time, has often stated
that the designers of SQL never understood RDM, and I expressed a similar
stance in If You Liked SQL, You'll love XQuery. This has had an
extremely detrimental effect on
database technology--regress rather than progress--none of which transpires in
the interview. So here is my reality check take on what you would not know from the interview.
I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:
THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
PRIMARY KEYS - A NEW UNDERSTANDING
available for ordering from the PAPERS page, and two more:
RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
DATABASE RELATIONS: A DEFINITIVE GUIDE
are in progress and forthcoming, respectively.
In the process I am coming across common and entrenched industry "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.
Time permitting, I may expose and dispel some of those fallacies, treated in more depth in the papers, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.
Here's one.
“A domain in most SQL usage is essentially an alias name for an existing type + restrictions on an existing type that can be used in a column. As for an attribute, it's essentially a COLUMN in SQL, a field in other types of databases, etc.”
Can you identify the fallacies before you proceed?
I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:
THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
PRIMARY KEYS - A NEW UNDERSTANDING
available for ordering from the PAPERS, and two more:
RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
DATABASE RELATIONS: A DEFINITIVE GUIDE
are in progress and forthcoming, respectively.
In the process, I am coming across industry common and entrenched "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.
Time permitting, I may expose and dispel some of those fallacies, treated in more depth in the papers, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.
Here's one.
“Data
is stored in two-dimensional tables consisting of columns (fields) and
rows (records). Multi-dimensional data is represented by a system of
relationships among two-dimensional tables.”
I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:
THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
PRIMARY KEYS - A NEW UNDERSTANDING
available for ordering from the PAPERS page, and two more:
RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
DATABASE RELATIONS: A DEFINITIVE GUIDE
are in progress and forthcoming, respectively.
In the process I am coming across industry common and entrenched "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.
Time permitting, I may expose and dispell some of those fallacies, treated in more depth in the papers, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.
Here's one:
“There
seams to be some confusion between what a Primary Key is, and what an
Index is and how they are used. The Primary Key is a logical object. By
that I mean that is simply defines a set of properties on one column or a
set of columns to require that the columns which make up the primary
key are unique and that none of them are null. Because they are unique
and not null, these values (or value if your primary key is a single
column) can then be used to identify a single row in the table every
time. In most if not all database platforms the Primary Key will have an
index created on it. An index on the other hand doesn’t define
niqueness. An index is used to more quickly find rows in the table based
on the values which are part of the index. When you create an index
within the database, you are creating a physical object which is being
saved to disk.”
Can you identify the fallacies before you proceed?
I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:
THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
PRIMARY KEYS - A NEW UNDERSTANDING
available for ordering from the PAPERS page, and two more:
RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
DATABASE RELATIONS: A DEFINITIVE GUIDE
are in progress and forthcoming, respectively.
In the process I am coming across industry common and entrenched "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.
Time permitting, I may expose and dispel some of those fallacies (treated in more depth in the papers) in short posts here, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.
“The company was using a [SQL] RDBMS . . . to handle data transactions for its trading applications. However, the applications required arbitrary data types, which is nearly impossible for relational systems, according to experts.”
which contains three fallacies--can you identify them before you proceed?
SUPPORT
THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics
column. The site was discontinued in 2018. The content here is not available
anywhere else, so if you deem it useful, particularly if you are a regular
reader, please help upkeep it by purchasing publications, or donating. On-site
seminars and consulting are available.Thank you.
HOW TO USE THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or
acronyms of the terms listed on the SEARCH page.
For detailed instructions on how to understand and use the labels in
conjunction with that page, see the ABOUT page. The 2017 and 2016 posts, incl uding earlier
posts rewritten in 2017 were relabeled accordingly. As other older posts are
rewritten, they will also be relabeled. For all other older posts use Blogger
search.
- The links to my AllAnalytics columns no longer work. I re-published only the
2017 columns @dbdebunk, and within them links to sources external to
AllAnalytics may or may not work.
"SQL RDBMS" is a contradiction in terms. Not only are SQL DBMSs not relational (and, thus, fail to provide RDM's advantages), but--even leaving SQL out--the interpretation (and, thus, understanding, such as it is) of RDM dominant in the industry is flawed. Do you know why, and what are the missed advantages?
"Arbitrary data types"--more precisely, domains of arbitrary complexity (not to be confused with SQL built-in types)--are not impossible in RDM properly understood, namely, as coupled with a strong type system: a notion of type hierarchy derived from a theory of types that governs manipulation of domain values, which is orthogonal to RDM, albeit necessary, for support of domains in general, and those so-called "complex" in particular (orthogonal in the sense that the relational data sublanguage is insulated from the implementation of the domains and their operators). Such a type system is incorporated in McGoveran's Semantic-Relational Data Model (SRDM)--the correct interpretation, extension and formalization of Codd's work.
As to "experts", I do not know many (to understate the case) in RDM and I assure you that the above statement was not made by any of them.
References
McGoveran, D., LOGIC FOR SERIOUS DATABASE FOLK (draft chapters), forthcoming. Pascal, F., RELATIONAL DATABASE DOMAINS, forthcoming.
In Part 1 we introduced in the conceptual model (CM) the
metalogical designation property. It represents—in the absence of known
shared defining properties of an entity type, the designation by a group's
definer that an entity identifier (aka assigned name) or property value is a
member of the group. Such agroup is not a group of entities, but a
group of name and property values. In the logical model (LM), it is
formalized as a designation predicate (DP) and defines a domain.
In Part 2, we introduce the metalogical assertion property.
It represents the assertion by an authorized database user that a specific
entity, represented by a tuple, either does or does not correspond to an actual
entity in the real world.
One purpose of our contributions here is to suggest a vocabulary
that avoids confusion not just within the formal logical level, but also
between conceptual and logical terminologies, which is widespread in the
industry and is exacerbated by limitations of natural language (NL). We use the
following terminology in our approach to conceptual modeling:
Objects
are:
- Primitive (basic entities);
- Compound:
- groups of related
entities;
- multigroups (groups
of related groups);
Properties
are:
- Individual (of basic
entities);
- Collective:
- Of groups: relationships among entities within a group;
- Of multigroups: relationships among groups within a multigroup.
Note: It is a McGoveran insight that relationships
between objects at a lower aggregate level are properties of the object at the
higher aggregate level which the former comprise (LOGIC FOR SERIOUS DATABASE
FOLK, forthcoming; see draft chapters) http://www.alternativetech.com/ATpubs_dir.html
For classification of properties as first, second, third and fourth order (1OP,
2OP, 3OP and 4OP) see RELATIONSHIPS AND THE RDM Parts 1-3. https://www.dbdebunk.com/2023/03/relationships-and-rdm-v2-part-1.html
All such properties can be expressed logically in a FOPL-based relational data
sublanguage as constraints, which is beyond the scope of this
discussion.
“I have read this article in an effort to boost my academic knowledge on data modeling a bit and still have no idea what this academic author wanted to say. Apparently First Normal Form (1NF) doesn't get enough respect and then proceeds to talk about Non-First Normal Form (NFNF). But what about First Normal Form (1NF) damnit.”