UML For The C Programming Language
UML For The C Programming Language
Abstract
In this paper we describe a substantial example of model-driven development (MDD)
applied to model transformation (MT) development. A detailed requirements engineering
process was followed, together with an agile MDD process. The transformation maps UML
specifications to ANSI C code, and it is written using the UML-RSDS MT specification
language.
1 Introduction
In this paper we describe the requirements and specification of a code generator for mapping
UML to ANSI C. Although the C language is now over 40 years old, it is still in widespread
use, especially for applications requiring high efficiency and small code size. Therefore it was
considered useful to provide such a generator. Existing C code generators for UML do not map
detailed functional specifications (expressed, for example, in OCL or by UML activities) into
C, and we wished to provide a generator which would implement the OCL library and provide
full functional implementations of UML operation specifications and activities. In this respect,
the generator can be used as a ‘virtual machine’ for the execution of UML/OCL and for model
transformations (MT) expressed in UML/OCL, as an alternative to the more usual Java/JVM
implementation route for UML [15].
We use the UML-RSDS subset of UML as our input language [7]. UML-RSDS models systems,
including model transformations, using UML 2 class diagrams, OCL 2.4 and use cases at the
specification level. State machines and interations may optionally be used. At the design level,
UML activities using a pseudocode notation are used. Relative to the fUML executable subset of
UML [12], UML-RSDS has a more declarative orientation. As in the case of fUML, certain UML
notations and modelling elements have been excluded from UML-RSDS in order to achieve a more
coherent and precise modelling language. Applications (such as the UML to C code generator
itself) are specified using class diagrams, use cases and OCL constraints with a clear semantic
meaning, and designs are then generated semi-automatically from these. From the design, code
in Java, C# or C++ can be automatically generated (Figure 1). The UML to C translator will
take as input text files model.txt which define the design level of an application as an instance
model of the UML-RSDS class diagram, OCL and activities metamodels (Figures 4, 6, 8). It is
assumed that these models are syntactically correct wrt the metamodels, and that type checking
and design generation steps have been performed prior to export of the data.
The code generator is an example of the reflexive application of UML-RSDS to operate on
UML-RSDS models. This technique enables the extension of the UML-RSDS tools with new
capabilities such as customised XML production, document and code generators, analysis reports
and new semantic mappings, etc. It also supports the use of UML-RSDS to define textual DSLs
and associated tools.
1
Figure 1: UML-RSDS process
2
Figure 2: Functional requirements decomposition in SysML
The development was therefore organised into five iterations, one for each translator part, and
each iteration was given a maximum duration of one month. The overall development bound was
6 months (process requirement NF10).
Other high-priority requirements identified for the translator were the following functional and
non-functional system (product) requirements:
• F3: Model-level semantic preservation: the semantics of the source and target models are
equivalent.
• F4: Traceability: a record should be maintained of the correspondence between source and
target elements.
• NF2: Efficiency: input models with 100 classes and 100 attributes per class should be
processed within 30 seconds.
• NF6: Produce efficient code, of the same efficiency as equivalent hand-produced code.
• NF7: Produce compact code, of the same size as equivalent hand-produced code.
3
• NF2 conflicts with F4, F5 and NF3 because the additional structure needed for tracing and
bx properties impairs efficiency, and the decomposition of the transformation into subtrans-
formations composed sequentially also slows execution.
• NF4 conflicts with NF10 as the additional work required for NF4 would need substantial
additional time resources.
• NF6 conflicts with F3 because in some cases semantic correctness will require inefficient
coding, eg., because OCL collection operators produce modified copies of their arguments
instead of updating them in-place.
The project attributes are as follows:
1. Size: medium
2. Complexity: high
3. Volatility: low
4. Customer relationship: low
5. Safety: low
6. Quality: high
7. Cost constraint: medium
8. Time constraint: medium
9. Domain knowledge: medium
It was identified that a suitable overall architecture for the transformation was a sequential
decomposition of a model-to-model transformation design2C , and a model-to-text transformation
genCtext. Decomposing the code generator into two sub-transformations improves its modularity,
and simplifies the transformation rules, which would otherwise need to combine language trans-
lation and text production. Figure 3 shows the resulting transformation architecture. This is an
example of the architectural pattern Factor Code Generation into Model-to-model and Model-to-
code [2].
This decomposition means that each of the high-level requirements need to be satisfied by
both design2C and genCtext. The requirements for bidirectionality and traceability are however
specific to design2C .
After a further interview, the application of model-based testing and bx to achieve F3 was
identified as an important area of work. Tests for the synthesised C code should, ideally, be
automatically generated based on the source UML model. The bx property can be utilised for
testing semantic equivalence by transforming UML to C, applying the reverse transformation, and
comparing to identify if the two UML models are isomorphic.
4
Sections 2, 3, 4, 5 and 6 describe the five iterations, Section 7 gives an evaluation, Section
8 describes related work, and Section 10 gives conclusions. The appendix summarises the C
semantics which we use to justify semantic preservation.
1.2 Methodology
We adopted an agile MDD approach to develop the code generator.
The key principles of agile development include (agilemanifesto.org): (i) satisfy the customer
through early and continuous software delivery; (ii) welcome changing requirements; (iii) deliver
working software frequently (every 2 weeks to every 2 months); (iv) business people and devel-
opers to work together daily; (v) rely on face-to-face communication to convey information; (vi)
continuous attention to software quality; (vii) simplicity is essential.
To achieve these principles, the following agile practices were used in the development: (i)
short iterations (principles (i) and (iii)); (ii) refactoring (principle (vi)); (iii) emphasis on sim-
plicity (principle (vii)); (iv) product and iteration backlogs (principles (i), (ii)); (v) Scrumboards
(principles (i), (ii)); (vi) continuous integration and testing (principle (vi)). Due to the lack of
direct customer contacts, it was not possible to achieve principle (iv).
The following MDD practices were used: (i) metamodelling; (ii) transformations; (iii) exe-
cutable modelling. The Scrum process [13] was followed, with executable models being used in
place of code.
Within each iteration, phases of requirements analysis, specification, implementation and test-
ing were applied to each task, and the following process was generally followed:
1. Exploratory prototyping and research to assess the feasibility and semantic validity of pos-
sible C programming approaches to express UML and OCL elements. The Visual Studio
(2012), gcc (2016) and lcc1 (2016) C compilers were used.
2. Informal specification in concrete grammar of the mappings from UML to C. Discussion of
the informal specification with stakeholders/team members.
3. Formalisation of the mappings as UML-RSDS rules and operations.
4. Specification review/refactoring.
5. Prototyping and testing of these specifications; integration with other software elements;
revision of specifications as necessary to pass tests and efficiency requirements. Test cases
were manually constructed for each local functional requirement.
5
is not needed, nor are the associations linkedClass, ownedLiteral or the classes Enumeration,
EnumerationLiteral in this version of the transformation.
A model of Figure 4 is concretely represented as a text file with each line describing either (i)
that an object exists in a particular concrete class, eg:
c : Entity
p : Property
or (ii) that an attribute or 1-multiplicity role has a particular value:
c.name = "A"
p.type = Integer
or (iii) that an object is in a many-multiplicity role:
p : c.ownedAttribute
c : p.owner
The initial target language is a simplified version of the abstract syntax of C programs, sufficient
to represent UML types (Figure 5). This language is open to further elaboration and extension
during the development.
Using goal decomposition, the requirements were decomposed into specific mapping require-
ments, these are the local functional requirements F1.1.1 to F1.1.4 in Figure 2. Table 1 shows
the informal scenarios for these local mapping requirements, based on the concrete metaclasses of
Type and the different cases of instances of these metaclasses. The schematic concrete grammar
is shown for the C elements representing the UML concepts. This mapping has the advantage
of simplicity, and the T* operator directly interprets Collection(T). As a result of requirements
evaluation and negotiation with the principal stakeholder, using exploratory prototyping and sce-
nario analysis, it was determined that all of these local requirements are of high priority except
6
Figure 5: C language metamodel
for the mapping F1.1.2 of enumerations (medium priority). The justification for this is that enu-
merations are not an essential UML language element. Bidirectionality was considered a high
priority for this subtransformation. It was identified that to meet this requirement, all source
model Property elements must have a defined type – and specifically that elements representing
many-valued association ends must have some CollectionType representing their actual type. A
limitation of the mapping is that mapping collections of numbers or booleans to C is not pos-
sible, because there is no means to identify the end of the collection in C (NULL is used as
the terminator for collections of objects and collections of strings). This means that expressions
such as objs→collect(att)→sort() must be coded as objs→sortedBy(att)→collect(att). Likewise,
objs→collect(att)→count(val ) should be expressed as objs→select(att = val )→size(). In general,
operators such as sorting and iterators should only be applied to collections of objects or of strings,
and not to collections of numeric/boolean values.
Requirements specification formalises these mappings as UML-RSDS rules, defining the post-
conditions of a transformation types2C . Scenario analysis and evolutionary prototyping were used
for this stage. The Auxiliary Correspondence Model pattern [6] was used to achieve the bidirec-
tionality requirement, and the traceability requirement. A new identity attribute typeId : String
was introduced into Type, and ctypeId into CType. This enables Type and CType instances to be
looked-up by key value: Type[id ] is the type instance t with t.typeId = id .
An example of a scenario expressed in the SBVRSE notation for SysML [11] is (for F1.1.1.1):
It is necessary that each string type PrimitiveType instance s maps to a CPointerType
instance p such that p.ctypeId = s.typeId , and to a CPrimitiveType instance c such
7
that p.pointsTo = c and c.name = “char”.
Each PrimitiveType is considered to be a string type PrimitiveType if it has name
“String”.
Such representations can be directly mapped to UML-RSDS rule specifications:
PrimitiveType::
name = "String" =>
CPointerType->exists( p | p.ctypeId = typeId &
CPrimitiveType->exists( c | c.name = "char" & p.pointsTo = c ) )
In this rule, the quantification is over objects self : PrimitiveType, and expresses that whenever
the lhs of the rule is true, the rhs should be made true, ie., the relevant C types should be looked-up
or created if they do not already exist.
Primitive types, entity types and collection types are successively mapped. typeId : String
and ctypeId : String are new identity attributes introduced for Auxiliary Correspondence Model.
These provide a correspondence between the occurrences of types in the source UML-RSDS design
and in the target C implementation. For model.txt files, typeId values are strings consisting of
digits. It is assumed that collection types can only have entity types or primitive types as their
element types:
CollectionType ::
elementType : Entity or elementType : PrimitiveType
This ensures that, in the final two rules, CType[elementType.typeId ] does exist at the point where
it is looked-up: it has been created by earlier rules operating on the element type.
PrimitiveType::
name = "int" =>
CPrimitiveType->exists( p | p.ctypeId = typeId & p.name = "int" )
PrimitiveType::
name = "long" =>
CPrimitiveType->exists( p | p.ctypeId = typeId & p.name = "long" )
PrimitiveType::
name = "double" =>
CPrimitiveType->exists( p | p.ctypeId = typeId & p.name = "double" )
PrimitiveType::
name = "boolean" =>
CPrimitiveType->exists( p | p.ctypeId = typeId & p.name = "unsigned char" )
PrimitiveType::
name = "void" =>
CPrimitiveType->exists( p | p.ctypeId = typeId & p.name = "void" )
PrimitiveType::
name = "String" =>
CPointerType->exists( t | t.ctypeId = typeId &
CPrimitiveType->exists( p | p.name = "char" & t.pointsTo = p ) )
Entity::
CPointerType->exists( p | p.ctypeId = typeId &
CStruct->exists( c | c.name = name & c.ctypeId = name & p.pointsTo = c ) )
CollectionType::
name = "Sequence" =>
8
CArrayType->exists( a | a.ctypeId = typeId & a.duplicates = true &
a.componentType = CType[elementType.typeId] )
CollectionType::
name = "Set" =>
CArrayType->exists( c | c.ctypeId = typeId & c.duplicates = false &
c.componentType = CType[elementType.typeId] )
The mapping of strings and entity types is an example of the Entity Splitting pattern [3].
During requirements validation and verification, model-level semantic preservation can be
shown based on the bx properties. The above constraints can be inverted to:
CPrimitiveType::
name = "int" =>
PrimitiveType->exists( t | t.typeId = ctypeId & t.name = "int" )
CPrimitiveType::
name = "long" =>
PrimitiveType->exists( t | t.typeId = ctypeId & t.name = "long" )
CPrimitiveType::
name = "double" =>
PrimitiveType->exists( t | t.typeId = ctypeId & t.name = "double" )
CPrimitiveType::
name = "unsigned char" =>
PrimitiveType->exists( t | t.typeId = ctypeId & t.name = "boolean" )
CPrimitiveType::
name = "void" =>
PrimitiveType->exists( t | t.typeId = ctypeId & t.name = "void" )
CPointerType::
p : CPrimitiveType & p.name = "char" & pointsTo = p =>
PrimitiveType->exists( t | t.typeId = ctypeId & t.name = "String" )
CPointerType::
c : CStruct & pointsTo = c =>
Entity->exists( e | e.typeId = ctypeId & e.name = c.name )
CArrayType::
duplicates = true =>
CollectionType->exists( t | t.typeId = ctypeId & t.name = "Sequence" &
t.elementType = Type[componentType.ctypeId] )
CArrayType::
duplicates = false =>
CollectionType->exists( t | t.typeId = ctypeId & t.name = "Set" &
t.elementType = Type[componentType.ctypeId] )
9
formation constraints are of type 1 (bounded loops). F2 was ensured by checking the generated
syntax against the C standard. F3 was ensured by constructing a semantic model SemC (p) of
the generated C programs p (in first-order set theory) and checking that this is equivalent to the
corresponding mathematical model SemUML (m) of UML/OCL models m [8], when p is generated
from m. F4 is ensured by the use of the Auxiliary Correspondence Model pattern, as is the bx
requirement F5. NF2 was checked for the uml2Ca release and found to be satisfied (Table 4).
In this iteration, the specification effort included construction of the C metamodels for use
in subsequent iterations, and the verification effort included verification of the ocl.h library. The
transformation size was 10 rules and 5 operations. A further medium level non-functional require-
ment was added during this iteration:
• NF5: Usability of the transformation – it should be a simple process to invoke it from the
UML-RSDS toolset interface.
10
Scenario UML element e C representation e’
F1.2.1 Class diagram D C program with D’s name
F1.2.2 Class E struct E { ... };
Global variable struct E** e instances;
Global variable int e size;
struct E* createE(void) operation
struct E** appendE(struct E**, struct E*) operation
struct E** insertE(struct E**, struct E*) operation
struct E** newEList(void) operation
F1.2.3.1 Instance property p : T Member T’ p; of the struct for p’s owner,
(not principal identity where T’ represents T
attribute) Operations T ′ getE p(E ′ self )
and setE p(E ′ self , T ′ px )
F1.2.3.2 Principal identity attribute Operations getE p, setE p,
p : String of struct E* getEByPK(char* v)
class E Key member char* p; of the struct for E
F1.2.4 Operation op(p : P ) : T of E C operation
(non-static) T’ op E(E’ self, P’ p)
with scope = “entity”
F1.2.5 Inheritance of A by B Member struct A* super;
of struct B
Operations getB att(x ) for inherited att
invoke getA att(x →super ), etc.
conditionals; (ii) embedded superclass struct instance in each subclass struct, and function point-
ers for each supported method; (iii) as (ii) but with vtables for function pointers; (iv) Objective-C
style metaprogramming of classes and objects.
We chose option (i) as the simplest possible scheme, with the intention to move to (ii) or (iii)
in future releases. Option (iv) was rejected because the resulting code is excessively complex and
quite distant in style from conventional C coding.
Global variables are elements of the variables list of the CProgram instance representing the
class diagram.
All of these local mapping requirements F1.2.1 to F1.2.5 are of high priority. As with iteration
1, bx properties are of high priority for this subtransformation.
The scenarios for this transformation are more complex, and were considered during evaluation
and negotiation, and during requirements specification. The scenarios included:
• F1.2.2, F1.2.3.1, F1.2.3.2: simple and primary key attributes of a single class; access and
update to these, and creation of class instances. Consideration of this scenario led to sim-
plification of the representation of primary keys. It was noted that for future improvement,
a hash-based or BST-based key lookup should be used.
• F1.2.4: operations with the same name and parameters but in different classes. Functions
must have distinct names in C, so it was decided to use distinct names op E , op F , etc, if
an operation op appears in several classes E , F . In this case the name op itself can be used
for a polymorphic implementation op(void ∗ self ...) which selects the most-specific op E
based on the actual class of its self argument.
• F1.2.5: a typical situation where client class A has a many-many association with ordered
role br at an (abstract) superclass B of concrete classes C and D. Navigations of the form
br[i].x for a property x of B must be possible. This was confirmed: the navigation would be
getB x (br [i − 1]). However, using option (i) for modelling inheritance, C and D elements
can only be stored in br by ‘upcasting’ them: br = appendB (br , cx →super ), and likewise
for D. This means that C and D instances are treated as B instances when in br.
11
The detailed form of the createE operation was determined: for classes without primary keys, this
operation has no parameter and has the form:
struct E* createE(void)
{ struct E* result = (struct E*) malloc(sizeof(struct E));
/* initialisations of E members */
e_instances = appendE(e_instances, result);
e_size++;
return result;
}
The initialisations are the standard defaults: 0 for numerics, the empty string for char*, NULL
for roles, FALSE for booleans. If a super member is present, this is initialised by calling createF()
for its class F:
CMember::
query initialiser() : String
post:
(isKey = true => result = "") &
(name = "super" =>
result = " result->super = create" + type.pointsTo.name + "();\n") &
(type : CPointerType or type : CArrayType =>
result = " result->" + name + " = NULL;\n") &
(type : CPrimitiveType =>
result = " result->" + name + " = 0;\n")
12
struct E** result = (struct E**) calloc(n+1, sizeof(struct E*));
int i = 0;
int j = 0;
for ( ; i < n; i++)
{ char* attv = col[i];
struct E* ex = getEByPK(attv);
if (ex != NULL)
{ result[j] = ex; j++; }
}
result[j] = NULL;
return result;
}
The mappings were formalised as UML-RSDS rules. The corresponding part of genCtext was
also written, enabling complete C programs to be produced: operations printProgramHeader (),
printDeclarations(), printOperations() and printMainOperation() of CProgram display the text of
these C program parts.
For F1.2.2, the definition of structs for entities is carried out by types2C. The global variables
are created by the following constraint of printDeclarations():
CStruct::
("struct " + name + "** " + name.toLowerCase() + "_instances = NULL;")->display() &
("int " + name.toLowerCase() + "_size = 0;")->display()
Getters and setters are created by getterOp, setterOp, getAllOp:
CMember::
query getterOp(ent : String) : String
post:
result =
type + " get" + ent + "_" + name +
"(struct " + ent + "* self) { return self->" + name + "; }\n"
CMember::
query inheritedGetterOp(ent : String, sup : String) : String
post:
(name /= "super" =>
result =
type + " get" + ent + "_" + name +
"(struct " + ent + "* self) { return get" +
sup + "_" + name + "(self->super); }\n") &
(name = "super" =>
result = self.ancestorGetterOps(ent,sup) )
CMember::
query inheritedGetterOps(ent : String) : String
pre: type : CPointerType &
type.pointsTo : CStruct
post:
sup = type.pointsTo &
result = sup.members->collect( m | m.inheritedGetterOp(ent, sup.name) )->sum()
CMember::
query ancestorGetterOps(ent : String, sup : String) : String
pre: type : CPointerType &
type.pointsTo : CStruct
post:
anc = type.pointsTo &
result = anc.members->collect( m | m.inheritedGetterOp(ent, sup) )->sum()
13
CMember::
query setterOp(ent : String) : String
post:
result =
"void set" + ent + "_" + name +
"(struct " + ent + "* self, " + type + " _value) { self->" + name + " = _value; }\n"
CMember::
query inheritedSetterOp(ent : String, sup : String) : String
post:
(name /= "super" =>
result =
"void set" + ent + "_" + name +
"(struct " + ent + "* self, " + type + "_value) { set" +
sup + "_" + name + "(self->super, _value); }\n") &
(name = "super" =>
result = self.ancestorSetterOps(ent,sup) )
CMember::
query inheritedSetterOps(ent : String) : String
pre: type : CPointerType &
type.pointsTo : CStruct
post:
sup = type.pointsTo &
result = sup.members->collect( m | m.inheritedSetterOp(ent, sup.name) )->sum()
CMember::
query ancestorSetterOps(ent : String, sup : String) : String
pre: type : CPointerType &
type.pointsTo : CStruct
post:
anc = type.pointsTo &
result = anc.members->collect( m | m.inheritedSetterOp(ent, sup) )->sum()
CMember::
query getAllOp(ent : String) : String
pre: type : CPrimitiveType
post:
result = type + "* getAll" + ent + "_" + name + "(struct " + ent + "* col[])\n" +
"{ int n = length((void**) col);\n" +
" " + type + "* result = (" + type + "*) calloc(n, sizeof(" + type + "));\n" +
" int i = 0;\n" +
" for ( ; i < n; i++)\n" +
" { result[i] = get" + ent + "_" + name + "(col[i]); }\n" +
" return result;\n" +
"}\n"
CMember::
query getAllOp1(ent : String) : String
pre: type : CPointerType
post:
result = type + "* getAll" + ent + "_" + name + "(struct " + ent + "* col[])\n" +
"{ int n = length((void**) col);\n" +
" " + type + "* result = (" + type + "*) calloc(n+1, sizeof(" + type + "));\n" +
" int i = 0;\n" +
14
" for ( ; i < n; i++)\n" +
" { result[i] = get" + ent + "_" + name + "(col[i]); }\n" +
" result[n] = NULL;\n" +
" return result;\n" +
"}\n"
CMember::
query inheritedAllOp(ent : String, sup : String) : String
post:
(name /= "super" & type : CPrimitiveType =>
result = getAllOp(ent)) &
(name /= "super" & type : CPointerType =>
result = getAllOp1(ent)) &
(name = "super" =>
result = self.ancestorAllOps(ent,sup) )
CMember::
query ancestorAllOps(ent : String, sup : String) : String
pre: type : CPointerType &
type.pointsTo : CStruct
post:
anc = type.pointsTo &
result = anc.members->collect( m | m.inheritedAllOp(ent, sup) )->sum()
CMember::
query inheritedAllOps(ent : String) : String
pre: type : CPointerType &
type.pointsTo : CStruct
post:
sup = type.pointsTo &
result = sup.members->collect( m | m.inheritedAllOp(ent, sup.name) )->sum()
CMember::
query getPKOp(ent : String) : String
post:
e = ent.toLowerCase &
result =
"struct " + ent + "* get" + ent + "ByPK(char* ex)\n" +
"{ int n = length((void**) " + e + "_instances);\n" +
" int i = 0;\n" +
" for ( ; i < n; i++)\n" +
" { char* attv = get" + ent + "_" + name + "(" + e + "_instances[i]);\n" +
" if (attv != NULL && strcmp(attv,ex) == 0)\n" +
" { return " + e + "_instances[i]; }\n" +
" }\n" +
" return NULL;\n" +
"}\n"
CStruct::
query getPKsOp() : String
post:
result = "struct " + name + "** get" + name + "ByPKs(char* col[])\n" +
"{ int n = length((void**) col);\n" +
" struct " + name + "** result = (struct " + name +
"**) calloc(n+1, sizeof(struct " + name + "*));\n" +
" int i = 0; \n" +
15
" int j = 0; \n" +
" for ( ; i < n; i++)\n" +
" { char* attv = col[i];\n" +
" struct " + name + "* ex = get" + name + "ByPK(attv);\n" +
" if (ex != NULL) \n" +
" { result[j] = ex; j++; }\n" +
" }\n" +
" result[j] = NULL; \n" +
" return result;\n" +
"}\n"
Here, the Text Templates pattern is used to combine fixed text portions with variable model-
derived elements. The Auxiliary Metamodel pattern is used to introduce a auxiliary association
allMembers from CStruct to CMember , which records all C members explicitly in or inherited by
a struct.
The text generation operations are then used in the postconditions of the genCtext use case:
CStruct::
f : members & f.name /= "super" => f.getterOp(name)->display()
CStruct::
f : members & f.name = "super" => f.inheritedGetterOps(name)->display()
CStruct::
members->exists( k | k.isKey ) & key = members->select( isKey )->any() =>
key.getPKOp(name)->display() & self.getPKsOp()->display()
CStruct::
f : members & f.name /= "super" => f.setterOp(name)->display()
CStruct::
f : members & f.name = "super" => f.inheritedSetterOps(name)->display()
CStruct::
f : members & f.type : CPrimitiveType => f.getAllOp(name)->display()
CStruct::
f : members & f.name /= "super" & f.type : CPointerType => f.getAllOp1(name)->display()
CStruct::
f : members & f.name = "super" & f.type : CPointerType => f.inheritedAllOps(name)->display()
CStruct::
members->exists( k | k.isKey ) & key = members->select( isKey )->any() =>
self.createPKOp(name, key.name)->display()
CStruct::
true =>
self.createOp(name)->display()
This use case also generates the createE and newEList operations, and other operations specific
to E, such as collectE, selectE, rejectE, intersectionE, unionE, reverseE, frontE, tailE, asSetE,
concatenateE, removeE, removeAllE, subrangeE, isUniqueE, insertAtE, etc:
CStruct::
query createOp(ent : String) : String
post:
einst = ent.toLowerCase + "_instances" &
16
result = "struct " + ent + "* create" + ent + "(void)\n" +
"{ struct " + ent + "* result = (struct " + ent +
"*) malloc(sizeof(struct " + ent + "));\n" +
members->collect( m | m.initialiser() )->sum() +
" " + einst + " = append" + ent + "(" + einst + ", result);\n" +
" " + ent.toLowerCase + "_size++;\n" +
" return result;\n" +
"}\n"
CStruct::
query createPKOp(ent : String, key : String) : String
post:
einst = ent.toLowerCase + "_instances" &
result = "struct " + ent + "* create" + ent + "(char* _value)\n" +
"{ struct " + ent + "* result = NULL;\n" +
" result = get" + ent + "ByPK(_value);\n" +
" if (result != NULL) { return result; }\n" +
" result = (struct " + ent + "*) malloc(sizeof(struct " + ent + "));\n" +
members->collect( m | m.initialiser() )->sum() +
" set" + ent + "_" + key + "(result, _value);\n" +
" " + einst + " = append" + ent + "(" + einst + ", result);\n" +
" " + ent.toLowerCase + "_size++;\n" +
" return result;\n" +
"}\n"
Selective generation of OCL operations is used, so that operations opE are only generated if there
is an occurrence of →op applied to a collection of E elements in the source model. This facilitates
achieving NF7.
The attributes (and association ends) owned by a class are mapped to members of its corre-
sponding struct (F1.2.3):
Entity::
CStruct->exists( c | c.name = name &
ownedAttribute->forAll( p | p.name.size > 0 =>
CMember->exists( m | m.name = p.name & m.isKey = p.isUnique &
m.type = CType[p.type.typeId] & m : c.members ) ) )
This constraint is in an invertible form according to [6]. Both single-valued and multi-valued
attributes are mapped correctly by this scheme. However, static attributes would need to be
represented as global scope C variables external to the struct definition of their entity.
F1.2.4 is specified by:
Operation::
COperation->exists( op | op.name = name + "_" + owner.name &
op.opId = name + "_" + owner.name &
CVariable->exists( p | p.name = "self" &
p : op.parameters & p.kind = "parameter" & p.type = CType[owner.typeId] ) &
parameters->forAll( x | CVariable->exists( y | y.name = x.name &
y.kind = "parameter" &
y.type = CType[x.type.typeId] &
y : op.parameters ) ) &
op.isQuery = isQuery &
op.isStatic = isStatic &
op.scope = "entity" &
op.returnType = CType[type.typeId] )
For a static operation, the self parameter is omitted. Update operations are given a void result
type in the UML model data file model.txt (if they do not have a specific result type). To
17
support bx properties, a new attribute isQuery needs to be introduced to the C meta model class
COperation and set as above. To support lookup of COperations in the mapping of activities to
C, a new identity attribute opId is introduced.
If an inheritance exists from entity E to entity F , then an additional member of type struct
F* is inserted into the struct for E (F1.2.5):
Generalization::
CMember->exists( m | m.name = "super" &
CStruct->exists( sub | sub.ctypeId = specific.name &
m : sub.members & m.type = CPointerType[general.typeId] ) )
and:
CMember::
sub : CStruct & name = "super" & self : sub.members =>
Generalization->exists( g | g.specific.name = sub.ctypeId &
g.general = Entity[type.ctypeId] )
CStruct::
Entity->exists( e | e.name = name &
members->forAll( m |
Property->exists( p | p.name = m.name & p.isUnique = m.isKey &
p.type = Type[m.type.ctypeId] & p : e.ownedAttribute ) ) )
18
Stage Effort (person days)
Req. Elicitation 2
Eval./Negotiation 1
Specification 12
Review/Validation 12
Implementation/ 10
Testing
Total 37
as output. Statement and OCL data is implicitly copied from the source to the target model by
the model loading/model saving mechanisms, an example of the Implicit Copy pattern.
The completed prototype after iterations 1 and 2 has been implemented as a jar file uml 2Ca.jar
at www.dcs.kcl.ac.uk/staff/kcl/uml2Ca/. This reads an input file model.txt, produced by the Save
As Model option of UML-RSDS. The C code is written to a file app.h:
19
Figure 6: UML-RSDS OCL metamodel
20
OCL expression e C representation e ′
self self as an operation parameter
super self →super
Variable v v
or v [ind ] v [ind ′ − 1]
Data feature f of context E self →f (E is root)
with no objectRef getE f (self ) (otherwise)
E data feature f ex ′ →f (E is root)
of instance ex getE f (ex ′ ) (otherwise)
Operation call op(e1, ..., en) op E(self, e1’, ..., en’)
or obj .op(e1, ..., en) of op E(obj’, e1’, ..., en’)
instance entity scope op of E
Call op(e1, ..., en) of op(e1’, ..., en’)
static/application scope op
E attribute f getAllE f (exs ′ )
of collection exs (duplicate values preserved)
Single-valued role r : F getAllE r (exs ′ ) defined by
of E collection exs (struct F ∗ ∗) collectE (exs ′ , getE r )
col [ind ] (col’)[ind’ - 1]
ordered collection col
E [v ] getEByPK(v’)
v single-valued
E [vs] getEByPKs(vs’)
vs collection-valued
E .allInstances e instances
value of enumerated type, value
numeric or string value
boolean true, false TRUE, FALSE
21
Not included are objs.r for collection-valued r and objs, or operation applications objs.op(pars)
on collections: the latter can be expressed instead using forAll or collect.
x .oclAsType(T ) needs to be considered separately. For T as long, int or double, a C cast can
be used. For String, a cast to char ∗ can be used. For an entity type, navigation using the super
member will be necessary. The decision was made to omit this operator from the first version of
the deliverable.
CExpression::
static defineCOpRef(op : COperation) : CExpression
post:
CBasicExpression->exists( be | be.cexpId = op.name + "_ref" &
be.data = op.name &
be.type = op.returnType &
be.elementType = op.elementType & result = be )
CExpression::
static defineCOpRefCast(op : COperation, cst : String) : CExpression
post:
CBasicExpression->exists( be | be.cexpId = op.name + "_ref" &
be.data = op.name &
22
be.type = op.returnType &
be.elementType = op.elementType & result = Expression.cast(cst, be) )
CExpression::
static defineCOpReference(op : String, typ : String) : CExpression
post:
CBasicExpression->exists( be | be.cexpId = op.name + "_ref" &
be.data = op.name &
CPrimitiveType->exists( t | t.name = typ & t.ctypeId = op.name + "_" + typ &
be.type = t & be.elementType = t & result = be ) )
Note that the auxiliary operations only have a single parameter, this means that mapping of forAll,
select, etc is only supported where the iterator/collect rhs expressions depend on a single variable.
The alternative (used in the Java, etc translators) is to create a specialised iterator operation for
each different use of an iterator operation.
The auxiliary operations created for iterator expressions are not mapped back to UML via the
inverse transformation.
23
OCL expression e C representation e ′
x :E isIn((void ∗) x ′ , (void ∗ ∗) e instances)
E entity type
x :s isIn((void ∗) x ′ , (void ∗ ∗) s ′ )
s collection
s->includes(x) Same as x : s
s collection
x /:E !isIn((void ∗) x ′ , (void ∗ ∗) e instances)
E entity type
x /:s !isIn((void ∗) x ′ , (void ∗ ∗) s ′ )
s collection
s->excludes(x) Same as x / : s
s collection
x =y x ′ == y ′
Numerics, booleans
Strings strcmp(x ′ , y ′ ) == 0
objects x ′ == y ′
Sets equalsSet((void ∗ ∗) x ′ , (void ∗ ∗) y ′ )
Sequences equalsSequence((void ∗ ∗) x ′ , (void ∗ ∗) y ′ )
x <y x ′ < y′
numerics
Strings strcmp(x ′ , y ′ ) < 0
Similarly for >, <=, >=, >, <=, >=,
/= !=
s <: t containsAll ((void ∗ ∗) t ′ , (void ∗ ∗) s ′ )
s, t collections
s / <: t !containsAll ((void ∗ ∗) t ′ , (void ∗ ∗) s ′ )
s, t collections
t->includesAll(s) Same as s <: t
t->excludesAll(s) disjoint((void**) t’, (void**) s’)
24
Expression e C translation e ′
x + y concatenateStrings(x’, y’)
x->size() strlen(x’)
x->first() firstString(x’) defined as subString(x’,1,1)
x->front() frontString(x’) defined as subString(x’, 1, strlen(x’)-1)
x->last() lastString(x’) defined as subString(x’, strlen(x’), strlen(x’))
x->tail() tailString(x’) defined as subString(x’, 2, strlen(x’))
x.subrange(i,j) subString(x’, i’, j’)
x[i] subString(x’,i’,i’)
x->toLowerCase() toLowerCase(x’)
x->toUpperCase() toUpperCase(x’)
s->indexOf(x) indexOfString(s’,x’)
s->hasPrefix(x) startsWith(s’,x’)
s->hasSuffix(x) endsWith(s’,x’)
s->characters() characters(s’)
s.insertAt(i,s1) insertAtString(s’,i’,s1’)
s->count(s1) countString(s1’, s’)
single character s1
s->reverse() reverseString(s’)
e->display() displayString(e’) defined as printf(”%s\n”, e’)
for String-valued e,
displayNumeric(e’) defined as printf(”%d\n”, e’)
or printf(”%f\n”, e’) for numeric e
s1 - s2 subtractString(s1’, s2’)
e->isInteger() –
e->isReal() –
e->toInteger() atoi(e’)
e->toReal() atof(e’)
e->toLong() atol(e’)
25
Expression e C translation e’
Set{} newEList()
Sequence{} newEList()
Set{x 1, x 2, ..., xn} insertE(... insertE(newEList(), x1’), ..., xn’)
Sequence{x 1, x 2, ..., xn} appendE(... appendE(newEList(), x1’), ..., xn’)
s->size() length((void**) s’)
s->including(x) insertE(s’,x’) or appendE(s’,x’)
s->excluding(x) removeE(s’,x’)
s - t removeAllE(s’,t’)
s->prepend(x) prependE(s’,x’)
s->append(x) appendE(s’,x’)
s->count(x) count((void*) x’, (void**) s’)
s->indexOf(x) indexOf((void*) x’, (void**) s’)
x \/y unionE(x’,y’)
x /\y intersectionE(x’,y’)
x ay concatenateE(x’,y )
x->union(y) unionE(x’,y’)
x->intersection(y) intersectionE(x’, y’)
x->unionAll(e) –
x->intersectAll(e) –
x->symmetricDifference(y) –
x->any() (x’)[0]
x->at(i) (x’)[i’-1]
x->subcollections() –
x->reverse() reverseE(x’)
x->front() frontE(x’) defined as subrangeE(x’,1,length((void**) x’)-1)
x->tail() tailE(x’) defined as subrangeE(x’,2,length((void**) x’))
x->first() firstE(x’) defined as x’[0]
x->last() lastE(x’) defined as x’[length((void**) x’)-1]
x->sort() (struct E**) treesort((void**) x’, compareTo E)
x of entity element type E
(char**) treesort((void**) x’, compareTo String)
x of String element type
x->sortedBy(e) (struct E**) treesort((void**) x’, comparee)
comparee defines e-order
x->sum() sumString(x’,n), sumint(x’,n), sumlong(x’,n), sumdouble(x’,n)
n is length of x
x->prd() prdint(x’,n), prdlong(x’,n), prddouble(x’,n)
n is length of x
Integer.Sum(a,b,x,e) intSum(a’,b’,fe), longSum(a’,b’,fe), doubleSum(a’,b’,fe)
fe computes e’(x’)
Integer.Prd(a,b,x,e) intPrd(a’,b’,fe), longPrd(a’,b’,fe), doublePrd(a’,b’,fe)
x->max() maxint(x’,n), maxlong(x’,n),
maxdouble(x’,n), maxString(x’,n)
n is length of x.
x->min() minint(x’,n), minlong(x’,n),
mindouble(x’,n), minString(x’,n)
n is length of x.
x->asSet() asSetE(x’)
x->asSequence() x’
s->isUnique(e) isUniqueE(s’,fe)
x->isDeleted() killE(x’)
26
returns e evaluated for (struct E ∗) other - e evaluated for (struct E ∗) self . For String-valued e,
strcmp is used, and for objects, the appropriate compareTo F operation.
A common form of OCL expression is the evaluation of a reduce operation (min, max, sum,
prd) applied to the result of a collect, eg.:
s→collect(e)→sum()
because it is not possible to find the length of a collection of primitive values. Likewise, s.att.sum
is mapped to sumdouble(getAllE att(s ′ ), length((void ∗ ∗) s ′ )). For a literal collection the length
can be directly determined and used.
After evaluation and negotiation, it was decided that full implementation of delete should be
deferred, because of the complex semantics of data deletion in C. Integer .Sum and Integer .Prd
were also deferred. In addition, prototyping on different platforms revealed that compiler differ-
ences made the use of qsort impractical, and instead a custom sorting algorithm, treesort, was
implemented. This has signature
void** treesort(void* col[], int (*comp)(void*, void*))
and the translation of x →sort() is then: (rt) treesort((void**) x’, comp) for the appropriate result
type rt and comparitor function comp. The algorithm uses an underlying data structure of binary
search trees, which can be used in the future to support the definition of maps (for caching and
model input) and a collection type of sorted sets.
For sort, the entity type E must have a compareTo(other : E) : int operation defined, this will
become a function int compareTo E(struct E* self, struct E* other) in C.
Table 12 shows the translation of select and collect operators. These mappings are grouped as
requirement F1.3.7.
Table 12: Scenarios for the mapping of selection and collection expressions
Unlike the types and class diagram mappings, a recursive descent style of specification is
needed for the expressions mapping (and for activities). This is because the subordinate parts
of an expression are themselves expressions. Thus it is not possible in general to map all the
subordinate parts of an expression by prior rules: even for basic expressions, the parameters may
be general expressions. In contrast, the element types of collection types cannot themselves be
collection types or involve subparts that are collection types, so it is possible to map all element
types before considering collection types. A recursive descent style of mapping specification uses
operations of each source entity type to map instances of that type, invoking mapping operations
recursively to map subparts of the instances.
An operation
27
mapExpression() : CExpression
is defined in each Expression subclass. This is also an example of the Rule Inheritance pattern.
For each category of expression, the subparts of the expression are mapped to C first, and then
composed by a separate operation. For example:
BinaryExpression::
mapExpression() : CExpression
post:
result = mapBinaryExpression(
left.mapExpression(),
right.mapExpression())
UnaryExpression::
mapExpression() : CExpression
post:
result = mapUnaryExpression(
argument.mapExpression())
BasicExpression::
mapExpression() : CExpression
post:
result = mapBasicExpression(
objectRef.mapExpression(),
arrayIndex.mapExpression(),
parameters.mapExpression())
CollectionExpression::
mapExpression() : CExpression
post:
result = mapCollectionExpression(expId,
elements.mapExpression())
For each category of expression, the mapping is further decomposed into cases:
BinaryExpression::
mapBinaryExpression(lexp : CExpression,
rexp : CExpression) : CBinaryExpression
pre:
lexp = CExpression[left.expId] &
rexp = CExpression[right.expId]
post:
CBinaryExpression->exists( c | c.cexpId = expId &
c.operator = Expression.cop(operator) &
c.left = lexp & c.right = rexp &
c.needsBracket = needsBracket &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
result = c )
UnaryExpression::
mapUnaryExpression(arg : CExpression) : CExpression
pre:
arg = CExpression[argument.expId]
post:
CUnaryExpression->exists( c | c.cexpId = expId &
c.operator = Expression.cop(operator) &
c.argument = arg &
28
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
result = c )
BasicExpression::
query mapBasicExpression(ob : Set(CExpression),
aind : Set(CExpression),
pars : Sequence(CExpression)) : CExpression
pre:
ob = CExpression[objectRef.expId] &
aind = CExpression[arrayIndex.expId] &
pars = CExpression[parameters.expId]
post:
(umlKind = value =>
result = mapValueExpression(ob,aind,pars)) &
(umlKind = variable =>
result = mapVariableExpression(ob,aind,pars)) &
(umlKind = attribute =>
result = mapAttributeExpression(ob,aind,pars)) &
(umlKind = role =>
result = mapRoleExpression(ob,aind,pars)) &
(umlKind = operation =>
result = mapOperationExpression(ob,aind,pars)) &
(umlKind = classid =>
result = mapClassExpression(ob,aind,pars)) &
(umlKind = function =>
result = mapFunctionExpression(ob,aind,pars)) &
(true => result = mapVariableExpression(ob,aind,pars))
CollectionExpression::
query mapCollectionExpression(id : String, elems : Sequence(CExpression)) : CExpression
post:
(elems.size = 0 =>
result = createCOpCall(id, "new" + elementType.name + "List") ) &
(elems.size > 0 & type.name = "Set" =>
result = createCBinOpCall(id, "insert" + elementType.name,
mapCollectionExpression(id + "_f", elems.front), elems.last) ) &
(elems.size > 0 & type.name = "Sequence" =>
result = createCBinOpCall(id, "append" + elementType.name,
mapCollectionExpression(id + "_f", elems.front), elems.last) )
This style of specification involves the use of update operations that also return results (or query
operations that create objects and have side-effects), which is considered undesirable. Such op-
erations cannot be translated into the B formalism for verification. The operation precondition
asserts that the parameters correspond to the sub-parts of the basic expression. The kind attribute
records the origin of the C expression. This enables an inverse operation to be defined, eg.:
CBinaryExpression::
mapCBinaryExpression(lexp : Expression,
rexp : Expression) : BinaryExpression
pre:
lexp = Expression[left.cexpId] &
rexp = Expression[right.cexpId]
post:
BinaryExpression->exists( c | c.expId = cexpId &
c.operator = CExpression.uop(operator) &
29
c.left = lexp & c.right = rexp &
c.type = Type[type.ctypeId] &
c.elementType = Type[elementType.ctypeId] &
result = c )
CUnaryExpression::
mapCUnaryExpression(arg : Expression) : Expression
pre:
arg = Expression[argument.cexpId]
post:
UnaryExpression->exists( c | c.expId = cexpId &
c.operator = CExpression.uop(operator) &
c.argument = arg &
c.type = Type[type.ctypeId] &
c.elementType = Type[elementType.ctypeId] &
result = c )
CBasicExpression::
query mapCBasicExpression(ob : Set(Expression),
aind : Set(Expression),
pars : Sequence(Expression)) : Expression
pre:
ob = Expression[reference.cexpId] &
aind = Expression[arrayIndex.cexpId] &
pars = Expression[parameters.cexpId]
post:
(kind = "value" =>
result = mapCValueExpression(ob,aind,pars)) &
(kind = "variable" =>
result = mapCVariableExpression(ob,aind,pars)) &
(kind = "attribute" =>
result = mapCAttributeExpression(ob,aind,pars)) &
(kind = "role" =>
result = mapCRoleExpression(ob,aind,pars)) &
(kind = "operation" =>
result = mapCOperationExpression(ob,aind,pars)) &
(kind = "classid" =>
result = mapCClassExpression(ob,aind,pars)) &
(kind = "function" =>
result = mapCFunctionExpression(ob,aind,pars))
An alternative style of specification would be to use the Map Objects Before Links pattern [3],
however this would involve separation of the mapping of expression instances and the mapping of
relation instances: attribute values of target objects would be set in separate rules to the setting
of their associations.
The detailed cases for mapping different forms of basic expression are as follows:
query mapValueExpression(ob : Set(CExpression),
aind : Set(CExpression),
pars : Sequence(CExpression)) : CBasicExpression
pre: umlKind = value
post:
CBasicExpression->exists( c | c.cexpId = expId & c.kind = "value" &
(data = "true" => c.data = "TRUE") &
(data = "false" => c.data = "FALSE") &
(data /= "true" & data /= "false" => c.data = data) &
c.arrayIndex = aind &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] & result = c)
30
query mapVariableExpression(obs : Set(CExpression),
aind : Set(CExpression),
pars : Sequence(CExpression)) : CBasicExpression
post:
CBasicExpression->exists( c | c.cexpId = expId & c.kind = "variable" &
c.data = data &
c.arrayIndex = aind &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] & result = c)
Note that context must be set for attributes and roles, so that the context entity using the feature can be
accessed (this may be a subclass of the owner of the feature).
31
(context.size = 0 & objectRef.size = 0 =>
CBasicExpression->exists( c | c.cexpId = expId & c.kind = "role" &
c.data = data &
c.isStatic = true &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
c.arrayIndex = aind &
result = c ) ) &
(context.size > 0 & objectRef.size = 0 =>
CBasicExpression->exists( c | c.cexpId = expId & c.kind = "role" &
c.data = "get" + context.any.name + "_" + data &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
c.arrayIndex = aind &
CBasicExpression->exists( s | s.data = "self" & s.kind = "variable" &
c.parameters = Sequence{s} &
s.cexpId = expId + "_self" &
s.type = CType[context.any.typeId] &
s.elementType = s.type ) & result = c ) ) &
(objectRef.size > 0 & objectRef.any.type : CollectionType =>
CBasicExpression->exists( c | c.cexpId = expId & c.kind = "role" &
c.data = "getAll" + objectRef.any.elementType.name + "_" + data &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
c.parameters = obs &
c.arrayIndex = aind &
result = c ) ) &
(objectRef.size > 0 & objectRef.any.type /: CollectionType =>
CBasicExpression->exists( c | c.cexpId = expId & c.kind = "role" &
c.data = "get" + objectRef.any.elementType.name + "_" + data &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
c.parameters = obs &
c.arrayIndex = aind &
result = c ) )
32
c.data = data + "_" + context.any.name &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
c.arrayIndex = aind &
CBasicExpression->exists( s | s.data = "self" & s.kind = "variable" &
s.cexpId = expId + "_self" &
s.type = CType[context.any.typeId] &
s.elementType = s.type &
c.parameters = Sequence{s}^pars) & result = c ) ) &
(objectRef.size > 0 =>
CBasicExpression->exists( c | c.cexpId = expId & c.kind = "operation" &
c.data = data + "_" + objectRef.any.elementType &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
c.parameters = obs^pars &
c.arrayIndex = aind &
result = c ) )
33
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] & result = c) ) &
(true =>
CBasicExpression->exists( c | c.cexpId = expId & c.data = Expression.cfunctionName(data) &
c.kind = "function" &
c.parameters = obs^pars &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] & result = c) )
34
(data = "front" =>
result = createCUnaryOpCall(expId, "front" + elementType.name, obs.any) ) &
(data = "tail" =>
result = createCUnaryOpCall(expId, "tail" + elementType.name, obs.any) ) &
(data = "reverse" =>
result = createCUnaryOpCall(expId, "reverse" + elementType.name, obs.any) ) &
(data = "asSet" =>
result = createCUnaryOpCall(expId, "asSet" + elementType.name, obs.any) ) &
(data = "asSequence" =>
result = obs.any ) &
(true =>
result = createCUnaryOpCall(expId, Expression.cfunctionName(data), obs.any) )
Testing of these operations revealed some errors regarding the metamodels, eg., that elementType
should be of 1 multiplicity, not 0..1, and that any is needed with objectRef because this is of 0..1
multiplicity. name is needed for the copies of Type and CType in the uml2Cb metamodel. The gen-
eration of model.txt needed to be adjusted in several cases to ensure that appropriate information
was available for the code generator.
isCFunction1 is true if the function name is one of: sqrt, exp, log, sin, cos, tan, pow, log10,
cbrt, tanh, cosh, sinh, asin, acos, atan.
The correspondence of OCL and C operators is given in Table 13.
Expression::
static query cfunctionName(f : String) : String
post:
(f = "round" => result = "oclRound") &
(f = "floor" => result = "oclFloor") &
(f = "ceil" => result = "oclCeil") &
(f = "abs" => result = "fabs") &
(f = "->toLowerCase" => result = "toLowerCase") &
(f = "->toUpperCase" => result = "toUpperCase") &
35
(f = "->hasPrefix" or f = "hasPrefix" => result = "startsWith") &
(f = "->hasSuffix" or f = "hasSuffix" => result = "endsWith") &
(f = "->characters" => result = "characters") &
(f = "->toInteger" => result = "atoi") &
(f = "->toReal" => result = "atof") &
(f = "->toLong" => result = "atol") &
(f = "->display" => result = "displayNumber") &
(true => result = f)
The final clause here is an ‘else’ case.
Various convenience operations were introduced during design, such as:
Expression::
createCBinOpCall(id : String, name : String, le : CExpression,
re: CExpression) : CBasicExpression
post:
CBasicExpression->exists( cbe |
cbe.cexpId = id & cbe.data = name &
le : cbe.parameters & re : cbe.parameters &
cbe.kind = "operation" &
cbe.type = CType[type.typeId] &
cbe.elementType = CType[elementType.typeId] & result = cbe )
Expression::
createCUnaryOpCall(id : String, name : String, arg : CExpression) : CBasicExpression
post:
CBasicExpression->exists( cbe |
cbe.cexpId = id &
cbe.data = name &
arg : cbe.parameters &
cbe.kind = "operation" &
cbe.type = CType[type.typeId] &
cbe.elementType = CType[elementType.typeId] & result = cbe )
Expression::
createCOpCall(id : String, name : String) : CBasicExpression
post:
CBasicExpression->exists( cbe |
cbe.cexpId = id &
cbe.data = name &
cbe.kind = "operation" &
cbe.type = CType[type.typeId] &
cbe.elementType = CType[elementType.typeId] & result = cbe )
Expression::
static cast(typ : String, e : CExpression) : CExpression
post:
CUnaryExpression->exists( ce | ce.cexpId = e.cexpId + "_cast" &
ce.operator = "(" + typ + ") " &
ce.argument = e &
ce.type = e.type &
ce.elementType = e.elementType &
result = ce )
These are used to create C representations of binary and unary expressions as operation calls.
They are an application of the pattern Factor out Duplicated Expression Evaluations.
For certain kinds of binary expression (+, -, =, /=, comparitors), the type of the arguments
determines the representation:
36
BinaryExpression::
query mapAddExpression(le : CExpression, re : CExpression) : CExpression
post:
(left.type.name = "String" & right.type.name = "String" =>
result = createCBinOpCall(expId, "concatenateStrings", le, re)) &
(true =>
CBinaryExpression->exists( ce | ce.cexpId = expId & ce.operator = "+" &
ce.left = le & ce.right = re &
ce.needsBracket = needsBracket &
ce.type = CType[type.typeId] &
ce.elementType = CType[elementType.typeId] &
result = ce ) )
BinaryExpression::
query mapSubtractExpression(le : CExpression, re : CExpression) : CExpression
post:
(left.type.name = "String" & right.type.name = "String" =>
result = createCBinOpCall(expId, "subtractString", le, re)) &
(left.type.name = "Set" or left.type.name = "Sequence" =>
result = createCBinOpCall(expId, "removeAll" + left.elementType.name, le, re)) &
(true =>
CBinaryExpression->exists( ce | ce.operator = "-" &
ce.cexpId = expId &
ce.left = le & ce.right = re &
ce.needsBracket = needsBracket &
ce.type = CType[type.typeId] &
ce.elementType = CType[elementType.typeId] &
result = ce ) )
BinaryExpression::
query mapComparitorExpression(le : CExpression, re : CExpression) : CExpression
post:
(left.type.name = "String" & right.type.name = "String" =>
CBinaryExpression->exists( be | be.cexpId = expId &
be.operator = Expression.cop(operator) &
be.left = createCBinOpCall(expId + "_strcmp", "strcmp", le, re) &
CBasicExpression->exists( zero | zero.data = "0" &
zero.cexpId = expId + "_0" &
be.right = zero ) &
be.needsBracket = true &
result = be ) ) &
(true =>
CBinaryExpression->exists( ce | ce.cexpId = expId & ce.operator = operator &
ce.left = le & ce.right = re &
ce.needsBracket = needsBracket &
ce.type = CType[type.typeId] &
ce.elementType = CType[elementType.typeId] &
result = ce ) )
BinaryExpression::
query mapEqualityExpression(le : CExpression, re : CExpression) : CExpression
post:
(left.type.name = "String" & right.type.name = "String" =>
result = mapComparitorExpression(le, re)) &
(left.type.name = "Set" =>
result = createCBinOpCall(expId, "equalsSet",
Expression.cast("void**", le),
Expression.cast("void**", re)) ) &
37
(left.type.name = "Sequence" =>
result = createCBinOpCall(expId, "equalsSequence",
Expression.cast("void**", le),
Expression.cast("void**", re)) ) &
(true =>
CBinaryExpression->exists( ce | ce.cexpId = expId & ce.operator = "==" &
ce.left = le & ce.right = re &
ce.needsBracket = needsBracket &
ce.type = CType[type.typeId] &
ce.elementType = CType[elementType.typeId] &
result = ce ) )
BinaryExpression::
query mapInclusionExpression(le : CExpression, re : CExpression) : CExpression
post:
(operator = ":" =>
result = createCBinOpCall(expId, "isIn",
Expression.cast("void*", le),
Expression.cast("void**", re)) ) &
(operator = "->includes" =>
result = createCBinOpCall(expId, "isIn",
Expression.cast("void*", re),
Expression.cast("void**", le)) ) &
(operator = "->includesAll" =>
result = createCBinOpCall(expId, "containsAll",
Expression.cast("void**",le),
Expression.cast("void**",re)) ) &
(operator = "<:" =>
result = createCBinOpCall(expId, "containsAll",
Expression.cast("void**", re),
Expression.cast("void**", le)) )
BinaryExpression::
query mapExclusionExpression(le : CExpression, re : CExpression) : CExpression
post:
(operator = "/:" =>
CUnaryExpression->exists( nin | nin.cexpId = expId & nin.operator = "!" &
nin.argument = createCBinOpCall(expId + "_isIn", "isIn",
Expression.cast("void*", le),
Expression.cast("void**", re)) &
nin.type = CType[type.typeId] &
nin.elementType = CType[elementType.typeId] &
result = nin ) ) &
(operator = "->excludes" =>
CUnaryExpression->exists( nin | nin.cexpId = expId & nin.operator = "!" &
nin.argument = createCBinOpCall(expId + "_isIn", "isIn",
Expression.cast("void*", re),
Expression.cast("void**", le)) &
nin.type = CType[type.typeId] &
nin.elementType = CType[elementType.typeId] &
result = nin ) ) &
(operator = "->excludesAll" =>
result = createCBinOpCall(expId, "disjoint",
Expression.cast("void**", le),
Expression.cast("void**", re)) ) &
(operator = "/<:" =>
CUnaryExpression->exists( nin | nin.cexpId = expId & nin.operator = "!" &
nin.argument = createCBinOpCall(expId + "_containsAll", "containsAll",
38
Expression.cast("void**", re),
Expression.cast("void**", le)) &
nin.type = CType[type.typeId] &
nin.elementType = CType[elementType.typeId] &
result = nin ) )
These create new auxiliary operations that evaluate the predicate of the iterator/collect (the RHS
argument), add it to the program, and create a reference to this operation as an argument for the
C operation that evaluates the iterator/collect. For collect, the function reference must be cast
to the function pointer type void* (*)(struct E*) where the LHS has element type E. As noted
above, there is only one parameter variable of the auxiliary operation.
Sorted-by expressions are mapped by:
BinaryExpression::
query mapSortByExpression(arg : CExpression) : CExpression
post:
result = Expression.cast("struct " + elementType.name + "**",
createCBinOpCall(expId, "treesort", Expression.cast("void**", arg),
CExpression.defineCOpReference("compareTo_" + right.type.name, "int")))
Comparitors compareTo int, compareTo long, etc are already defined in ocl.h.
The definition of binary expression mapping is updated to:
BinaryExpression::
query mapBinaryExpression(lexp : CExpression,
rexp : CExpression) : CBinaryExpression
pre:
lexp = CExpression[left.expId] &
rexp = CExpression[right.expId]
post:
(operator = "+" =>
result = mapAddExpression(lexp, rexp)) &
39
(operator = "-" =>
result = mapSubtractExpression(lexp, rexp)) &
(operator = "=" =>
result = mapEqualityExpression(lexp,rexp)) &
(Expression.isComparitor(operator) =>
result = mapComparitorExpression(lexp,rexp)) &
(Expression.isInclusion(operator) =>
result = mapInclusionExpression(lexp,rexp)) &
(Expression.isExclusion(operator) =>
result = mapExclusionExpression(lexp,rexp)) &
(Expression.isIteratorOp(operator) =>
result = mapIteratorExpression(operator.tail.tail, lexp, rexp)) &
(operator = "->collect" =>
result = Expression.cast(rexp.type + "*", mapCollectExpression(lexp, rexp))) &
(operator = "->sortedBy" =>
result = mapSortByExpression(lexp)) &
(left.type.name = "String" & Expression.isStringOp(operator) =>
result = mapStringExpression(lexp,rexp)) &
((left.type.name = "Set" or left.type.name = "Sequence") &
Expression.isCollectionOp(operator) =>
result = mapCollectionExpression(lexp,rexp)) &
(true =>
CBinaryExpression->exists( c | c.cexpId = expId &
c.operator = Expression.cop(operator) &
c.left = lexp & c.right = rexp &
c.needsBracket = needsBracket &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
result = c ) )
40
Expression.cast("void**", le)) ) &
(operator = "->union" =>
result = createCBinOpCall(expId, "union" + left.elementType.name, le, re) ) &
(operator = "^" =>
result = createCBinOpCall(expId, "concatenate" + left.elementType.name, le, re) ) &
(operator = "->intersection" =>
result = createCBinOpCall(expId, "intersection" + left.elementType.name, le, re) ) &
(operator = "->isUnique" =>
result = createCBinOpCall(expId,
"isUnique" + left.elementType.name, le, re) )
Unary expressions are mapped as follows:
UnaryExpression::
query mapUnaryExpression(arg : CExpression) : CExpression
pre:
arg = CExpression[argument.expId]
post:
(operator.size > 2 & Expression.isCfunction1(operator.tail.tail) =>
CBasicExpression->exists( c | c.cexpId = expId &
c.data = Expression.cfunctionName(operator.tail.tail) &
c.kind = "function" &
c.parameters = Sequence{ arg } &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] & result = c) ) &
(operator = "->sort" => result = mapSortExpression(arg)) &
(Expression.isReduceOp(operator) =>
result = mapReduceExpression(arg)) &
(argument.type.name = "String" & Expression.isUnaryStringOp(operator) =>
result = mapStringExpression(arg)) &
(operator = "->display" =>
result = createCUnaryOpCall(expId, "display" + argument.type, arg)) &
((argument.type.name = "Set" or argument.type.name = "Sequence") &
Expression.isUnaryCollectionOp(operator) =>
result = mapCollectionExpression(arg)) &
(true =>
CUnaryExpression->exists( c | c.cexpId = expId &
c.operator = Expression.cop(operator) &
c.argument = arg &
c.type = CType[type.typeId] &
c.elementType = CType[elementType.typeId] &
result = c ) )
A reduce operator is one of →min, →max , →sum, →prd . These expressions are mapped by:
UnaryExpression::
query mapReduceExpression(arg : CExpression) : CExpression
post:
(operator[1] = "-" & operator[2] = ">" =>
result = createCBinOpCall(expId,
operator.tail.tail + argument.elementType.name,
arg, argument.clength(arg)) ) &
(true =>
result = createCBinOpCall(expId, operator + argument.elementType.name,
arg, argument.clength(arg)) )
clength by default returns length((void ∗∗) arg), but for s→collect on a collection, and for attribute
applications s.att to a collection, it returns the size of s ′ . The size of literal collections can be
directly calculated, and for Integer .subrange(a, b) it is b ′ − a ′ + 1.
Sort expressions are mapped by:
41
UnaryExpression::
query mapSortExpression(arg : CExpression) : CExpression
post:
(elementType.name = "String" =>
result = Expression.cast("char**",
createCBinOpCall(expId, "treesort",
Expression.cast("void**", arg),
CExpression.defineCOpReference("compareTo_String", "int"))) ) &
(true =>
result = Expression.cast("struct " + elementType.name + "**",
createCBinOpCall(expId, "treesort",
Expression.cast("void**", arg),
CExpression.defineCOpReference("compareTo_" + elementType.name, "int"))))
Unary String expressions are mapped by:
UnaryExpression::
query mapStringExpression(arg : CExpression) : CExpression
post:
(operator = "->size" =>
result = createCUnaryOpCall(expId, "strlen", arg) ) &
(operator = "->first" =>
result = createCUnaryOpCall(expId, "firstString", arg) ) &
(operator = "->last" =>
result = createCUnaryOpCall(expId, "lastString", arg) ) &
(operator = "->front" =>
result = createCUnaryOpCall(expId, "frontString", arg) ) &
(operator = "->tail" =>
result = createCUnaryOpCall(expId, "tailString", arg) ) &
(operator = "->reverse" =>
result = createCUnaryOpCall(expId, "reverseString", arg) ) &
(operator = "->display" =>
result = createCUnaryOpCall(expId, "displayString", arg) ) &
(true =>
result = createCUnaryOpCall(expId, Expression.cfunctionName(operator), arg) )
A (unary) collection operator is one of →size, →any, →reverse, →front, →last, →tail , →first,
→sort, →asSet, →asSequence. These are mapped by:
UnaryExpression::
query mapCollectionExpression(arg : CExpression) : CExpression
post:
(operator = "->size" =>
result = createCUnaryOpCall(expId, "length", Expression.cast("void**", arg)) ) &
(operator = "->any" =>
result = createCUnaryOpCall(expId, "any", Expression.cast("void**", arg)) ) &
(operator = "->first" =>
result = createCUnaryOpCall(expId, "first", Expression.cast("void**", arg)) ) &
(operator = "->last" =>
result = createCUnaryOpCall(expId, "last", Expression.cast("void**", arg)) ) &
(operator = "->front" =>
result = createCUnaryOpCall(expId, "front" + elementType.name, arg) ) &
(operator = "->tail" =>
result = createCUnaryOpCall(expId, "tail" + elementType.name, arg) ) &
(operator = "->reverse" =>
result = createCUnaryOpCall(expId, "reverse" + elementType.name, arg) ) &
(operator = "->asSet" =>
result = createCUnaryOpCall(expId, "asSet" + elementType.name, arg) ) &
(operator = "->asSequence" =>
42
result = arg ) &
(true =>
result = createCUnaryOpCall(expId, Expression.cfunctionName(operator), arg) )
For any, first, last, the result also needs to be cast to arg.elementType.
During this iteration, a new functional requirement F1.2.6 to map static operations to C was
added, which involved reworking the uml2Ca transformation. In particular, the Expression and
CExpression and COperation classes all need an additional boolean attribute isStatic. This effort
is included in the budget of iteration 3.
There are 92 operations and 33 transformation rules for this use case. Testing and inspec-
tion were used for validation and verification. NF1 was achieved because all the transformation
constraints are of type 1 (bounded loops), and recursions in operations are bounded by the maxi-
mum depth of input model expressions as abstract syntax trees. F2 was ensured by checking the
generated syntax against the C standard. F3 was ensured by checking SemC (p) ≡ SemUML (m)
when p is generated from m. F4 is ensured by the use of the Auxiliary Correspondence Model
and Recursive Descent patterns, as is the bx requirement F5. NF2 was checked for the uml2Cb
release and found to be satisfied (Table 18).
The estimated effort for this iteration is shown in Table 15.
Considerable effort was spent on debugging, with most of the problems traced to the input
model data generated from UML models, or (less often) to errors in the preceding uml2Ca trans-
formation. Errors in the uml2Cb transformation itself were usually quickly traced and corrected,
and were for the most part minor, eg., incorrectly treating a 0..1-multiplicity role as if it were
1-multiplicity, etc.
43
Figure 8: UML-RSDS activity metamodel
44
Requirement UML activity st C statement st’
F1.4.1 Creation statement x : T T’ x = defaultT’;
defaultT’ is default value of T’
F1.4.2 Assign statement v := e v’ = e’;
Assign statement v : T := e T’ v’ = e’;
F1.4.3 Sequence statement st1 ; ... ; stn st1’ ... stn’
F1.4.4 Conditional statement if e then st1 else st2 if (e’) { st1’ } else { st2’ }
Conditional statement if e then st1 if (e’) { st1’ }
F1.4.5 Return statement return e return e’;
Return statement return return;
F1.4.6 Break statement break break;
F1.4.7 Bounded loop for (x : e) do st int i = 0;
on object collection e of entity for ( ; i < length((void**) e’); i++)
element type E { struct E* x = e’[i]; st’ }
New index variable i
F1.4.8 Unbounded loop while e do st while (e’) { st’ }
F1.4.9 Operation call ex.op(pars) op’(ex’,pars’)
on single object ex
mapStatement() : CStatement
is defined in each Statement subclass.
For basic statements, this is defined as follows:
AssignStatement::
mapStatement() : CStatement
post:
CAssignment->exists( ca | ca.cstatId = statId &
ca.type = CType[type.typeId] &
ca.left = left.mapExpression() &
ca.right = right.mapExpression() &
result = ca )
BreakStatement::
mapStatement() : CStatement
post:
CBreakStatement->exists( ca | ca.cstatId = statId & result = ca )
OperationCallStatement::
mapStatement() : CStatement
post:
OpCallStatement->exists( ca | ca.cstatId = statId &
ca.callExp = callExp.mapExpression() &
ca.assignsTo = assignsTo & result = ca )
ImplicitCallStatement::
mapStatement() : CStatement
post:
OpCallStatement->exists( ca | ca.cstatId = statId &
ca.callExp = callExp.mapExpression() &
ca.assignsTo = assignsTo & result = ca )
CreationStatement::
mapStatement() : CStatement
45
post:
DeclarationStatement->exists( ds | ds.cstatId = statId &
ds.createsInstanceOf = createsInstanceOf &
ds.assignsTo = assignsTo &
ds.type = CType[type.typeId] &
ds.elementType = CType[elementType.typeId] & result = ds)
For each category of composite statement, the subparts of the statement are mapped to C first,
and then composed by a separate operation. For example:
SequenceStatement::
mapStatement() : CStatement
post:
result = mapSequenceStatement(
statements.mapStatement())
ConditionalStatement::
mapStatement() : CStatement
post:
result = mapConditionalStatement(
ifPart.mapStatement(), elsePart.mapStatement())
UnboundedLoopStatement::
mapStatement() : CStatement
post:
result = mapUnboundedLoopStatement(
body.mapStatement())
BoundedLoopStatement::
mapStatement() : CStatement
post:
result = mapBoundedLoopStatement(
46
body.mapStatement())
The definitions of these mappings were revised and simplified during the specification stage, prior
to implementation.
As with the mappings of expressions, these mappings can be directly inverted:
CSequenceStatement::
mapCStatement() : Statement
post:
result = mapCSequenceStatement(
statements.mapCStatement())
IfStatement::
mapCStatement() : Statement
post:
result = mapIfStatement(
ifPart.mapCStatement(), elsePart.mapCStatement())
WhileStatement::
mapCStatement() : Statement
post:
result = mapWhileStatement(
body.mapCStatement())
47
result = lp )
This can be inverted using mapCExpression to convert the value expressions from C to UML.
Statements are printed by toString operations, eg.:
DeclarationStatement::
query toString() : String
post:
(createsInstanceOf = "String" =>
result = type + " " + assignsTo + " = \"\"") &
(type : CPrimitiveType =>
result = type + " " + assignsTo + " = 0;") &
(true =>
result = type + " " + assignsTo + " = NULL;")
CSequenceStatement::
query toString() : String
post:
result = statements->collect( s | s.toString() + "\n " )->sum()
IfStatement::
query toString() : String
post:
(elsePart.size = 0 =>
result = "if (" + test + ")\n { " +
ifPart + " }") &
(elsePart.size > 0 =>
result = "if (" + test + ")\n { " +
ifPart + " }\n else { " + elsePart.any + " }")
ForStatement::
query toString() : String
post:
ind = "ind_" + cstatId &
result = " int " + ind + " = 0;\n" +
" for ( ; " + ind + " < length((void**) " + loopRange + "); " + ind + "++)\n" +
" { " + loopRange.elementType + " " + loopVar + " = (" + loopRange + ")[" + ind + "];\n" +
" " + body + "\n" +
" } "
It was identified that the elsePart association ends should be 0..1 multiplicities to include
the cases of If statements without Else clauses. Likewise, the returnValue of a Return statement
should be optional. There are 28 operations and 4 rules for this iteration.
The estimated effort for this iteration is shown in Table 17.
48
Stage Effort (person days)
Req. Elicitation 2
Eval./Negotiation 1
Specification 6
Review/Validation 6
Implementation/ 4
Testing
Total 19
UseCase::
COperation->exists( cop | cop.name = name & cop.scope = "application" &
cop.isQuery = false &
cop.isStatic = true &
cop.code = classifierBehaviour.mapStatement() &
parameters->forAll( x | CVariable->exists( y | y.name = x.name &
y.kind = "parameter" &
y.type = CType[x.type.typeId] &
y : cop.parameters ) ) &
cop.returnType = CType[returnType.typeId] )
The other features of the COperation in this case are set by the iteration 2 mapping.
All auxiliary scope operations are printed before all entity operations, and these are printed
before all application operations:
COperation::
scope = "auxiliary" => self->display()
COperation::
scope = "entity" => self->display()
COperation::
scope = "application" => self->display()
49
NF1 was achieved because all the transformation constraints are of type 1 (bounded loops).
F2 was ensured by checking the generated syntax against the C standard. F3 was ensured by
checking the equivalence of the semantics of the UML and C models. F4 is ensured by the use
of the Auxiliary Correspondence Model and Recursive Descent patterns, as is the bx requirement
F5. NF2 was checked for the uml2Cb release and found to be satisfied (Table 18).
7 Evaluation
In this section we evaluate the outcomes of the development, the effectiveness of UML-RSDS for
the development, and the agile MDD approach that we have used.
50
Requirement Priority Achievement
NF1: Termination High Proved
NF10: Development time High Achieved
F2: Syntactic correctness High Rigorous argument and testing
F3: Semantic preservation High Rigorous argument and testing
F4: Traceability High Achieved
F5: Bidirectionality Medium Partly achieved
NF2: Transformation efficiency Medium Achieved
NF3: Transformation modularity Medium Achieved
NF5: Usability Medium Achieved
NF6: Efficient code Medium Partly achieved
NF7: Compact code Medium Partly achieved
F6: Confluence Low Proved
NF4: Flexibility Low Not achieved
In order to test NF6 and NF7 we wrote a test UML specification involving a fixed-point
computation of the maximum-value node in a graph of nodes. This has one entity A, with an
attribute x : int and a self-association neighbours : A → Set(A). There is a use case maxnode
with the postcondition
A::
n : neighbours & n.x > x@pre => x = n.x
This updates a node to have the maximum x value of its neighbours. Because this constraint
reads and writes A :: x , a fixed-point design is generated, with a running time of cubic order in
the number of nodes.
The generated C code of the use case and its auxiliary functions is:
void maxnode1(struct A* self, struct A* n)
{ setA_x(self, getA_x(n)); }
51
void maxnode(void)
{ unsigned char maxnode1_running = 0; maxnode1_running = TRUE;
while (maxnode1_running)
{ maxnode1_running = maxnode1search(); }
}
Table 21 compares the code size (for the complete applications, including library code) and
the efficiency of the C code with the Java code. The lcc compiler was used for C. These show that
code size is halved by using C, and that efficiency is improved.
Table 22: Development effort for previous code generators (person months)
The best comparison with the C code generator is perhaps the C++ generator, which in-
volved considerable background research into the semantics, language and libraries of C++, and
significant revision of the existing Java-oriented code generator. Likewise, the C code generator
involved substantial new research work on the code generation stategy, in addition to the technical
challenge of implementing this strategy.
Summarising Tables 2, 5, 15, 17, 19, we obtain an overall estimate for the C code generator in
Table 23.
This amounts to 4.5 person months for requirements analysis/specification activities, compared
to 6 months for the manually-developed C++ generator (which had no specification). 49 days
were spent on implementation and testing, compared to 8 months for the C++ generator. A
major factor in this difference is the simpler and more concise transformation specification of
the C code generator (expressed in UML-RSDS) compared to the Java code of the C++ code
generator. Not only is the UML-RSDS specification 4 times shorter than the Java code, but
the latter is scattered over multiple source files (eg., Attribute.java, Association.java, Entity.java,
etc), making debugging and maintenance more complex compared to the C translator, which is
defined in 2 specification files. In total, the core code of the UML-RSDS tools is 90,500 lines of
Java code, of which approximately 20% (18,100 lines) is the C++ code generator. In contrast the
52
Stage Effort (person days)
Req. Elicitation 17
Eval./Negotiation 5
Specification 56
Review/Validation 57
Implementation/ 49
Testing
Total 184
UML2C specification is 2,200 (uml2Ca) and 2,700 (uml2Cb) lines, in total 4,900 lines. The OCL
specification of UML2C is highly declarative and corresponds directly to the informal requirements,
hence it is easier to understand and modify compared to a programming language implementation.
In iterations 3 and 4 the specification style is less purely declarative than in iterations 1, 2 and 5,
but instead is in a functional programming style. It was found that this was also more concise and
easier to understand and change than the imperative Java coding of the C++ transformation.
53
• Recursive Descent: The mapping of expressions and activities are naturally expressed using
this pattern, because of their self-recursive structures.
• Rule Inheritance: In some cases (such as the definition of mapExpression, mapStatement,
clength, etc), a generalised mapping rule is expressed in a superclass definition, and special-
isations defined in certain subclasses.
• Transformation Chain: The implementation is organised as the sequential composition of
uml2Ca and uml2Cb. This allows each to operate on only subsets of the complete UML and
C metamodels.
• Implicit Copy: The executable implementations of UML-RSDS transformations automati-
cally copy any unmodified input data to the output model, so that there is no need to write
copying rules for entities unaffected by the transformation. Uml2Ca copies UML expression
and activity data unchanged, for example.
• Factor Code Generation into Model-to-model and Model-to-code: The UML model is mapped
to a C language model and then code text is printed from this.
• Text Templates: The printing of code uses fixed text templates with variable elements spec-
ified by C model data.
• Factor out Duplicated Expression Evaluations: Helper operations are introduced, such as
createCBinOpCall, to factor out repeated processing. Let variables are also used in certain
operations (eg., getPKOp).
We found that all of these patterns had positive contributions in the development, as shown in
Table 24.
Pattern Benefits
Phased Constr. Helps achieve NF3
Aux. Corr. Model Helps achieve F3, F4, F5
Object Indexing Helps achieve NF2
Entity Splitting Helps achieve NF3, F4
Entity Merging Helps achieve NF3
Recursive Descent Helps achieve F5, NF3
Rule Inheritance Helps achieve NF3, simplicity
Transformation Chain Helps achieve NF3
Implicit Copy Simplicity
Factor Code Gen. Helps achieve F2, F3, NF3
Text Templates Helps achive simplicity, NF3, potentially NF4
Factor out Dup. Expressions Helps NF2, simplicity
However there were some disadvantages: in particular the use of Transformation Chain required
additional effort to ensure that all information needed by uml2Cb was correctly produced by
uml2Ca. The detection and correction of errors in uml2Cb caused by incorrect UML model data
was more difficult because this data was accessed via uml2Ca. The organisation and division of
metamodels between these transformations also required effort, and deployment was made more
complex. However, we consider that overall the benefits of modularity and cohesion gained by
this architecture outweigh the costs.
8 Related work
Systematic software engineering of model transformations is only practised in a minority of MT
developments, according to surveys of MT development [1, 5]. The emphasis in MT develop-
54
ments has been on implementation, with less attention paid to requirements engineering. One
example of a detailed development process is the migration case study of [14], which describes the
techniques used in this industrial project. Details of the development process for an industrial
transformation project are also provided in [10]. In this paper we have given a detailed description
of the development process and engineering techniques used, together with evaluations of their
effectiveness.
Code generation from UML to ANSI C is also an unusual topic, with only one recent publication
describing such a translator [2]. This code generator is described in a high-level manner, and it
is not clear how OCL expressions or UML activities are mapped to C using the transformation.
In contrast, we have provided explicit mappings for all elements of a substantial subset of UML,
including a large subset of OCL.
There has been some work on providing virtual machines for standard OCL [15]. This work
uses Java as the basis of the VM, instead of C. Compared to [15] we consider a subset of OCL
which (i) omits null and invalid values, (ii) uses classical logic, (iii) uses computational numeric
types, (iv) omits OclAny, bags and ordered sets. These modifications make the correspondence
between a (UML-RSDS) OCL specification and a Java/C#/C++/C implementation more direct
and also simplify specification verification, eg., using the B formal method. The OCL VM of [15]
does not seem to consider state-modifying OCL constraints or issues around object deletion. We
have found that execution efficiency for code generated from UML-RSDS OCL is acceptable, and
in some cases, superior to manually-constructed code [4]. Therefore we consider that our approach
has potential for the practical execution of OCL.
9 Future work
The code generator specification can be used as the basis of alternative C translators. In particular,
there is interest in mapping to the high-integrity MISRA C subset [9]. For this subset, dynamic
memory allocation is not permitted, so for each class, a maximum bound must be provided for
the number of objects of the class. Classes can again be represented by C structs, but e instances
would be an array of structures, instead of an array of pointers to structures. Objects would
be represented as ints indexing into these arrays, and collections represented as fixed-size arrays.
Our code generator already satisfies most of the code structuring restrictions of MISRA C, and
union data structures and other cases of overlapping memory usage are already excluded. Bitfields
are not used, nor are static elements. Recursive functions are not permitted by MISRA C (rule
16.2), so an alternative means of sorting (not treesort or qsort) must be used. Declarations of all
operations must be provided in separate header files (rule 8.1). The following rules of MISRA-C
are satisfied by our code generator: 1.1; 1.2; 1.3; 2.1; 2.2; 2.3; 2.4; 3.5; 4.1; 4.2; 5.1*; 5.2*; 5.3;
5.4; 5.5; 6.1; 6.4; 6.5; 7.1; 8.2; 8.3; 8.4; 8.6; 8.7; 8.8; 8.9; 9.1; 9.2; 9.3; 10.3; 10.4; 10.5; 11.2; 11.3;
11.5; 12.1*; 12.2*; 12.3; 12.4*; 12.5*; 12.6; 12.7; 12.8; 12.9; 12.12; 12.13; 13.1; 13.2; 13.3*; 13.4;
13.5; 13.6; 14.2; 14.3; 14.4; 14.5; 14.6*; 14.7*; 14.8; 14.9; 15.0; 15.1; 15.2; 15.3; 15.4; 15.5; 16.1;
16.3; 16.4; 16.6; 16.8; 17.1; 17.2; 17.3; 17.5; 17.6; 18.1; 18.2; 18.4; 19.1; 19.2; 19.3; 19.4; 19.5; 19.6;
19.7; 19.8; 19.9; 19.10; 19.11; 19.12; 19.13; 19.14; 19.16; 19.17; 20.1; 20.2; 20.5; 20.6; 20.7; 20.8;
20.11; 20.12. The rules marked with an asterisk are only satisfied if corresponding restrictions are
observed in the UML model.
The transformation can also serve as a basis for re-engineering the UML to Java code gener-
ators as UML-RSDS specifications. The transformation is an example of a UML-RSDS plugin,
that is, an additional tool which can process UML-RSDS model data in text form to produce
specialised code or documentation. Developers may add such plugins f.jar by placing them in a
/f subdirectory of the UML-RSDS tool directory. More generally, UML-RSDS can be used to
define domain-specific languages (DSLs) and supporting tools in a similar manner: the abstract
syntax of a DSL is specified as a UML class diagram (such as Figure 8 as an abstract syntax
for pseudocode statements), the concrete text syntax is given by text lines of the three forms
object : Entity, object.feature = value and object1 : object2.feature as described in Section 2. This
is also the serialisation syntax of the DSL. Plugins can then operate on files of such text to perform
55
analysis, to generate other representations of DSL models (such as graphical syntax, or semantic
representations) and to generate executable code or configuration files, etc. Using UML-RSDS,
such plugins can be themselves written as transformations that use the DSL metamodel as their
source language.
10 Conclusions
We have shown that a systematic MDD process including requirements engineering and agile
practices can be beneficial for practical MT development. In addition, we have shown that it is
feasible to use a declarative and semantically simple specification approach (using OCL without
null or undefined values) to define a substantial system.
This case study is the largest transformation which has been developed using UML-RSDS, in
terms of the number of rules (of the order of 250 rules/operations in 5 subtransformations). By
using a systematic requirements engineering and agile development approach, we were able to ef-
fectively modularise the transformation and to organise its structure and manage its requirements.
Despite the complexity of the transformation, it was possible to use patterns to enforce bx and
other properties, and to effectively prove these properties. The translator has been incorporated
into the UML-RSDS tools version 1.7.
References
[1] E. Batot, H. Sahraoui, E. Syriani, P. Molins, W. Sboui, Systematic mapping study of model transfor-
mations for concrete problems, Modelsward 2016, pp. 176–183.
[2] M. Funk, A. Nysen, H. Lichter, From UML to ANSI-C: an Eclipse-based code generation framework,
RWTH, 2007.
[3] K. Lano, S. Kolahdouz-Rahimi, Model-transformation Design Patterns, IEEE Transactions in Soft-
ware Engineering, vol 40, 2014.
[4] K. Lano, H. Alfraihi, S. Yassipour-Tehrani, H. Haughton, Improving the Application of Agile Model-
based Development: Experiences from Case Studies, ICSEA 2015.
[5] S. Yassipour-Tehrani, K. Lano, S. Zschaler, Requirements engineering in MT development, ICMT
2016.
[6] K. Lano, S. Yassipour-Tehrani, Verified bidirectional transformations by construction, VOLT ’16,
MODELS 2016.
[7] K. Lano, Agile Model-based Development using UML-RSDS, Taylor and Francis, 2016.
[8] K. Lano, The UML-RSDS Manual, 2017.
[9] MIRA Ltd., MISRA-C:2004 Guidelines for the use of the C language in critical systems, 2004.
[10] M. Nakicenovic, Agile driven architecture modernization to a MDD solution, IJAS, vol. 5, 2012.
[11] OMG, Semantics of Business vocabulary rules (SBVR), Version 1.2 (2013),
www.omg.org/spec/SBVR/1.2/PDF.
[12] OMG, Semantics of a Foundational Subset for Executable UML Models (FUML), v1.1, 2015.
[13] K. Schwaber, M. Beedble, Agile software development with Scrum, Pearson, 2012.
[14] G. Selim, S. Wang, J. Cordy, J. Dingel, Model transformations for migrating legacy deployment models
in the automotive industry, SoSyM (2015), 14: 365–381.
[15] E. Willink, An extensible OCL virtual machine and code generator, OCL ’12, 2012.
56
C element e Semantic denotation e’
int Range INT MIN ..INT MAX for particular implementation, denoted INT
long LONG MIN ..LONG MAX , denoted LONG
double Appropriate floating-point representation, eg., IEEE 754.
struct E* Domain OBJ of object identifiers
void* OBJ
char* seq(INT)
struct E** seq(OBJ)
e instances Set es of existing E instances,
es ⊆ OBJ
Member T f; of struct E Map f E : es → T ′
We assume that malloc and calloc always succeed and allocate areas of memory which are not already
used. Semantically, these operations add elements to certain es sets, and OBJ must be of sufficient size
that this is always possible. We assume that two literal strings are never placed in overlapping areas of
memory.
To prove semantic preservation, we need to show that SemUML (e) = SemC (c(e)) for each OCL ex-
pression e in a specification, where SemUML is the semantic denotation of OCL given in [8], c(e) is the
translation of e by UML2C, and SemC is the semantic representation of C expressions. We assume that
e is well-defined, ie., that Def (e) and Det(e) hold [8]. We can consider expressions case by case, only
selected examples will be given here. c(e) is CExpression[e.expId ] for e : Expression.
For numeric expressions, the main issues are datatype sizes and agreement of the specified and im-
plemented numeric operators. Both int and INT should be the domain of 32-bit signed integers, and
both long and LONG should be 64-bit signed integers. In special cases alternative datatypes could be
used, but then the specifier must adopt the same types as used in the specific implementation, and adapt
definitions of definedness appropriately. The operators +, *, - at specification and implementation levels
agree using the conventional definitions, provided the result is in the appropriate numeric domain. The
operator / on integers should use the ‘round towards zero’ convention, ie., -5/3 is -1, not -2. Likewise, %
should have the same sign as its first argument. Real numbers should be interpreted by IEEE standard
754 double-precision floating point numbers at specification and implementation levels.
For string values, SemC (str ) is the sequence of characters up to but not including the first 0 value.
Thus for literal strings, SemUML (str ) = SemC (c(str )) since the mapping to C inserts a 0 character at the
end of the OCL string, which cannot itself contain 0. We need also to show that for each unary operator
op on strings, that
SemUML (str ) = SemC (c(str )) ⇒ SemUML (str →op()) = SemC (c(str →op()))
and likewise for binary and ternary operators.
If arr is of type struct E**, then SemC (arr ) is the representation of this as a mathematical set or
sequence, and is the collection of elements up to and not including the first NULL element. Likewise for
char* arrays. Thus, if e is Set{}, c(e) is an array arr with arr [0] = NULL, and SemC (arr ) = {}, which
is the same as SemUML (e).
The appendE(col,x) operation in C operates to add x to the end of the collection col, and returns this
collection:
SemC (append (col , x )) = SemC (col ) a [SemC (x )]
Thus by structural induction we can infer that
SemC (c(Sequence{x 1, ..., xn})) = SemUML (Sequence{x 1, ..., xn})
for any finite sequence of elements. Likewise for insert and other OCL collection operators. The ocl.h
library provides implementations of OCL operators that agree with this semantic interpretation. Note that
for subString, indexOfString, indexOf , insertAtString and at, the OCL numbering convention (indexes
start at 1) is used, not the C convention. For any collection operator op we need to show that:
SemUML (str ) = SemC (c(str )) ⇒ SemUML (str →op()) = SemC (c(str →op()))
57