Donnerstag, 20. März 2008

Textual Editing Framework Tutorial

Its done: the first major release for our Textual Editing Framework is out. After watching all the TEF screencasts, you can finally try it yourself. We provide a small tutorial, based on plug-in extension templates that allows you to create your own first textual editor in seconds. For all of you that always wanted a textual notation and a proper textual editor for languages based on existing meta-models: check it out.



A typical workspace for textual editor development with TEF.

Dienstag, 18. März 2008

Combining Graphical and Textual Model Editing

This is a small video presentation, where I explain how graphical and textual modeling can be used in combination. This is based on TEF and GMF. We simply put TEF editors for single model elements into small overlay windows. These embedded textual editors can be opened from the graphical host editor. The embedded textual editors then edit a sub-model within the model managed by the graphical host editor.


Click the image, or simply download the video as Xvid-encoded AVI. This will be part of our next release in the end of March 2008.

Mittwoch, 20. Februar 2008

What Does it Need to Automatically Resolve Names in Programs

A big part of a programming language's (and thats true for all textual languages defined with on context-free grammar) static semantics deals with the resolution of names. Best example: you declare a variable, giving it a name; you later use that variable by referencing the declaration, using that name. How can one describe this part of a language's static semantics so that automatically generated editors/compilers can resolve names automatically?

The identity of an element is a value that uniquely identifies this element within a program. The language defines a function (1) that assigns each element of a language instance an identity. Example identities can be the element itself, the name of the element, or more complex constructs like full qualified names. An identifier is a value used to identify a model element based on the element's identity. In simple languages, identifier can often be used directly to identify elements: if an identifier and the identity of a element are the same, this identifier identifies this element. The many languages, however, require more complex identification, including name spaces, name hiding, imports, etc. In those cases, an identifier depends on the context it is used in. The language must define a function (2) that assigns a set of possible global identifiers to an identifier and its context. These global identifiers are then used to find a element with an identity that matches on of those global identifiers.

When we describe a language with meta-modeling, all model elements are objects of given meta-classes. Identification for a specific language can easily be programmed by implementing a simple interface (here based on EMF):
  1. public Object getIdentitiy(EObject object);
  2. public Object[] getGlobalIdentities(Object identifier, EObject context);
With this interface, and language engineers implementing it for their languages, our Textual Editing Framework TEF can simply customize name resolution and aspects of code-completion for languages with specific identification mechanisms.

We will try in the future to use this interface to implement an identification scheme for a language with a complex name-space based identification mechanism. See, if it really is enough to describe identification sufficiently.

Montag, 18. Februar 2008

Textual Modelling and More Than Text Editors

On the 20th March 2008, we will release the first major version of our Textual Editing Framework. We take this as an opportunity to talk about existing frameworks and new ways to facilitate text editors for the editing of models.

"Text" Editors for Domain Specific Languages

There are several projects that try to ease the pain of creating text editors for your textual (domain specific) (modeling) language. Based on the eclipse platform, these projects created frameworks that allow to describe a textual notation and generate a "text" from this description automatically. Some of these projects are:
What all of this frameworks have in common is that they create editors that allow users to edit text files. Consequently, these generated editors are text editors.

Beyond Text Editors

Text editors, nothing wrong with that. But, to use a file edited with such an editor, it has to be transformed into a model before. So why do we use text files with concrete syntax as artifacts to store models? The reasons are manifold, but reviewed carefully, most of them are doomed to become obsolete. Storing models, programs, whatever, in concrete syntax is still the favored way of persistence, because accepted storage and versioning systems are tailored for that. CVS, SVN and most of the commercial products, all rely on text-file comparison. Text files are also preferred, when it comes to migrating sources from one language version to another. Taking models from one meta-model version to the next, is still a risky task. And even it is not easier with concrete syntax, we are at least used to migrate concrete syntax files with aged but proven technology like regular expressions.

Anyway, all these problems that hold us back from persisting models as models rather than as text, are about to be solved. Projects like EMF's compare or teneo allow to compare and store models as models. And, there are similar projects behind closed doors at SAP or IKV (just the two I happen to know). Once this technology become more accepted, the need for "text" editors that edit models and not text increases.

TEF's Three Editor Kinds

To promote the idea of text editors that work on model (XMI) files, we integrated the support for different types of editors into our own framework for textual modeling. In TEF we distinct three different editor kinds, for three different purposes. Text editors are the usual editors that edit the textual model representation and not the model. Textual model editors are actual model editors. When loaded they create a textual representation for the edited model, the user can change it, and when stored, a new model is created from the user changed text representation. The third editor kind allows to edit a partial model. These editors can be embedded into already existing model editors (host editor), like graphical or tree-based editors. These editors show in a small overlay window and edit only a part of the host-editors model.

The last two editor kinds require to create an initial textual representation for a model. This means they have to pretty print the model. To do that, they have to chose the right representation, when multiple representations are possible for the same model constructs, and they have to generate white-spaces (layout information), which is not part of the model. TEF solves both problems by allowing language engineers to add according clues to the language notation. The language engineer can prioritize notation elements to favor one of multiple possible representations, and language engineers can add white-spaces roles to the notation that allow editors to intelligently create breaks, spaces and indentations.

to be continued ...

Upcoming posts will discuss pretty-printing and textual modeling embedded in graphical editing in more detail. We also want to talk about identifiers and references, which are still causing pain in all frameworks. We will keep you posted about TEF marching towards its first release.

Freitag, 19. Oktober 2007

Textual Editing Framework - First Release and Tutorial

It took me quit longer than expected, but finally there is something that everyone of you can try. There is no an eclipse update-site to install a pretty early but functional version of TEF and a tutorial that helps you with the first steps. If you're interested, everything is layed out in great detail on our TEF web-site.

Donnerstag, 16. August 2007

About the Who is Who in Domain Specific Languages

In this post we try to get a different perspective on what domain specific development/languages can be by looking at the roles involved.

Role and stakeholders in traditional software engineering are pretty simple. We create and use software. You have the software developer and you have the software user. One is the technical genius that creates the product, the other simply uses it.

In software development with domain specific languages, things become more complicated. Since we create languages, use languages to create software, and use software, we need more than two roles. Now we have the language developer, the language user/software developer, and the user. The language developer still needs to be technical genius; he creates the DSLs that make all the domain concepts usable for anybody. Then, we have the language user; he is either a technician or a domain expert. The user stays the user, simply using the end product.

Who is the language user/software developer/domain expert? First we should separate all the roles into two different sets: The competence perspective: computer scientist, domain expert, user. The computer scientist knows about platforms, programming, software. The domain expert knows his domain, he knows about concepts and applications. The user knows nothing, but that he knees this software. The other rule set comes from the language perspective: language engineer, language user, user.

Now we different scenarios with different individuals and distribute the roles from both sets to the involved people:
  • Software development with specialised languages: DSLs are made for more efficient software development. The domain is software development. We have the traditional role allocation. The software developer is computer scientist and domain expert at the same time. Software developer play both roles language engineer and language user. The software user is the user. Example DSLs are Makefiles, JET, JSP, etc.
  • Domain specific development: DSLs are made to integrate the user into the software engineering process. The computer scientist constrains himself to his core expertise. He simply takes the domain concepts from that he has no idea how to use them, and "computerizes them", binds them to a specific system platform, builds some GUI around it, etc. The computer scientist is not a software developer anymore. The user is the one who knows the domain, who knows how to use the concepts. Software user and domain expert are the same stakeholder. The user is also the language user and software user. Example DSLs are made for simulations in sciences like Physics, Biology, Chemistry, etc. or DSLs to create specific forms of data like in geo-information systems, or gene-databases.
  • DSL based software engineering: Here we have three stakeholders. The computer scientist is language engineer. The domain expert is software developer and language user. The user is the software user. An example is the famous traffic light control language example.

Dienstag, 3. Juli 2007

From What We Code Complete

Code completion is an important part of modern development environments for textual languages. Over the past decades the proposals offered to the user during code completion have evolved from simple hippo completion (each word in the opened document is used as a proposal) to sophisticated proposals rooted in the used language semantics. But when we say code completion is based on language semantics, what semantics is suitable to provide code completion. For what language can we or can we not offer code completion?

The "Normal" Case, e.g. A Single Java File

In any case, and therefore in the normal case, we need more sophisticated model of the written text than the text itself. The edited text has to be parsed and transformed into a model that allows semantic analysis of the text. One possibility is to use a parse-tree and a library that allows to resolve variables, types, basically everything that can be referenced and therefore is possibly subject for code completion. Based on that a single edited file can be parsed and the information gathered can be used for code-completion proposals with in that file.

Using a Model That Spans Multiple Sources, e.g. A Java Project

The same technique as in the normal case, but we don't use just the model of one file. All the models of all the files organized in a project are combined. This does not only mean all the source files, but also names and types from library files or compiled object files, etc.

Using Context Information from External Models, e.g. OCL

Some languages, like OCL, do not only use the names and types that are defined using the language itselves. OCL for example is used to define expressions over models. It therefore references into these models using names from these models. Code completion in OCL often means to propose the names of properties and operations from external models.

Using Context Information from a Runtime System, e.g. Python

Some languages have only weak or no static types. This means that when you edit a program in such a language before the program runs, you don't exactly know what the types of your variables, parameters, etc. are. And with no information about the types, you don't know the properties, operations, or whatever features the values in your variables, etc. might provide. In such cases the development environment has to allow to edit the program at runtime. So you run your program and stop the execution at some point, then you start to edit your program. Now you have a completely different situation. Since you are at runtime, the environment can read the types from current variable, etc. values. From this runtime context an editor can provide code completion using runtime types.

Conclusion

Depending on the language semantics the sources for code completion proposals varies. This of course can make developing code completion less or more challenging. But you probably have to search really hard in order to find a language that would not allow any intelligent code-completion at all.