So, I’ve Created a Programming Language!

Come on, you knew it would happen, and it has. I’ve created CMPL, a modern programming language which aims to implement some of my Ideas.

Many of my friends can now feel justified and validated in their assessment of me.


The project is still very much new born, I hope not still born. The current state of CMPL represents a couple of weekend worth of work.

CMPL will be a fully fledged Object Oriented programming language. It will not be virtual machine based and will compile to native, via C.

The CMPL compiler is implemented in Java, split into three subsystems: parser, analyser and compilation backends. The syntax is defined as a JavaCC grammar, which is used to generate the parser. The parser constructs an abstract syntax tree which is used throughout the compiler. The parser is the most complete part of the project at the moment and where most of the effort is being spent.

The analyser implements all the complex compilation rules which are not covered by the parser. Such as type resolution, name checking, all the dull stuff.

CMPL supports compilation to many targets, be it C, Java or SQL. Therefore CMPL has a number of compilation backends. These take the analysed abstract syntax tree and output compile code in what ever the target.

I guess the only question left is: why?

and for the adventurous

Hello World

Why create another programming language?

Many people will ask, why did I decide to create yet another programming language. This is a complicated question to answer. To some extent I created CMPL for fun, it is a chance to learn, a chance to expand upon a number of previous ideas in existing Intrbiz projects.

Another part of the equation is to provide features I consider lacking from other languages. The smart approach here would be to work with an existing language. However if the truth be known, I’m far to much of a control freak for that. What I want to achieve with CMPL is also different from that of other existing languages. Above all, I want CMPL to be simple.

Another part, is wanting to implement some ideas discussed at University, after many bottles of wine with two of my closest friends.

What do I want to achieve with CMPL?

Foremost, simplicity.  I believe that the most important part of any programming language is a simple, clean syntax.  Where the rules are non-ambiguous and strictly defined.  This is something I like about Java, it has a clean syntax, which is easy to read and has a simple object hierarchy.

Open.  Too many programming languages are closed and recent events demonstrate the fragility of not having any control over your assets.  The purchase of Java by Oracle highlighted this.  I think this is also why after roughly 40 years C is still prominent.  I struggle to find an alternative I like, .net is bloated and controlled by a dictatorship.  Python is interesting but has too many oddities.  Perl is nice but dated.

Strong, static typing.  I hate loosely typed languages, PHP being a case in-point, its just fucked.  I’m also not a fan of dynamic typing, while it is quick to ‘prototype’ in these languages, I think it is difficult to write robust code.  I like knowing the type of something when I’m coding rather than having to guess all the time.  I also like to be able to glance at code and not have to think too much about what type something is, freeing me up to focus on the interesting points.

Top level functions.  I prefer Object Oriented programming to procedural.  However there are many use cases, where all you need is to parcel up some functionality, utility methods.  These inevitably become static methods on a utility class.  These tend to create more problems than they solve.  Simple solution, lets define a function just as we would a class.

Composition.  I like the simplicity of the Java OO model, however it lacks one major feature: composition.  I’ve been in many situations where I simply want to compose a class from two other classes.  In Perl (with Moose) this is really well done with Roles.  Yet it is important to prevent the full on madness which is multi-inheritance.

Simple data types.  There are time where all you need is a set of properties.  Defining a full on class is over board.  The ability to quickly define common data structures like Lists and Maps make life so much easier.

Compiled.  I’ve had enough of Virtual Machines.  The whole write once run anywhere never fully worked and fails to make advantage of the features that differentiate platforms.

Native integration.  Sometime I need to branch out into native code, generally C.  I want this to be easy, I want to just intermix CMPL with native code.

Multiple targets.  CMPL is essentially a cross compiler.  I want support for a number of different targets.

Contextual memory allocation.  I spent alot of time writing web applications.  These tend to generate a number of objects, with very short lifetimes.  I implicitly know when I can throw these away.  I want to allocate a slab of memory and then throw it all away at once, rather than waiting for each object to individually be collected.

Data validation.The biggest flaw in most applications is the inability to validate input data.  This needs to be a first hand construct.

Data domains.  One fantastic feature of databases is domains.  The ability to place constraints upon a data type to derive another data type.

Panorama

I’ve played around with a number of the OS OpenData mapping data. I have the majority of it loaded on my PostgreSQL server (damn it, it’s broken). A while back I loaded the Land-Form PANORAMA data.

This data is presented in a rather unhelpful file format: DXF. I found a Java parser for DXF and armed with no knowledge of DXF, managed to extract some data. I transformed the data into line-strings and dumped them into PostgreSQL with PostGIS.

I then used Mapnik, to render the data. I was struck by the simplicity and elegance of the result.

I will release all the code behind it soon, as soon as I get a chance.

CMPL Annotations

CMPL annotations allow for metadata to be placed onto: types, domains, interfaces, roles, classes, functions, methods and attributes. CMPL annotations have no type, they do not need to be defined, they are essentially hashes. CMPL annotations can only contain information which is defined as literal to the language, this is: numbers, strings, booleans, hashes, lists and null. CMPL annotations are available at runtime, along with other class metadata. CMPL supports as many annotations as you wish and even allows multiple annotations of the same name.

Annotations are placed before the modifiers, and start with a ‘@’ symbol, followed by the annotation name. Then enclosed in brackets is the annotation data, as a single value or a set of name value pairs.

@Note("Some message of interest")
@Searchable(["title","description"])
@Deprecated(since: "1.5.0", because: "It was poorly thought out.")
@Foo(bar: [ { name: "Value" }, { name: "Value" } ])

As an example, we could annotate a type with extra information required to generate a database table:

@SQLTable(schema: "blog", name: "post")
@SQLType(schema: "blog", name: "t_post", composite: true)
@Searchable(on: ["summary", "description", "content"])
public type Post
{
	@Primary() UUID id;
	@Unique()  String name;
	           String summary;
	           String description;
	           String content;
	           Timestamp create;
	           Timestamp modified;
	           Author author;
	           List categories;
}

CMPL Functions

CMPL detaches from the convention of many Object Oriented languages by dispensing with static methods. Instead CMPL allows functions to be defined in the same way that classes can be defined.

CMPL functions, like types exist within packages and can be imported. Functions follow the same conventions as methods: a return type, name and list of arguments. CMPL functions can be defined as public, protected or private only. Private functions can only be called within the same file, protected functions can only be called within the same package.

public Int getTheAnswer(String question)
{
    return 42;
}

Functions can be annotated with metadata, just like types, attributes and functions can be.

@Note("But what is the question?")
public Int getTheAnswer(String question)
{
    return 42;
}

CMPL Hello World

To get to grips with CMPL’s syntax, lets have a look at the defacto Hello World.

package hello.world;

import cmpl.sys.*;

public function void main(List args)
{
    print("Hello World");
}

In a more objective way.

package hello.world;

import cmpl.sys.*;

public class HelloWorld
{
    private String message = "Hello World";

    public void sayHello()
    {
        print(this.message);
    }
}

public function void main(List args)
{
    HelloWorld hello = new HelloWorld();
    hello.sayHello();
}

Lets mix it up a little.

package hello.world;

import cmpl.sys.*;

public role Talk
{
    public void say( String message )
    {
        print("Hello " + message);
    }
}

public class HelloWorld with Talk
{
    public void sayHello()
    {
        say("World");
    }
}

public function void main(List args)
{
    HelloWorld hello = new HelloWorld();
    hello.sayHello();
}

Basic structure

CMPL source code is stored in plain (UTF8) text files. These files are organised into a architectural package structure. CMPL allows many ?elements? to be stored in one file. No significance is place on the file name, only on the directory, all ?elements? in a file are within the same package. The first line of the file must define the package:

package hello.world;

Next referenced code is imported, ?elements? can be imported directly or all ?elements? within a package can be imported. Individual imports can alias an ?element?, CMPL does not permit fully qualified names except with the import keyword.

import cmpl.sys.*;
import cmpl.util.log.Context as LoggingContext;

All programs require an entry point. In CMPL this is a function called main. The main function has a single argument, a List of strings, which are the command line arguments. The main function does not specify a return, CMPL will by default return with 0, a program can exit with a different status code if need be.

public function void main(List args)
{
    print("Hello World");
    exit(1);
}

That is the basics, next look at:

CMPL Types

CMPL has five key data types which create a flexible Object model with simple rules.

Type

Types are simple data structures, they are simply a set of attributes. Much like a C structure or SQL table. Types can only
define attributes, these attributes are always public.

public type Author
{
    UUID   id;
    String name;
    String role;
    String email;
    String website;
    String twitter;
    String content;
}

Domain

Domains are types with validation. A domain is an alias to any CMPL type, be it a: class, interface, type or role. Domains have an implicit validation method, which can be invoked to check if a domain is considered valid.

public domain TheAnswer as Number
{
    return value == 42;
}

Interface

Interfaces specify methods which must be available on a class. Interfaces can define public methods and constructors. Interfaces can extend many other interfaces.

public interface Logger
{
    void log( Level level, String message );
}

Role

Roles create functionality which can be composed into a class. Roles can define attributes, methods and constructors. Roles can inherit from a parent role, include other roles and implement interfaces. Roles are designed for composition, allowing common functionality to placed within a role and used by classes.

When roles are composed into other types, the ordering is defined left to right. As such if two roles provide a method of the same name, the role included first will win.

public role CanLog
{
    protected void warn( String message )
    {
        print(ERR, "WARN: " + message);
    }
}

Class

Classes are the work-horse of Object Oriented languages, CMPL is no exception. Classes can define attributes, methods and constructors. Classes can inherit from a parent class, include main roles and implement many interfaces.

public class Earth with CanLog
{
    public Earth()
    {
    }

    public TheAnswer answerTheQuestion()
    {
        warn("But what is the question?");
        return 42;
    }
}