Code Generation

As hand-writing code for a lot of drivers in multiple languages would be quite a nightmare, we have invested a very large amount of time into finding a way to automate this.

So in the end we need 3 parts:

  1. Protocol definition

  2. Language template

  3. A maven plugin which generates the code

This maven plugin uses a given protocol definition as well as a language template and generates code for reading/writing data in that protocol with the given language.

code-generation-intro

The Types Base module provides all the structures the Protocol modules output which are then used in the Language templates to generate code.

Protocol Base and Language Base hereby just provide the interfaces that reference these types and provide the API for the plc4x-maven-plugin to use.

These modules are also maintained in a repository which is separate from the rest of the PLC4X code.

This is due to some restrictions in the Maven build system. If you are interested in understanding the reasons - please read the chapter on Problems with Maven near the end of this page.

Concrete protocol spec parsers and templates that actually generate code are implemented in derived modules.

We didn’t want to tie ourselves to only one way to specify protocols and to generate code. Generally multiple types of formats for specifying drivers are thinkable and the same way multiple ways of generating code are possible. Currently however we only have one parser: MSpec and one generator: Freemarker.

These add more layers to the hierarchy.

So for example in case of generating a Siemens S7 Driver for Java this would look like this:

code-generation-intro-s7-java

The dark blue parts are the ones released externally, the turquoise ones are part of the main PLC4X repo.

Introduction

The maven plugin is built up very modular.

So in general it is possible to add new forms of providing protocol definitions as well as language templates.

For the formats of specifying a protocol we have tried out numerous tools and frameworks, however the results were never quite satisfying.

Usually using them required a large amount of workarounds, which made the solution quite complicated.

In the end only DFDL and the corresponding Apache project Apache Daffodil seemed to provide what we were looking for.

With this we were able to provide first driver versions fully specified in XML.

The downside was, that the PLC4X community regarded this XML format as pretty complicated and when implementing an experimental code generator we quickly noticed that generating a nice object model would not be possible, due to the lack ability to model the inheritance of types in DFDL.

In the end we came up with our own solution which we called MSpec and is described in the MSpec Format description.

Configuration

The plc4x-maven-plugin has a very limited set of configuration options.

In general all you need to specify, is the protocolName and the languageName.

An additional option outputFlavor allows generating multiple versions of a driver for a given language. This can come in handy if we want to be able to generate read-only or passive mode driver variants.

Last, not least, we have a pretty generic options config option, which is a Map type.

With options is it possible to pass generic options to the code-generation. So if a driver or language requires further customization, these options can be used.

Currently, the Java module makes use of such an option for specifying the Java package the generated code uses. If no package option is provided, the default package org.apache.plc4x.{language-name}.{protocol-name}.{output-flavor} is used, but especially when generating custom drivers, which are not part of the Apache PLC4X project, different package names are better suited. So in these cases, the user can simply override the default package name.

There is also an additional parameter: outputDir, which defaults to ${project.build.directory}/generated-sources/plc4x/ and usually shouldn’t require being changed in case of a Java project, but usually requires tweaking when generating code for other languages.

Here’s an example of a driver pom for building a S7 driver for java:

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

      https://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing,
  software distributed under the License is distributed on an
  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  KIND, either express or implied.  See the License for the
  specific language governing permissions and limitations
  under the License.
  -->
<project xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <parent>
    <groupId>org.apache.plc4x.plugins</groupId>
    <artifactId>plc4x-code-generation</artifactId>
    <version>0.6.0-SNAPSHOT</version>
  </parent>

  <artifactId>test-java-s7-driver</artifactId>

  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.plc4x.plugins</groupId>
        <artifactId>plc4x-maven-plugin</artifactId>
        <executions>
          <execution>
            <id>test</id>
            <phase>generate-sources</phase>
            <goals>
              <goal>generate-driver</goal>
            </goals>
            <configuration>
              <protocolName>s7</protocolName>
              <languageName>java</languageName>
              <outputFlavor>read-write</outputFlavor>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

  <dependencies>
    <dependency>
      <groupId>org.apache.plc4x.plugins</groupId>
      <artifactId>plc4x-code-generation-driver-base-java</artifactId>
      <version>0.6.0-SNAPSHOT</version>
    </dependency>

    <dependency>
      <groupId>org.apache.plc4x.plugins</groupId>
      <artifactId>plc4x-code-generation-language-java</artifactId>
      <version>0.6.0-SNAPSHOT</version>
      <!-- Scope is 'provided' as this way it's not shipped with the driver -->
      <scope>provided</scope>
    </dependency>

    <dependency>
      <groupId>org.apache.plc4x.plugins</groupId>
      <artifactId>plc4x-code-generation-protocol-s7</artifactId>
      <version>0.6.0-SNAPSHOT</version>
      <!-- Scope is 'provided' as this way it's not shipped with the driver -->
      <scope>provided</scope>
    </dependency>
  </dependencies>

</project>

So the plugin configuration is pretty straight forward, all that is specified, is the protocolName, languageName and the output-flavor.

The dependency:

<dependency>
  <groupId>org.apache.plc4x.plugins</groupId>
  <artifactId>plc4x-code-generation-driver-base-java</artifactId>
  <version>0.6.0-SNAPSHOT</version>
</dependency>

For example contains all classes the generated code relies on.

The definitions of both the s7 protocol and java language are provided by the two dependencies:

<dependency>
  <groupId>org.apache.plc4x.plugins</groupId>
  <artifactId>plc4x-code-generation-language-java</artifactId>
  <version>0.6.0-SNAPSHOT</version>
  <!-- Scope is 'provided' as this way it's not shipped with the driver -->
  <scope>provided</scope>
</dependency>

and:

<dependency>
  <groupId>org.apache.plc4x.plugins</groupId>
  <artifactId>plc4x-code-generation-protocol-s7</artifactId>
  <version>0.6.0-SNAPSHOT</version>
  <!-- Scope is 'provided' as this way it's not shipped with the driver -->
  <scope>provided</scope>
</dependency>

The reason for why the dependencies are added as code-dependencies and why the scope is set the way it is, is described in the Why are the protocol and language dependencies done so strangely? section.

Custom Modules

The plugin uses the Java Serviceloader mechanism to find modules.

Protocol Modules

In order to provide a new protocol module, all that is required, it so create a module containing a META-INF/services/org.apache.plc4x.plugins.codegenerator.protocol.Protocol file referencing an implementation of the org.apache.plc4x.plugins.codegenerator.protocol.Protocol interface.

This interface is located in the org.apache.plc4x.plugins:plc4x-code-generation-protocol-base module and generally only defines two methods:

package org.apache.plc4x.plugins.codegenerator.protocol;

import org.apache.plc4x.plugins.codegenerator.types.definitions.ComplexTypeDefinition;
import org.apache.plc4x.plugins.codegenerator.types.exceptions.GenerationException;

import java.util.Map;

public interface Protocol {

    /**
     * The name of the protocol what the plugin will use to select the correct protocol module.
     *
     * @return the name of the protocol.
     */
    String getName();

    /**
     * Returns a map of complex type definitions for which code has to be generated.
     *
     * @return the Map of types that need to be generated.
     * @throws GenerationException if anything goes wrong parsing.
     */
    Map<String, TypeDefinition> getTypeDefinitions() throws GenerationException;

}

These implementations could use any form of way to generate the Map of `ComplexTypeDefinition’s. They could even be hard coded.

However, we have currently implemented utilities for universally providing input:

Language Modules

Analog to the Protocol Modules the Language modules are constructed equally.

The Language interface is very simplistic too and is located in the org.apache.plc4x.plugins:plc4x-code-generation-language-base module and generally only defines two methods:

package org.apache.plc4x.plugins.codegenerator.language;

import org.apache.plc4x.plugins.codegenerator.types.definitions.ComplexTypeDefinition;
import org.apache.plc4x.plugins.codegenerator.types.exceptions.GenerationException;

import java.io.File;
import java.util.Map;

public interface LanguageOutput {

    /**
     * The name of the template is what the plugin will use to select the correct language module.
     *
     * @return the name of the template.
     */
    String getName();

    List<String> supportedOutputFlavors();

    /**
     * An additional method which allows generator to have a hint which options are supported by it.
     * This method might be used to improve user experience and warn, if set options are ones generator does not support.
     *
     * @return Set containing names of options this language output can accept.
     */
    Set<String> supportedOptions();

    void generate(File outputDir, String languageName, String protocolName, String outputFlavor,
        Map<String, TypeDefinition> types, Map<String, String> options) throws GenerationException;

}

The file for registering Language modules is located at: META-INF/services/org.apache.plc4x.plugins.codegenerator.language.LanguageOutput

Same as with the protocol modules, the language modules could also be implemented in any thinkable way, however we have already implemented some helpers for using:

Problems with Maven

Why are the 4 modules released separately?

We mentioned in the introduction, that the first 4 modules are maintained and released from outside the main PLC4X repository.

This is due to some restrictions in Maven, which result from the way Maven generally works.

The main problem is that when starting a build, in the validate-phase, Maven goes through the configuration, downloads the plugins and configures these. This means that Maven also tries to download the dependencies of the plugins too.

In case of using a Maven plugin in a project which also produces the maven plugin, this is guaranteed to fail - Especially during releases. While during normal development, Maven will probably just download the latest SNAPSHOT from our Maven repository and be happy with this and not complain that this version will be overwritten later on in the build. It will just use the new version as soon as it has to.

During releases however the release plugin changes the version to a release version and then spawns a build. In this case the build will fail because there is no Plugin with that version to download. In this case the only option would be to manually build and install the plugin in the release version and to re-start the release (Which is not a nice thing for the release manager).

For this reason we have stripped down the plugin and its dependencies to an absolute minimum and have released (or will release) that separately from the rest, hoping due to the minimality of the dependencies that we will not have to do it very often.

As soon as the tooling is released, the version is updated in the PLC4X build and the release version is used without any complications.

Why are the protocol and language dependencies done so strangely?

It would certainly be a lot cleaner, if we provided the modules as plugin dependencies.

However, as we mentioned in the previous sub-chapter, Maven tries to download and configure the plugins prior to running the build. So during a release the new versions of the modules wouldn’t exist, this would cause the build to fail.

We could release the protocol- and the language modules separately too, but we want the language and protocol modules to be part of the project, to not over-complicate things - especially during a release.

So the Maven plugin is built in a way, that it uses the modules dependencies and creates its own Classloader to contain all of these modules at runtime.

This brings the benefit of being able to utilize Maven’s capability of determining the build order and dynamically creating the modules build classpath.

Adding a normal dependency however would make Maven deploy the artifacts with the rest of the modules.

We don’t want that as the modules are useless as soon as they have been used to generate the code.

So we use a trick that is usually used in Web applications, for example: Here the vendor of a Servlet engine is expected to provide an implementation of the Servlet API. It is forbidden for an application to bring this along, but it is required to build the application.

For this the Maven scope provided, which tells Maven to provide it during the build, but to exclude it from any applications it builds, because it will be provided by the system running the application.

This is not quite true, but it does the trick.