NAV Navbar

Inkling Overview

#######################################################
# Inkling Code and Syntax snippets will be shown here #
#######################################################

This reference contains comprehensive specifications and usage for the Inkling language, Bonsai’s special purpose programming language for training AI.

Inkling is a declarative, statically-typed language that provides a layer of abstraction between the Inkling programmer and the vast and dynamic set of AI algorithms that require expertise in machine learning. Inkling allows you to focus at a conceptual level on what you want the machine to learn. That is why this model of programming is called machine teaching.

An Inkling file contains the concepts (what and how you want to teach the AI), and a set of types (which describe the data) necessary to train your BRAIN. Training your BRAIN will also require a simulator. Python simulators are introduced in the Quick Start.

The version of Inkling described below replaces an earlier version, which is documented in the Inkling v1 Reference

Purpose

The purpose of the Inkling Reference is to give you the level of detailed specification you need to write valid Inkling code. The syntax for Inkling statements is available here. This document also contains examples and explanation so that you can understand Inkling better.

Functionality

Use the Inkling Reference for quick lookups while coding and also for deep dives into subjects like type equivalence and range expressions.

Lexical Structure

The lexical structure of Inkling includes these lexical elements:

  • keyword: words that Inkling reserves for its own use
  • operator: mathematical, logical, and grouping operators
  • identifier: user-defined names in Inkling
  • literal: numeric and string literal values
  • comment: user-specified comments

Keywords

Keywords are reserved words in the grammar. They all consist of lowercase ASCII strings like concept, input, and type.

Operators

Operators are symbols that represent mathematical, logical or grouping operations like +, %, (), or [].

Operators Table

<= < == != >= >
{ } [ ] ( )
: + - / % ,
** * . ..

Identifiers

    velocity
    _control_axis_1
    Rotate90Degrees
    `Search & Rescue`
    `input`
    `1 Microsoft Way`
    `\`A \\ B\``

Identifiers are user-defined names of an object, such as a concept or a type. Identifiers start with an alphabetical ASCII character or underscore and contain only alphanumeric characters and underscores.

If an identifier doesn’t follow these rules (e.g. contains spaces or non-ASCII characters, starts with a number) or is a keyword, it must be surrounded by backticks. Backslash can be used as an escape for back quotes and backslashes.

The use of capitalized identifiers is encouraged to avoid possible collisions with future keywords, which are guaranteed to be lowercase.

Literals

Inkling supports string and numeric literals (floating point and integer).

  • String Literals
    "Foo"
    "My \"String\""
    ""

String literals are surrounded by double quote characters. A backslash is used as an escape character.

  • Integer Literals
    +99
    -45
    9999

Integer literals are a string of digits with an optional sign and no decimal point.

  • Floating Point Literals
    12.0
    .5
    13.0f7
    .3f+2

Floating-point literal values use decimal points and optional exponents.

  • Array literals
    [0, 2, 37]

Array literals are surrounded by square brackets and contain a comma-separated list of values. Each value can be specified as an expression. All elements within an array literal must be the same type. Empty (zero-length) array literals are not supported.

  • Structure literals
    { X: 2, Y: 3, Name: "Bob" }

Struct literals are surrounded by curly brackets and contain a comma-separated list of field name/value pairs separated by a colon. The value can be specified as an expression and can even be another struct literal.

Comments

  # This is a comment
  • An Inkling comment begins after the character #. Comments must be on a line by themselves.

Inkling Version

    inkling "2.0"

All Inkling programs must start with an inkling statement that specifies the current language version. This allows the language to evolve without breaking existing programs.

Types

Inkling is a statically-typed language, which means that each variable, constant, and parameter has a well-defined type. The type system in Inkling supports the following basic types:

  • Number - numeric values (integer or floating point)
  • String (future) - an array of zero or more unicode characters
  • Array (future) - a non-zero-length array of same-typed values
  • Structure - a collection of typed fields with unique names
  • Complex - built-in object types for specialized uses (like images)

Type Literals

  number
  string
  number[3][3]
  { A: number, B: string }

Types may be specified as simple type literals. Number and string literals simply use the keywords number and string. (Note: Strings are not yet implemented.)

Array literals use square brackets surrounding a positive integer that represents the size of the array. Arrays may be nested. (Note: Arrays are not yet implemented.)

Structure literals are denoted with curly braces surrounding a comma-separated list of one or more field definitions. Each field definition consists of a field name and a type separated by a colon.

Numeric Constraints

  number<1..2>
  number<0..8 step 3>
  number<0, 3.5, 7>
  number<200>
  200
  number<Left = 0, Straight = 1, Right = 2>
  number<1..100><2..99><50..51>

A number type can express any value compatible with a double-precision floating-point representation. Sometimes it’s useful to constrain a number type to represent smaller ranges or sets of numbers.

A numeric type with a range constraint specifies a minimum and maximum value plus an optional step value surrounded by angle brackets. For example, number<0..8 step 3> represents a number that can take on the values 0, 4 or 8. If the range does not evenly divide by the step value, the last number in the range will be omitted. In the above example, the range would include the values 0, 3 and 6.

A numeric type with ordinal enumerated constraint specifies discrete, unique values in a comma-separated list. They must be listed in increasing order. For example, number<0, 3.5, 7> is constrained to take on one of those three values.

If an enumerated constraint contains only one value (e.g. number<200>), it can be represented more tersely using the value alone (e.g. 200).

A numeric type with nominal enumerated constraint specifies unordered symbolic identifiers in a comma-separated list. Unique numeric values must be specified for each identifier. For example, number takes on one of three values. Values specified in a nominal enumerated type can be accessed as constants elsewhere in the program by using the type name followed by a dot and the symbolic name of the value – for example, Direction.Left.

A constraint can be applied to an already-constrained type if it further narrows the constraints on the type.

Built-in Number Types

  using Number
  Number.UInt8
  Number.Int64
  Number.Float32
  Number.Bool

Inkling supports built-in symbolic names for number types that correspond to integer and floating-point types commonly found in other typed languages. These are shortands for constrained number types.

These types make use of the built-in namespace Number, which must be referenced somewhere in the program with a using statement.

  • Unsigned integers: UInt8, UInt16, UInt32, UInt64
  • Signed Integers: Int8, Int16, Int32, Int64
  • Floating-point: Float32, Float64
  • Boolean: Bool

String Constraints

  string<"A", "B", "C">
  string<"Z">
  "Z"
  number<Left = "L", Straight = "S", Right = "R">

As with numbers, a string type can be constrained using an enumerated constraint.

Complex Types

  using Image
  # Default order of parameters is <width, height>
  Image.Gray<100, 200>
  Image.Gray<Height=200, Width=100>

Currently, the only supported complex type is Image.Gray, which represents a two-dimensional collection of gray-scale pixels. Each pixel is represented as a Number.UInt8 value (0 is black and 255 is white).

The width and height of the image must be specified using a constant expression.

Type Declarations

  type Height number

  type Coordinate {X: number, Y: number}

  type TicTacToeValue string<"", "X", "O">

  type TicTacToeBoard TicTacToeValue[3][3]

A type can be associated with an identifier using a type declaration statement. This identifier can be used anywhere a type literal would be used.

Type Declaration Syntax

  typeDeclaration :==
    type <typeName> typeReference

  typeReference :==
    <typeName> | typeLiteral

  typeLiteral :==
    primitiveType |
    typeReference'<' rangeExpression '>' |
    typeReference'<' valueList '>' |
    typeReference'<' typeArgumentList '>' |
    typeReference '[' constExpression ']' |
    '{' typeFieldList '}'

  rangeExpression :==
    constExpression '..' constExpression [step constExpression]?

  valueList :==
    constExpression [',' constExpression]* [',']?

  typeArgumentList :==
    typeArgument [',' typeArgument]* [',']?

  typeArgument :==
    [<paramName> '=']? typeReference

  typeFieldList :==
    typeField [',' typeField]* [',']?

  typeField :==
    <fieldName> ':' typeReference

Type Equivalence

Inkling uses types to understand and interpret the format of data. The Inkling compiler performs type checks to guarantee type compatibility. In some cases, type equivalence is required (i.e. the types must match exactly). In other cases, castability is required (i.e. one type can be trivially transformed into another type).

The table below explains the rules for equivalence and castability.

Type Rules for equivalence Rules for castability
Number Constraints match Source is more constrained than dest
String Constraints match Source is more constrained than dest
Array Element type is equivalent
Number of dimensions match
Size of each dimension matches
Element type is castable
Number of dimensions matches
Size of each dimension matches
Structure All fields are present in the same order
Field names match
Field types are equivalent
Subset of source fields are present in dest, not necessarily in same order
Field types of present fields are castable
Complex Base type matches
Generic type arguments match
Base type matches
Generic type arguments match

Constants

This is the reference for the keyword const.

const MaxVelocity = 100
const MinVelocity = 0
const Environment = "NewYorkCity"
const DebugMode = true
const DestinationPoint = [4.5, 2.7, 9.2]

Inkling supports named constants for all Inkling types. The name can be used wherever a value can appear.

Constants make code easier to understand and modify because the constant name conveys the semantics of the constant. They document the logic of the program.

Types can be specified for constants. If a type is omitted, it is inferred. If a type is specified, the compiler validates that the constant value matches the specified type.

The constant value is specified as an expression that can make use of supported operators, numeric or string literals, and other constants. Cyclical constant references are not allowed.

# Valid constant declarations
const NumericConst1: number = 100
const NumericConst4: number<0 .. 2> = 3 * (5 / 2) - 7
const NumericConst5 = NumericConst1 * 2
const StringConst1: string = "Hello"
const ArrayConst1: number[3] = [0, 2, 5]
const StructConst1: {X: number, Y: string} = {X: 3, Y: "Foo"}

# Errors: type mismatch
const NumericConst2: number = "Hello"
const NumericConst3: number<0 .. 2> = 3
const StringConst2: string = ["Hello"]
const StructConst2: {X: number, Y: string} = {X: 3, Y: 3}
const StructConst3: {X: number, Y: string} = {X: 3}

# Error: cyclical reference
const NumericConst6 = NumericConst6

Some valid and invalid constant declarations are shown in the accompanying panel.

The constants declaration syntax is shown in the accompanying panel.

Constant Syntax

constantDeclaration ::=
  const <constName> [':' typeReference] = constExpression

Usage

  • A constant declaration can appear only in the global program scope. It cannot be nested within another statement.
  • A named constant does not have to be declared before it is used.
  • A named constant can be used anywhere a literal can be used.
  • The initialized value of the constant must be compatible with the declared type of the constant.
  • If no type is declared, it is inferred from the value.
  • The declared type of the constant must be compatible with the type of the context where it is used.

Graph

The graph keyword declares the concept graph. Only one graph declaration is allowed per Inkling program. It defines the input type and output type of the BRAIN. It also contains the list of declared concepts and designates which concept provides the output for the graph.

graph (input: InputType): OutputType {
  # Concept declarations omitted

  output MyConcept
}

Graph Syntax

graphStatement :=
graph '(' input ':' inputType ')' [':' outputType]? '{'
  conceptDeclaration*

  output <conceptName>
'}'

inputType ::= typeReference
outputType ::= typeReference

The inputType specifies the graph’s input type.

The outputType specifies the graph’s output type. If omitted, it is inferred by the output type of the output concept. If provided, the compiler verifies that the specified type matches that of the output concept.

The output clause must refer to one of the declared concepts.

Concepts

The concept keyword declares an abstract concept that is to be learned by the system. It can be a feature (such as a curvy line in an image) or a goal (such as high score in a game). Ultimately, a concept represents a function from input to output. By declaring a concept, you are instructing the AI Engine that this is a node in the concept graph that must be learned.

concept AbstractConceptName(Antecedent1, Antecedent2): OutputType {
  # Curriculum omitted
}

Because concepts are learned, their declarations tend to be fairly simple. The typical components of a concept statement are shown in the accompanying panel.

Concept Syntax

conceptDeclaration :=
concept <conceptName> '(' inputItemList ')' ':' outputType '{'
  curriculumClause
'}'

inputItemList ::=
  [inputItem] | [inputItem ',' inputItemList]

inputItem ::=
  input | <conceptName>  # name of a concept or input

outputType ::= typeReference

The inputItemList specifies the concept’s inputs. Cycles in the graph are not allowed.

The outputType specifies the concept’s output [type][1].

Usage

The concept statement specifies input sources and an output type. Input sources are provided in the input list.

concept bar (input): Move {
  # Curriculum omitted
}

concept foo (bar): Action {
  # Curriculum omitted
}

Inputs can be the output of other concepts or the input data stream. The input stream is the original input to the concept graph.

Input types do not need to be specified because their types are provided in the graph statement (in the case of input) or via the concept type (in the case of other concepts).

Example

We show Inkling for the concept GetHighScore.

concept GetHighScore(input): PlayerMove {
  # Curriculum omitted
}
  • conceptName: GetHighScore
  • inputItemList: input
  • outputType: PlayerMove

Curriculums

concept MyConcept(input): OutputType {
  curriculum {
    source MySimulator

    # Lessons specified here

  }
}

The curriculum statement is used within a concept to specify how the concept should be trained.

The lessons defined within the curriculum can be used to create a staged teaching plan.

Curriculum Syntax

curriculum {
  source <simulatorReference>

  [lessonStatement]*
}

Usage

There can be only one curriculum per concept, and every concept must have a curriculum.

Every curriculum must provide a source clause that specifies the data source for teaching the concept. Currently, only simulators are supported as data sources, so the source keyword must be followed by a simulator declaration or the name of a simulator declared at the global scope of the program.

Breakout Example

Breakout Example

simulator BreakoutSimulator(action: PlayerMove, config: BreakoutConfig): GameState {
}

The simulator declaration specifies the simulator name and the types of two input values (the action and configuration). It also specifies the type of the simulator’s output (the simulator state). In this instance, the configuration type is BreakoutConfig. The lessons initialize the configuration type of BreakoutConfig.

type BreakoutConfig {
  level: Number.UInt16,
  paddle_width: Number.UInt8<1 .. 4>,
  bricks_percent: number
}

type PlayerMove number<Left = -1, Stay = 0, Right = 1>

type GameState {
  pixels: Image.Gray<84, 336>
}

graph (input: GameState) {
  concept HighScore(input): PlayerMove {
    curriculum {
      source BreakoutSimulator

      lesson ScoreLesson {
        constraint {
          level: Number.UInt16<1 .. 100>,
          paddle_width: Number.UInt8<1 .. 4>,
          bricks_percent: 1
        }
      }
    }
  }

  output PlayBreakout
}

The configuration type BreakoutConfig is constrained in the lesson. When a concept is being trained, a new simulation configuration is passed to the simulator for each new episode. The configuration values are chosen in accordance with the current lesson’s constraint. For example, in the lesson above, the bricks_percent field is constrained to a value of 1, so the configuration passed to the simulator at the beginning of every episode will contain a value of 1 for this field. The paddle_width field is constrained to integers in the range of 1 through 4, so the configuration passed to the simulator at the beginning of each episode will contain a value of 1, 2, 3 or 4 for this field. Values are chosen uniformly at random from the set specified in the constraint.

The GameState type describes the data that is passed as an input to the graph. This same data type is provided as an output from the simulator. For Breakout, this data includes a field called pixels which refers to an 84x336 grayscale image.

The simulator return type must match (or be castable to) the graph’s input type. For an explanation of castability, refer to types.

In this example, the simulator declares the action type to be PlayerMove. The simulator action type must match the output type for the concept being trained. In our example, the concept HighScore trains the BRAIN to select the next move, which is of type PlayerMove.

graph (input: GameState) {
  concept KeepPaddleUnderBall(input): PlayerMove {
    curriculum {
      source BreakoutSimulator

      lesson TrackBallWidePaddle {
        constraint {
          level: Number.UInt16<1 .. 100>,
          paddle_width: 4,
          bricks_percent: 1
        }
      }

      lesson TrackBallAnyPaddle {
        constraint {
          level: Number.UInt16<1 .. 100>,
          paddle_width: Number.UInt8<1 .. 4>,
          bricks_percent: 1
        }
      }
    }
  }

  output KeepPaddleUnderBall
}

Another concept that helps with playing Breakout is KeepPaddleUnderBall. In the curriculum for this concept, there are two lessons. One of the lessons, TrackBallAnyPaddle, uses a wide paddle and is easier to learn. The second lesson, TrackBallAnyPaddle, uses a range expression to vary the paddle width from 1 to 4. The second lesson is more difficult to learn, but by starting with an easier problem, the overall training time is reduced.

Lessons

Lessons provide a staged way to teach a concept starting with easier tasks and moving to more difficult. This approach can reduce the total time it takes to train a concept.

Lesson Syntax

lessonStatement ::=
  lesson <lessonName> '{'
    [constraint <typeReference>]?
  '}'

The lesson declares a lesson within a curriculum. Lessons provide control over the training of a concept, allowing it to be staged from easy to difficult. This approach can reduce the overall training time.

Usage

Multiple lessons can be used within a curriculum, and they are used at training time in the same order in which they’re declared. The constraint clause should configure the simulator in a restricted manner. Subsequent lessons should incrementally reduce the restrictions, making the problem more difficult and the exploration space larger. A range constraint used within a lesson should be less restrictive (or at least not more restrictive) than the corresponding range constraint used in a previous lesson.

Example

simulator BreakoutSimulator(action: PlayerMove, config: BreakoutConfig): GameState {
}

type BreakoutConfig {
  level: Number.UInt16,
  paddle_width: Number.UInt8<1 .. 4>,
  bricks_percent: number
}

type PlayerMove number<Left = -1, Stay = 0, Right = 1>

type GameState {
  pixels: Image.Gray<84, 336>
}

graph (input: GameState) {
  concept HighScore(KeepPaddleUnderBall, input): PlayerMove {
    curriculum {
      source BreakoutSimulator

      lesson ConstantBreakout {
        constraint {
          level: 1,
          paddle_width: 4,
          bricks_percent: 1
        }
      }

      lesson VaryBreakout {
        constraint {
          level: Number.UInt16<1 .. 100>,
          paddle_width: Number.UInt8<1 .. 4>,
          bricks_percent: number<0.1 .. 1 step 0.01>
        }
      }
    }
  }

  output HighScore
}

In this example, we show lessons that break into stages the task of playing the game Breakout.

  • The first lesson, ConstantBreakout, trains the AI with a fixed configuration parameter.
  • The second lesson, VaryBreakout trains the AI with a configuration parameter that varies.

The types specified after the constraint keyword in our example specifies a type that must be compatible with the config type defined in the simulator declaration (BreakoutConfig).

Simulators

simulator MySimulator(action: MyActionType, config: MyConfigType): MyStateType {
}

Curriculum statements reference an associated simulator in the source clause.

The simulator statement describes the interface to a simulator program. The Inkling program does not contain code for the simulator itself. Instead the Inkling program defines how the simulator is used to train Inkling concepts.

Simulator Declaration Syntax

simulator <simulatorName>'('action <typeReference>
  [',' config <ctypeReference>]? ')' ':' <typeReference> '{'
'}'

A simulator statement defines the interface to the simulator - its input and output parameter types.

  • The action input type
  • An optional configuration input type
  • The state output type

Usage

curriculum {
  source BreakoutSimulator
}

The curriculum statement references a simulator. The example in the code panel shows the use of simulator BreakoutSimulator.

Discussion

Simulators are virtual environments designed to simulate the behavior of a real-world pysical environment and the objects or agents within it. Every simulator has “observable state”, a representation of the world inside the virtual environment. This state changes over time in response to actions taken by an agent. Typically, the observable state represents the inputs received by the simulated agent’s sensors (cameras, temperature guages, depth sensors, etc.).

During training, the AI Engine “drives” the simulator by giving it an action. The simulator then performs the action and returns the resulting state. The AI Engine then selects a new action and sends it to the simulator, and so on. Through this manner, the AI Engine is able to explore the “state space” of the simulated environment.

Our Library Reference describes the classes and methods used to connect an existing simulator or create a new simulator in Python. Find the Center is an example of a basic simulator implementation.

Example


simulator BreakoutSimulator(action: PlayerMove, config: BreakoutConfig): GameState {
}

type BreakoutConfig {
  level: Number.UInt16,
  paddle_width: Number.UInt8<1 .. 4>,
  bricks_percent: number
}

type PlayerMove number<Left = -1, Stay = 0, Right = 1>

type GameState {
  pixels: Image.Gray<84, 336>
}

graph (iput: GameState) {
  concept HighScore(input): PlayerMove {
    curriculum {
      source BreakoutSimulator

      lesson ScoreLesson {
        constraint {
          level: Number.UInt16<1 .. 100>,
          paddle_width: Number.UInt8<1 .. 4>,
          bricks_percent: 1
        }
      }
    }
  }

  output PlayBreakout
}

In this example we show some of the Inkling code for training the game Breakout. The curriculum source clause specifies BreakoutSimulator, and the simulator statement specifies the action, state, and configuration types.

Algorithm Hints

Syntax

algorithm {
   <fieldName> ':' <literal>
   [ ',' <fieldName> ':' <literal> ]*
}

Example

graph (input: GameState): Action {
    concept balance(input): Action {
        curriculum {
            source MySimulator

            algorithm {
                Algorithm: "APEX",
                HiddenLayers: [
                    {
                        Size: 64,
                        Activation: "tanh"
                    },
                    {
                        Size: 64,
                        Activation: "tanh"
                    }
                ]
            }
        }
    }
    output balance
}

The algorithm clause is for advanced users who want to experiment with changing machine learning algorithms and their tuning parameters. An algorithm statement is a hint to the AI Engine which may be ignored if the AI Engine decides that the specified algorithm or parameters are inappropriate or obsolete. If the hint is followed, it applies only to the concept in which it is specified.

Valid Algorithm Identifiers

  • APEX (Distributed Deep Q Network) - discrete action spaces only
  • PPO (Proximal Policy Optimization) - continuous action spaces only

Shared Parameter Details

The following parameters all apply to APEX and PPO. Additional parameter options are shown in subsequent tables below.

Parameter (Example use) Description
HiddenLayers: [HiddenLayerInfo]

(HiddenLayers: [{Size: 400, Activation: "relu"}, {Size: 300, Activation: "tanh"}])
An array of structures that define the size and (optionally) the activation function for each hidden layer. Sizes must be positive integers, and activation functions must be one of “linear”, “tanh”, “relu”, “logistic”, “softmax”, “elu”, or “default”.
ConvolutionLayers: [ConvolutionLayerInfo]

(ConvolutionLayers: {XSize: 8, YSize: 8, XStride: 4, YStride: 4 FilterCount: 32}”)
An array of structures that define the size, stride and (optionally) the filter count for each convolution layer.
CompressionLayers: [HiddenLayerInfo]

(CompressionLayers: [{Size: 20, Activation: "tanh"}, {Size: 10, Activation: "softmax"}])
If specified, will add one or more fully-connected hidden layers after the convolutional layers to compress the representation before incorporating other features. The format is the same as the HiddenLayerInfo described above.
ConvolutionLimitDimension: string
(ConvolutionLimitDimension: “x”)
A string that represents which dimension of the input shape to be used as the convolutional filter sizes. Must be one of: “max”, “min”, “x”, “y”, or “none”. By default, it’s “none”, which uses the original input shapes as the filter sizes. “max” uses the maximum of x and y inputs as the filter size. “min” uses the minimum of x and y input. “x” uses the x dimension input shape as the filter size. “y” uses the y dimension input shape. (The example uses the input x dimension shape as the convolutional filter size.)

APEX-Specific Parameters

In addition to the first (shared) table of parameters, APEX also supports these additional parameters.

Parameter (Example use) Description
ExplorationDecay: float
(ExplorationDecay: 0.00001)
The decay rate for the exploration policy.
QLearningRate: float
(QLearningRate: 0.0001)
The learning rate for training the Q network.

PPO-Specific

In addition to the first (shared) table of parameters, PPO supports these additional parameters.

Parameter (Example use) Description
BatchSize: (1000-2000000)
(BatchSize: 8000)
See below for details.
PolicyLearningRate: float
(PolicyLearningRate: 0.0001)
The learning rate for training the policy network.

BatchSize Details

Proximal Policy Optimization works by gathering a large number of complete trajectories and analyzing them in aggregate to obtain a confidence metric for the probability that a given change to the policy will improve performance. With a sufficiently large batch size, this yields monotonic policy improvement, as the algorithm only makes changes its confident will lead to real improvements based on large amounts of data. The BatchSize parameter determines how much data will be aggregated to make this decision. Smaller batches will lead to faster convergence as updates will be computed more frequently, but these updates will be less reliable and the policy may become unstable if this value is too small. We recommend you set this value to be large enough to contain a significant fraction of a complete episode from each of your simulators, as that aids in estimating policy performance. For example, if your episodes are up to 100 steps each and you have 100 simulators, try a value between 5000 and 10000.