NAV Navbar

Examples Overview

This page contains working examples of Inkling code in conjunction with python simulator files. All of these examples and the libraries that accompany them can be found on BonsaiAI’s GitHub page and are also linked within each Example.

All of the Python OpenAI Gym, and EnergyPlus examples can be trained in the cloud with Bonsai managed simulators. A full list of supported Docker containers for remotely managed simulators can be found in the Project File Reference.

If you have any suggestions of examples you’d like to see us implement please contact the support team.

Custom Simulator Examples

  • Basic Python/C++ Simulation: A project called Find the Center which walks you through how to create a simple Inkling file that connects to a basic Python or C++ simulator.
  • SimPy Elevator Simulation: A basic example of how to use SimPy to simulate a scenario. This one is an elevator dropping people off on 3 floors.

OpenAI Gym Examples

  • Cartpole: A simple control problem where a cart must make a simple move to keep a pole balanced on top.
  • Mountain Car: A simple control problem where a car must build up enough momentum to climb a hill.

Real World Examples

  • HVAC with EnergyPlus: An example of climate control machine teaching using EnergyPlus and BCVTB for simulation.
  • Simulink Cartpole: The compiled Simulink version of a pole balancing on a cart.
  • Simulink Househeat: The compiled Simulink version of an HVAC system taking into account the outdoor environment, the thermal characteristics of the house, and the house heating system.

Basic Python/C++ Simulation

Find the Center Diagram

Download the full source code on GitHub if you want to run this simulator locally.

In this example, we’ll walk you through the various statements that are part of the Find the Center game, including options for either a Python or C++ simulator file and the Inkling file. This is a very basic example of Inkling and how to connect to a custom simulator and shows the differences between using libbonsai (C++) and the bonsai-ai (Python) libraries.

Find the Center is a simple game where the AI seeks the average value between two numbers. In this game, the AI begins at a random value of 0, 1, or 2. The AI then can move to a lower number by outputting -1, a higher number by outputting +1, or staying on the same number by outputting 0. The goal of Find the Center is to remain in the center of 0 and 2 (the number 1).

Inkling File

Schema
schema GameState
    Int8 value
end

The GameState schema has one field, value, with type Int8.

schema PlayerMove
    Int8{-1, 0, 1} delta
end

The PlayerMove schema has one field, delta, with type Int8. The Int8 type is constrained to three possible values: -1, 0, and 1.

schema SimConfig
    Int8 dummy
end

The SimConfig schema has one field, dummy, because there is no configuration needed in this particular example.

Concept
concept find_the_center
    is classifier
    predicts (PlayerMove)
    follows input(GameState)
    feeds output
end

This concept is named find_the_center. find_the_center expects input about the state of the game (defined by the GameState schema) and replies with output defined by the PlayerMove schema. This is the AI’s next move in the simulation.

Simulator
simulator find_the_center_sim(SimConfig)
    action (PlayerMove)
    state (GameState)
end

The simulator is called find_the_center_sim (shown in #simulator-file) and takes the schema input of SimConfig (even though it isn’t configuring anything, it’s required by the simulator). The find_the_center concept will be trained using the find_the_center_sim simulator. To define the training relationship between the simulator and the concept we must begin by defining the simulator. find_the_center_sim expects an action defined in the PlayerMove schema as input and replies with a state defined in the GameState schema as output.

Curriculum
curriculum find_the_center_curriculum
    train find_the_center
    with simulator find_the_center_sim
    objective time_at_goal
        lesson seek_center
            configure
                constrain dummy with Int8{-1}
            until
                maximize time_at_goal
end

The curriculum is named find_the_center_curriculum, and it trains the find_the_center concept using the find_the_center_sim.

This curriculum contains one lesson, called seek_center. It configures the simulation, by setting a number of constraints for the state of the simulator. The lesson trains until the AI has maximized the objective time_at_goal.

Simulator File

""" This Basic simulator is for learning the simulator interface.
It can be used in this case to find the center between two numbers.
"""
import bonsai_ai
from random import randint


class BasicSimulator(bonsai_ai.Simulator):
    """ A basic simulator class that takes in a move from the inkling file,
    and returns the state as a result of that move.
    """
    min = 0
    max = 2
    goal = 1
    started = False

    def episode_start(self, parameters=None):
        """ called at the start of every episode. should
        reset the simulation and return the initial state
        """

        # reset internal initial state
        self.goal_count = 0
        self.value = randint(self.min, self.max)

        # print out a message for our first episode
        if not self.started:
            self.started = True
            print('started.')

        # return initial external state
        return {"value": self.value}

    def simulate(self, action):
        """ run a single step of the simulation.
        if the simulation has reached a terminal state, mark it as such.
        """

        # perform the action
        self.value += action["delta"]
        if self.value == self.goal:
            self.goal_count += 1

        # is this episode finished?
        terminal = (self.value < self.min or
                    self.value > self.max or
                    self.goal_count > 3)
        state = {"value": self.value}
        reward = self.goal_count
        return (state, reward, terminal)


if __name__ == "__main__":
    config = bonsai_ai.Config()
    brain = bonsai_ai.Brain(config)
    sim = BasicSimulator(brain, "find_the_center_sim")

    print('starting...')
    while sim.run():
        continue

// Copyright (C) 2017 Bonsai, Inc.

#include <iostream>
#include <memory>
#include <string>
#include <random>

#include "bonsai/bonsai.hpp"

// std
using std::cout;
using std::endl;
using std::make_shared;
using std::move;
using std::shared_ptr;
using std::string;

using std::random_device;
using std::mt19937;
using std::uniform_int_distribution;

// bonsai
using bonsai::Brain;
using bonsai::Config;
using bonsai::InklingMessage;
using bonsai::Simulator;

// random number generator
random_device rd;
mt19937 rng(rd());

// basic simulator
class BasicSimulator : public Simulator {
    constexpr static int8_t _min = 0, _max = 2, _goal = 1;
    int8_t _goal_count = 0;
    int8_t _value = 0;
    uniform_int_distribution<int8_t> _uni{_min, _max};
 public:
    explicit BasicSimulator(shared_ptr<Brain> brain, string name )
        : Simulator(move(brain), move(name)) {}

    void episode_start(const InklingMessage& params,
        InklingMessage& initial_state) override;
    void simulate(const InklingMessage& action,
        InklingMessage& state,
        float& reward,
        bool& terminal) override;
};

void BasicSimulator::episode_start(
    const InklingMessage& params,
    InklingMessage& initial_state) {

    // reset
    _goal_count = 0;
    _value = _uni(rng);

    // set intial state
    initial_state.set_int8("value", _value);

    // print a message for our first episode
    static bool started = false;
    if (!started) {
        started = true;
        cout << "started." << endl;
    }
}

void BasicSimulator::simulate(
    const InklingMessage& action,
    InklingMessage& state,
    float& reward,
    bool& terminal) {

    // perform the action
    _value += action.get_int8("delta");
    if (_value == _goal)
        _goal_count++;

    // output
    state.set_int8("value", _value);
    terminal = _value < _min || _value > _max || _goal_count > 3;
    reward = _goal_count;
}


int main(int argc, char** argv) {
    auto config = make_shared<Config>(argc, argv);
    auto brain = make_shared<Brain>(config);

    BasicSimulator sim(brain, "find_the_center_sim");

    cout << "starting..." << endl;
    while ( sim.run() ) {
        continue;
    }

    return 0;
}

This is a basic simulator for learning the simulator library. In this case it is used to find the center between two numbers, 0 and 2. The goal, as outlined in the Inkling file, is to reach 1. The moves that the simulator is able to make are sent from the Inkling file to the simulator and the state of the simulator is sent back to Inkling.

The README file contained in the project has instructions for running this simulator in either Python or C++.

SimPy Elevator Simulation

ASCII Elevator Output

Download the full source code on GitHub if you want to run this simulator locally.

In this example, SimPy, a process-based discrete-event simulation framework based on standard Python, is used to simulate an elevator effectively transporting people to their desired floor. The simulated elevator gets rewarded by having people wait for the least amount of time. This example includes a simple elevator SimPy simulator, a Python simulation, and the simulation’s Inkling file.

“Processes in SimPy are defined by Python generator functions and may, for example, be used to model active components like customers, vehicles or agents. SimPy also provides various types of shared resources to model limited capacity congestion points (like servers, checkout counters and tunnels).” - SimPy docs

This simulation is to provide actions (up, down, open doors) for an elevator, given floor requests from randomly arriving passengers. SimPy has a great framework for simulating time only when some state changes, which speeds up training for systems that would otherwise be mostly waiting.

In the image to the right, the elevator logs output every 100 seconds, then shows the state of the world, and then a list of recent passengers. The world state is the floor, the number of people waiting, plus the elevator, and the number of people inside.

1: 0| 2: 1| 3: 0| [_1_] shows zero people on the first floor, one person waiting on the second floor, and one person in the elevator on the third floor.

For more ideas of how SimPy can simulate real world problems see the SimPy Examples page.

Inkling File

Schema
# Position is current location of elevator
# State of each floor: 1 if the floor is requested, 0 if not
schema FloorState
    Int8{0, 1, 2} Position,
    Int8{0, 1} Floor1,
    Int8{0, 1} Floor2,
    Int8{0, 1} Floor3
end

The FloorState schema defines the dictionary returned from the Python simulation’s advance method to the BRAIN.

# command options: up, open, down
schema Action
    Int8{0, 1, 2} command
end

The Action schema defines the possible actions the elevator can take. In this case the command given to the elevator is ‘0’ is open, ‘1’ is go up a floor, and ‘2’ is go down a floor.

# Possible option for configuration
schema ElevatorConfig
    Int8 episode_length
end

The ElevatorConfig schema outlines some possible configurations you could give to the elevator if you wanted to tailor its learning in the lessons outlined later.

Concept
# Predicts an Action and follows input from the FloorState schema
concept elevator_plan is classifier
    predicts (Action)
    follows input(FloorState)
    feeds output
end

This concept is named elevator_plan, a classifier, which predicts an Action given the current FloorState. In this simple example, we are training the concept to make an action (go up a floor, go down a floor, or open the doors) based on the current state of the floor the elevator is on.

Simulator
# Connect to SimPy simulator for training
simulator elevator_simulator(ElevatorConfig)
    action (Action)
    state (FloorState)
end

The simulator clause declares that a simulator named elevator_simulator will be connecting to the server for training. This elevator_simulator expects an action defined in the Action schema as input and replies with a state defined in the FloorState schema as output.

Curriculum
# This trains the concept using a single lesson
# Maximize the elevator_objective defined in elevator_simulator.py
curriculum high_score_curriculum
    train elevator_plan
    with simulator elevator_simulator
    objective elevator_objective
        lesson get_high_score
            configure
                constrain episode_length with Int8{-1}
            until
                maximize elevator_objective
end

The curriculum high_score_curriculum trains elevator_plan using elevator_simulator. The BRAIN that runs this Inkling code will try to maximize the value returned from elevator_objective until you stop training. The reward function passed to elevator_objective is a method in the simulator elevator_simulator.py in which it returns the waiting time of the total sum of people as a negative number, in order to maximize it. The code for this can be seen at the end of the simulator excerpt below.

This curriculum contains one simple lesson, called get_high_score. It configures the simulation, but in this case the lesson is simply using a dummy variable to not provide the elevator with any starting conditions to learn from. The training starts from random placement and actions.

Simulator Excerpt

# Excerpt of simulator class from the elevator_simulator.py file

class ElevatorSimulator(Simulator):

    def episode_start(self, parameters=None):
        print('called episode_start')
        self.env = env = simpy.Environment()
        floors = [simpy.Resource(env, 1) for _ in range(0, BUILDING_SIZE)]
        store = simpy.Store(env, 1)
        state = elevator.Lstate()
        person_score = []
        reqs = []
        env.process(elevator.claim(floors, reqs))
        env.process(elevator.person_generator(env, floors, person_score))
        env.process(elevator.display_process(env, person_score, state))

        # We use the single step version of elevator (elevator_one)
        # this allows the simulator to run until the elevator uses a command.
        ep = env.process(
            elevator.elevator_one(env, floors, state, store, reqs))

        self.floors = floors
        self.store = store
        self.state = state
        self.person_score = person_score
        self.reqs = reqs
        self.ep = ep

        return self._get_state()

    def simulate(self, action):
        command = action['command']
        env = self.env
        # print('[advance]', end='')
        # print('command: {}'.format(command))
        self.state.command = command

        # pass our command to a resource by way of this doit() method
        env.process(doit(self.store, command))

        env.run(until=self.ep)
        self.ep = env.process(elevator.elevator_one(
            self.env, self.floors, self.state, self.store, self.reqs))
        # print('stepped to {}'.format(env.now))

        state = self._get_state()
        done = self._get_done()
        reward = None

        # only calculate reward for training mode
        if not self.predict:
            reward = self._elevator_objective()

        return self._get_state(), reward, done

    def _get_state(self):
        """ return the current state of the simulation """

        # print('[get_state]', end='')

        # if a floor is requested, state=1
        values = [min(len(q.queue), 1) for q in self.floors]
        state = {'Floor{}'.format(ix+1): v for ix, v in enumerate(values)}
        state['Position'] = self.state.floor
        # print(state)

        return state

    def _get_done(self):
        self.done = done = self.env.now > SIM_TIME
        return done

    def _elevator_objective(self):
        # print('[objective]', end='')
        waiting = elevator.count_waiting(self.person_score)
        # print("returning score %d for %d people" % (active, len(scores)))

        # return as negative because the simulator maximizes this value.
        return -waiting

The full simulator file elevator_simulator.py and elevator simulation file elevator.py for this example is with the rest of the simpy-elevator code on GitHub.

This is a Python simulator which uses the elevator.py custom Python simulation using SimPy. This elevator_simulator.py file repeatedly runs the elevator simulation for each episode of training to get details of where people start out, how many are on the elevator, and what floor they are going to, etc. Each episode the curriculum in Inkling trains the concept by sending a new Action to the BRAIN based on the FloorState of the simulator.

The _elevator_objective function returns a negative waiting value because this value is going to be maximized, and we want to actually minimize the collective group of people’s wait time.

For more information on the functions inside of this simulator class and how to implement them see the Library Reference.

Also note that if you would like to see how this simulator can be run without Bonsai - to demonstrate how the emulator behaves with a hard-coded algorithm - you can do so by running elevator.py simply with Python itself.

OpenAI Gym: Cartpole

Cartpole Balance

Download the full source code on GitHub if you want to run this simulator locally. If you want to run Cartpole remotely on the Bonsai Platform as a managed simulator, create a new BRAIN selecting the Cartpole demo on beta.bons.ai.

In this example, we’ll walk you through the various statements that are part of the Cartpole Inkling file. Each statement is followed by an explanation of the statement.

Cartpole is a classic control problem. OpenAI Gym describes it as:

A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center.

Inkling File

Schema
schema GameState
    Float32 position,
    Float32 velocity,
    Float32 angle,
    Float32 rotation
end

The schema GameState names four records — position, velocity, angle, and rotation — and assigns a type to them. This information is input from the simulation.

schema Action
    Int8{0, 1} command
end

The schema Action names a record — action — and assigns it a constrained type.

schema CartPoleConfig
    Int8 episode_length,
    UInt8 deque_size
end

The schema CartPoleConfig names two records — episode_length and deque_size — and assigns each of them a type. episode_length is a signed Int8 because -1 is used for “run until pole drops”.

Concept
concept balance is classifier
    predicts (Action)
    follows input(GameState)
    feeds output
end

The concept is named balance, and it takes input from the simulator. That input is the records in the schema GameState. The balance concept outputs the move the AI should make in the simulator. This output is the record in the Action schema.

Simulator
simulator cartpole_simulator(CartPoleConfig) 
    action (Action)
    state (GameState)
end

The cartpole_simulator gets information from two schemas. The first schema, CartPoleConfig, specifies the schema for configuration of the simulation. The second schema contains the state of the simulator that is sent to the lesson.

Curriculum
curriculum balance_curriculum
    train balance
    with simulator cartpole_simulator
    objective open_ai_gym_default_objective

        lesson balancing
            configure
                constrain episode_length with Int8{-1},
                constrain deque_size with UInt8{1}
            until
                maximize open_ai_gym_default_objective
end

The curriculum’s name is balance_curriculum. It trains the balance concept with the cartpole_simulator. The objective for this curriculum is up_time. The objective measures how long the pole stays upright.

This curriculum contains one lesson, called balancing. It configures the simulation, by setting two constraints for the state of the simulator. The lesson trains until the AI has maximized the objective.

Simulator File

import sys
import logging
from bonsai_ai import Brain, Config
from bonsai_gym import GymSimulator

log = logging.getLogger('gym_simulator')
log.setLevel(logging.DEBUG)


class CartPole(GymSimulator):
    # Environment name, from openai-gym
    environment_name = 'CartPole-v0'

    # simulator name from Inkling
    # Example Inkling:
    #   curriculum balance_curriculum
    #       train balance
    #       with simulator cartpole_simulator
    #       ....
    simulator_name = 'cartpole_simulator'

    # convert openai gym observation to our state schema
    # Example Inkling:
    #   schema GameState
    #       Float32 position,
    #       Float32 velocity,
    #       Float32 angle,
    #       Float32 rotation
    #   end
    def gym_to_state(self, observation):
        state = {'position': observation[0],
                 'velocity': observation[1],
                 'angle':    observation[2],
                 'rotation': observation[3]}
        return state

    # convert our action schema into openai gym action
    # Example Inkling:
    #   schema Action
    #       Int8{0, 1} command
    #   end
    def action_to_gym(self, action):
        return action['command']


if __name__ == '__main__':
    # create a brain, openai-gym environment, and simulator
    config = Config(sys.argv)
    brain = Brain(config)
    sim = CartPole(brain)
    sim.run_gym()

This is an OpenAI Gym example which uses the OpenAI environment as its simulator. For more information about the simulator used see the Bonsai Gym Common GitHub repo which is a python library for integrating a Bonsai BRAIN with OpenAI Gym environments.

OpenAI Gym: Mountain Car

Mountain Car Control

Download the full source code on GitHub if you want to run this simulator locally. If you want to run Mountain Car remotely on the Bonsai Platform as a managed simulator, create a new BRAIN selecting the Mountain Car demo on beta.bons.ai

We’ve used pieces of code from this example in several places, but here we’ll walk you through all the various statements that are part of the Mountain Car Inkling file. Each statement is followed by an explanation of the statement.

Mountain Car is a classic control problem. OpenAI Gym describes it as:

A car is on a one-dimensional track, positioned between two “mountains”. The goal is to drive up the mountain on the right; however, the car’s engine is not strong enough to scale the mountain in a single pass. Therefore, the only way to succeed is to drive back and forth to build up momentum.

Inkling File

Schema
schema GameState
    Float32 x_position,
    Float32 x_velocity
end

The GameState schema names two records — x_position and y_position — and assigns a type to them.

schema Action
    Float32{-1.0:1.0} command
end

The Action schema names a single record — command — and assigns a constrained type to it.

schema MountainCarConfig
    UInt8 deque_size
end

The MountainCarConfig schema names a single record - deque_size - and assigns an unconstrained type to it.

Concept
concept high_score is classifier
    predicts (Action)
    follows input(GameState)
    feeds output
end

The concept is named high_score, and it takes input from the simulator about the state of the game (GameState schema). It outputs to the Action schema. This is the AI’s next move in the game.

Simulator
simulator mountaincar_continuous_simulator(MountainCarConfig)
   action  (Action)
   state  (GameState)
end

The mountaincar_continuous_simulator gets information from two schemas. The first schema, MountainCarConfig, specifies the schema for configuration of the simulation. The second schema contains the state of the simulator that is sent to the lesson.

Curriculum
curriculum high_score_curriculum
    train high_score
    with simulator mountaincar_continuous_simulator
    objective open_ai_gym_default_objective

        lesson get_high_score
            configure
                constrain deque_size with UInt8{1}
            until
                maximize open_ai_gym_default_objective
end

The curriculum is named high_score_curriculum, and it trains the high_score concept using the mountaincar_simulator. This curriculum contains one lesson, called get_high_score. It configures the simulation, by setting two constraints for the state of the simulator.

The lesson trains until the AI has maximized the objective named score.

Simulator File

import sys
import numpy
import logging
from bonsai_ai import Brain, Config
from bonsai_gym import GymSimulator

log = logging.getLogger('gym_simulator')
log.setLevel(logging.DEBUG)


class MountainCarContinuous(GymSimulator):
    environment_name = 'MountainCarContinuous-v0'
    simulator_name = 'mountaincar_continuous_simulator'

    def gym_to_state(self, observation):
        state = {'x_position': observation[0],
                 'x_velocity': observation[1]}
        return state

    # As an Estimator, continuous mountaincar returns the command
    # as a numpy array.
    def action_to_gym(self, actions):
        # return actions['command']
        return numpy.asarray([actions['command']])


if __name__ == '__main__':
    # create a brain, openai-gym environment, and simulator
    config = Config(sys.argv)
    brain = Brain(config)
    sim = MountainCarContinuous(brain)
    sim.run_gym()

This is an OpenAI Gym example which uses the OpenAI environment as its simulator. For more information about the simulator used see the Bonsai Gym Common GitHub repo which is a python library for integrating a Bonsai BRAIN with OpenAI Gym environments.

EnergyPlus HVAC Optimization

EnergyPlus Graph

Download the full source code on GitHub if you want to run this simulator locally. If you want to run EnergyPlus remotely on the Bonsai Platform as a managed simulator, create a new BRAIN selecting the EnergyPlus demo on beta.bons.ai.

In this example, we’ll walk you through the various statements that are part of a sample implementation of EnergyPlus on the Bonsai Platform, including the simulator and the Inkling files. This is a real-world example of how to use the Bonsai Platform for HVAC control using BCVTB and EnergyPlus.

While this BRAIN is training, the Bonsai AI Engine launches the EnergyPlus simulator in the background for every episode. The energyplus_simulator.py then drives the simulator forward a step at a time until it finishes the episode and then relaunches it for the next episode, driving the actions into it and sending state results back to the Bonsai AI Engine.

Inkling File

Schema
schema SimState
    Int32{0:10} SolarIrradiation
end

The SimState schema defines the dictionary returned from the Python simulation’s advance method to the BRAIN.

schema SimAction
    Int32 {0, 1} shade
end

The SimAction schema defines the ‘actions’, a dictionary of control signals this AI can send to the climate control. For example: shade == night, off, day.

schema SimConfig
    Int32{-1} unused
end

The SimConfig schema in this case is not used (but is still required to be defined in Inkling) but it would define the dictionary passed as a parameter to the set_properties method of the Python simulator.

Concept
concept my_concept is classifier
   predicts (SimAction)
   follows input(SimState)
   feeds output
end

This concept is named my_concept which predicts a SimAction given a SimState. In this simple demo we just ask the Bonsai Platform to generate any model that can learn to control the server using these inputs and outputs.

Simulator
simulator energyplus_simulator(SimConfig)
    action (SimAction)
    state (SimState)
end

The simulator clause declares that a simulator named energyplus_simulator will be connecting to the server for training. This code snippet binds the previous schemas to this simulator. To define the training relationship between the simulator and the concept we must begin by defining the simulator. energyplus_simulator expects an action defined in the SimAction schema as input and replies with a state defined in the SimState schema as output.

Curriculum
curriculum my_curriculum
    train my_concept
    with simulator energyplus_simulator
    objective reward_function
        lesson my_first_lesson
            configure
                constrain unused with Int32{-1}
            until
                maximize reward_function
end

The curriculum my_curriculum trains my_concept using energyplus_simulator. The BRAIN that runs this Inkling code will try to maximize the value returned from reward_function until you stop training. reward_function is a method in the Python simulator.

This curriculum contains one lesson, called my_first_lesson. It configures the simulation, by setting a number of constraints for the state of the simulator.

Simulator Excerpt

# Excerpt of simulator class from the energyplus_simulator.py file

class EnergyPlusSimulator(Simulator):
    """
    Runs the Actuator model for training or prediction by launching it
    against the Ptolemy server above. This uses the Bonsai Simulator
    base class to interface with the BRAIN server.
    """
    model = ePlus85Actuator()
    server = None

    clientState = {'SolarIrradiation': 0}
    shade = 0.
    is_terminal = True

    def episode_start(self, parameters):
        """
        Callback called when an training episode starts. We use this
        to reset the Ptolemy server and start a new simulation session.
        Returns the initial state.
        """
        self.restartPtolemyServer()
        return self.clientState

    def simulate(self, action):
        """
        Callback called  when stepping the simulation. It sends actions to the
        model and returns the state of the simulation to the BRAIN.
        """

        # take the action
        self.shade = action['shade'] * 6.

        if self.is_terminal:
            self.restartPtolemyServer()
        else:
            self.server.writeToClient([self.shade])
            self.readFromPtolemyClient()

        # you like graphs? WE HAVE GRAPHS. SO MANY GRAPHS.
        if self.is_terminal:
            graph = self.model.grapher()
            write_graph(graph)

            # clear old data
            self.model.data = ([], [], [], [], [])

        reward = self.reward_function()
        return self.clientState, float(reward), self.is_terminal

    def finished(self):
        """
        Called at the end of the simulation to output the graph.
        """
        graph = self.model.grapher()
        py.plot(graph, filename="graph.html")

    def readFromPtolemyClient(self):
        """
        Utility method used to do the read portion of the exchange
        with the Ptolemy server and client.
        """
        self.server.readFromClient()
        if self.model.fromClient and len(self.model.fromClient) == 4:
            self.clientState = {
                'SolarIrradiation': int(self.model.fromClient[2]/100)
                }

            # save the client input in our graph
            for n in range(len(self.model.fromClient)):
                value = self.model.fromClient[n]
                # scale some of the values for readability
                if n == 2:
                    value /= 100.
                self.model.data[n].append(value)

        self.is_terminal = self.model.exitFlag != 0

    def restartPtolemyServer(self):
        """
        Used to restart the server and setup the initial state.
        """

        # set some default values for get_state
        self.is_terminal = True
        self.clientState = {'SolarIrradiation': 0}

        # close the old connections if they're still open
        if self.server:
            self.server.close()

        # star a new episode
        print("EnergyPlusSimulator: starting PtolemyServer")
        self.server = PtolemyServer(self.model)

        try:
            self.server.start()
            self.server.waitForClient()
            # get initial state
            self.readFromPtolemyClient()

        except OSError as msg:
            print("EnergyPlusSimulator: error on restart:", msg)
            self.server = None

    def reward_function(self):
        """
        Calculates the reward for the current state of the simulation
        """
        print("EnergyPlusSimulator: reward_function")

        # largest reward is best reward (maximize)
        reward = 0
        if self.model.fromClient and len(self.model.fromClient) == 4:
            # SolarIrradiation === Shades down === good
            SolarIrradiation = self.model.fromClient[2] / 100.

            # sun is down
            if SolarIrradiation <= 1:
                if self.shade > 0:
                    reward = -1  # shades on
                else:
                    reward = 1  # shade off

            # sun is out
            else:
                if self.shade > 0:
                    reward = 1  # shades on
                else:
                    reward = -1  # shades off

            self.model.data[4].append(reward)

        print("EnergyPlusSimulator reward:", reward)
        return reward

The full simulator file energyplus_simulator.py for this example is with the rest of the energyplus-sample code on GitHub.

This is a Python simulator for integrating the EnergyPlus simulator into the Bonsai AI Engine. This energyplus_simulator.py file repeatedly runs the EnergyPlus simulator in the background with new actions sent from the Bonsai AI Engine by passing the state from EnergyPlus to the backend, and the action from the backend back to EnergyPlus.

For more information on the functions inside of this simulator class and how to implement them see the Library Reference.

Simulink Examples

Simulink, developed by The MathWorks, is a graphical programming environment for modeling, simulating and analyzing multi-domain dynamical systems.

Bonsai is using Simulink as a training environment for a Bonsai BRAIN. We’re supporting a wide range of classical control or machine tuning use cases. Please review our simulator and simulation model requirements guide.

Prerequisites

You can also clone the repo on GitHub

git clone https://github.com/BonsaiAI/bonsai-simulink

Installation covered within the specific example.

  • Python3
  • Bonsai CLI and SDK
  • Asynchronous HTTP Client/Server
  • MATLAB Engine API for Python
  • MATLAB & Simulink (R2017b)

Download the full source code

This is a step-by-step guide for using Bonsai’s Universal Coordinator in Python to connect the Bonsai Platform to a Simulink model.

You have to have Matlab and Simulink installed. You can download the trial version from TheMathworks

CLI guide

We’re using the Bonsai Command Line Interface (CLI) for this example. Refer to the Bonsai CLI guide for more details.

Install Pre-Requisites

pip install aiohttp
pip install bonsai-cli bonsai-ai bonsai-gym

Install Asynchronous HTTP Client/Server and the Bonsai CLI and Bonsai libraries needed to run our examples.

Install Matlab Dependencies:

cd <matlabroot>/extern/engines/python
python setup.py install

Install the Matlab API for Python from the root directory of your Matlab installation.

This example requires the installation of Matlab Simscape Multibody Add-On.

Go to your Add-On Explorer in Matlab and add Simscape and Simscape Multibody.

Running and Training

bonsai create simulink-cartpole
bonsai push
bonsai train start



../../coordinator/coordinator --brain=simulink-cartpole

Run the following in the simulink-househeat directory:

The bonsai push command will upload all needed files to the Bonsai backend. This includes an Inkling file which describes state, actions and a learning curriculum that the the Bonsai AI is using for training.

Inkling

Inkling is a declarative, strongly typed programming language specifically designed for artificial intelligence (AI). It abstracts away the world of dynamic AI algorithms that require expertise in machine learning and enables more developers to program AI. Please review our Inkling Guide

Schema
schema CartpoleState
    Float32 theta,
    Float32 dtheta,
    Float32 x,
    Float32 dx
end

The CartpoleState schema names four records — theta, dtheta,x, and dx — and assigns a type to them.

schema CartpoleAction
    Float32 {-10.0:10.0} f
end

The CartpoleAction schema names a single record — action — and assigns a constrained type to it.

schema CartpoleConfig
    Int8 dummy
end

The CartpoleConfig uses dummy because we’re not using config in this model.

Concept
concept balance is estimator
   predicts (CartpoleAction)
   follows input(CartpoleState)
   feeds output
end

The concept, balance, takes input from the simulator about the state of the model (Cartpolestate schema). It outputs to the CartpoleAction schema. This is the AI’s next control signal to the cartpole.

Simulator
simulator simulink_sim(CartpoleConfig)
    action (CartpoleAction)
    state (CartpoleState)
end

The simulink_sim gets information from two schemas. The first schema, action, specifies the schema for actions within the simulation. The second schema contains the state of the simulator sent to the lesson.

Curriculum
curriculum my_curriculum
    train balance
    with simulator simulink_sim
    objective cartpole_balance
        lesson my_first_lesson
            configure
                constrain dummy with Int8{-1}
            until
                maximize cartpole_balance
end

The curriculum, my_curriculum, trains the balance concept using the simulink_sim. This curriculum contains one lesson, called my_first_lesson. It configures the simulation, by setting a constraint for the state of the simulator.

The lesson trains until the AI has maximized the objective named cartpole_balance.

Predicting with a BRAIN
bonsai train stop

When you are seeing rewards reaching 1000 you can stop training. You may need to Ctrl+C to stop in the terminal.

../../coordinator/coordinator --predict

Now you can predict using the trained BRAIN.

This is a step-by-step guide for using Bonsai’s Universal Coordinator in Python to connect the Bonsai Platform to a Simulink model.

You have to have Matlab and Simulink installed. You can download a trial version from TheMathworks

This example shows how to use Simulink to create the thermal model of a house. This system models the outdoor environment, the thermal characteristics of the house, and the house heating system. Objective for the Bonsai AI is to reach the desired temperature.

CLI Guide

We’re using the Bonsai Command Line Interface (CLI) for this example. Refer to the Bonsai CLI guide for more details.

Pre-requisites to run the Example

pip install aiohttp
pip install bonsai-cli bonsai-ai bonsai-gym

Install Asynchronous HTTP Client/Server and the Bonsai CLI and Bonsai libraries needed to run our examples.

Install Matlab Engine:

cd <matlabroot>/extern/engines/python
python setup.py install

Install the Matlab API for Python from the root directory of your Matlab installation.

Running and Training

bonsai create simulink-househeat
bonsai push
bonsai train start

../../coordinator/coordinator --brain=simulink-househeat

Run the following in the simulink-househeat directory:

The bonsai push command will upload all needed files to the Bonsai backend. This includes an Inkling file which describes state, actions and a learning curriculum that the the Bonsai AI is using for training.

Inkling

Inkling is a declarative, strongly typed programming language specifically designed for artificial intelligence (AI). It abstracts away the world of dynamic AI algorithms that require expertise in machine learning and enables more developers to program AI. Please review our Inkling Guide

Schema
schema HouseheatState
    Float32 set_temp,
    Float32 room_temp,
    Float32 room_temp_change,
    Float32 heat_cost,
    Float32 outside_temp,
    Float32 outside_temp_change
end

The HouseheatState schema names six records — set_temp.room_temp,room_temp_change,heat_cost,outside_temp, and outside_temp_change — and assigns a type to them.

schema HouseheatAction
    Float32{ 0.0:1:1.0 } heater_on
end

The HouseheatAction schema names a single record — heater_on — and assigns a constrained type to it.

schema HouseheatConfig
    Float32 outside_phase
end

The HouseheatConfig schema names one record — outside_phase - and assigns a type to it.

Concept
concept thermostat is classifier
   predicts (HouseheatAction)
   follows input(HouseheatState)
   feeds output
end

The concept, thermostat, takes input from the simulation model about the state of the temperature in the house (HouseheatState schema). It outputs to the HouseheatAction schema. This is the AI’s next move in the game.

Simulator
simulator simulink_sim(HouseheatConfig)
    action (HouseheatAction)
    state (HouseheatState)
end

The simulink_sim gets information from two schemas. The first schema, action, specifies the schema for actions within the simulation. The second schema, state contains the state of the simulation sent to the lesson.

Curriculum
curriculum my_curriculum
    train thermostat
    with simulator simulink_sim
    objective match_set_temp
        lesson my_first_lesson
            configure
            constrain outside_phase with Float32{0.0:48.0}
            until
                maximize match_set_temp
end

The curriculum, my_curriculum, trains the thermostat concept using the simulink_sim. This curriculum contains one lesson, called my_first_lesson. It configures the simulation, by setting constraints for the state of the simulator.

The lesson trains until the AI has maximized the objective named match_set_temp.

Predicting with a BRAIN
bonsai train stop

When you are seeing rewards reaching >100 you can stop training. You may need to Ctrl+C to stop in the terminal.

../../coordinator/coordinator --predict

Now you can predict using the trained BRAIN.

Simulink Coder provides a mechanism to compile Simulink models as a fast running C executable file for your operating system. Three main benefits to training Bonsai BRAINs with these executables are as follows.

First, is that Simulink Coder executables return simulations results much faster than raw Simulink files. When training a BRAIN using Deep Reinforcement Learning, this difference in speed can add up. For example, if you train a BRAIN for 1M iterations with a Simulink Coder executable that runs in 0.5s per iteration instead of 1.5s per iteration, your total training time will decrease by 277 hours!

Second, Simulink Coder executables are much easier to connect to the Bonsai platform in parallel because they require a lot less memory and CPU than instances of interactive Simulink models. Continuing the example above, if you run 100 copies your new coder executable in parallel training you’ve now reduced the training time by about 250 hours.

Lastly, Simulink Coder executable models allow you to share with people and teams using Simulink to train Bonsai BRAINS without a Matlab or Simulink license.

Visit The Mathworks for more information on Matlab and Simulink Coder

How to connect your own model

Please review the HOWTO.md file in the downloaded folder for more information on how to connect your own Simulink model to the Bonsai AI platform.