Success Stories With Prof. Neil Leach
According to the French philosopher, Henri Lefebvre, everyday life in the city should be understood in terms of rhythms. Jazz, too, is made up of rhythms, forever mutating, and changing against a background condition. This workshop explored the possibility of using advanced machine learning to generate patterns and rhythms. It explored the potential of pattern recognition and prediction – which lie at the heart of Artificial Intelligence (AI) – as a generative technique to trace existing patterns and extrapolate variations on those patterns – much like the logic of jazz. The result is a form of creative jamming between humans and artificial intelligence. This hybrid human-machine modusoperandi, borrowing its fundamentals from cyborg philosophy, forms the base of the new paradigm of Extended Intelligence (EI). As Paul Tudor Jones notes, “No man is better than a machine, and no machine is better than a man with a machine.THE RESULT IS A CREATIVE HUMAN – AI JAMMING. Susmita was able to meet the goals of Jazz Urbanism with Neil Leach, who is the NASA Innovative Advanced Concepts Fellow.
With Jazz, the band members are constantly exchanging leadership roles, in an attempt to either infuse novelty and creativity, or to reinforce the underlying harmonic structure. The idea is to achieve a “Fantastic Harmony”. This is the principle behind this workshop. Imagine an AI joining in, first trying to grasp the essence and not the process of improvisation, learning by imitating the other players while also developing its own, novel strategies and tactics, so as to help the entire band to get closer
and closer to its common goal of reaching the fantastic harmonic state.
NO MAN IS BETTER THAN A MACHINE,
NO MACHINE IS BETTER THAN A MAN WITH A MACHINE.
The result is a form of creative jamming between human and artificial intelligence. This hybrid human-machine modusoperandi, borrowing its fundamentals from cyborg philosophy, forms the base of the new paradigm of Extended Intelligence (EI). As Paul Tudor Jones notes, “No man is better than a machine, and no machine is better than a man with a machine.”
THE RESULT IS A CREATIVE HUMAN – AI JAMMING
In this workshop we went on an extraordinary computational journey. We explored how the latest advances in Machine Learning could be understood and used by designers in conjunction with Gaming Technologies. By using Unity’s recently developed, powerful Machine Learning toolkit (Unity ML-Agents) and state-of-the-art algorithms (based on TensorFlow), we created
and trained intelligent agents capable of sensing and interacting with each other and their environment. Each could be considered as an instrument within an ensemble with a different range of expressions triggered by sensitivities for specific harmonic structures. These trained agents are then exposed to a live feed from a camera recording choreographies of movement in the exhibition
space. They begin to jam, and take on lives of their own which remain mysterious to human beings.
DigitalFUTURES 2019 Summer Workshop “Architectural Intelligence” achieved great success. In Group 7-1, Prof. Neil Leach and Claudiu Barsan-PIPU from Tongji University, led
16 students from the University of New South Wales, Cario University, University College London, Tsinghua University, Tongji University, University of Hong Kong, UN Studio, Zaha Hadid Architects and other universities or architectural firms across the world to research the potential application of Artificial Intelligence to architecture and urbanism under the theme of “Jazz Urbanism”.
Tutors：Prof. Neil Leach | Claudiu Barsan
Students：Oana Nituica | Zhuoxing GU | Xinyi YANG | Andrea Macruz | Shaoxing ZENG | Xuexin DUAN | Abhijit Mojumder | Xiaohan XU | Lin MENG | Reham Ahmed Abdelwahab Hassan | Xi PENG | Zimian CHEN | Susmita Biswas Sathi | Radwa Ahmed Abdelhamid Abdelhafez | Jie SHEN
Prof. Neil Leach giving a lecture
（1）Understanding the principles
Before going into the built-in features of ML-agents in Unity, the general rules for Agent Training were introduced first. It’s based on the following concepts, pertaining to the
way that they perceive, act and are rewarded for in terms of their respective actions:
Agent Observations—— this is how agents perceive the environment from their local perspective. There can
be numeric (measurable) attributes, either discrete or continuous, and also visual observations, where images generated from the viewpoint(s) of the agent can be used to drive the training behavior
Agent Actions—— in response to the observations the agent is able to make about its local environment,
actions can be taken in a continuous or discrete manner. These can relate, for example, to constant changes in position or to performing a task or interaction at specific points in time.
Agent Reward Signals——this is a measure of the agent’s performance that may or may not be assessed at every step. The reward signals are the means by which the objectives of the task are communicated to the agent, so they need to be set up in a manner where maximizing reward generates the desired optimal behavior. The reward can be positive (encouraging a specific behavior) or negative (as a penalty).
The reinforcement learning cycle
For Unity’s ML-Agents, the reinforcement learning of agents is based on the previously enounced agent features in simulations comprised of the following elements:
Learning Environment——containing the Unity environment used as a base for the simulation
Python API——which contains all the machine learning algorithms that are used for training (learning a behavior or policy) and communicates with Unity through the External Communicator.
External Communicator——Connects the Learning Environment with the Python API, inside the Learning Environment.
Schematic Diagram of the Proposed Reinforcement Learning Environment in Unity
The Learning Environment contains three additional components:
Agents——handles generating agent observations, performing the actions it receives and assigning a reward
(positive / negative) when appropriate. Each Agent is linked to exactly one Brain.
Brains——which encloses the logic for making decisions for the Agent. The Brain is what holds on to the
policy for each Agent and determines which actions the Agent should take at each instance. It is the component that receives the observations and rewards from the Agent and returns an action.
Academy——which orchestrates the observation and decision making process, holding the environment-wide parameters controlling the simulation.
（2）Multiple Training modes
During the workshop, we addressed multiple Training modes provided by Unity’s ML-Agents:
Built-in Training and Inference ML-Agents toolkit comes with several implementations of state-of-the-art algorithms for training intelligent agents.
The Training Phase
During training, all the agents in the environment send their observations to the Python API via the
External Communicator. The Python API processes these observations and sends back actions for each agent to take. During training, these actions are mostly exploratory to help the Python API learn the best policy for each agent. Once training concludes, the learned policy for each agent can be exported, and since all implementations are based on TensorFlow, the learned policy is just a TensorFlow model file.
The Inference Phase
In this phase, the agents still continue to generate their observations, but instead of being sent to the Python API, they will be fed into their (internal, embedded) model to generate the optimal action for each agent to take at every point in time.
Curriculum Learning This mode is an extension of Built-in Training and Inference, and is particularly helpful when training intricate behaviors for complex environments. Curriculum learning is a way of training a machine learning model where more difficult aspects of a problem are gradually introduced in
such a way that the model is always optimally challenged. The ML-Agents toolkit supports setting custom environment parameters within the Academy. This allows elements of the environment related to difficulty or complexity to be dynamically adjusted based on training progress.
Imitation Learning Often, it’s more intuitive to simply demonstrate the behavior we desire an agent to perform, rather than attempting to have it learn via trial-and-error methods. For example, instead of training
the agent by setting up its reward function, this mode allows providing real examples from a game controller on how the agent should behave. More specifically, in this mode, the Brain type during training is set to Player and all the actions performed with the controller (in addition to the agent
observations) will be recorded and sent to the Python API. The imitation learning algorithm will then use these pairs of observations and actions from the human player to learn a policy.
AI Driven Multi-agents Urbanism
Besides, ML-Agents allow for the following scenarios for training intelligent agents: Single Agent A single agent linked to a single Brain, with its own reward signal.
Simultaneous Single-Agent Multiple independent agents with independent reward signals linked to a single Brain.
Adversarial Self-Play Two interacting agents with inverse reward signals linked to a single Brain.
Cooperative Multi-agent Multiple interacting agents with a shared reward signal linked to either a single or multiple different Brains.
Competitive Multi-Agent Multiple interacting agents with inverse reward signals linked to either a single or multiple different Brains.
Agents Ecosystem Multiple interacting agents with independent reward signals linked to either a single or multiple different Brains.
With set-up simple rules in Unity, multiple agents were trained to chase the target, and apply the brain to different types of agents which have different local behaviors, to accumulate and generate a global behavior.
Final results of the group
Different moments of the behaviors，Oana Nituica