Ako používať AI na hranie Sonic the Hedgehog. Je to NEAT!

Z generácie na generáciu sa ľudia prispôsobili tak, aby viac vyhovovali nášmu okoliu. Začali sme ako primáti, ktorí žijú vo svete, ktorý má jesť alebo byť zjedený. Nakoniec sme sa vyvinuli do toho, kým sme dnes, odrážajúc modernú spoločnosť. Procesom evolúcie sa stávame inteligentnejšími. Sme schopní lepšie pracovať s našim prostredím a dosiahnuť to, čo potrebujeme.

Koncept učenia sa prostredníctvom evolúcie sa dá aplikovať aj na umelú inteligenciu. Môžeme trénovať AI na vykonávanie určitých úloh pomocou NEAT, Neuroevolution of Augmented Topologies. Jednoducho povedané, NEAT je algoritmus, ktorý zaberá dávku AI (genómov), ktoré sa pokúšajú splniť danú úlohu. Najvýkonnejšie AI sa „množia“ a vytvárajú tak ďalšiu generáciu. Tento proces pokračuje, kým nebudeme mať generáciu, ktorá je schopná dokončiť to, čo potrebuje.

NEAT je úžasný, pretože eliminuje potrebu už existujúcich údajov potrebných na školenie našich AI. Pomocou sily NEAT a Gym Retro OpenAI som trénoval AI, aby som hral Sonic the Hedgehog pre SEGA Genesis. Poďme sa naučiť ako!

NEAT Neural Network (implementácia Pythonu)

Repozitár GitHub

Vedant-Gupta523 / sonicNEAT

Prispejte k vývoju Vedant-Gupta523 / sonicNEAT vytvorením účtu na GitHub. github.com

Poznámka: Celý kód v tomto článku a vo vyššie uvedenej ukážke je mierne upravená verzia príručky Sonic AI Bot od Lucasa Thompsona s využitím tutoriálov a kódu YouTube pre Open-AI a NEAT.

Pochopenie OpenAI Gym

Ak ešte nepoznáte OpenAI Gym, prečítajte si terminológiu uvedenú nižšie. V celom článku budú často použité.

agent - hráč AI. V tomto prípade to bude Sonic.

prostredie - Kompletné okolie agenta. Herné prostredie.

action - niečo, čo má agent možnosť urobiť (tj. pohyb vľavo, vpravo, skok, nič neurobiť).

krok - Vykonanie 1 akcie.

stav - Rámec životného prostredia. Aktuálna situácia, v ktorej sa nachádza AI.

pozorovanie - Čo AI pozoruje z prostredia.

fitnes - Aký dobrý výkon má naša AI.

hotovo - keď AI dokončí svoju úlohu alebo nemôže pokračovať ďalej.

Inštalácia závislostí

Ďalej sú uvedené odkazy na GitHub pre OpenAI a NEAT s pokynmi na inštaláciu.

OpenAI : //github.com/openai/retro

NEAT : //github.com/CodeReclaimers/neat-python

Pip nainštalujte knižnice ako cv2, numpy, pickle atď.

Importujte knižnice a nastavené prostredie

Na začiatok musíme importovať všetky moduly, ktoré použijeme:

import retro import numpy as np import cv2 import neat import pickle

Definujeme tiež naše prostredie pozostávajúce z hry a stavu:

env = retro.make(game = "SonicTheHedgehog-Genesis", state = "GreenHillZone.Act1")

Aby ste mohli trénovať AI na hranie hry Sonic the Hedgehog, budete potrebovať ROM hry (súbor hry). Najjednoduchší spôsob, ako to získať, je zakúpenie hry mimo Steam za 5 dolárov. Môžete tiež nájsť súbory na stiahnutie ROM zadarmo, ktoré sú však nelegálne, takže to nerobte.

V úložisku OpenAI na adrese retro / retro / data / stable / nájdete priečinok pre Sonic the Hedgehog Genesis. Sem umiestnite ROM hry a uistite sa, že sa volá rom.md. Tento priečinok obsahuje aj súbory .state. Môžete si zvoliť jednu a nastaviť jej stavový parameter rovnako. Vybral som si hru GreenHillZone Act 1, pretože je to úplne prvá úroveň hry.

Pochopenie data.json a scenario.json

V priečinku Sonic the Hedgehog budete mať tieto dva súbory:

data.json

{ "info": { "act":  "address": 16776721, "type": ", "level_end_bonus":  "address": 16775126, "type": ", "lives":  "address": 16776722, "type": ", "rings": { "address": 16776736, "type": ">u2" }, "score": { "address": 16776742, "type": ">u4" }, "screen_x": { "address": 16774912, "type": ">u2" }, "screen_x_end": { "address": 16774954, "type": ">u2" }, "screen_y": { "address": 16774916, "type": ">u2" }, "x": { "address": 16764936, "type": ">i2" }, "y": { "address": 16764940, "type": ">u2" }, "zone": u1"  } }

scenár.json

{ "done": { "variables": { "lives": { "op": "zero" } } }, "reward": { "variables": { "x": { "reward": 10.0 } } } }

Oba tieto súbory obsahujú dôležité informácie týkajúce sa hry a jej tréningu.

As it sounds, the data.json file contains information/data on different game specific variables (i.e. Sonic’s x-position, number of lives he has, etc.).

The scenario.json file allows us to perform actions in sync with the values of the data variables. For example we can reward Sonic 10.0 every time his x-position increases. We could also set our done condition to true when Sonic’s lives hit 0.

Understanding NEAT feedforward configuration

The config-feedforward file can be found in my GitHub repository linked above. It acts like a settings menu to set up our training. To point out a few simple settings:

fitness_threshold = 10000 # How fit we want Sonic to become pop_size = 20 # How many Sonics per generation num_inputs = 1120 # Number of inputs into our model num_outputs = 12 # 12 buttons on Genesis controller

There are tons of settings you can experiment with to see how it effects your AI’s training! To learn more about NEAT and the different settings in the feedfoward configuration, I would highly recommend reading the documentation here

Putting it all together: Creating the Training File

Setting up configuration

Our feedforward configuration is defined and stored in the variable config.

config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction, neat.DefaultSpeciesSet, neat.DefaultStagnation, 'config-feedforward')

Creating a function to evaluate each genome

We start by creating the function, eval_genomes, which will evaluate our genomes (a genome could be compared to 1 Sonic in a population of Sonics). For each genome we reset the environment and take a random action

for genome_id, genome in genomes: ob = env.reset() ac = env.action_space.sample()

We will also record the game environment’s length and width and color. We divide the length and width by 8.

inx, iny, inc = env.observation_space.shape inx = int(inx/8) iny = int(iny/8)

We create a recurrent neural network (RNN) using the NEAT library and input the genome and our chosen configuration.

net = neat.nn.recurrent.RecurrentNetwork.create(genome, config)

Finally, we define a few variables: current_max_fitness (the highest fitness in the current population), fitness_current (the current fitness of the genome), frame (the frame count), counter (to count the number of steps our agent takes), xpos (the x-position of Sonic), and done (whether or not we have reached our fitness goal).

current_max_fitness = 0 fitness_current = 0 frame = 0 counter = 0 xpos = 0 done = False

While we have not reached our done requirement, we need to run the environment, increment our frame counter, and shape our observation to mimic that of the game (still for each genome).

env.render() frame += 1 ob = cv2.resize(ob, (inx, iny)) ob = cv2.cvtColor(ob, cv2.COLOR_BGR2GRAY) ob = np.reshape(ob, (inx,iny))

We will take our observation and put it in a one-dimensional array, so that our RNN can understand it. We receive our output by feeding this array to our RNN.

imgarray = [] imgarray = np.ndarray.flatten(ob) nnOutput = net.activate(imgarray)

Using the output from the RNN our AI takes a step. From this step we can extract fresh information: a new observation, a reward, whether or not we have reached our done requirement, and information on variables in our data.json (info).

ob, rew, done, info = env.step(nnOutput)

At this point we need to evaluate our genome’s fitness and whether or not it has met the done requirement.

We look at our “x” variable from data.json and check if it has surpassed the length of the level. If it has, we will increase our fitness by our fitness threshold signifying we are done.

xpos = info['x'] if xpos >= 10000: fitness_current += 10000 done = True

Otherwise, we will increase our current fitness by the reward we earned from performing the step. We also check if we have a new highest fitness and adjust the value of our current_max_fitness accordingly.

fitness_current += rew if fitness_current > current_max_fitness: current_max_fitness = fitness_current counter = 0 else: counter += 1

Lastly, we check if we are done or if our genome has taken 250 steps. If so, we print information on the genome which was simulated. Otherwise we keep looping until one of the two requirements has been satisfied.

if done or counter == 250: done = True print(genome_id, fitness_current) genome.fitness = fitness_current

Defining the population, printing training stats, and more

The absolute last thing we need to do is define our population, print out statistics from our training, save checkpoints (in case you want to pause and resume training), and pickle our winning genome.

p = neat.Population(config) p.add_reporter(neat.StdOutReporter(True)) stats = neat.StatisticsReporter() p.add_reporter(stats) p.add_reporter(neat.Checkpointer(1)) winner = p.run(eval_genomes) with open('winner.pkl', 'wb') as output: pickle.dump(winner, output, 1)

All that’s left is the matter of running the program and watching Sonic slowly learn how to beat the level!

To see all of the code put together check out the Training.py file in my GitHub repository.

Bonus: Parallel Training

If you have a multi-core CPU you can run multiple training simulations at once, exponentially increasing the rate at which you can train your AI! Although I will not go through the specifics on how to do this in this article, I highly suggest you check the sonicTraning.py implementation in my GitHub repository.

Conclusion

That’s all there is to it! With a few adjustments, this framework is applicable to any game for the NES, SNES, SEGA Genesis, and more. If you have any questions or you just want to say hello, feel free to email me at vedantgupta523[at]gmail[dot]com ?

Also, be sure to check out Lucas Thompson's Sonic AI Bot Using Open-AI and NEAT YouTube tutorials and code to see what originally inspired this article.

Key Takeaways

  1. Neuroevolúcia rozširujúcich topológií (NEAT) je algoritmus používaný na trénovanie AI na vykonávanie určitých úloh. Je modelovaný podľa genetickej evolúcie.
  2. NEAT eliminuje potrebu už existujúcich údajov pri tréningu AI.
  3. Proces implementácie OpenAI a NEAT pomocouPython trénovať AI na hranie akejkoľvek hry.