Evolutionary strategy 2 - Elitarian selection evolution

티스토리 뷰

keep9oing

Evolutionary strategy 2 - Elitarian selection evolution

HTS3 2021. 7. 18. 00:56

Evolutionary 알고리즘에 대한 개략적인 설명은 이전 포스트에서 언급됐다.

Evolutionary strategy 1 - Simple Gaussian Evolution

최근 RL관련 논문이나 구루들의 토의 영상들을 보면 evolution strategy(혹은 evolution algorithm)이란 단어가 심심치 않게 등장한다. Evolution algorithm은 Black box optimization의 일종이다. Black box opti..

ropiens.tistory.com

오늘은 $(\mu, \lambda)$-ES 방식과는 약간 다른 Elitarian selection $(\mu + \lambda)$-ES에 대해 알아보겠다.

Elitarian selection $(\mu + \lambda)$-ES

Gaussian search와 동일한 방식으로 데이터를 샘플링하지만 한가지 다른점은 이전에 샘플링한 데이터 셋을 모두 버리지 않고 과거와 현재의 데이터셋의 합집합에서 가장 좋은 샘플들을 살려놓고 다음 샘플링을 진행한다는 것이다. 이런 특성으로인해 좋은 성능을 내는 Good parents들도 다음세대에 살아남기때문에 "elitarian"(엘리트주의)이란 이름이 붙는다.이때 $\mu=1, \lambda=1$이면 Hill Climber 알고리즘이다. 이 알고리즘은 수렴성에 대한 많은 이론들이 존재한다.

Implementation

문제 상황은 Gaussian Search때와 같다. 이 정도로 간단한 문제에서는 성능면으로 큰 차이를 보이지 않는다.

# "elitarian" selection (mu+lambda ES)
from matplotlib import pyplot as plt
import numpy as np

# Configuration
theta=[0.0, 0.0]    
mu_=100
lambda_= 5000     
target=[100.0, 100.0]
var=1.0
step=1

assert isinstance(theta, list) and len(theta)==2,"theta should be 2D list"
assert isinstance(target, list) and len(target)==2, "target point should be 2D list"

cov = [[var, 0], [0, var]]
theta = np.array(theta)
target = np.array(target)


while np.linalg.norm(theta-target, ord=2)>=0.01:
    # Plot
    if step%5==0:
        plt.plot(sampled_theta[:,0],sampled_theta[:,1],'x',label='theta')
        plt.plot(target[0], target[1], 'ro', label='target')
        plt.title('Step:'+str(step))
        plt.xlim([-5,105])
        plt.ylim([-5,105])
        plt.legend()
        plt.show()
        
    # Sample data from distribution model
    sampled_theta = np.random.multivariate_normal(theta, cov, lambda_) # x ~ p_t(x) = N(x|x_hat, sigma^2)

    # Evaluate Samples
    loss = np.linalg.norm(sampled_theta-target,ord=2,axis=1)

    # dataset D
    D = np.concatenate((sampled_theta,loss[:,None]),axis=1)

    if step == 1:  
        D_prime = np.copy(D)
    else:
        D_prime = np.concatenate((D, D_prime),axis=0)

    # Update D' by select mu best from D'UD
    best_idx = []
    loss_=D_prime[:,-1]
    for i in range(mu_):
        best_idx.append(loss_.argmin())
        loss=np.delete(loss_,loss_.argmin())
    D_prime = D_prime[np.array(best_idx)]
    
    # Update heuristic
    theta = D_prime[D_prime[:,-1].argmin()][:-1]
    
    step += 1
    
plt.plot(sampled_theta[:,0],sampled_theta[:,1],'x',label='theta')
plt.plot(target[0], target[1], 'ro', label='target')
plt.title('Step:'+str(step))
plt.xlim([-5,105])
plt.ylim([-5,105])
plt.legend()
plt.show()

Little Summary of EA(Evolutionary Algorithms)

아주 간단한 EA관련 알고리즘 2개에 대해 알아보았다. 전체적인 프레임웍을 진화적 관점에서 정리해보자면

$\theta$는 지금까지 발견한 가장 좋은 성능의 샘플들의 집합이다.
돌연변이와 번식의 과정은 $p_{\theta}(x)$를 통해 일어난다.
샘플된 데이터셋 $D$는 자손 세대이다.
$\theta$의 업데이트는 가장 적합한 집합을 $D$에서 고르는 것으로 한다.

EA의 종류는 여러가지가 있는데 특징들을 정리하자면

Evolution Strategies: $x \in \mathbb{R}^{n}$
Genetic Algorithms: $x \in\{0,1\}^{n}$
Genetic Programming: $x$ 는 program이나 tree다.
Estimation of Distribution Algorithms: $\theta$가 직접적으로 $p_{\theta}(x)$를 정의한다

OK Then, What?

앞으로 있을 포스팅은 본디 목적에 따른 Evolution Strategies들을 중점적으로 다룰 듯 하다.

CMA-ES, NES을 다루고 강화학습이나 딥러닝에 적용된 CEM-RL, PBT, WANN등도 다룰 수 있다면,, 다뤄보도록 하겠다.

'keep9oing' 카테고리의 다른 글

Robotarium - 원격 로봇 테스트 베드 (5)	2022.04.07
로봇 작업 할당 - Hungarian Algorithm (9)	2022.04.05
Evolutionary strategy 1 - Simple Gaussian Evolution (7)	2021.07.16
Graph spectral partitioning/clustering with python (0)	2021.05.13
강화학습 환경들 (9)	2021.04.15

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

로봇이 아닙니다.

티스토리 뷰