华为河伯

2022-10-24  本文已影响0人  臻甄

华为诺亚方舟实验室开源了一个贝叶斯优化+RL的仓库,里面包括几部分:

HEBO github & paper

仓库地址:https://github.com/huawei-noah/HEBO
其中河伯是NeurIPS 2020 Black-Box Optimisation Challenge的冠军解决方案

HEBO仓库宣称的特性

PaperHEBO: Heteroscedastic Evolutionary Bayesian Optimisation

GitHub doc

import torch
from hebo.design_space.design_space import DesignSpace
params = [
    {'name' : 'hidden_size', 'type' : 'int', 'lb' : 16, 'ub' : 128},
    {'name' : 'batch_size',  'type' : 'int', 'lb' : 16, 'ub' : 128},
    {'name' : 'lr', 'type' : 'pow', 'lb' : 1e-4, 'ub' : 1e-2, 'base' : 10},
    {'name' : 'use_bn', 'type' : 'bool'},
    {'name' : 'activation', 'type' : 'cat', 'categories' : ['relu', 'tanh','sigmoid']},
    {'name' : 'dropout_rate', 'type' : 'num', 'lb' : 0.1, 'ub' : 0.9},
    {'name' : 'optimizer', 'type' : 'cat', 'categories' : ['sgd', 'adam', 'rmsprop']}
]

space = DesignSpace().parse(params)
space.sample(5)  # 随机抽样,返回一个pandas
# 实际上在hebo内部会将所有类型的数据都转化为torch.FloatTensor或torch.LongTensor
from hebo.optimizers.hebo import HEBO
from hebo.optimizers.bo import BO # 具有LCB采样的基本BO
hebo_seq = HEBO(space, model_name = 'gpy', rand_sample = 4)
for i in range(64):
    rec_x = hebo_seq.suggest(n_suggestions=1) # n_suggestions可以决定每次采样多少个点来并行评估
    hebo_seq.observe(rec_x, obj(rec_x)) # obj是目标函数
    if i % 4 == 0:
        print('Iter %d, best_y = %.2f' % (i, hebo_seq.y.min()))

conv_hebo_seq = np.minimum.accumulate(hebo_seq.y) # 获取所有结果,用于绘图
from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score, mean_squared_error
from hebo.sklearn_tuner import sklearn_tuner

space_cfg = [
        {'name' : 'max_depth',        'type' : 'int', 'lb' : 1, 'ub' : 20},
        {'name' : 'min_samples_leaf', 'type' : 'num', 'lb' : 1e-4, 'ub' : 0.5},
        {'name' : 'max_features',     'type' : 'cat', 'categories' : ['auto', 'sqrt', 'log2']},
        {'name' : 'bootstrap',        'type' : 'bool'},
        {'name' : 'min_impurity_decrease', 'type' : 'pow', 'lb' : 1e-4, 'ub' : 1.0},
        ]
X, y   = load_boston(return_X_y = True)
result = sklearn_tuner(RandomForestRegressor, space_cfg, X, y, metric = r2_score, max_iter = 16)

print(result) #打印结果 {'最大深度':15, 'min_samples_leaf':0.00011814573477638075, 'max_features':'log2', '引导程序':假, 'min_impurity_decrease':0.00010743041070558209}
from pymoo.factory import get_problem
problem = get_problem("zdt1", n_var = 5) # pymoo的多目标基准函数

dim = problem.n_var  # 优化维度
num_obj = problem.n_obj  # 优化的目标数量
num_constr = problem.n_constr  # 优化的约束数量

from hebo.optimizers.general import GeneralBO  #在这之前还有很多别的配置,建议直接看HEBO官方文档,下方仅截取核心代码
opt = GeneralBO(space, num_obj, num_constr, model_conf = conf)
for i in range(50):
    rec = opt.suggest(n_suggestions=4)
    opt.observe(rec, obj(rec))

# 可以通过绘图发现HEBO得到的帕累托前沿比随机搜索更好
from hebo.benchmarks.synthetic_benchmarks import BraninDummy
from hebo.optimizers.hebo_embedding import HEBO_Embedding

prob  = BraninDummy(1000) # 1000-D Branin function where the frist two dimensions are active
opt   = HEBO_Embedding(prob.space, rand_sample = 10, eff_dim = 2, scale = 1)
from hebo.optimizers.evolution import Evolution
from hebo.benchmarks.synthetic_benchmarks import Ackley

prob = Ackley(dim = 2)
opt = Evolution(prob.space, num_obj = 1, num_constr = 0, algo = 'de', verbose = True)
n_eval = 0
for i in range(30):
    rec     = opt.suggest()
    obs     = prob(rec)
    n_eval += rec.shape[0]
    opt.observe(rec, obs)
print(f'After iter {i+1}, evaluated {n_eval}, best_y is {opt.best_y.squeeze()}')

复现HEBO结果

Step1:配置环境

pip install HEBO
pip install pymoo==0.4.1  # 新版本的pymoo>=0.5.0跑不起来hebo
pip install bayesmark[optimizers,notebooks] # 既安装所有内置优化器,也安装包含notebook的环境

Step2:下载HEBO,评测脚本

git clone https://github.com/rdturnermtl/bbo_challenge_starter_kit.git # 下载官方的一键评测脚本,主要用到了bayesmark这个库来作为评测基准
git clone https://github.com/huawei-noah/HEBO.git  # 下载HEBO代码

Step3:测试一下评测脚本,用pysot算子例程评测3次得到的结果,评测的训练模型是SVM、DT,使用训练数据集合是boston、wine。脚本将自动进行初始化、评测、结果分析。

cd bbo_challenge_starter_kit
bash ./run_local.sh ./example_submissions/pysot 3  # 会报错,看了下是pysot/optimizer.py的实现有问题,引入了不存在的类
bash ./run_local.sh ./example_submissions/random-search 3  # 换个算法就能跑了
所有评估指标
统计指标

Step4:重跑华为的脚本,时间关系只评估1趟就好了

bash ./run_local.sh ../HEBO/HEBO/archived_submissions/hebo 1

报错 ModuleNotFoundError: No module named 'pymoo.algorithms.so_genetic_algorithm'。原因是pymoo升级了,如果要跑hebo,就需要安装旧版本的pymoo==0.4.1


上一篇下一篇

猜你喜欢

热点阅读