2017-11-23

catch構文を使ったときのパフォーマンスはどうなるか

node.js v8

メモ

node8.x系においてasync/awaitにといてtry/catchを使う場合のパフォーマンスに関する議論 stackoverflow.com

ざっくりいうと

V8の最新版においては、optimizer compilerであるturbofanによってtry catchの処理の最適化がされているため、V8(5.3以下 ) に比べて try/catchのパフォーマンスが向上している

V8 5.3時点(node.jsでいうと7.x系)でのtry catch周りの処理

https://github.com/v8/v8/blob/5.3-lkgr/src/ast/ast-numbering.cc#L306-L319
DisableCrankshaftによって、Crankshaftによる最適化が無効にされていることがわかる

void AstNumberingVisitor::VisitTryCatchStatement(TryCatchStatement* node) {
  IncrementNodeCount();
  DisableCrankshaft(kTryCatchStatement);
  Visit(node->try_block());
  Visit(node->catch_block());
}


void AstNumberingVisitor::VisitTryFinallyStatement(TryFinallyStatement* node) {
  IncrementNodeCount();
  DisableCrankshaft(kTryFinallyStatement);
  Visit(node->try_block());
  Visit(node->finally_block());
}

V8 2017/11/23時点でのtry catch周りの処理

最適化の無効処理は消されていて、最適化が走っていると思われる（実際にどう最適化されているかまでは追えていない）
https://github.com/v8/v8/blob/master/src/ast/ast-numbering.cc#L193-L202

void AstNumberingVisitor::VisitTryCatchStatement(TryCatchStatement* node) {
  DCHECK(node->scope() == nullptr || !node->scope()->HasBeenRemoved());
  Visit(node->try_block());
  Visit(node->catch_block());
}

void AstNumberingVisitor::VisitTryFinallyStatement(TryFinallyStatement* node) {
  Visit(node->try_block());
  Visit(node->finally_block());
}

2017-10-21

C言語 crypt()を使ったソースコンパイル時に undefined reference to `crypt' が出たときの対処法

c linux

概要

Linuxプログラミングインターフェースの8.5章「パスワードの暗号化とユーザ認証」のサンプルコードをコンパイルしたときにエラーがでて躓いたので対処法をメモする

Linuxプログラミングインタフェース

作者:Michael Kerrisk
オライリージャパン

Amazon

前提

まず最初に、今回コンパイルをした環境は以下のとおりです。

$ cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core)

$ gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)

cryptについて

GNU Cライブラリの一つであるcryptを使って、ユーザから入力した平文パスワードを暗号化できる http://www.gnu.org/software/libc/manual/html_node/crypt.htmlwww.gnu.org

char * crypt (const char *key, const char *salt)

暗号化する際には、変換対象の平文と、ソルトを指定する

Linuxプログラミングインターフェースの8.5章「パスワードの暗号化とユーザ認証」に示されているサンプルコードは以下になります。

#define _BSD_SOURCE // <unistd.h>内のgetpass()の宣言を有効にする

#define _XOPEN_SOURCE // <unistd.h>内のcrypt()の宣言を有効にする

#include <unistd.h>
#include <limits.h>
#include <pwd.h>
#include <shadow.h>
#include "../tlpi_hdr.h"

int main(int argc, char *argv[])
{
  char *username, *password, *encrypted, *p;

  struct passwd *pwd;
  struct spwd *spwd;
  Boolean authOk;
  size_t len;
  long lnmax;

  lnmax = sysconf(_SC_LOGIN_NAME_MAX);
  if (lnmax == -1) {
    // 最小ログイン名長が不明な場合 推測
    lnmax = 256;
  }

  username = malloc(lnmax);
  if (username == NULL) {
    errExit("malloc");
  }

  printf("Username: ");
  fflush(stdout);
  if (fgets(username, lnmax, stdin) == NULL) {
    // EOFならば終了
    exit(EXIT_FAILURE);
  }

  len = strlen(username);
  if (username[len - 1] == '\n') {
    // 行末を削除
    username[len - 1] = '\0';
  }

  pwd = getpwnam(username);
  if (pwd == NULL) {
    fatal("Couldn't get password record");
  }

  spwd = getspnam(username);
  if (spwd == NULL && errno == EACCES) {
    fatal("No permission to read shadow password file");
  }

  if (spwd != NULL) {
    // shadowパスワードが存在すれば shadowパスワードを利用する
    pwd->pw_passwd = spwd->sp_pwdp;
  }
  password = getpass("Password: ");

  // パスワードを暗号化し、平文は即座に破棄する
  encrypted = crypt(password, pwd->pw_passwd);
  for (p = password; *p != '\0';) {
    *p++ = '\0';
  }

  if (encrypted == NULL) {
    errExit("crypt");
  }

  // 入力されたパスワードを暗号化したものと、shadowパスワードが一致するかチェック
  authOk = strcmp(encrypted, pwd->pw_passwd) == 0;

  if (!authOk) {
    printf("Incorrect password\n");
    exit(EXIT_FAILURE);
  }

  printf("Successfully authenticated: UID=%ld\n", (long) pwd->pw_uid);

  // 認証成功 ここで処理を行う
  printf("Now do authenticated work");

  exit(EXIT_SUCCESS);
}

コンパイル時のエラー

この環境で、コンパイルする際に以下のエラーが出て、コンパイルできない状態になっていた。

check_password.c:(.text+0x162): undefined reference to `crypt'

原因

OSやOSバージョン、gccのバージョンにより変わると思いますが、コンパイル時に -lcryptオプションをつかってcryptをlinkさせる必要がありますが、今回コンパイル時に-lcryptオプションを使用していませんでした。
-lcryptをつけるとコンパイルができました

$ gcc users_groups/check_password.c error_functions.c -lcrypt

参考

stackoverflow.com

2017-09-21

facebook/flow Comment Typesによるpure javascriptでの型チェック

javascript flow

概要

javascriptの静的型付けのチェックを行うツールであるflowというツールを導入し、業務で開発しているプロダクトの品質向上を目指そうと考えています。
既にpure javascriptで書かれた環境があるので、このコードを書き直さずにflowを導入したいと考えており、導入方法について調べた内容をまとめます。
github.com

flowを使った従来の型チェック

一般的に、flowを使ってjavascriptコードの型チェックを行うためには、flow独自のsyntaxでコードを記述し、実環境へのデプロイは、babelを使ってpure javascriptのコードへトランスパイルするのが一般的です。

// flow独自のsyntaxで記述
// @flow
function square(n: number): number {
  return n * n;
}

square("2"); // Error!

既に本番運用されているプロダクトへのflowの導入の辛さ

しかし、既にpure javascriptで記述して運用されている実環境において、flowを導入するために、イチからflow syntaxで書き直すのは現実的ではありません。
デプロイ処理にbabelを使った pure javascriptへのトランスパイルを挟む必要があるし、テスト周りの処理も見直さなければなりません。
そもそもトランスパイルを挟むなら、型指定のあるtypescriptにまるっとコードを置き換えたほうが良いかと思います。

型チェックのために、コードをflow用に書き直すことなく、pure javascriptでflowの型チェックの利点を活かす事ができれば、既に運用されているnode.jsプロダクトへ楽に導入できるし、プロダクトの品質向上につながります。

私と同じように考えている人は多いらしく、本家のissueや、stackoverflowにも幾つか言及がありました。 github.com

stackoverflow.com

Comment Typesを使った型チェック

前述した要望を見たすために、flowは公式にコメントベースでのsyntaxをサポートしています。
このsyntaxを使うことで、pure javascriptのコードを変えずに、クラスプロパティや、関数、メソッド引数の型チェックが可能になります。
babelによるトランスパイルを挟まなくても型チェックを導入できるので非常に便利です。
flow.org

Comment type include

このComment Typesを使う際には、従来のコメントと区別するために、コメントの先頭にダブルコロン「::」を付ける必要があります。

/*::
type Foo = {
  foo: number,
  bar: boolean,
  baz: string
};
*/

class MyClass {
  /*:: prop: string; */
}

上記のコードは、pure javascriptであり、従来通り実行が可能です。 flowによる型チェックの際に、以下のような形でインクルードされ、型チェックが行われます。

type Foo = {
  foo: number,
  bar: boolean,
  baz: string
};

class MyClass {
  prop: string;
}

ダブルコロンの他に「flow-include」と記述しても動作します。

/*flow-include
type Foo = {
  foo: number,
  bar: boolean,
  baz: string
};
*/

class MyClass {
  /*flow-include prop: string; */
}

Comment type annotation

前項のように毎回::やflow-includeを使って書く代わりに、コメントの先頭にシングルコロン「:」を使って楽に書くことができます。

例えば下記のように、関数の引数のあとに追加します。

function concat(a /*: string */, b /*: number*/) {
  return a + b;
}

// 引数の型がそれぞれ指定したものと異なっている
concat(1, "a")

flowによる型チェックを実行すると

Error: index.js:24
 24: concat(1, "a")
            ^ number. This type is incompatible with the expected param type of
 13: function concat(a /*: string */, b /*: number*/) {
                           ^^^^^^ string

Error: index.js:24
 24: concat(1, "a")
               ^^^ string. This type is incompatible with the expected param type of
 13: function concat(a /*: string */, b /*: number*/) {
                                            ^^^^^^ number

ちゃんと型チェックが行われ、エラーを出してくれます。

まとめ

flowによる型チェックは、引数のnullチェックなどちゃんとやってくれるようで非常に強力です。
これを実環境のコードやデプロイフローを変えるなく導入できれば、コードのさらなる品質向上が期待できます。

前述したように、flowはComment Typeで、コメントとして記述する方法もちゃんとサポートされているので、導入するハードルがかなり下がるので、pure javascriptで書いているプロジェクトは導入する余地がかなりあると思います。

2017-09-20

Redis Sentinel configファイルのまとめ

概要

Redis Sentinelを用いて、Redisクラスタのモニタリング、master slaveの各プロセスの監視と自動フェイルオーバーを行うことができます。今回はRedis Sentinelの起動時のconfigファイルについてメモがてらざっくりまとめます。

Redis Sentinel についてざっくりと

Redis Sentinelは、 Redisの運用において高可用性を提供し、Redisに関する障害に関して、人の手を介さずに対応することを可能にします。

さらに、Redisの運用に関する下記の機能を提供します。

Monitoring Redisのmaster, slaveが正常に動作しているか常に監視します。
Notification 監視対象のRedisノードに問題があった場合に、APIを通じて管理者に通知をしたり、他のプログラムと通信をします。
Automatic failover もしRedis masterノードが正常に動かない場合、Sentinelはフェイルオーバープロセスを起動し、slaveをmasterに昇格させます。複数slaveが存在している場合、masterに昇格しなかったslaveノードに対して、新たなmasterノードのデータをレプリケーションするように再設定します。
Configuration provider Sentinelは、Redisクライアントのためのサービスディスカバリを管理します。例えば、Redisクライアントは現在のRedisサーバのmasterノードを特定するために Sentinelへ接続します。
もしフェイルオーバーが発生した場合、Sentinelは昇格後の新たなノードへのアドレスをRedisクライアントに伝えます。

Redis Sentinelの設定ファイルについて

Redis Sentinelの起動時に読み込む設定ファイルに付いて解説します。デプロイする環境によって異なるかと思いますが、 sentinel.confというファイルに設定を記述し、起動時に読み込む形になります。

Redis公式のSentinel解説記事に書いてあるとおり、以下のように記述していきます。 Redis Sentinel Documentation – Redis

sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel parallel-syncs mymaster 1

sentinel monitor resque 192.168.1.3 6380 4
sentinel down-after-milliseconds resque 10000
sentinel parallel-syncs resque 5

ここでは、2つのRedisクラスタに対する設定を記述しています。

各ブロックの1行目の項目は、監視対象のmasterノードの設定を記述しています。

sentinel monitor

sentinel monitor <master-group-name> <ip> <port> <quorum>

監視するmasterノード毎に名前をつけて、設定ファイル内でノード毎にフェイルオーバーの設定などを記述します。

ip portは監視対象のノードのip portを指定します。
quorumは、masterノードが死んだことを判定したSentinelの数を設定します。 2を設定していた場合、2つのSentinelが、masterノードがダウンしたと判定した時、フェイルオーバープロセスを始動させます。

sentilnel down-after-milliseconds

sentinel down-after-milliseconds mymaster 60000

down-after-millisecondsは、各Sentinelがmasterノードに対して、どの間pingによる疎通確認が取れなかったらダウンと判定するかを指定します。

sentinel parallel-syncs

sentinel parallel-syncs <master-group-name> <number-of-syncs>

parallel-syncsは、フェイルオーバーが始動した際に、新たなmasterノードを向くように再設定させるslaveの数を設定します。
この数が小さいほど、フェイルオーバープロセスが完了する時間が長くなります。

まとめ

今回はメモがてら sentinel.confの主な設定項目についてまとめました。このあたりは設定をいじりながら動かしてみると理解が進むと思うのでまた別記事でまとめたいと思います。

2017-09-19

Go revelでのmongoDBとのコネクション、CRUD処理についてのメモ

mongoDB Go

概要

Go revelというwebフレームワークで個人のwebアプリを作っている最中にハマった箇所のメモ。

mgoというGoのmongoドライバを使って、mongoDBとのコネクションを保持させ、CRUD処理を1ファイルにまとめて他のcontroller層から呼び出して使いたかったのでその方法をまとめました。今回、mongoに関しては、特にレプリカセットを組まずに、単一のmongodに対してアクセスさせます。

アプリのファイル構成

revelを使ったアプリのファイル構成は以下のとおりです。ファイル構成はrevel new app コマンドで自動生成されたままで、特に変えていないです。

app
├── controllers
│   ├── app.go
│   └── session.go
├── init.go
├── models
│   └── user.go
├── routes
│   └── routes.go
├── tmp
│   └── main.go
└── views
    ├── App
    │   ├── Index.html
    │   ├── Test.html
    │   └── register.html
    ├── debug.html
    ├── errors
    │   ├── 404.html
    │   └── 500.html
    ├── flash.html
    ├── footer.html
    └── header.html

今回はapp/controllers/session.goに、mongoDBとのコネクションやCRUDのロジックをまとめます。

goでのmongoDBへのクライアントとしては、mgoというパッケージが一番主流なのでそれを使う。
https://godoc.org/gopkg.in/mgo.v2 mongoのドキュメントサーチのクエリ生成において、mgoにバンドルされているbsonパッケージが使われているので、こちらも利用する https://godoc.org/gopkg.in/mgo.v2/bson

session.goは、ざっと書いてみると以下のような感じになる。
mongodへ接続したときのsession情報を保持しており、毎回 insert search removeなど、mongodへアクセスする際に sessionを確認する。 sessionがない場合は新たにmgo.Dial(dialURL)を呼ぶことでコネクションを貼る。

基本だと思うが、mongodへのアクセスを行う際に毎回コネクションを貼るのは、明らかに良くないし、もともと推奨されていない。
ローカルとかで軽く勉強するときには問題ないが、コネクションを貼るコストが高いので実環境では絶対やってはいけない。

mongoDB側で許容されているコネクションプールの設定は、mongod側の設定で色々変えることができますが、本稿では省きます。

package controllers

import (
    "gopkg.in/mgo.v2"
    "gopkg.in/mgo.v2/bson"
)

var (
    session *mgo.Session
    err error
)

// dialURL to mongodb instance
const dialURL = "mongodb://192.168.33.10"

// Create new session.
// if session has already exists, reuse it.
func cloneSession(dialURL string) *mgo.Session {
    if session == nil {
        session, err = mgo.Dial(dialURL)
        if err != nil {
            panic(err)
        }
    }
    return session.Clone()
}

// Insert document to specified db.
func InsertEntity(dbName string,
    collection string,
    model interface{}) {

    // Get DB session
    session := cloneSession(dialURL)
    defer session.Close()

    err := session.DB(dbName).C(collection).Insert(model)
    if err != nil {
        panic(err)
    }
}

// Remove document from collection
func RemoveEntry(dbName string,
    collection string,
    model interface{}) {

    session := cloneSession(dialURL)
    defer session.Close()

    err := session.DB(dbName).C(collection).Remove(model)
    if err != nil {
        panic(err)
    }
}

func queryCollection(dbName string,
    collection string,
    query func(c *mgo.Collection) error) error {

    session := cloneSession(dialURL)
    defer session.Close()

    c := session.DB(dbName).C(collection)
    return query(c)
}

func findOne(userId string,
    dbName string,
    collection string,
    ) (results []interface{}, errorCode string) {
    query := bson.M{"userid": userId}
    return Search(query, 0, 1, dbName, collection)
}

func Search(q interface{},
    skip int,
    limit int,
    dbName string,
    collection string,
) (results []interface{}, errorCode string) {

    errorCode = ""
    query := func(c *mgo.Collection) error {
        fn := c.Find(q).Skip(skip).Limit(limit).All(&results)
        if limit < 0 {
            fn = c.Find(q).Skip(skip).All(&results)
        }
        return fn
    }

    search := func() error {
        return queryCollection(dbName, collection, query)
    }

    err := search()
    if err != nil {
        errorCode = "error"
    }
    return
}

まとめ

今回 mongoDBへの接続や、CRUD処理をすべてsession.goにまとめて、 revelアプリ内でのcontroller層からsession.goを呼び出す形にした。
session生成とCRUDはおそらく別ファイルに分けた方が良いかと思うので、改良の余地がある

2017-09-07

Vagrant VM内にたてたmongodにhost OSから接続できるようにする

vagrant mongo

概要

Vagrant VM内でmongodを立てて、host OSからアクセスできるようにする方法をまとめます。 docker containerで以前mongo用コンテナの建て方についてまとめたのですが、仕事でVagrantでVMを立て、その中でmongodを立てる機会があり、
今回はその時に行った手順をまとめます。

vagrant 1.8.1
MongoDB server version: 3.4.7

やりたいこと

Vagrant で立ち上げたVM内に、mongoDB 3.4系 (本稿では3.4.7 ）のmongodプロセスを立ち上げる。 VMのプライベートネットワーク内のipを192.168.33.10と設定し、host(MacOS)のポート27017を、VM内のmongodが使用しているポート27017にポートフォワーディングさせる。

hostのmongoシェルから mongo --host 192.168.33.10とコマンドを打つと、VM内のmongodにアクセスできるようにする

Vagrant VMを立ち上げる

下準備として、centOS用のboxを追加する作業がありますが、今回は主旨とずれるため説明しません。 centOSのboxは、下記のところから、6系か7系のものを選んで vagrant box add {url} とすればboxが追加されるかと思います A list of base boxes for Vagrant - Vagrantbox.es

Vagrantfileの記述

そして、Vagrantで起動するVMの設定ファイルであるVagrantfileに、プライベートネットワークのipや、ポートフォワーディング設定を記述していきます。今回はすでに用意したcentOSのboxを使用します。 box名はここではcentosとしています。

$ vagrant box list
centos (virtualbox, 0)

今回はVagrantfileの記述はほんとにシンプルで、最小限の設定にします。プライベートネットワークIPと、ポートフォワーディングの設定のみ行います。 hostの27017ポートをguest(VM)の27017ポートに紐付けます

Vagrant.configure(2) do |config|
  config.vm.box = "centos"

  # VMのプライベートネットワークのIPを設定する
  config.vm.network "private_network", ip: "192.168.33.10" 
  # ポートフォワーディングの設定をする
  config.vm.network "forwarded_port", host: 27017, guest: 27017
end

上記のconfig.vmのパラメータは色々あり、ここでVMの設定を行うことができます。
設定できるパラメータはドキュメントを参照ください。 www.vagrantup.com

今回つかう config.vm.networkの設定はこちらです www.vagrantup.com

vmを立ち上げる

Vagrantfileの記述ができたのでvmを立ち上げます。

Vagrantfileのある階層でvagrant upを実行してください

Vagrantfileの内容をもとにVMを立ち上げる
$vagrant up


Vagrant VMでポートフォワーディングされているポートを確認
$ vagrant port
The forwarded ports for the machine are listed below. Please note that
these values may differ from values configured in the Vagrantfile if the
provider supports automatic port collision detection and resolution.

    22 (guest) => 2222 (host)
 27017 (guest) => 27017 (host)

vagrant up して立ち上げ、 vagrant portで紐付けられたポートを確認できます。

これでVMが立ち上がりました。あとはvagrant ssh してVM インスタンス内で作業します。

centOS内でのmongo3.4系のインストールとmongod起動

今回はmongoのインストールは主旨と外れるのでここでは詳細な手順をスキップします。本家のドキュメントに書いてある通りにおこなってyum でインストールできるようになります。 Install MongoDB on Red Hat Enterprise or CentOS Linux — MongoDB Manual

mongodの起動設定の編集

mongo3.4系がインストールできたらmongodを起動する前に、VM外からmongodへアクセスさせるための設定を入れます。 centOSでmongoDBをインストールした際、デフォルトで/etc/mongod.conf にmongodの設定ファイルが作成されます。
ここで、mongodへの接続を待ち受けるipの設定を行う事ができます。

net.bindIp項目でその設定を行う必要があり、初期設定では下記の用になっています。

# network interfaces
net:
  port: 27017
  bindIp: 127.0.0.1  # Listen to local interface only, comment to listen on all interfaces.

mongod起動時のportは 27017、接続を待ち受けるipは127.0.0.1つまりVM内部のみからとなっております。

bindIpのドキュメントを抜粋すると下記のとおりです。

The IP address that mongos or mongod binds to in order to listen for connections from applications. You may attach mongos or mongod to any interface. When attaching mongos or mongod to a publicly accessible interface, ensure that you have implemented proper authentication and firewall restrictions to protect the integrity of your database.

To bind to multiple IP addresses, enter a list of comma separated values. https://docs.mongodb.com/manual/reference/configuration-options/#net-options

今回は、VMに設定したプライベートネットワークのIP 192.168.33.10 に対して接続も受け付けるため、 bindIpにカンマ区切りで192.168.33.10を追記します。

# network interfaces
net:
  port: 27017
  bindIp: 127.0.0.1,192.168.33.10  # カンマ区切りで192.168.33.10追加

これでmongodの設定はできたので、mongodを起動します。

#mongod start
$sudo service mongod start
Redirecting to /bin/systemctl stop  mongod.service

# mongodのプロセスが立っていることを確認する
$ps aux | grep mongo
mongod    3451  7.6  8.1 972640 38172 ?        Sl   15:57   0:00 /usr/bin/mongod -f /etc/mongod.conf

これでmongodを起動できました。

vm内のファイアウォールの設定を変更する

現時点では、まだVM外部から27017ポートへの接続はできません。ファイアウォールの設定を変更し、VM内の27017ポートに対して、VM外からtcpでアクセスできるようにします。

firewall-cmdを使うのが非常に簡単なので、今回はfirewall-cmdを使った方法を説明します。

## port27017に対して、tcpを使った外部からの接続を許可する
sudo firewall-cmd --zone=public --add-port=27017/tcp --permanent

## ↑で設定を変更したのでfirewall-cmdをリロードする
sudo firewall-cmd --reload

これで27017ポートの外部公開設定が完了しました。

ホストOSからVM内のmongodに接続する

VM、mongodの設定が完了したのであとはホストOSのターミナルから接続するだけです。

$ mongo --host 192.168.33.10
MongoDB shell version v3.4.4
connecting to: mongodb://192.168.33.10:27017/
MongoDB server version: 3.4.7
Server has startup warnings: 
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] 
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] 
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] 
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-09-06T16:04:00.953+0000 I CONTROL  [initandlisten] 
> show dbs
admin  0.000GB
local  0.000GB

これで接続できない場合は、mongodの設定ファイルを確認し、mongod再起動したり、あとはfirewall-cmdでポートの公開設定がされているかを確認してください。 firewallまわりで手こずるかもしれませんが、一度理解するとあとは簡単に設定できるかと思います。

まとめ

今回はVagrant で立ち上げたcentOS VM内のmongodを、ホストOS(MacOS) から接続できるようにしました。コレができるとVagrantで簡単に検証用のmongo環境が作れて、mongo関連の検証・動作確認が捗るかと思います。

2017-09-06

redis 4系から追加されたUNLINKについて

redis

はじめに

2017年7月にリリースされたredis4.0から、特定のキーの値を非同期的に削除するUNLINKコマンドがでました。 UNLINKコマンドの既存のDELコマンドとの違いついてまとめます。

redis4.0系リリース

2017年7月にredis4.0系がリリースされました。主な機能に関しては、下記のqiitaの記事にまとめられています。 qiita.com

redis4.0.0のリリースノートは下記のリンクで見ることができます。 https://raw.githubusercontent.com/antirez/redis/4.0/00-RELEASENOTES

若干読みづらいですが、 UNLINKに関する記述を抜粋します。

Lazy freeing of keys. Redis is now able to delete keys in the background in a different thread without blocking the server. The new UNLINK command is the same as DEL but working in a non blocking way.

UNLINKコマンドが生まれた経緯シングルスレッドモデルにおける DELコマンドの問題点

redisはシングルスレッドモデルのため、既存のDELコマンドの場合だと複数のクライアントからクエリを実行された時、1つのキューに並んで順番に処理されます。
そのため、データサイズの大きいデータセットに対してDELを実行すると、計算量が大きいため、後続の処理が遅れて大きなパフォーマンス低下の要因となってしまいます。

今回4.0系で登場したUNLINKは、ある値を削除する際にこの問題を解決するコマンドです。
UNLINKを使ったクエリが実行された際、文字通り、指定されたキーの値を即座に削除せず、非同期的にデータセット内のキーを格納したキースペースからUNLINKされ、実際のデータは後に削除されます。これにより、ノンブロッキングな処理が可能となります。

クエリの実行的には即座に完了し、データセットのサイズに関係なく定数時間 O(1)で処理が完了します。

同時に複数のキーを指定した場合はキーの数N 文の時間の計算量O(N)となります。

UNLINKの内部処理を見る

UNLINKコマンドの内部処理が気になったので、githubに上がっているredisのソースを読んでみました。関連のソースは、 src/lazyfree.cあたりにまとまっています。

UNLINKが実行されたときに、dbAsyncDeleteと呼ばれる関数が呼ばれます。 https://github.com/antirez/redis/blob/unstable/src/lazyfree.c#L49-L85 文字通り、データを非同期的に削除する処理を行っています。

/* Delete a key, value, and associated expiration entry if any, from the DB.
 * If there are enough allocations to free the value object may be put into
 * a lazy free list instead of being freed synchronously. The lazy free list
 * will be reclaimed in a different bio.c thread. */
#define LAZYFREE_THRESHOLD 64
int dbAsyncDelete(redisDb *db, robj *key) {
    /* Deleting an entry from the expires dict will not free the sds of
     * the key, because it is shared with the main dictionary. */
    if (dictSize(db->expires) > 0) dictDelete(db->expires,key->ptr);

    /* If the value is composed of a few allocations, to free in a lazy way
     * is actually just slower... So under a certain limit we just free
     * the object synchronously. */
    dictEntry *de = dictUnlink(db->dict,key->ptr);
    if (de) {
        robj *val = dictGetVal(de);
        size_t free_effort = lazyfreeGetFreeEffort(val);

        /* If releasing the object is too much work, let's put it into the
         * lazy free list. */
        if (free_effort > LAZYFREE_THRESHOLD) {
            atomicIncr(lazyfree_objects,1);
            bioCreateBackgroundJob(BIO_LAZY_FREE,val,NULL,NULL);
            dictSetVal(db->dict,de,NULL);
        }
    }

    /* Release the key-val pair, or just the key if we set the val
     * field to NULL in order to lazy free it later. */
    if (de) {
        dictFreeUnlinkedEntry(db->dict,de);
        if (server.cluster_enabled) slotToKeyDel(key);
        return 1;
    } else {
        return 0;
    }
}

さらに内部を辿っていくと、 src/dict.cの方で dictGenericDeleteと呼ばれるstatic関数を呼び出しています。
これは、データセット内の特定のキーへのリンクを解除しています。
https://github.com/antirez/redis/blob/06263485d46696ba76a653d2b594f3493103c001/src/dict.c#L361-L399

/* Search and remove an element. This is an helper function for
 * dictDelete() and dictUnlink(), please check the top comment
 * of those functions. */
static dictEntry *dictGenericDelete(dict *d, const void *key, int nofree) {
    unsigned int h, idx;
    dictEntry *he, *prevHe;
    int table;

    if (d->ht[0].used == 0 && d->ht[1].used == 0) return NULL;

    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);

    for (table = 0; table <= 1; table++) {
        idx = h & d->ht[table].sizemask;
        he = d->ht[table].table[idx];
        prevHe = NULL;
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) {
                /* Unlink the element from the list */
                if (prevHe)
                    prevHe->next = he->next;
                else
                    d->ht[table].table[idx] = he->next;
                if (!nofree) {
                    dictFreeKey(d, he);
                    dictFreeVal(d, he);
                    zfree(he);
                }
                d->ht[table].used--;
                return he;
            }
            prevHe = he;
            he = he->next;
        }
        if (!dictIsRehashing(d)) break;
    }
    return NULL; /* not found */
}

更に細かくは見れていないですが、データセットは内部的にlinked listのようなデータ構造でリストを保持していて、 dictGenericDelete内では、対象のキーへの参照を外すことで、UNLINKを行っているようです。
上記のコードをのwhileループ内の処理を抜粋してみると、おぼろげながらも理解ができるかと思います。

        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) {
                /* Unlink the element from the list */
                if (prevHe)
                    prevHe->next = he->next;
                else
                    d->ht[table].table[idx] = he->next;
                if (!nofree) {
                    dictFreeKey(d, he);
                    dictFreeVal(d, he);
                    zfree(he);
                }
                d->ht[table].used--;
                return he;
            }
            prevHe = he;
            he = he->next;
        }

dictGenerictDeleteの第3引数nofree は2値の値で、1の場合は、直接データを消さずに参照だけ外し、 0の場合は直接データを消す、つまりメモリをfreeしているようです。

static dictEntry *dictGenericDelete(dict *d, const void *key, int nofree) {

実際には下記の用な感じで呼び出されています。

nofree == ０つまりメモリをfreeさせる場合

/* Remove an element, returning DICT_OK on success or DICT_ERR if the
 * element was not found. */
int dictDelete(dict *ht, const void *key) {
    return dictGenericDelete(ht,key,0) ? DICT_OK : DICT_ERR;
}

nofree == 1 つまり今回紹介したUNLINKのように参照を外す場合

/* Remove an element from the table, but without actually releasing
 * the key, value and dictionary entry. The dictionary entry is returned
 * if the element was found (and unlinked from the table), and the user
 * should later call `dictFreeUnlinkedEntry()` with it in order to release it.
 * Otherwise if the key is not found, NULL is returned.
 *
 * This function is useful when we want to remove something from the hash
 * table but want to use its value before actually deleting the entry.
 * Without this function the pattern would require two lookups:
 *
 *  entry = dictFind(...);
 *  // Do something with entry
 *  dictDelete(dictionary,entry);
 *
 * Thanks to this function it is possible to avoid this, and use
 * instead:
 *
 * entry = dictUnlink(dictionary,entry);
 * // Do something with entry
 * dictFreeUnlinkedEntry(entry); // <- This does not need to lookup again.
 */
dictEntry *dictUnlink(dict *ht, const void *key) {
    return dictGenericDelete(ht,key,1);
}

という感じになります。このあたりの実装は読んでいて非常に勉強になるので、redisを使っている方は読んでみることをおすすめします。

まとめ

redis4.0系から、データセット内の特定のキーを非同期で削除できるUNLINKコマンドが追加されました。 UNLINKは、DELとは異なり、実際にクエリが実行されたときにデータを削除せず、あくまでデータセット内からキーの参照を非同期で UNLINKしています。
これにより、シングルスレッドモデルを採用しているredisで、データ容量が大きい場合にDELによって実行速度が遅くなってしまう問題を解消できます。

実際にパフォーマンスについては今回検証できていないですが、redis4.0系以上を使う場合は UNLINKを使うことによって大きくパフォーマンス改善が図れるかと思います。