Starting blog again with Nikola

More than 2 years, I haven't updated my site. To say long story short, I was in UK for more than one years, and unable to set up my desktop machine, which has all the sources of my site, left in a luggage during in UK. Recently, I moved to the Bay area in US, and finally set up the old desktop again, and replace the site generator from Jekyll to Nikola.

Jekyll was great, however, it requires too many modification for my need, since my site requires some static pages plus blog entries, and I wrote all my posts in Org-mode, which was not supported by Jekyll so I had to come up with my own solution that convert .org files to .html files using Emacs myself, and insert YAML configuration to all of them, and uses Jekyll to generate the final output.

Now, I read a very interesting article, Blogging in org-mode with Nikola, which introduced Nikola to me, and I decided to experiment with it. After few hours of experimentation, I was really surprised as it contains everything that I imagined before!

  • org mode support using Emacs instead of custom converter.
  • RSS feed by default
  • support for non-blog site
  • Disqus integration.
  • support for multi-lingual site

I dare to say, (for my own need at least), Nikola is better than vanilla Jekyll, and still better than Octopress.

Of course, there are some problems that I found already, but they are minor enough. Hopefully, I may explain those in another post.

Anyway, here I am, restarted the blog again. Hope that I can find many interesting topics soon.

Remove a node from a linked list in almost O(1)

struct node {
  int data;
  /* ... */
  struct node *next;
typedef struct node NODE;

Suppose that we have a singular list of NODE or a circular list of NODE⁠. To remove a node from the list, we'd like to implement following function:

/* Remove NPTR from the list pointed by HEAD, and return it. */
NODE *list_extract(NODE **head, NODE *nptr);

The node that we need to extract from the list is pointed by nptr⁠. So we need to make sure that the next field of the node before nptr should point the node after nptr⁠. One way to find the previous node is to iterate all nodes from head to find the node whose next field is the node pointed by nptr⁠:

NODE *before;
for (before = head; before != 0; before = before->next)
  if (before->next == nptr)

However, it may not be so efficient to traverse lots of node before nptr if the list is very very long. There is another way to extract a node from a list without iterating the list to find the previous node.

head           nptr
[10] -> ... -> A:[12] -> B:[34] -> ...

Suppose we have a list like the above. head and nptr is the pointer to the head of the node and the node that needs to be extracted respectively. Let the node points by nptr is node A, and the next node of node A is node B. Node A has an integral data, 12, and node =⁠B= has 34⁠.

Instead of finding the node before nptr⁠, we like to swap the contents of the node A and node B. To do that, I'll introduce a temporary node, T⁠, and copy the contents of node =⁠A= into it.

head           nptr
[10] -> ... -> A:[12] -> B:[34] -> ...

               T:[12] -----+

Note that we copied every fields of node A into node T⁠, so =⁠next= field of node T also points the same node as node A points. Then, we copy all fields of node B into node A:

head           nptr
[10] -> ... -> A:[34] -------------+
               T:[12] -> B:[34] -> ...

Then, we copy all fields of node T into node B⁠:

head           nptr
[10] -> ... -> A:[34] -----------> ...

                            |    |
                            V    |
               T:[12] -> B:[12] -+

Since we copyied next field from node =⁠T= to node B⁠, node =⁠B= turns to be a circular list of its own node, which we don't intent. So set next field of node B to NULL⁠.

head           nptr
[10] -> ... -> A:[34] -----------> ...

               T:[12] -> B:[12] -|

Now the node that we wanted to extracted is copyied to node B, which is accessible by next field of node T. And the list has successfully removed the contents of node pointed by nptr⁠.

However, when the target nptr is the last node, we cannot use this trick since there is no next node. In that case, we use the well known solution, to iterate all nodes to find the previous node.

The full source code of list_extract() is here:

list_extract(NODE **head, NODE *nptr)
  NODE t;
  NODE *prev;

  if (*head == nptr) {
    if (nptr->next == NULL) {
      *head = 0;
      return nptr;
    else {
      *head = nptr->next;
      nptr->next = NULL;
      return nptr;

  if (nptr->next) {
    memcpy(&t, nptr, sizeof(t));
    memcpy(nptr, nptr->next, sizeof(t));
    memcpy(, &t, sizeof(t));>next = 0;

  for (prev = *head; prev != NULL; prev = prev->next) {
    if (prev->next == nptr)
  if (!prev)
    return 0;

  prev->next = nptr->next;
  nptr->next = 0;

  return nptr;

And the time complexity will be:

\begin{eqnarray} O(f(n)) & = & \left\{ \begin{array}{l l} O(1) & \quad \text{ if \(nptr\) is the head} \\ O(n - 1) & \quad \text{ if \(nptr\) is the last} \\ O(1) & \quad \text{ the rest } \\ \end{array} \right. \\ &=& \frac{1}{n}O(1) + \frac{1}{n}O(n-1) + \frac{n - 2}{n}O(1) \\ &=& \frac{n - 1}{n}O(1) + \frac{1}{n}O(n) \\ &\cong& O(1) \quad \text{ if \(n\) is relatively large } \end{eqnarray}

Detect staled NFS mount

Check stale NFS

Here's a simple script to check whether the given directory (NFS mount point) is stale.

SSH wrapper to connect Scalr-managed servers

It's annoying to connect one of the virtual machine managed by Scalr for various reasons. So I created small shell script for easy connection; browse the servers with its IP address, then connect to them.


We created lots of Scalr managed servers. Some of them have external IP addresses, but some of them are not. So, we need to prepare a proxy machine, to connect those servers.

            Internet     Firewall     Cloud IaaS
  +---------+          +----+----+         +---------+
  | Client  |          | Proxy   |         | Target  |
  | Machine |          | Server  |         | Server  |
  |         |--------->|         |-------->|         |
  |         |          |         |         |         |
  |         |          |         |         |         |
  +---------+          +----+----+         +---------+

For example, suppose that we've prepared the proxy machine at Normally, you could connect to the destination server by issuing ssh twice, like this (which is annoying):

$ ssh
$ ssh

We are using Scalr's auto-scaling feature; this means, the number of servers are dynamically increasing/decreasing depending on the server's load. In other words, at some instance, we do not exactly know how many servers are there, and we do not exactly know what IP addresses they have.

So, I created small shell script named sssh (stands for "Scalr-ssh") to find out the list of Scalr-managed servers, and provide easy ssh connection to one of the servers. With this, you can connect a server instantly even if the server does not have external IP address.

Download & Installation

First, you'll need to download the Scalr command line tool from Scalr Command Line Tools, and you'll need to finish scalr configure step.

$ sudo easy_install scalr
$ scalr configure -i d41d8cd98f00b204 \
    -a 3bEGXWzaoT92BMhOaqv13bEGXWzaoT92BMhOaqv13bEGXWzaoT92BMhOaqv1+0=

Above example will save your configuration in $HOME/.scalr/config.ini.

Then, you'll need to grab the source from here, and save it to some place like /usr/local/bin. Then, edit that file to update the proxy endpoint in SSH_PROXY_ENDPOINT to reflect your proxy endpoint. For example:


You can test whether the installation was successful via following command. Note that the actual output may vary depending on your Scalr configuration/usage.

$ sssh env
  [149]  AWS-TEST-ENV
  [158]  US-EAST-9
  [161]  AP-KR-FOOBAR


If you have more than one Scalr environment, you'll need to list the environments using sssh env, then select one of the environment with the following command:

$ # select environment with id, 158
$ sssh set-env 158

Then, you'll need to select one of your farms. First, list the farms using sssh farms, then select one of it using sssh set-farm:

$ # list the farms
$ sssh farms
  [808]                          test-vpc (Stopped):   VPC farm for testing
  [809]      ec2-us-east-1-management-dev (Running):   None
  [814]           ec2-us-east-2-store-dev (Stopped):   None
  [953]                template-test-farm (Running):   None
$ # select one of the farm
$ sssh set-farm 809

Once the env/farm is selected, then you can browse the list of servers by sssh list:

$ sssh list
  [ 0]  proxy-server
  [ 1]  zookeeper-3-centos6-cl
  [ 2]  zookeeper-3-centos6-cl
  [ 3]  zookeeper-3-centos6-cl
  [ 4]             None  sessionmgr-master-centos6-cl
  [ 5]  cs-sessionmgr-master-centos6-cl
  [ 6]  cs-sessionmgr-master-centos6-cl
  [ 7]  cs-frontend-centos6-cl
  [ 8]  cs-frontend-centos6-cl
  [ 9]             None  cs-datastore-centos6-cl

Each item contains 4 fields: the server index, the external IP, the internal IP, and the name of the server. In above example, 4th and 9th server do not have external IP. Remember that we configured SSH_PROXY_ENDPOINT to point 0-th server endpoint, "root@". This server is used for the ssh proxy for this demonstration.

These servers belong to the farm id, 809 as we selected this farm using sssh set-farm 809. To connect one of these servers, you need to download the PEM file of this farm, and place it in your $HOME/.ssh/809.pem. Finally, you can connect to one of the servers by following command:

$ # connect to 1st server
$ sssh connect 1
Last login: Tue Feb 11 05:32:28 2014 from

Appliance:      centos-6-scalr appliance 1.0
Hostname:       ip-10-102-9-135
IP Address:

[root@ec2-53-84-9-110 ~]# _

You can even connect to the server without external IP. For example:

$ # connect to 9th server
$ sssh connect 9
Last login: Wed Feb 12 09:04:02 2014 from

Appliance:      centos-6-scalr appliance 1.0
Hostname:       ip-10-102-9-210
IP Address:

[root@ip-10-101-3-210 ~]# _

Note that from the first message of the command; it says that the connection was from, which is the internal IP address of the 0-th server, which is used for the ssh proxy.

Internally, when we specify a server without external IP address, sssh will indirectly connect to the server via the pre-configured ssh proxy server using ssh ProxyCommand option with netcat(1):

ssh -i "$pem" -o "ProxyCommand ssh -i $pem $SSH_PROXY_ENDPOINT nc %h %p" root@${iip}

If you have interest on this, read the nice article Configure openssh to tunnel through another server using SSH.

Easy way to resize/select windows or frames of Emacs

Window/Frame Selection

When you're using Emacs in a graphic display (That is, not in a console nor a terminal), you'll have multiple Emacs frames and windows in them.

The default navigation interface of Emacs is quite surprising to the non Emacs users since Emacs provides relative selection machanism.

For example, suppose you have following Emacs frames and windows:

+-Frame A------------+   +-Frame B------------+   +-Frame C------------+
|                    |   |                    |   |                    |
| Window A           |   | Window C           |   | Window F           |
|                    |   |                    |   |                    |
|                    |   +--------------------+   |                    |
|                    |   |                    |   |                    |
+--------------------+   | Window D           |   |                    |
|                    |   |                    |   |                    |
| Window B           |   |                    |   |                    |
|                    |   +--------------------+   |                    |
|                    |   |                    |   |                    |
|                    |   | Window E           |   |                    |
+--------------------+   +--------------------+   +--------------------+

Emacs provides basic frame selection and window selection commands; other-frame and other-window. They select the next frame or next window from the list in cyclic order. The problem is, the default order may not reflect the coordinates of the frame/window, especially when you moved some frames/windows.

Suppose that the currently selected frame is Frame B in above figure. If Emacs kept the frame list in (Frame#B Frame#A Frame#C), the next frame would be Frame A. Of course, by using negative prefix argument to other-frame function, you can select the previous frame if you want.

What I want is, to select a frame in the order of the actual coordinate of the frames. That is, I want to give a command something like, "select the frame where its X coordinate is the closest to the origin.", or "select the frame where its X coordinate is the second closest to the origin.".

So I made a simple function, wfu/frame-list-ordered, to return an ordered list of frames. Similarly, I made another function, wfu/window-list-ordered, to return an ordered list of windows. Using these two functions, I made two commands; wfu/other-frame-or-window and wfu/other-window-or-frame. wfu/other-frame-or-window will select other(next) frame. If there is no other frame, it will select other(next) window. Similarly, wfu/other-window-or-frame will select other(next) window. If there is no other window, it will select other(next) frame.

Binding a key shortcut to an Emacs command is treaky thing. Since you don't know that whether you can easily memorize new keybindings. Anyway, I found following key sequences are best for my personal use:

Keys Description
C-<tab> bound to wfu/other-window-or-frame, it selects the next window of the current frame.
C-N C-<tab> Select the N-th window of the current frame.
C-- C-N C-<tab> Select the N-th frame.
C-x o bound to wfu/other-frame-or-window, it selects the next frame.

N is the number between 0 and 9. Note first window/frame starts from 1, not 0.

Normally, I just stick to C-<tab> to select other window. When I want to select other frame, I'll feed it a negative number; where the absolute value of the number denotes the N-th frame.


You may download wfutils.el from the github.

A script to create Redis Cluster using GNU screen(1)

Easy way to setup the local redis replication using GNU screen(1)

One of my reponsibility is to write easy client library for Redis in C and Java. There are already well-known client C library, hiredis and Java library, jedis. At the time of development, none of these support our Redis replication cluster. I'll write later about the client libraries that support replication.

Anyway, during the development, I need to launch simple redis cluster, which consists of 1 master and 2 slaves. It is tiresome job to setup the configuration of master and slaves, and it is very likely to commit a mistake.

Thus, I wrote a small shell script (called to launch redis cluster locally. Internally, it uses GNU screen to create multiple shell to launch required processes:

shell no. description
0 redis-sentinel, listening 26379
1 redis monitor to the sentinel
2 redis-cli to the sentinel
3 redis-server master, listening 6379
4 redis monitor to the master
5 redis-cli to the master
6 redis-server slave#1, listening 6380
7 redis monitor to the slave#1
8 redis-cli to the slave#1
9 redis-server slave#2, listening 6381
10 redis monitor to the slave#2
11 redis-cli to the slave#2

Since the master and slaves are managed by the sentinel process, if you shutdown the master, one of the slaves will be promoted to new master.

This way, you can easily experiment and test your client software or libraries.

Here is the source code of

How I measured Emacs init script performance


How did I load lots of init scripts?

If you're like me, you have a lot of elisp files for your Emacs configuration. Without considering unmaintained code or junk codes, I have almost 80 elisp files in my $HOME/.emacs.d/. This causes Emacs launching slower and slower. Normally, I don't turn-off my computer, nor I need to launch Emacs frequently. So it was not big deal.

However, sometimes it took more than 7 seconds to launch Emacs on my idle Gentoo machine. Why it took so much time before starting? Which file is the time-consuming monster? Unfortunately, I couldn't answer.

So I tried to clean-up my configurations. First, I remove all unused junk from my $HOME/.emacs.d/init.el. Even after that, my init file was too big to maintain easily. So I modulize the init file into separate code pieces. Now, there is just a small init.el, and lots of code pieces reside in $HOME/.emacs.d/init/. For example, my customization for dired package is stored in $HOME/.emacs.d/init/dired.el, and Korean language customization is stored in $HOME/.emacs.d/init/dired.el, and so on.

After that, I wrote simple macro, which loads the init code pieces from the specified directory, $HOME/.emacs.d/init/, if it was not loaded before, much like Emacs's require function.

(uinit/load "korean" enable-multibyte-characters)

You can interpret above code as "If enable-multibyte-characters is not nil, load $HOME/.emacs.d/init/korean.el, if it was not loaded before."

In detail, uinit/load function will search whether the given code piece (e.g. Korean.el in above) is loaded before, by searching the list uinit/loaded-init-files. Then if it does not exists, it call call load function to load the file. It also record the duration time of loading the piece, and save it in to the report buffer, "*uinit*".

After my init.el is loaded by Emacs, it will repeatedly call uinit/load to load lots of init code pieces, saving the timing of loading in "*uinit*" buffer. In the end of init.el file, I call uinit/summarize so that it will sort *uinit* buffer by the consumed time, and wrote the total amount of time to load my init code pieces. Here's the screenshot of *uinit* buffer:



You can grab the source of uinit package from here.

Your emacs init script ($HOME/.emacs or $HOME/.emacs.d/init.el) modifies load-path, You must set load-path before using any of function in uinit because uinit will try to byte compile your init pieces on load. Otherwise byte compilation will fail. Especially, if you're using package, try to call (package-initialize) before loading uinit. For example:

(when (locate-library "package")
  (require 'package)

(setq uinit/init-directory "~/.emacs.d/init")
(require 'uinit)

Then, you're free to call uinit/load to load your init code pieces. (Your init code pieces should be placed in uinit/init-directory.) For example:

(uinit/load "darwin" (eq system-type 'darwin))
(uinit/load "X" (eq window-system 'x))

Finally, add following code in the end of your init.


A dirty macro that convert the scala type to string type in binary representation

Handy function that convert the scala type to string type in C. This is a not-portable GCC macro to produce binary-represented string in C or C++.

Sometimes, it is useful to have a way to print a scala value in a binary representation. Unfortunately, there is no such format specifier for printf(3). Luckily, glibc has a customization feature for printf(3)-like functions. If you're interested in, read the code here. For example, you could write:

int i = 0xdeadbeef;
printf("%b\n", i);      /* print 11011110101011011011111011101111 */

If you're not using glibc, you could create some utility functions to do this. Here're some candidate function prototypes:

/* return the binary represented string of VALUE.  The width
 * of the string is CHAR_BIT * size.  The return value should
 * be free(3)ed after use. */
char *binstr(unsigned long long value, size_t size);

/* store the binary representation of VALUE in string pointed by BUF
 * with the size SIZE.  */
char *binstr(unsigned long long value, char *buf, size_t size);

I'm not satisfied with these. The first one, which returns a value that is dynamically allocated, looks heavy. Plus, the caller should supply the number of binary digit, since it always accept the value as unsigned long long. The second one, which works only when the user provided the buffer as in the second and the third parameters. It does not look heavy, but probably someone may be not happy with it, since the user always need to prepare the buffer.

I couldn't come up with a everybody-satisified solution with this.

Since most of my works are done in Linux and MacOS; both provides GCC, so I decided to drop the portability. :)

#define binstr(x)       ({ char *p = alloca(sizeof(x) * CHAR_BIT + 1);  \
      char *q = p + sizeof(x) * CHAR_BIT;                               \
      *q-- = '\0';                                                      \
      int i;                                                            \
      typeof(x) tmp = (x);                                              \
      for (i = 0; i < sizeof(x) * CHAR_BIT; i++) {                      \
	*q-- = (tmp & 1) ? '1' : '0';                                   \
	tmp >>= 1;                                                      \
      }                                                                 \
      q + 1;                                                            \
  • I'm using alloca(3), instead of malloc(3), so that the memory buffer should be freed automatically. The caller no longer need to call free(3).
  • To decide the right number of digits of the value type, binstr is a macro function.
  • I used statement expression, ({…}) (GCC extension), which allows me to declare local variables.
  • I used typeof operator (GCC extension), which allows to declare a local variable, that has the same type as the parameter.

How I kill a process with suspicious TCP CLOSE_WAIT

During our server-side application development, we encontered a lot of connections are in CLOSEWAIT state, so that our server process is out of file descriptors. We are in the middle of development of a client application that runs in the mobile androids, and the server-side application that runs in a cloud infrastrure.

I'm in the server-side team, and our team is focusing on the development of server-side. Our server-side have multiple front-end server that expose the interface for the clients. Front-end servers are like load-balancers, they dispatch the client requests to the one of the several back-end workers. Since we're in the middle of the development, our front-end servers and back-end servers have a couple of bugs in them. They sometimes made the server crash, even hang unpredictively.

Unfortunately, while we were tring to stablize our server applications, the client team needed a prototype server cluster, so that they can develop their client application and test the interaction between client and the front-end. Personally, I don't want to provide our prototype servers to the client team until the server-side is stablized, but the client team also need to hurry, to meet the dead-line, so we have no choice but to provide them still-unstable-servers.

The biggest problem was, the server application leaves CLOSE_WAIT state TCP connections on unexpected network situation. So, after a couple of hours, the server process ran out of file descriptors, denying client requests. Since we use sophiscated third-party network library, it would take some times to fix the problem.

So, I need some kind of watchdog, which periodically check whether the server process leaves CLOSE_WAIT connections, and kill them, leave some logs, and so on. Our server application is managed by init(1) like launcher, so when the server processes are terminated, the launcher automatically raise them.


I was in hurry to implement this wachdog program, so I decided to write small bash script, but later changed to Ruby script. Fortunately, all of our servers already have Ruby 1.8 installed.

At some time slice, the output of the netstat(1) would like this:

$ netstat -ntp
tcp  0  0  ESTABLISHED 16125/fe-server
tcp  0  0  ESTABLISHED 16118/fe-server          
tcp  0  0  ESTABLISHED 16120/fe-server          
tcp  0  0   CLOSE_WAIT  16063/fe-server   
tcp  0  0  ESTABLISHED 16063/fe-server
tcp  0  0  CLOSE_WAIT  -
$ _

The netstat(1) from net-tools, accept -n option, indicates to use numerical addresses and ports, -t options indicates to show only TCP connections, and -p options to show the related PID and program names.

It looks trival to catch the PID of the process that has one or more CLOSE_WAIT connections. One thing to keep in mind is, netstat(1) sometimes displays "-" in the PID/PROGRAM field. I don't have enough time when netstat(1) shows "-", but fortunately, fuser(1) can identify the owner PID of the connection.

$ fuser -n tcp 8010
35250/tcp:           16063
$ fuser -n tcp 8010 2>/dev/null

My first implementation was, just simply count the number of CLOSE_WAIT connections per process, and kill(1) $PID if the process has more than N CLOSE_WAIT connections.

The limitation of the first implementation is, it may kill the process with CLOSE_WAIT connection that the process just about to close(2) it.

So the second implementation work like this:

  1. save the connection information (source address:port, destination address:port) per process as a set-like container
  2. Wait for certain amount of the time
  3. save the connection information again, in another set-like container.
  4. Get the intersection of the two set.
  5. If the number of elements in the intersection exceeds N, kill the process.

I couldn't come up with a good implementation of set-like container in bash(1), so I re-implement from the scratch with ruby(1).

After few hours, another problem arised. Some server processes, goes coma, and does not adhere to SIGTERM. We can only kill them with SIGKILL, so I modified the killing line like this:

kill $pid; sleep 2; kill -0 $pid && kill -9 $pid

This line, first send SIGTERM to the $pid, then sleep for 2 seconds, and if it still can send a signal to the process (in other words, if the process is still alive), send SIGKILL to the $pid.


I named the script, resreap. The reason was, we call our server processes as resources, so it stands for 'RESource REAPer'. The full source code is available from here.

With some extra features, my script, called resreap, can accept following options:

$ ./resreap --help
Kill processes that have enough CLOSE_WAIT socket(s)
Usage: resreap [OPTION...]

    -f PAT        Kill only processes whose command matches PAT
    -F HOST:PORT  Ignore if foreign endpoint matches to HOST:PORT
		  HOST should be in IPv4 numerical notation.

    -l N          If a process has more than or equal to N CLOSE_WAIT
		  socket(s), it will be killed with a signal
		  (default: 2)

    -i N          Set sleep interval between checks in seconds
		  (default: 2)

    -c CMD        Before sending a signal, execute CMD in the shell,
		  If this CMD returns non-zero returns, the process
		  will not receive any signal.

    -s SIG        Set the signal name (e.g. TERM) that will be send
		  to a process (default: TERM)
    -w SEC        Set the waiting time in seconds between the signal and
		  SIGKILL (default: 2)

    -d            dry run, no kill
    -D            debug mode

    -h            show this poor help messages and exit
    -v            show version information and exit

Note that if a process receives the signal, and the process is alive
for 2 second(s), the process will receive SIGKILL.

If you are going to use "-f" option, I recommend to try "-d -D" option
first.  If you get the pid of the culprit process, try to get the
command name by "ps -p PID -o command=" where PID is the pid of that

You could send two signal(s) before sending SIGKILL using '-S' option.
This can be useful since some JVM print stacktrace on SIGQUIT.

$ _

For example, if you want to kill a process if it has more than 2 CLOSE_WAIT connections, and you only care for java program, then you can do:

$ ./resreap -l 2 -f ^java

Plus, if you want to ignore CLOSE_WAIT connection on, you could do:

$ ./resreap -F

I really hope that we don't need to use this awful script for our servers. :)

Introduction of GNU obstack

Obstack 소개

Obstack은 Object Stack의 약자로서, 일종의 small memory allocator입니다. 대개의 C/C++ 책을 보면, 작은 크기의 메모리를 여러번 할당할 필요가 있을 경우, malloc(3)⁠이나 new operator를 직접 쓰는 것보다, 따로 메모리 할당 루틴을 만들어 쓰는 기법을 소개하곤 합니다. 물론 잘 만들면 좀 더 나은 성능을 가진 small memory allocator를 만들 수 있겠지만, 이미 GNU C library에 포함되어 있기 때문에, obstack을 쓰는 것이 좀 더 현명한 선택이 될 수 있습니다. (Why reinvent the wheel?)

Obstack은 GNU C library에 포함되어 있습니다. 좀 더 정확히 말하면, GNU libiberty library에 포함되어 있으며, 이 libiberty library는 GNU C library나 GCC, GDB 등의 소스에 포함되어 있는 라이브러리입니다. 필요한 소스는 단지 obstack.h⁠와 obstack.c⁠이기 때문에, GNU system이 아닌 다른 시스템에 포팅하기도 매우 쉽습니다.

글쓴이의 개인적인 경험을 바탕으로 말하자면, Obstack은 매우 이식성이 높습니다. 글쓴이는 obstack을 Windows, DOS(Turbo C 2.0), vxworks, psos등에 포팅한 경험을 갖고 있으며, 이 때, 소스 수정은 거의 필요없었습니다. 또한 시스템이 제공하는 memory allocator가 매우 느릴 경우, 또는 overhead가 클 경우등의 상황에서 obstack을 써서 큰 효과를 보았습니다.

GNU obstack은 malloc(3)⁠과 다른 여러 특징을 가지는데, 크게 요약하면 다음과 같습니다:

  • memory를 (블럭 단위로) 미리 할당해 놓고, 사용자가 요청할 때 그 블럭의 일부분을 쪼개어 그 일부분을 제공합니다. 따라서 malloc(3)⁠에 비해, 함수 호출에 대한 overhead가 무척 작습니다.
  • obstack을 써서 할당한 메모리는 이름을 보면 알 수 있듯이, stack 형태로 할당됩니다. 그리고, 기존에 할당되어 있던 메모리를 해제하면, 그 이후에 할당했던 메모리는 자동으로 해제됩니다. 따라서, obstack을 써서 N 번 메모리를 할당했을 경우, 맨 처음에 할당받은 메모리를 해제(free)하게 되면, N개의 메모리 블럭이 모두 해제(free)됩니다.
  • obstack의 growing object 기능을 쓰면, 메모리를 단계적으로 할당할 수 있습니다. 예를 들어, 한 object의 크기를 필요에 따라 조금씩 줄이거나 늘려 할당한 다음, 마지막에 완전히 크기가 결정되었을때 최종 메모리 크기를 결정할 수 있습니다.
  • obstack의 대부분 기능은 매크로 형태로 제공되기 때문에, 매우 빠릅니다.
  • 한가지 단점은, obstack이 내부적으로 일정한 memory block을 할당해서 나눠주기 때문에, 개발자가 주의하지 않을 경우, 메모리 블럭이 망가질 가능성이 있다는 것입니다. 이런 경우, efence와 같은 메모리 디버깅 라이브러리는 큰 도움을 주지 못합니다.

Obstack 써보기

Obstack을 쓰기 위해서는, 먼저 기본적인 memory allocator를 알려 주어야 합니다. 개발자는 매크로 obstack_chunk_alloc⁠과 obstack_chunk_free⁠를 각각 정의해주어야 하는데, 간단히 다음과 같이 써 주면 됩니다:

#define obstack_chunk_alloc malloc
#define obstack_chunk_free  free

물론, obstack 헤더 파일을 포함하는 코드도 써 주어야 할 것입니다 (위 매크로 정의와 #include⁠의 순서는 상관 없습니다):

#include <obstack.h>

일단 위와 같이 환경 설정이 끝났다면, 이제 obstack을 하나 만들어야 합니다. (상황에 따라 여러 개 만들 수도 있습니다.) obstack을 만드는 대표적인 함수는 obstack_init()⁠입니다. 다음과 같이 obstack을 만들 수 있습니다:

struct obstack stack;

obstack_init()⁠은 내부적으로 메모리 블럭을 하나 만들고, 기타 초기 설정을 마치는 함수입니다. 만약 obstack_init()⁠이 실패했을 경우, 전역 변수인 obstack_alloc_failed_handler⁠에 저장된 함수 포인터를 호출해서 에러 상황을 알리게 됩니다. 개발자가 특별히 이 변수에 에러 처리 함수를 등록하지 않았다면, 기본적으로 에러를 출력하고 프로그램을 종료하게 됩니다.

주어진 obstack에 메모리를 할당하는 함수는 여러개가 존재합니다. 이 중 가장 대표적인 함수는 obstack_alloc()⁠이며, malloc(3)⁠과 같은 기능을 한다고 생각하시면 됩니다. 예를 들어, 문자열을 복사하는 함수인 strdup()⁠과 비슷한 함수를 다음과 같이 만들 수 있습니다 (아래 코드는 GNU C Library Manual에서 인용한 것입니다):

struct obtsack string_obstack;

char *
copystring(char *string)
  size_t len = strlen(string) + 1;
  char *s = (char *)obstack_alloc(&string_obstack, len);
  memcpy(s, string, len);
  return s;

이 외에도 다양한 할당 함수가 제공됩니다:

/* SIZE              , ADDRESS                         .
 *            ,                      . */
void *obstack_copy(struct obstack *OBSTACK_PTR, void *ADDRESS, int SIZE);

/* obstack_copy()       , SIZE + 1                     
 * '\0'               .                             . */
void *obstack_copy0(struct obstack *OBSTACK_PTR, void *ADDRESS, int SIZE);

앞에서도 잠깐 이야기했지만, obstack에 있는 메모리를 해제(free)하는 것은, malloc(3)⁠ … free(3)⁠와 좀 다르게 동작합니다. 일단 메모리를 해제하는 함수는 obstack_free()⁠입니다.

void obstack_free(struct obstack *OBSTACK_PTR, void *OBJECT);

이 함수는 주어진 obstack에 있는 OBJECT와, 이 OBJECT 이후에 할당한 모든 메모리를 해제합니다. 만약 OBJECT 파라메터에 NULL⁠을 주면, 이 obstack에 할당된 모든 OBJECT가 해제(free)되며, 이 obstack은 더이상 쓸 수 없는 상태가 됩니다. 따라서 모든 메모리를 해제하면서, 동시에 이 obstack을 나중에 다시 쓰기 위해서는, 이 obstack에 맨 처음 할당했던 메모리 주소를 기억해 두었다가 OBJECT 파라메터에 전달해야 합니다.

예를 들어, 포인터 A, B, C가 있고, 각각 메모리를 10, 100, 1000 바이트씩 순서대로 할당해서 썼다고 가정해 봅시다. 이 때 이 모든 메모리를 해제하기 위해서는 다음과 같이 호출하면 됩니다:

struct obstack my_stack;

void *A, *B, *C;
A = obstack_alloc(&my_stack, 10);
B = obstack_alloc(&my_stack, 100);
C = obstack_alloc(&my_stack, 1000);
/* ... */
obstack_free(&my_stack, A);

앞에서 말했듯이, 한 obstack에 있는 메모리 블럭을 해제하면, 그 obstack에서 이 메모리 블럭 이후에 할당한 모든 메모리까지 다 해제된다는 것을 다시 한 번 기억하기 바랍니다.

Growing Objects

Obstack은 단계적으로 메모리 블럭을 할당할 수 있는 방법을 제공합니다. 예를 들어, 파일에서 한 token을 읽어서 메모리에 할당한다고 가정해 봅시다. 보통 token을 나타내는 문자열을 다 읽어오기 전에는, (크기를 모르기 때문에) 메모리를 할당할 수 없습니다. 그러나 obstack을 쓰면, 조금씩 메모리를 얻어 쓰다가, 마지막에 크기를 알게 된 순간에 지금까지 얻어쓴 크기만큼 메모리를 할당할 수 있습니다. 이 기능은 특히, 크기를 모르는 text를 파일/네트웍에서 받아 처리하는 함수를 작성할 때 매우 쓸모있습니다.

growing object를 처리하는 함수들은 앞에서 설명한 함수들과는 조금 다른 방식으로 동작합니다. 먼저, 조금씩 얻어쓰는 단계에서는 마지막에 고정될 메모리의 주소를 알 수 없습니다. 즉, 얻어쓰는 단계에서 메모리의 위치가 바뀔 수도 있다는 뜻입니다. 표준 C 라이브러리가 제공하는 realloc(3)⁠을 생각하시면 이해하기 쉬울 것입니다.

한 obstack에서, growing object는 단 하나만 만들 수 있다는 것을 주의하기 바랍니다.

growing object를 위해, 메모리를 할당하는 함수는 매우 많습니다. 여기서 적당한 것을 골라 쓰시면 되며, 여러번 부르거나 섞어써도 상관없습니다.

/*       , SIZE          ,            */
void obstack_blank(struct obstack *OBSTACK_PTR, int SIZE);
/* SIZE          , DATA                  */
void obstack_grow(struct obstack *OBSTACK_PTR, void *DATA, int SIZE);
/* obstack_grow()    ,   SIZE + 1          , 
 *      '\0'         . */
void obstack_grow0(struct obstack *OBSTACK_PTR, void *DATA, int SIZE);
/*    C     */
void obstack_1grow(struct obstack *OBSTACK_PTR, char C);
/*       DATA     */
void obstack_ptr_grow(struct obstack *OBSTACK_PTR, void *DATA);
/*      DATA     */
void obstack_int_grow(struct obstack *OBSTACK_PTR, int DATA);

따로 예제는 만들지 않겠습니다. 다만 obstack_blank()⁠의 경우, 위에서 설명한 것 이외의 기능을 가지고 있습니다. 위 함수들을 써서 메모리를 조금씩 얻는 도중, 일정 크기의 메모리를 다시 반납하고 싶다면 obstack_blank()⁠의 SIZE 파라메터에 음수값(negative value)을 주면 됩니다.

그리고, 나중에 메모리의 크기를 확실히 알았다면, 이제 지금까지 얻어썼던 메모리를 고정(fix)시켜야 합니다. 이 역할은 obstack_finish()⁠하며, 이 때에, 실제 메모리의 주소가 결정됩니다.

void *obstack_finish(struct obstack *OBSTACK_PTR);

만약, 얻어쓰는 도중에, (임시적으로 사용하고 있는) 메모리의 주소를 알고 싶다면, osbtack_base()⁠를 쓰면 됩니다. 또, 현재 얻어쓰고 있는 메모리의 총 크기를 알고 싶다면 obstack_object_size()⁠를 쓰면 됩니다. 만약 obstack_object_size()⁠가 0을 리턴한다면 현재 얻어쓰고 있는 메모리가 없다는 뜻입니다. 주의할 것은, 만약 현재 얻어쓰고 있는 메모리가 없을 경우, obstack_base()⁠가 NULL⁠을 리턴하지 않는다는 것입니다. 얻어쓰고 있는 메모리가 없을 경우 obstack_base()⁠는, 다음에 할당할 메모리 위치를 리턴합니다. 따라서, 현재 얻어쓰고 있는 메모리가 있느냐 여부는 obstack_object_size()⁠로 알아내는 것이 좋습니다.

/*    growing object  (   )               */
void *obstack_base(struct obstack *OBSTACK_PTR);
/*    growing object       ,       0    */
int obstack_object_size(struct obstack *OBSTACK_PTR);

마지막으로, growing object를 쓴 완전한 예제를 보고 끝내겠습니다. 표준 입력(stdin)에서 텍스트를 읽어서, 띄어쓰기 단위로 한 단어를 읽은 다음, obstack에 할당하고, 이를 리턴하는 함수인 get_word()⁠를 만들겠습니다.

#include <assert.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <obstack.h>

#define obstack_chunk_alloc malloc
#define obstack_chunk_free  free

struct obstack stack_;
struct obstack *stack;

char *
  int ch;

  /*    growing object          */
  assert(obstack_object_size(stack) == 0);

  while ((ch = getchar()) != EOF)
    /*           skip */
    if (!isspace((unsigned char)ch))

  while (ch != EOF) {
    if (isspace((unsigned char)ch))

    /*             EOF     growing object     */
    obstack_1grow(stack, ch);
    ch = getchar();

  if (obstack_object_size(stack) == 0)
    return NULL;

  /*             ,    */
  return obstack_finish(stack);

  char *word;

  stack = &stack_;

  while ((word = get_word()) != NULL)
    printf("word: %s\n", word);

  obstack_free(stack, NULL);
  return 0;

Memory Usage

Obstack은 내부적으로 블럭 단위(보통 4096 byte)로 메모리를 할당해서, 사용자가 요청할 때 쪼개어 보내줍니다. 따라서 동적으로 메모리가 할당되는 과정을 지켜보면 계단식으로 메모리가 요청된다는 것을 예상할 수 있습니다. 아래 그래프는 위 프로그램을 실행시켰을 때, 메모리가 할당되는 과정을 보여줍니다. (빨간색 선이 동적으로 할당되는 메모리 크기입니다)



이외에도 obstack은 여러가지 기능을 제공합니다. (이 글에서는 다루지 않겠지만) 관심있는 분은 GNU C Library 매뉴얼을 찾아보기 바랍니다.

obstack에 관련된 것 중 추가적으로 알려드리고 싶은 것들입니다:

  • 조금씩 할당해 쓰는 방식을 쓸 때, 더욱 빠르게 쓸 수 있는 방법이 있습니다. "Extra Fast Growing Object"란 것인데, 이는 메모리를 얻어쓸 때, obstack이 내부적으로 할당한 메모리 블럭의 크기를 넘지 않는다는 확신이 있을 때 사용합니다. 내부적으로 할당한 메모리 크기는 obstack_room()⁠으로 확인할 수 있습니다.
  • 일반적으로 obstack_init()⁠을 호출하면, obstack은 먼저 커다란 메모리 블럭을 하나 할당하고 나서 시작합니다. 시스템에 따라 다르지만, 대개 이 크기는 4096 byte입니다. 만약, 이 초기 블럭의 크기가 너무 크다고 생각하면, (매뉴얼에는 나와 있지 않지만) obstack_init() 대신에 obstack_begin()⁠을 써서, 초기 크기가 적은 obstack을 만들 수 있습니다. (자세한 것은 obstack의 소스를 참고하기 바랍니다)
  • obstack이, 내부적으로 메모리를 할당하다가 메모리 부족 현상이 발생하면 에러를 리턴하지 않고, 에러 처리 함수를 호출합니다. 이 함수를 바꾸려면, 전역 함수 포인터인 obstack_alloc_failed_handler⁠를 적당하게 바꿔주면 됩니다. 물론 이 함수 포인터를 적절하게 바꿔서, obstack 관련 모든 함수가 에러가 발생할 경우, 에러를 리턴하는 방식으로 wrapper를 만들 수도 있습니다.