Sunday, November 01, 2009

Imporve serialize() in R for windows

serialize() function of R is very slow in Windows. My observation is that calling realloc is the bottle neck. The workaround can be replacing realloc with Rm_realloc in $R_HOME/src/main/serialize.c. More specifically, I have updated as follows:

1. In function, resize_buffer(...), replace
mb->buf = realloc(mb->buf, newsize);
with
mb->buf = Rm_realloc(mb->buf, newsize);

2. In function, free_mem_buffer(...), replace
free(buf);
with
Rm_free(buf);


The following is a short result running on my Windows 7 desktop:
The original:
> system.time(serialize(matrix(0, 1000, 1000), NULL))
user system elapsed
5.74 4.39 10.15
> system.time(serialize(matrix(0, 2000, 2000), NULL))
user system elapsed
85.40 74.80 161.62

After updating:
> system.time(serialize(matrix(0, 1000, 1000), NULL))
user system elapsed
0.78 0.30 1.10
> system.time(serialize(matrix(0, 2000, 2000), NULL))
user system elapsed
6.21 4.13 10.54

No comments: