Tarantool User Guide, version 1.5.1-219-g1d00e15


1. Preface
Tarantool: an overview
Conventions
Reporting bugs
2. Getting started
Downloading and installing a binary package
Downloading and building a source package
Starting Tarantool and making your first database
3. Data model and data persistence
Dynamic data model
Data persistence
4. Language reference
Data manipulation
Memcached protocol
Administrative console
Writing stored procedures in Lua
Package box
Package box.tuple
Package box.cjson
Package box.space
Package box.index
Package box.fiber
Package box.session
Package box.ipc — inter procedure communication
Package box.socket — TCP and UDP sockets
Package box.net.box — working with networked Tarantool peers
Packages box.cfg, box.info, box.slab and box.stat: server introspection
Limitations of stored procedures
Defining triggers in Lua
Triggers on connect and disconnect
5. Replication
Replication architecture
Setting up the master
Setting up a replica
Recovering from a degraded state
6. Server administration
Server signal handling
Utility tarantar
Utility tarancheck
Utility tarantool_deploy
System-specific administration notes
Debian GNU/Linux and Ubuntu
Fedora, RHEL, CentOS
FreeBSD
Mac OS X
7. Configuration reference
Command line options
The configuration file
8. Connectors
Packet example
C
Erlang
Java
node.js
Perl
PHP
Python
Ruby
A. Server process titles
B. List of error codes
C. Limitations
D. Client reference

Chapter 1. Preface

Tarantool: an overview

Tarantool is an in-memory NoSQL database management system. The code is available for free under the terms of the BSD license. Supported platforms are GNU/Linux, Mac OS and FreeBSD.

The server maintains all its data in random-access memory, and therefore has very low read latency. The server keeps persistent copies of the data in non-volatile storage, such as disk, when users request "snapshots". The server maintains a write-ahead log (WAL) to ensure consistency and crash safety of the persistent copies. The server performs inserts and updates atomically -- changes are not considered complete until the WAL is written. The logging subsystem supports group commit.

When the rate of data changes is high, the write-ahead log file (or files) can grow quickly. This uses up disk space, and increases the time necessary to restart the server (because the server must start with the last snapshot, and then replay the transactions that are in the log). The solution is to make snapshots frequently. Therefore the server ensures that snapshots are quick, resource-savvy, and non-blocking . To accomplish this, the server uses delayed garbage collection for data pages and uses a copy-on-write technique for index pages. This ensures that the snapshot process has a consistent read view.

Tarantool is lock-free. Instead of the operating system's concurrency primitives, such as mutexes, Tarantool uses cooperative multitasking to handle thousands of connections simultaneously. There is a fixed number of independent execution threads. The threads do not share state. Instead they exchange data using low-overhead message queues. While this approach limits the number of cores that the server will use, it removes competition for the memory bus and ensures peak scalability of memory access and network throughput. CPU utilization of a typical highly-loaded Tarantool server is under 10%.

Unlike most NoSQL DBMSs, Tarantool supports secondary index keys and multi-part index keys as well as primary keys. The possible index types are HASH, TREE, and BITSET.

As its key feature, Tarantool supports Lua stored procedures, which can access and modify data atomically. Users can create, modify and drop Lua procedures at runtime.

There is a role not only for Lua procedures, but also for Lua programs. During startup, Lua programs can be used to define triggers and background tasks, or interact with networked peers. Unlike popular application development frameworks based on a "reactor" pattern, networking in server-side Lua is sequential, yet very efficient, as it is built on top of the cooperative multitasking environment that Tarantool itself uses.

Extended with Lua, Tarantool typically replaces multiple components of an existing system. Complex multi-tier Web application architectures become simpler, and performance is good.

Tarantool supports asynchronous replication , locally or to a remote host. Replication does not cause blocking of writes to the master database. If the master becomes unavailable, a replica can assume the master role without requiring a restart.

Tarantool is in production today. Tarantool was created by and is actively used at Mail.Ru, one of the leading Russian web content providers. At Mail.Ru the software serves the hottest data, such as online users and their sessions, online application properties, mapping between users and their serving shards, and so on.

Outside Mail.Ru the software is used by a growing number of projects in online gaming, digital marketing, and social media industries. While product development is sponsored by Mail.Ru, the roadmap, the bugs database and the development process are fully open. The software incorporates patches from dozens of community contributors. The Tarantool community writes and maintains most of the drivers for programming languages.

Conventions

This manual is written in DocBook 5 XML markup language and is using the standard DocBook XSL formatting conventions:

UNIX shell command input is prefixed with '$ ' and is in a fixed-width font:

$ tarantool_box --background

File names are also in a fixed-width font:

/path/to/var/dir

Text that represents user input is in boldface:

$ your input here

Within user input, replaceable items are in italics:

$ tarantool_box --option

Reporting bugs

Please report bugs in Tarantool at http://github.com/tarantool/tarantool/issues. You can contact developers directly on #tarantool IRC channel or via a mailing list, Tarantool Google group.

Chapter 2. Getting started

This chapter shows how to download, how to install, and how to start Tarantool for the first time.

For production, if possible, you should download a binary (executable) package. This will ensure that you have the same build of the same version that the developers have. That makes analysis easier if later you need to report a problem, and avoids subtle problems that might happen if you used different tools or different parameters when building from source. All programs in the binary tarballs are linked statically so there will be no external dependencies. The section about binaries is Downloading and installing a binary package.

For development, you will want to download a source package and make the binary by yourself using a C/C++ compiler and common tools. Although this is a bit harder, it gives more control. And the source packages include additional files, for example the Tarantool test suite. The section about source is Downloading and building a source package.

If the installation has already been done, then you should try it out. So we've provided some instructions that you can use to make a temporary sandbox. In a few minutes you can start the server, start the client, and type in some database-manipulation statements. The section about sandbox is Starting Tarantool and making your first database.

Downloading and installing a binary package

The repositories for the stable release are at tarantool.org/dist. The repositories for the master release are at tarantool.org/dist/master. Since this is the manual for the stable release, all instructions use tarantool.org/dist.

An automatic build system creates, tests and publishes packages for every push into the stable branch. Therefore if you looked at tarantool.org/dist you would see many files. Names of binary packages have the format tarantool-<version>-<OS>-<machine>.tar.gz. Here is one example:

tarantool-1.5.1-97-g8e8cd06-linux-x86_64.tar.gz    26-Sep-2013 15:55             3664777

which means Tarantool package, major version = 1, minor version number = 5, patch number 1, git revision id g8e8cd06, is a Linux x86 64-bit compressed tarball, pushed on September 26 2013, which contains 3.6 MB.

To download and install the package that's appropriate for your environment, start a shell (terminal) and enter one of the following sets of command-line instructions.

# DEBIAN commands for Tarantool stable binary download:
# There is always an up-to-date Debian repository at http://tarantool.org/dist/debian
# The repository contains builds for Debian unstable "Sid", stable "Wheezy", forthcoming "Jessie", ...
# add the tarantool.org repository to your apt sources list:
# ($release is an environment variable which will contain the Debian version code e.g. "Wheezy")
wget http://tarantool.org/dist/public.key
sudo apt-key add ./public.key
release=`lsb_release -c -s`
# append two lines to a list of source repositories
echo "deb http://tarantool.org/dist/debian/ $release main" | sudo tee -a /etc/apt/sources.list.d/tarantool.list
echo "deb-src http://tarantool.org/dist/debian/ $release main" | sudo tee -a /etc/apt/sources.list.d/tarantool.list
# install
sudo apt-get update
sudo apt-get install tarantool tarantool-client

# UBUNTU commands for Tarantool stable binary download:
# There is always an up-to-date Ubuntu repository at http://tarantool.org/dist/ubuntu
# The repository contains builds for Ubuntu 12.04 "precise", 12.10 "quantal", 13.04 "raring", 13.10 "saucy", ...
# add the tarantool.org repository to your apt sources list:
# ($release is an environment variable which will contain the Ubuntu version code e.g. "precise")
# (if you want the version that comes with Ubuntu, start with the lines that follow the '# install' comment)
cd ~
wget http://tarantool.org/dist/public.key
sudo apt-key add ./public.key
release=`lsb_release -c -s`
# append two lines to a list of source repositories
echo "deb http://tarantool.org/dist/ubuntu/ $release main" | sudo tee -a /etc/apt/sources.list.d/tarantool.list
echo "deb-src http://tarantool.org/dist/ubuntu/ $release main" | sudo tee -a /etc/apt/sources.list.d/tarantool.list
# install
sudo apt-get update
sudo apt-get install tarantool tarantool-client

# CENTOS commands for Tarantool stable binary download:
# These instructions are applicable for CentOS version 5 or 6, and RHEL version 5 or 6
# Pick the CentOS repository which fits your CentOS/RHEL version and your x86 platform:
# http://tarantool.org/dist/centos/5/os/i386 for version 5, x86-32
# http://tarantool.org/dist/centos/5/os/x86_64 for version 5, x86-64
# http://tarantool.org/dist/centos/6/os/i386 for version 6, x86-32
# http://tarantool.org/dist/centos/6/os/x86_64 for version 6, x86-64
# Add the following section to your yum repository list (/etc/yum.repos.d/tarantool.repo):
# (in the following instructions, $releasever i.e. CentOS release version must be either 5 or 6)
# (in the following instructions, $basearch i.e. base architecture must be either i386 or x86_64)
# [tarantool]
# name=CentOS-$releasever - Tarantool
# baseurl=http://tarantool.org/dist/centos/$releasever/os/$basearch/
# enabled=1
# gpgcheck=0
# For example, if you have CentOS version 6 and x86-64, you can
# add the new section thus:
echo "[tarantool]" | sudo tee /etc/yum.repos.d/tarantool.repo
echo "name=CentOS-6 - Tarantool"| sudo tee -a /etc/yum.repos.d/tarantool.repo
echo "baseurl=http://tarantool.org/dist/centos/6/os/x86_64/" | sudo tee -a /etc/yum.repos.d/tarantool.repo
echo "enabled=1" | sudo tee -a /etc/yum.repos.d/tarantool.repo
echo "gpgcheck=0" | sudo tee -a /etc/yum.repos.d/tarantool.repo

# GENTOO commands for Tarantool stable binary download:
# Tarantool is available from tarantool portage overlay. Use layman to add the overlay to your system:
layman -S
layman -a tarantool
emerge dev-db/tarantool -av

# ANY-LINUX commands for Tarantool stable binary download:
# If you have a GNU/Linux distribution which is not one of the above,
# or if you want to install on your own subdirectory without affecting /usr /etc etc.,
# start your browser and go to
# http://tarantool.org/download.html download page.
# Look for words similar to "Other Linux distributions". You will want the
# binary tarball (.tar.gz) file for your release architecture (32-bit or 64-bit).
# Click on either "32-bit" or "64-bit" depending on your release architecture.
# This will cause a download of the latest stable tarball.
# Suppose it is tarantool-1.5.1-133-g11edda1-linux-x86_64.tar.gz. Say:
tar zxvf tarantool-1.5.1-133-g11edda1-linux-x86_64.tar.gz
# You now have a directory named tarantool-1.5.1-133-g11edda1-linux-x86_64.
# Let's move it to ~/tarantool, which is an easier name to remember.
mv tarantool-1.5.1-133-g11edda1-linux-x86_64 ~/tarantool
# Within it there is a subdirectory /bin containing both server and client.

# FREEBSD commands for Tarantool stable binary download:
# With your browser go to the FreeBSD ports page
# http://www.freebsd.org/ports/index.html
# Enter the search term: tarantool
# Choose the package you want ...
# However, there is a known issue with the binary of Tarantool
# version 1.5, see # https://github.com/tarantool/tarantool/issues/19.

# MAC OS X commands for Tarantool stable binary download:
# This is actually a homebrew recipe
# so it's not a true binary download, some source code is involved.
# First upgrade Clang (the C compiler) to version 3.2 or later using
# Command Line Tools for Xcode disk image version 4.6+ from Apple Developer web-site.
brew install --use-clang http://tarantool.org/dist/tarantool.rb
# or
brew install http://tarantool.org/dist/tarantool.rb

More advice for binary downloads is at http://tarantool.org/download.html.

Downloading and building a source package

For downloading Tarantool source and building it, the platforms can differ and the preferences can differ. But the steps are always the same. Here in the manual we'll explain what the steps are, then on the Internet you can look at some example scripts.

1. Get tools and libraries that will be necessary for building and testing. The absolutely necessary ones are:

  • A program for downloading source repositories. In this case the necessary program is git. Although tarantool.org/dist has source tarballs (the files whose names end in -src.tar.gz), the latest complete source downloads are on github.com, and from github one gets with git.

  • A C/C++ compiler. Ordinarily the compiler is GCC version 4.5 or later, on Mac OS X it should be Clang version 3.2 or later. There may be some benefit in rebuilding gcc to suit tarantool requirements.

  • A program for managing the build process. This is always CMake for GNU/Linux and FreeBSD.

Here are names of tools and libraries which may have to be installed in advance, using sudo apt-get (for Ubuntu), sudo yum install (for CentOS), or the equivalent on other platforms. Different platforms may use slightly different names. Do not worry about the optional, for build with -DENABLE_DOC ones unless you intend to work on the documentation.

   binutils-dev or binutils-devel        # contains GNU bfd for printing stack traces
   gcc or clang                          # see above
   git                                   # see above
   uuid-dev                              # optional, for box_uuid_* functions
   cmake                                 # see above
   libreadline-dev                       # optional, for build with -DENABLE_CLIENT
   libncurses5-dev or ncurses-devel      # optional, for build with -DENABLE_CLIENT
   xsltproc                              # optional, for build with -DENABLE_DOC
   lynx                                  # optional, for build with -DENABLE_DOC
   jing                                  # optional, for build with -DENABLE_DOC
   libxml2-utils                         # optional, for build with -DENABLE_DOC
   docbook5-xml                          # optional, for build with -DENABLE_DOC
   docbook-xsl-ns                        # optional, for build with -DENABLE_DOC
   w3c-sgml-lib                          # optional, for build with -DENABLE_DOC
   libsaxon-java                         # optional, for build with -DENABLE_DOC
   libxml-commons-resolver1.1-java       # optional, for build with -DENABLE_DOC
   libxerces2-java                       # optional, for build with -DENABLE_DOC
   libxslthl-java                        # optional, for build with -DENABLE_DOC
   autoconf                              # optional, appears only in Mac OS scripts
   zlib1g or zlib                        # optional, appears only in Mac OS scripts

2. Pick a default directory. This can be anywhere. We'll assume that your default directory is ~, and therefore the tarantool download will go inside it, as ~/tarantool.

3. Use git to download from github.com.

cd ~
git clone -b stable https://github.com/tarantool/tarantool.git tarantool

The optional argument -b stable causes download from the stable branch instead of the master branch, and the optional last word on the line, tarantool, means download is to ~/tarantool.

4. Use git again so that third-party contributions will be seen as well. This step is only necessary once, the first time you do a download. There is an alternative -- say git clone --recursive in step 3 -- but we prefer this method because it works with older versions of git.

cd ~/tarantool
git submodule init
git submodule update
cd ../

5. Use CMake to initiate the build.

cd ~/tarantool
make clean         # unnecessary, added for good luck
rm CMakeCache.txt  # unnecessary, added for good luck
cmake .            # The command to initiate with build type=Debug, no client, no doc

The option for specifying build type is -DCMAKE_BUILD_TYPE=type where type = {None | Debug | Release | RelWithDebInfo | MinSizeRel} and a reasonable choice for production is -DCMAKE_BUILD_TYPE=RelWithDebInfo (Debug is used only by project maintainers and Release is used only when the highest performance is required). The option for asking to build client is -DENABLE_CLIENT={true|false} and a reasonable choice is -DENABLE_CLIENT=true. The option for asking to build documentation is -DENABLE_DOC={true|false} and the assumption is that only a minority will need to rebuild the documentation (such as what you're reading now), so details about documentation are in the developer manual, and the reasonable choice is -DENABLE_DOC=false or just don't use the -DENABLE_DOC clause at all.

6. Use make to complete the build.

make

It's possible to say make install too, but that's not generally done.

7. Set up python modules for running the test suite. This step is optional.

Say:

python --version

... You should see that the python version is between 2.6 and 3.

On Ubuntu you can get modules from the repository:

sudo apt-get install python-daemon python-yaml python-argparse python-pexpect

On CentOS too you can get modules from the repository:

sudo yum install python26 python26-PyYAML python26-argparse

But in general it is best to set up the modules by getting a tarball and doing the setup with python setup.py, thus:

# python module for parsing YAML (pyYAML):
# (If wget fails, check the PyYAML web site to see what the current version is.)
cd ~
wget http://pyyaml.org/download/pyyaml/PyYAML-3.10.tar.gz
tar -xzf PyYAML-3.10.tar.gz
cd PyYAML-3.10
sudo python setup.py install
# python module for spawning child applications (pexpect):
# (If wget fails, check the  python-pexpect web site to see what the current version is.)
cd ~
wget http://pypi.python.org/packages/source/p/pexpect-u/pexpect-u-2.5.1.tar.gz
tar -xzvf pexpect-u-2.5.1.tar.
cd pexpect-u-2.5.1
sudo python setup.py install
# python module for assisting programs to turn themselves into daemons (daemon):
# (if wget fails, check the python-daemon web site to see what the current version is.)
cd ~
wget http://pypi.python.org/packages/source/d/daemon/daemon-1.0.tar.gz
tar -xzvf daemon-1.0.tar.gz
cd daemon-1.0
sudo python setup.py install

8. Run the test suite. This step is optional.

Tarantool's developers always run the test suite before they publish new versions. You should run the test suite too, if you make any changes in the code. Assuming you downloaded to ~/tarantool, the principal steps are:

mkdir ~/tarantool/bin    # make a subdirectory named bin
ln usr/bin/python ~/tarantool/bin/python # link python to bin
cd ~/tarantool/test #get on the test subdirectory
PATH=~/tarantool/bin:$PATH ./run #run tests using python

The output should contain reassuring reports, for example

======================================================================
TEST                                             RESULT
------------------------------------------------------------
box/admin.test                                  [ pass ]
box/admin_coredump.test                         [ pass ]
box/args.test                                   [ pass ]
box/cjson.test                                  [ pass ]
box/configuration.test                          [ pass ]
box/errinj.test                                 [ pass ]
box/fiber.test                                  [ pass ]
... etc.

There are more than 70 tests in the suite. To prevent later confusion, clean up what's in the bin subdirectory:

rm ~/tarantool/bin/python
rmdir ~/tarantool/bin

9. Make an rpm. This step is optional. It's only for people who want to redistribute Tarantool. Ordinarily it should be skipped. It will add a new subdirectory: ~/tarantool/RPM.

make rpm

This is the end of the list of steps to take for source downloads.

For your added convenience, github.com has README files with example scripts: README.CentOS for CentOS 5.8, README.FreeBSD for FreeBSD 8.3, README.MacOSX for Mac OS X Lion, README.md for generic GNU/Linux. These example scripts assume that the intent is to download from the master branch, build the server and the client (but not the documentation), and run tests after build.

Starting Tarantool and making your first database

Here is how to create a simple test database after installing.

1. Create a new directory. It's just for tests, you can delete it when the tests are over.

mkdir ~/tarantool_sandbox
cd ~/tarantool_sandbox
mkdir work_dir

2. Create a configuration file. The Tarantool server requires a configuration file with some definitions of ports and database objects. The server, by default, looks for its configuration file in the current working directory and in etc/. Enter the following commands which make a minimal configuration file that will be suitable for day one.

echo "slab_alloc_arena = 0.1" | tee tarantool.cfg
echo "pid_file = \"box.pid\"" | tee -a tarantool.cfg
echo "primary_port = 33013" | tee -a tarantool.cfg
echo "secondary_port = 33014" | tee -a tarantool.cfg
echo "admin_port = 33015" | tee -a tarantool.cfg
echo "rows_per_wal = 50000" | tee -a tarantool.cfg
echo "space[0].enabled = 1" | tee -a tarantool.cfg
echo "space[0].index[0].type = \"HASH\"" | tee -a tarantool.cfg
echo "space[0].index[0].unique = 1" | tee -a tarantool.cfg
echo "space[0].index[0].key_field[0].fieldno = 0" | tee -a tarantool.cfg
echo "space[0].index[0].key_field[0].type = \"NUM\"" | tee -a tarantool.cfg
echo "logger = \"tee --append tarantool.log\"" | tee -a tarantool.cfg
echo "work_dir = \"work_dir\"" | tee -a tarantool.cfg
(With some downloads a tarantool.cfg file like this is already available in a test subdirectory.)

Initialize the storage area. You only have to do this once.

/usr/bin/tarantool_box --init-storage            #if you downloaded a binary with apt-get or yum
~/tarantool/bin/tarantool_box --init-storage     #if you downloaded and untarred a binary tarball to ~/tarantool
~/tarantool/src/box/tarantool_box --init-storage #f you built from a source download 

Start the server. The server name is tarantool_box.

/usr/bin/tarantool_box             #if you downloaded a binary with apt-get or yum
~/tarantool/bin/tarantool_box     	#if you downloaded and untarred a binary tarball to ~/tarantool
~/tarantool/src/box/tarantool_box 	#f you built from a source download

If all goes well, you will see the server displaying progress as it initializes, something like this:

2013-10-18 20:20:36.806 [16560] 1/sched C> version 1.5.1-141-ga794d35
2013-10-18 20:20:36.830 [16560] 1/sched I> Loading plugin: /usr/lib/tarantool/plugins/libmysql.so
2013-10-18 20:20:37.016 [16560] 1/sched I> Plugin 'mysql' was loaded, version: 1
2013-10-18 20:20:37.016 [16560] 1/sched I> Loading plugin: /usr/lib/tarantool/plugins/libpg.so
2013-10-18 20:20:37.044 [16560] 1/sched I> Plugin 'postgresql' was loaded, version: 1
2013-10-18 20:20:37.044 [16560] 1/sched I> space 0 successfully configured
2013-10-18 20:20:37.044 [16560] 1/sched I> recovery start
2013-10-18 20:20:37.060 [16560] 1/sched I> recover from `./00000000000000000001.snap'
2013-10-18 20:20:37.060 [16560] 1/sched I> snapshot recovered, confirmed lsn: 1
2013-10-18 20:20:37.070 [16560] 1/sched I> done `./00000000000000000002.xlog' confirmed_lsn: 2
2013-10-18 20:20:37.070 [16560] 1/sched I> WALs recovered, confirmed lsn: 2
2013-10-18 20:20:37.070 [16560] 1/sched I> building secondary indexes
2013-10-18 20:20:37.070 [16560] 1/sched I> bound to primary port 33013
2013-10-18 20:20:37.070 [16560] 1/sched I> I am primary
2013-10-18 20:20:37.070 [16560] 1/sched I> bound to secondary port 33014
2013-10-18 20:20:37.070 [16560] 1/sched I> bound to admin port 33015
2013-10-18 20:20:37.071 [16560] 1/sched C> log level 4
2013-10-18 20:20:37.071 [16560] 1/sched C> entering event loop

Now take the server down, with

Ctrl+C

Now start the server again. This time start it in the background.

/usr/bin/tarantool_box --background             #if you downloaded a binary with apt-get or yum
~/tarantool/bin/tarantool_box --background     	#if you downloaded and untarred a binary tarball to ~/tarantool
~/tarantool/src/box/tarantool_box --background 	#f you built from a source download

If all went well, there is now an instance of the Tarantool server running in the background. You can confirm that with the command:

ps -a | grep tarantool_box

or look at the log file:

less work_dir/tarantool.log

Please follow distribution-specific instructions to find out how to manage Tarantool instances on your operating system.

Note

Alternatively, the server can be started right out of the in-source build. Use the Tarantool regression testing framework:

$ ./test/run --start-and-exit

It will create necessary files in directory ./test/var/, and start the server with minimal configuration.

Now that the server is up, you can start the client. The client name is tarantool.

/usr/bin/tarantool                      #If you downloaded a binary with apt-get or yum:  
~/tarantool/bin/tarantool               #If you downloaded and untarred a binary tarball to ~/tarantool:  
~/tarantool/client/tarantool/tarantool  #If you built from a source download on ~tarantool

If all goes well, a prompt will appear:

localhost>

The client is waiting for the user to type instructions.

To insert three tuples (our name for records) into the first space of the database (which is called t0), try this:

localhost> INSERT INTO t0 VALUES (1)
localhost> INSERT INTO t0 VALUES (2,'Music')
localhost> INSERT INTO t0 VALUES (3,'length',93)

To select a tuple from the first space of the database, using the first defined key (which is called k0), try this:

localhost> SELECT * FROM t0 WHERE k0 = 3

Your terminal screen should now look like this:

localhost> INSERT INTO t0 VALUES (1)
Insert OK, 1 rows affected
localhost> INSERT INTO t0 VALUES (2,'Music')
Insert OK, 1 rows affected
localhost> INSERT INTO t0 VALUES (3,'Length',93)
Insert OK, 1 rows affected
localhost> SELECT * FROM t0 WHERE k0 = 3
Select OK, 1 rows affected
[3, 'Length', 93]
localhost>>

You can repeat INSERT and SELECT indefinitely. When the testing is over: To stop the client: Ctrl+C. To stop the server: sudo pkill -f tarantool_box. To destroy the test: rm -r ~/tarantool_sandbox.

Chapter 3. Data model and data persistence

This chapter describes how Tarantool stores values and what operations with data it supports.

Dynamic data model

If you tried out the Starting Tarantool and making your first database exercise from the last chapter, then your database looks like this:

+--------------------------------------------+
|                                            |
| SPACE 'space[0]'                           |
| +----------------------------------------+ |
| |                                        | |
| | TUPLE SET 't0'                         | |
| | +-----------------------------------+  | |
| | | Tuple: [ 1 ]                      |  | |
| | | Tuple: [ 2, 'Music' ]             |  | |
| | | Tuple: [ 3, 'length', 93 ]        |  | |
| | +-----------------------------------+  | |
| |                                        | |
| | INDEX 'index[0]'                       | |
| | +-----------------------------------+  | |
| | | Key: 1                            |  | |
| | | Key: 2                            |  | |
| | | Key: 3                            |  | |
| | +-----------------------------------+  | |
| |                                        | |
| +----------------------------------------+ |
+--------------------------------------------+

Space

A space -- 'space[0]' in the example -- is a container.

There is always at least one space; there can be many spaces, numbered as space[0], space[1], and so on. Spaces always contain one tuple set and one or more indexes.

Tuple Set

A tuple set -- 't0' in the example -- is a group of tuples.

There is always one tuple set in a space. For the tarantool client, the identifier of a tuple set is t followed by the space's number, for example t0 refers to the tuple set of space[0]. (The letter t stands for tuple set.)

A tuple fills the same role as a row or a record, and the components of a tuple (which we call fields) fill the same role as a row column or record field, except that: the fields of a tuple don't need to have names. That's why there was no need to pre-define the tuple set in the configuration file, and that's why each tuple can have a different number of elements, and that's why we say that Tarantool has a dynamic data model.

Any given tuple may have any number of fields and the fields may have any of these three types: NUM (32-bit unsigned integer between 0 and 2,147,483,647), NUM64 (64-bit unsigned integer between 0 and 18,446,744,073,709,551,615), or STR (string, any sequence of octets). The identifier of a field is k followed by the field's number, for example k0 refers to the first field of a tuple.

Note

This manual is following the tarantool client convention by using tuple identifier = t followed by the space's number, and using field identifier = k followed by the field's number. The server knows nothing about such identifiers, it only cares about the number. Other clients follow different conventions, and may even have sophisticated ways of mapping meaningful names to numbers.

When the tarantool client displays a tuple, it surrounds strings with single quotes, separates fields with commas, and encloses the tuple inside square brackets. For example: [ 3, 'length', 93 ].

Index

An index -- 'index[0]' in the example -- is a group of key values and pointers.

There is always at least one index in a space; there can be many. The identifier of an index is 'index' followed by the index's number within the space, so in our example there is one index and its identifier is index[0].

An index may be multi-field, that is, the user can declare that an index key value is taken from two or more fields in the tuple, in any order. An index may be unique, that is, the user can declare that it would be illegal to have the same key value twice. An index may have one of three types: HASH which is fastest and uses the least memory but must be unique, TREE which allows partial-key searching and ordered results, and BITSET which can be good for searches that contain '=' and 'AND' in the WHERE clause. The first index -- index[0] -- is called the primary key index and it must be unique; all other indexes -- index[1], index[2], and so on -- are secondary indexes.

An index definition always includes at least one identifier of a tuple field and its expected type. Take our example configuration file, which has the lines:

space[0].index[0].key_field[0].fieldno = 0
space[0].index[0].key_field[0].type = "NUM"

The effect is that, for all tuples in t0, field number 0 (k0) must exist and must be a 32-bit unsigned integer.

For the current version of the Tarantool server, space definitions and index definitions must be in the configuration file. Administrators must take care that what's in the configuration file matches what's in the database. If a server is started with the wrong configuration file, it could behave in an unexpected way or crash. However, it is possible to stop the server or disable database accesses, then add new spaces and indexes, then restart the server or re-enable database accesses. The syntax details for defining spaces and indexes are in chapter 7 Configuration reference.

Operations

The basic operations are: the four data-change operations (INSERT, UPDATE, DELETE, REPLACE), and the data-retrieval operation (SELECT). There are also minor operations like ping which are not available via the tarantool client's SQL-like interface but can only be used with the binary protocol. Also, there are index iterator operations, which can only be used with Lua stored procedures. (Index iterators are for traversing indexes one key at a time, taking advantage of features that are specific to an index type, for example evaluating Boolean expressions when traversing BITSET indexes, or going in descending order when traversing TREE indexes.)

Five examples of basic operations:

/* Add a new tuple to tuple set t0.
   The first field, k0, will be 999 (type is NUM).
   The second field, k1, will be 'Taranto' (type is STR). */
INSERT INTO t0 VALUES (999,'Taranto')

/* Update the tuple, changing field k1.
   The clause "WHERE primary-key-field-identifier = value is mandatory
   because UPDATE statements must always have a WHERE clause that
   specifies the primary key, which in this case is k0. */
UPDATE t0 SET k1 = 'Tarantino' WHERE k0 = 999

/* Replace the tuple, adding a new field.
   This is not possible with the UPDATE statement because
   the SET clause of an UPDATE statement can only refer to
   fields that already exist. */
REPLACE INTO t0 VALUES (999,'Tarantella',Tarantula')

/* Retrieve the tuple.
   The WHERE clause is still mandatory, although it does not have to
   mention the primary key. */
SELECT * FROM t0 WHERE k0 = 999

/* Delete the tuple.
   Once again the clause "WHERE k0 = value is mandatory. */
DELETE FROM t0 WHERE k0 = 999

How does Tarantool do a basic operation? Let's take this example:

UPDATE t0 SET k1 = 'size', k2=0 WHERE k0 = 3

STEP #1: the client parses the statement and changes it to a binary-protocol instruction which has already been checked, and which the server can understand without needing to parse everything again. The client ships a packet to the server.

STEP #2: the server's transaction processor thread uses the primary-key index on field k0 to find the location of the tuple in memory. It determines that the tuple can be updated (not much can go wrong when you're merely changing an unindexed field value to something shorter).

STEP #3: the transaction processor thread sends a message to the write-ahead logging (WAL) thread.

At this point a yield takes place. To know the significance of that -- and it's quite significant -- you have to know a few facts and a few new words.

FACT #1: there is only one transaction processor thread. Some people are used to the idea that there can be multiple threads operating on the database, with (say) thread #1 reading row #x while thread#2 writes row#y. With Tarantool no such thing ever happens. Only the transaction processor thread can access the database, and there is only one transaction processor thread for each instance of the server.

FACT #2: the transaction processor thread can handle many fibers. A fiber is a set of computer instructions that may contain yield signals. The transaction processor thread will execute all computer instructions until a yield, then switch to execute the instructions of a different fiber. Thus (say) the thread reads row#x for the sake of fiber#1, then writes row#y for the sake of fiber#2.

FACT #3: yields must happen, otherwise the transaction processor thread would stick permanently on the same fiber. There are implicit yields: every data-change operation or network-access causes an implicit yield, and every statement that goes through the tarantool client causes an implicit yield. And there are explicit yields: in a Lua stored procedure one can and should add yield statements to prevent hogging. This is called cooperative multitasking.

Since all data-change operations end with an implicit yield and an implicit commit, and since no data-change operation can change more than one tuple, there is no need for any locking. Consider, for example, a stored procedure that does three operations:

SELECT              /* this does not yield and does not commit */
UPDATE              /* this yields and commits */
SELECT              /* this does not yield and does not commit */

The combination SELECT plus UPDATE is an atomic transaction: the stored procedure holds a consistent view of the database until the UPDATE ends. For the combination UPDATE plus SELECT the view is not consistent, because after the UPDATE the transaction processor thread can switch to another fiber, and delete the tuple that was just updated.

Since locks don't exist, and disk writes only involve the write-ahead log, transactions are usually fast. Also the Tarantool server may not be using up all the threads of a powerful multi-core processor, so advanced users may be able to start a second Tarantool server on the same processor without ill effects.

Additional examples of SQL statements can be found in the Tarantool regression test suite. A complete grammar of supported SQL is provided in the Language reference chapter.

Since not all Tarantool operations can be expressed in SQL, to gain complete access to data manipulation functionality one must use a Perl, Python, Ruby or other programming language connector. The client/server protocol is open and documented: an annotated BNF can be found in the source tree, file doc/protocol.txt.

Data persistence

To maintain data persistence, Tarantool writes each data change request (INSERT, UPDATE, DELETE) into a write-ahead log. WAL files have extension .xlog and are stored in wal_dir. A new WAL file is created for every rows_per_wal records. Each INSERT, UPDATE or DELETE gets assigned a continuously growing 64-bit log sequence number. The name of the log file is based on the log sequence number of the first record this file contains.

Apart from a log sequence number and the data change request (its format is the same as in the binary protocol and is described in doc/box-protocol.txt), each WAL record contains a checksum and a UNIX time stamp.

Tarantool processes requests atomically: a change is either accepted and recorded in the WAL, or discarded completely. Let's clarify how this happens, using REPLACE command as an example:

  1. The server attempts to locate the original tuple by primary key. If found, a reference to the tuple is retained for later use.

  2. The new tuple is then validated. If it violates a unique-key constraint, misses an indexed field, or an index-field type does not match the type of the index, the change is aborted.

  3. The new tuple replaces the old tuple in all existing indexes.

  4. A message is sent to WAL writer running in a separate thread, requesting that the change is recorded in the WAL. The server switches to work on the next request until the write is acknowledged.

  5. On success, a confirmation is sent to the client. Upon failure, a rollback procedure is begun. During the rollback procedure, the transaction processor rolls back all changes to the database which occurred after the first failed change, from latest to oldest, up to the first failed change. All rolled back requests are aborted with ER_WAL_IO error. No new change is applied while rollback is in progress. When the rollback procedure is finished, the servers restarts the processing pipeline.

One advantage of the described algorithm is that complete request pipelining is achieved, even for requests on the same value of the primary key. As a result, database performance doesn't degrade even if all requests touch upon the same key in the same space.

The transaction processor and the WAL writer threads communicate using asynchronous (yet reliable) messaging; the transaction processor thread, not being blocked on WAL tasks, continues to handle requests quickly even at high volumes of disk I/O. A response to a request is sent as soon as it is ready, even if there were earlier incomplete requests on the same connection. In particular, SELECT performance, even for SELECTs running on a connection packed with UPDATEs and DELETEs, remains unaffected by disk load.

WAL writer employs a number of durability modes, as defined in configuration variable wal_mode. It is possible to turn the write ahead log completely off, by setting wal_mode to none. Even without the write ahead log it's still possible to take a persistent copy of the entire data set with SAVE SNAPSHOT.

Chapter 4. Language reference

This chapter provides a reference of Tarantool data operations and administrative commands.

Digression: data and administrative ports

Unlike many other key/value servers, Tarantool uses different TCP ports and client/server protocols for data manipulation and administrative statements. During start up, the server can connect to up to five TCP ports:

  • Read/write data port, to handle INSERTs, UPDATEs, DELETEs, SELECTs and CALLs. This port speaks the native Tarantool protocol, and provides full data access.

    The default value of the port is 33013, as defined in primary_port configuration option.

  • Read only port, which only accepts SELECTs and CALLs, default port number 33014, as defined in secondary_port configuration option.

  • Administrative port, which defaults to 33015, and is defined in admin_port configuration option.

  • Replication port (see replication_port), by default set to 33016, used to send updates to replicas. Replication is optional, and if this port is not set in the configuration file, the corresponding server process is not started.

  • Memcached port. Optional, read-write data port that speaks Memcached text protocol. This port is off by default.

In absence of authentication, this approach allows system administrators to restrict access to read/write or administrative ports. The client, however, has to be aware of the separation, and tarantool command line client automatically selects the correct port for you with help of a simple regular expression. SELECTs, UPDATEs, INSERTs, DELETEs and CALLs are sent to the primary port. SHOW, RELOAD, SAVE and other statements are sent to the administrative port.

Data manipulation

Five basic request types are supported: INSERT, UPDATE, DELETE, SELECT and CALL. All requests, including INSERT, UPDATE and DELETE may return data. A SELECT can be requested to limit the number of returned tuples. This is useful when searching in a non-unique index or when a special wildcard (zero-length string) value is supplied as search key or a key part.

UPDATE statement supports operations on fields — assignment, arithmetic operations (the field must be numeric), cutting and pasting fragments of a field, — as well as operations on a tuple: push and pop of a field at the tail of a tuple, deletion and insertion of a field. Multiple operations can be combined into a single update, and in this case they are performed atomically. Each operation expects field number as its first argument. When a sequence of changes is present, field identifier in each operation is assumed to be relative to the most recent state of the tuple, i.e. as if all previous operations in a multi-operation update have already been applied. In other words, it's always safe to merge multiple UPDATE statements into a single one, with no change in semantics.

Tarantool protocol was designed with focus on asynchronous I/O and easy integration with proxies. Each client request starts with a 12-byte binary header, containing three fields: request type, length, and a numeric id.

The mandatory length, present in request header simplifies client or proxy I/O. A response to a request is sent to the client as soon as it is ready. It always carries in its header the same type and id as in the request. The id makes it possible to match a request to a response, even if the latter arrived out of order.

Request type defines the format of the payload. INSERTs, UPDATEs and DELETEs can only be made by the primary key, so an index id and a key (possibly multipart) are always present in these requests. SELECTs can use secondary keys. UPDATE only needs to list the fields that are actually changed. With this one exception, all commands operate on whole tuple(s).

Unless implementing a client driver, one needn't concern oneself with the complications of the binary protocol. Language-specific drivers provide a friendly way to store domain language data structures in Tarantool, and the command line client supports a subset of standard SQL. A complete description of both, the binary protocol and the supported SQL, is maintained in annotated Backus-Naur form in the source tree: please see doc/box-protocol.txt and doc/sql.txt respectively.

Memcached protocol

If full access to Tarantool functionality is not needed, or there is no readily available connector for the programming language in use, any existing client driver for Memcached will make do as a Tarantool connector. To enable text Memcached protocol, turn on memcached_port in the configuration file. Since Memcached has no notion of spaces or secondary indexes, this port only makes it possible to access one dedicated space (see memcached_space) via its primary key. Unless tuple expiration is enabled with memcached_expire, TTL part of the message is stored but ignored.

Notice, that memcached_space is also accessible using the primary port or Lua. A common use of the Memcached port in Tarantool is when a Memcached default expiration algorithm is insufficient, and a custom Lua expiration procedure is used.

Tarantool does not support the binary protocol of Memcached. If top performance is a must, Tarantool's own binary protocol should be used.

Administrative console

The administrative console uses a simple text protocol. All commands are case-insensitive. You can connect to the administrative port using any telnet client, or a tool like rlwrap, if access to readline features is desired. Additionally, tarantool, the SQL-capable command line client, understands all administrative statements and automatically directs them to the administrative port. The server response to an administrative command, even though it is always in plain text, can be quite complex. It is encoded using YAML markup to simplify automated parsing.

To learn about all supported administrative commands, you can type help in the administrative console. A reference description also follows below:

save snapshot

Take a snapshot of all data and store it in snap_dir/<latest-lsn>.snap. To take a snapshot, Tarantool first enters the delayed garbage collection mode for all data. In this mode, tuples which were allocated before the snapshot has started are not freed until the snapshot has finished. To preserve consistency of the primary key, used to iterate over tuples, a copy-on-write technique is employed. If the master process changes part of a primary key, the corresponding process page is split, and the snapshot process obtains an old copy of the page. Since a snapshot is written sequentially, one can expect a very high write performance (averaging to 80MB/second on modern disks), which means an average database instance gets saved in a matter of minutes. Note, that as long as there are any changes to the parent index memory through concurrent updates, there are going to be page splits, and therefore one needs to have some extra free memory to run this command. 10% of slab_alloc_arena is, on average, sufficient. This statement waits until a snapshot is taken and returns operation result. For example:

localhost> show info
---
info:
  version: "1.4.6"
  lsn: 843301
...
localhost> save snapshot
---
ok
...
localhost> save snapshot
---
fail: can't save snapshot, errno 17 (File exists)
...

Taking a snapshot does not cause the server to start a new write ahead log. Once a snapshot is taken, old WALs can be deleted as long as all replicas are up to date. But the WAL which was current at the time save snapshot started must be kept for recovery, since it still contains log records written after the start of save snapshot.

An alternative way to save a snapshot is to send the server SIGUSR1 UNIX signal. While this approach could be handy, it is not recommended for use in automation: a signal provides no way to find out whether the snapshot was taken successfully or not.

reload configuration

Re-read the configuration file. If the file contains changes to dynamic parameters, update the runtime settings. If configuration syntax is incorrect, or a read-only parameter is changed, produce an error and do nothing.

show configuration

Show the current settings. Displays all settings, including those that have default values and thus are not necessarily present in the configuration file.

show info

localhost> show info
---
info:
  version: "1.4.5-128-ga91962c"
  uptime: 441524
  pid: 12315
  logger_pid: 12316
  lsn: 15481913304
  recovery_lag: 0.000
  recovery_last_update: 1306964594.980
  status: primary
  config: "/usr/local/etc/tarantool.cfg"

recovery_lag holds the difference (in seconds) between the current time on the machine (wall clock time) and the time stamp of the last applied record. In replication setup, this difference can indicate the delay taking place before a change is applied to a replica.

recovery_last_update is the wall clock time of the last change recorded in the write ahead log. To convert it to human-readable time, you can use date -d@1306964594.980.

status is either "primary" or "replica/<hostname>".

show stat

Show the average number of requests per second, and the total number of requests since startup, broken down by request type: INSERT or SELECT or UPDATE or DELETE."

localhost> show stat
---
statistics:
  INSERT:        { rps:  139  , total:  48207694    }
  SELECT_LIMIT:  { rps:  0    , total:  0           }
  SELECT:        { rps:  1246 , total:  388322317   }
  UPDATE_FIELDS: { rps:  1874 , total:  743350520   }
  DELETE:        { rps:  147  , total:  48902544    }

show slab

Show the statistics of the slab allocator. The slab allocator is the main allocator used to store tuples. This can be used to monitor the total memory use and memory fragmentation.

items_used contains the % of slab_alloc_arena already used to store tuples.

arena_used contains the % of slab_alloc_arena that is already distributed to the slab allocator.

show palloc

A pool allocator is used for temporary memory, when serving client requests. Every fiber has its own temporary pool. Shows the current state of pools of all fibers.

save coredump

Fork and dump a core. Since Tarantool stores all tuples in memory, it can take some time. Mainly useful for debugging.

show fiber

Show all running fibers, with their stack. Mainly useful for debugging.

lua ...

Execute a chunk of Lua code. This can be used to define, invoke, debug and drop stored procedures, inspect server environment, perform automated administrative tasks.

Writing stored procedures in Lua

Lua is a light-weight, multi-paradigm, embeddable language. Stored procedures in Lua can be used to implement data manipulation patterns or data structures. A server-side procedure written in Lua can select and modify data, access configuration and perform administrative tasks. It is possible to dynamically define, invoke, alter and drop Lua procedures. Lua procedures can run in the background and perform administrative tasks, such as data expiration or re-sharding.

Tarantool uses the LuaJIT just-in-time Lua compiler and virtual machine. Apart from increased performance, this provides such features as bitwise operations and 64-bit integer arithmetic.

Procedures can be invoked from the administrative console and using the binary protocol, for example:

localhost> lua function f1() return 'hello' end
---
...
localhost> call f1()
Found 1 tuple:
['hello']

In the language of the administrative console LUA ... evaluates an arbitrary Lua chunk. CALL is an SQL standard statement, so its syntax was adopted by Tarantool command line client to invoke the CALL command of the binary protocol.

In the example above, a Lua procedure is first defined using the text protocol of the administrative port, and then invoked using the Tarantool client-side SQL parser plus the binary protocol on the primary_port. Since it's possible to execute any Lua chunk in the administrative console, the newly created function f1() can be called there too:

localhost> lua f1()
---
 - hello
...
localhost> lua 1+2
---
 - 3
...
localhost> lua "hello".." world"
---
 - hello world
...

Lua procedures could also be called at the time of initialization using a dedicated init.lua script, located in script_dir. An example of such a script is given below:

    
-- Importing expirationd module
dofile("expirationd.lua")

function is_expired(args, tuple)
   if tuple == nil then
       return true
   end

   if #tuple <= args.field_no then
       return true
   end

   field = tuple[args.field_no]
   if field == nil or #field ~= 4 then
       return true
   end

   local current_time = os.time()
   local tuple_ts = box.unpack("i", field)
   return current_time >= tuple_ts + args.ttl
end
function purge(args, tuple)
    box.space[0]:delete(tuple[0])
end

-- Run task
expirationd.run_task("exprd space 0", 0, is_expired, purge,
                    { field_no = 1, ttl = 30 * 60 })

    

The initialization script can select and modify data. However, if the server is a running replica, data change requests from the start script fail just the same way they would fail if they were sent from a remote client.

Another common task to perform in the initialization script is to start background fibers for data expiration, re-sharding, or communication with networked peers.

Finally, the script can be used to define Lua triggers invoked on various events within the system.

There is a single global instance of the Lua interpreter, which is shared across all connections. Anything prefixed with lua on the administrative console is sent directly to this interpreter. Any change of the interpreter state is immediately available to all client connections.

Each connection, however, is using its own Lua coroutine — a mechanism akin to Tarantool fibers. A coroutine has an own execution stack and a Lua closure — set of local variables and definitions.

The interpreter environment is not restricted when init.lua is loaded. But before the server starts accepting requests, the standard Lua APIs, such as for file I/O, process control and module management are unset, to avoid possible trivial security attacks.

In the binary protocol, it's only possible to invoke existing procedures, but not define or alter them. The CALL request packet contains the command code for CALL (22), the name of a procedure to be called, and a tuple for procedure arguments. Currently, Tarantool tuples are type-agnostic, thus each field of the tuple is passed into the procedure as an argument of type string. For example:

kostja@atlas:~$ cat arg.lua
function f1(a)
    local s = a
    if type(a) == 'string' then
        s = ''
        for i=1, #a, 1 do
            s = s..string.format('0x%x ', string.byte(a, i))
        end
    end
    return type(a), s
end
kostja@atlas:~$ tarantool
localhost> lua dofile('arg.lua')
---
...
localhost> lua f1('1234')
---
 - string
 - 0x31 0x32 0x33 0x34
...
localhost> call f1('1234')
Call OK, 2 rows affected
['string']
['0x31 0x32 0x33 0x34 ']
localhost> lua f1(1234)
---
 - number
 - 1234
...
localhost> call f1(1234)
Call OK, 2 rows affected
['string']
['0xd2 0x4 0x0 0x0 ']

In the above example, the way the procedure receives its argument is identical in two protocols, when the argument is a string. A numeric field, however, when submitted via the binary protocol, is seen by the procedure as a 4-byte blob, not as a Lua number type.

In addition to conventional method invocation, Lua provides object-oriented syntax. Access to the latter is available on the administrative console only:

localhost> lua box.space[0]:truncate()
---
...
localhost> call box.space[0]:truncate()
error: 1:15 expected '('

Since it's impossible to invoke object methods from the binary protocol, the object-oriented syntax is often used to restrict certain operations to be used by a system administrator only.

Every value, returned from a stored function by means of a return clause, is converted to a Tarantool tuple. Tuples are returned as such, in binary form; a Lua scalar, such as a string or an integer, is converted to a tuple with only one field. When the returned value is a Lua table, the resulting tuple contains only table values, but not keys.

When a function in Lua terminates with an error, the error is sent to the client as ER_PROC_LUA return code, with the original error message preserved. Similarly, an error which has occurred inside Tarantool (observed on the client as an error code), when it happens during execution of a Lua procedure, produces a genuine Lua error:

localhost> lua function f1() error("oops") end
---
...
localhost> call f1()
Call ERROR, Lua error: [string "function f1() error("oops") end"]:1: oops (ER_PROC_LUA)
localhost> call box.insert('99', 1, 'test')
Call ERROR, Space 99 is disabled (ER_SPACE_DISABLED)
localhost> lua pcall(box.insert, 99, 1, 'test')
---
 - false
 - Space 99 is disabled
...

It's possible not only to invoke trivial Lua code, but call into Tarantool storage functionality, using the box Lua library. The contents of the library can be inspected at runtime:

localhost> lua for k, v in pairs(box) do print(k, ": ", type(v)) end
---
fiber: table
space: table
cfg: table
on_reload_configuration: function
update: function
process: function
delete: function
insert: function
select: function
index: table
unpack: function
replace: function
select_range: function
pack: function
...

As is shown in the listing, the box package contains:

  • high-level functions, such as process(), update(), select(), select_range(), insert(), replace(), delete(), to manipulate tuples and access spaces from Lua.

  • libraries, such as cfg, space, fiber, index, tuple, to access server configuration, create, resume and interrupt fibers, inspect contents of spaces, indexes and tuples, send and receive data over the network.

Global Lua names added by Tarantool

tonumber64(value)

Convert a given string or a Lua number to a 64-bit integer. The returned value supports all arithmetic operations, but uses 64-bit integer arithmetic, rather than floating-point arithmetic as in the built-in number type.

Example

localhost> lua tonumber64('123456789'), tonumber64(123456789)
---
 - 123456789
 - 123456789
...
localhost> lua i=tonumber64(1)
---
...
localhost> lua type(i), type(i*2),  type(i/2), i, i*2, i/2
---
 - cdata
 - cdata
 - cdata
 - 1
 - 2
 - 0
...

Package box

box.process(op, request)

Process a request passed in as a binary string. This is an entry point into the server request processor. It can be used to insert, update, select and delete tuples from within a Lua procedure.

This is a low-level API, and it expects all arguments to be packed in accordance with the binary protocol (iproto header excluded). Normally, there is no need to use box.process() directly: box.select(), box.update() and other convenience wrappers invoke box.process() with correctly packed arguments.

Parameters

op — number, any Tarantool command code, except 22 (CALL). See doc/box-protocol.txt.
request — command arguments packed in binary format.

Returns

This function returns zero or more tuples. In Lua, a tuple is represented by a userdata object of type box.tuple. If a Lua procedure is called from the administrative console, returned tuples are printed out in YAML format. When called from the binary protocol, the binary format is used.

Errors

Any server error produced by the executed command.

Please note that, since all requests from Lua enter the core through box.process(), all checks and triggers run by the core automatically apply. For example, if the server is in read-only mode, an update or delete fails. Analogously, if a system-wide "instead of" trigger is defined, it is run.

box.select(space_no, index_no, ...)

Search for a tuple or tuples in the given space. A wrapper around box.process().

Parameters

space_no — space id,
index_no — index number in the space, to be used for match
...— index key, possibly multipart.

Returns

Returns zero or more tuples.

Errors

Same as in box.process(). Any error results in a Lua exception.

Example

localhost> call box.insert(0, 'test', 'my first tuple')
Call OK, 1 rows affected
['test', 'my first tuple']
localhost> call box.select(0, 0, 'test')
Call OK, 1 rows affected
['test', 'my first tuple']
localhost> lua box.insert(5, 'testtest', 'firstname', 'lastname')
---
 - 'testtest': {'firstname', 'lastname'}
...
localhost> lua box.select(5, 1, 'firstname', 'lastname')
---
 - 'testtest': {'firstname', 'lastname'}
...

box.insert(space_no, ...)
box.select_limit(space_no, index_no, offset, limit, ...)

Search for tuples in the given space. This is a full version of the built-in SELECT command, in which one can specify offset and limit for a multi-tuple return. The server may return multiple tuples when the index is non-unique or a partial key is used for search.

box.replace(space_no, ...)

Insert a tuple into a space. Tuple fields follow space_no. If a tuple with the same primary key already exists, box.insert() returns an error, while box.replace() replaces the existing tuple with a new one. These functions are wrappers around box.process()

Returns

Returns the inserted tuple.

box.update(space_no, key, format, ...)

Update a tuple identified by a primary key. If a key is multipart, it is passed in as a Lua table. Update arguments follow, described by format. The format and arguments are passed to box.pack() and the result is sent to box.process(). A correct format is a sequence of pairs: update operation, operation arguments. A single character of format describes either an operation which needs to take place or operation argument. A format specifier also works as a placeholder for the number of a field, which needs to be updated, or for an argument value. For example:

+p=p — add a value to one field and assign another,
:p — splice a field: start at offset, cut length bytes, and add a string.
#p — delete a field.
!p — insert a field (before the one specified).

Possible format specifiers are: + for addition, - for subtraction, & for bitwise AND, | for bitwise OR, ^ for bitwise exclusive OR (XOR), : for string splice and p for operation argument.

Returns

Returns the updated tuple.

Example

#Assume that the initial state of the database is ...
#  space[0] has one tuple set and one primary key whose type is 32-bit integer.
#  There is one row, with field[0] = 999 and field[1] = 'A'.

#In the following update ...
#  The first argument is 0, that is, the affected space is space[0]
#  The second argument is 999, that is, the affected tuple is identified by primary key value = 999
#  The third argument is '=p', that is, there is one operation, assignment to a field
#  The fourth argument is 1, that is, the affected field is field[1]
#  The fifth argument is 'B', that is, field[1] contents change to 'B'
#  Therefore, after the following update, field[0] = 999 and field[1] = 'B'.
lua box.update(0, 999, '=p', 1, 'B')

#In the following update, the arguments are the same, except that ...
#  the key is passed as a Lua table (inside braces). This is unnecessary
#  when the primary key has only one field, but would be necessary if the
#  primary key had more than one field.
#  Therefore, after the following update, field[0] = 999 and field[1] = 'B' (no change).
lua box.update(0, {999}, '=p', 1, 'B')

#In the following update, the arguments are the same, except that ...
#   The fourth argument is 2, that is the affected field is field[2].
#   It is okay that, until now, field[2] has not existed. It gets added.
#   Therefore, after the following update, field[0] = 999, field[1] = 'B', field[2] = 1.
lua box.update(0, 999, '=p', 2, 1)

#In the following update, the arguments are the same, except that ...
#   The third argument is '+p', that is, the operation is addition rather than assignment.
#   Since field[2] previously contained 10, this means we're adding 1 to 1.
#   Therefore, after the following update, field[0] = 999, field[1] = 'B', field[2] = 2.
lua box.update(0, 999, '+p', 2, 1)

#In the following update ...
#   The idea is to modify two fields at once.
#   The third argument is '|p=p', that is, there are two operations, OR and assignment.
#   The fourth and fifth arguments mean that field[2] gets ORed with 1.
#   The fifth and sixth arguments mean that field[1] gets assigned 'C'.
#   Therefore, after the following update, field[0] = 999, field[1] = 'C', field[2] = 3.
lua box.update(0, 999, '|p=p', 2, 1, 1, 'C')

#In the following update ...
#   The idea is to delete field[1], then subtract 3 from field[2], but ...
#   after the delete, there is a renumbering -- so field[2] becomes field[1]
#   before we subtract 3 from it, and that's why the sixth argument is 1 not 2.
#   Therefore, after the following update, field[0] = 999, field[1] = 0.
lua box.update(0, 999, '#p-p', 1, 0, 1, 3)

#In the following update ...
#   We're making a long string so that the splice will work in the next example.
#   Therefore, after the following update, field[0[ = 999, field[1] = 'XYZ'.
lua box.update(0, 999, '=p', 1, 'XYZ')

#In the following update ...
#   The third argument is ':p', that is, this is the example of splice.
#   The fifth argument is actually four arguments packed together ...
#      a filler, an offset, the number of bytes to cut (1), and the string to add ('!')
#   Therefore, after the following update, field[0[ = 999, field[1] = 'X!Z'.
lua box.update(0, 999, ':p', 1, box.pack('ppp', 1, 1, '!'))

box.delete(space_no, ...)

Delete a tuple identified by a primary key.

Returns

Returns the deleted tuple.

Example

localhost> call box.delete(0, 'test')
Call OK, 1 rows affected
['test', 'my first tuple']
localhost> call box.delete(0, 'test')
Call OK, 0 rows affected
localhost> call box.delete(0, 'tes')
Call ERROR, Illegal parameters, key is not u32 (ER_ILLEGAL_PARAMS)

box.select_range(space_no, index_no, limit, key, ...)

Select a range of tuples, starting from the offset specified by key. The key can be multipart. Limit selection with at most limit tuples. If no key is specified, start from the first key in the index.

For TREE indexes, this returns tuples in sorted order. For HASH indexes, the order of tuples is unspecified, and can change significantly if data is inserted or deleted between two calls to box.select_range(). If key is nil or unspecified, the selection starts from the start of the index. This is a simple wrapper around box.space[space_no]:select_range(index_no, ...). BITSET index does not support this call.

Example

localhost> show configuration
---
...
  space[4].cardinality: "-1"
  space[4].estimated_rows: "0"
  space[4].index[0].type: "HASH"
  space[4].index[0].unique: "true"
  space[4].index[0].key_field[0].fieldno: "0"
  space[4].index[0].key_field[0].type: "STR"
  space[4].index[1].type: "TREE"
  space[4].index[1].unique: "false"
  space[4].index[1].key_field[0].fieldno: "1"
  space[4].index[1].key_field[0].type: "STR"
...
localhost> insert into t4 values ('0', '0')
Insert OK, 1 rows affected
localhost> insert into t4 values ('1', '1')
Insert OK, 1 rows affected
localhost> insert into t4 values ('2', '2')
Insert OK, 1 rows affected
localhost> insert into t4 values ('3', '3')
Insert OK, 1 rows affected
ocalhost> lua box.select_range(4, 0, 10)
---
 - '3': {'3'}
 - '0': {'0'}
 - '1': {'1'}
 - '2': {'2'}
...
localhost> lua box.select_range(4, 1, 10)
---
 - '0': {'0'}
 - '1': {'1'}
 - '2': {'2'}
 - '3': {'3'}
...
localhost> lua box.select_range(4, 1, 2)
---
 - '0': {'0'}
 - '1': {'1'}
...
localhost> lua box.select_range(4, 1, 2, '1')
---
 - '1': {'1'}
 - '2': {'2'}
...

box.select_reverse_range(space_no, index_no, limit, key, ...)

Select a reverse range of tuples, starting from the offset specified by key. The key can be multipart. Limit selection with at most limit tuples. If no key is specified, start from the last key in the index.

For TREE indexes, this returns tuples in sorted order. For other index types this call is not supported. If key is nil or unspecified, the selection starts from the end of the index.

Example

localhost> show configuration
---
...
  space[4].cardinality: "-1"
  space[4].estimated_rows: "0"
  space[4].index[0].type: "HASH"
  space[4].index[0].unique: "true"
  space[4].index[0].key_field[0].fieldno: "0"
  space[4].index[0].key_field[0].type: "STR"
  space[4].index[1].type: "TREE"
  space[4].index[1].unique: "false"
  space[4].index[1].key_field[0].fieldno: "1"
  space[4].index[1].key_field[0].type: "STR"
...
localhost> insert into t4 values ('0', '0')
Insert OK, 1 rows affected
localhost> insert into t4 values ('1', '1')
Insert OK, 1 rows affected
localhost> insert into t4 values ('2', '2')
Insert OK, 1 rows affected
localhost> insert into t4 values ('3', '3')
Insert OK, 1 rows affected
localhost> lua box.select_reverse_range(4, 0, 10)
---
 error: 'Illegal parameters, hash iterator is forward only
...
localhost> lua box.select_reverse_range(4, 1, 10)
---
 - '3': {'3'}
 - '2': {'2'}
 - '1': {'1'}
 - '0': {'0'}
...
localhost> lua box.select_reverse_range(4, 1, 2)
---
 - '3': {'3'}
 - '2': {'2'}
...
localhost> lua box.select_reverse_range(4, 1, 2, '1')
---
 - '1': {'1'}
 - '0': {'0'}
...

box.pack(format, ...)

To use Tarantool binary protocol primitives from Lua, it's necessary to convert Lua variables to binary format. This helper function is prototyped after Perl 'pack'. It takes a format and a list of arguments, and returns a binary string with all arguments packed according to the format.

Format specifiers

b — converts Lua variable to a 1-byte integer, and stores the integer in the resulting string
s — converts Lua variable to a 2-byte integer, and stores the integer in the resulting string, low byte first,
i — converts Lua variable to a 4-byte integer, and stores the integer in the resulting string, low byte first,
l — converts Lua variable to a 8-byte integer, and stores the integer in the resulting string, low byte first,
n — converts Lua variable to a 2-byte integer, and stores the integer in the resulting string, big endian,
N — converts Lua variable to a 4-byte integer, and stores the integer in the resulting string, big endian,
Q — converts Lua variable to a 8-byte integer, and stores the integer in the resulting string, big endian,
f — converts Lua variable to a 4-byte float, and stores the float in the resulting string,
d — converts Lua variable to a 8-byte double, and stores the double in the resulting string,
w — converts Lua integer to a BER-encoded integer,
p — stores the length of the argument as a BER-encoded integer followed by the argument itself (a little-endian 4-byte integer for integers, and a binary blob for other types),
=, +, &, |, ^, : — stores the corresponding Tarantool UPDATE operation code: field assignment, addition, conjunction, disjunction, exclusive disjunction, splice (from Perl SPLICE function). Expects field number to update as an argument. These format specifiers only store the corresponding operation code and field number to update, but do not describe operation arguments.

Errors

Unknown format specifier.

Example

localhost> lua box.insert(0, 0, 'hello world')
---
 - 0: {'hello world'}
...
localhost> lua box.update(0, 0, "=p", 1, 'bye world')
---
 - 0: {'bye world'}
...
localhost> lua box.update(0, 0, ":p", 1, box.pack('ppp', 0, 3, 'hello'))
---
 - 0: {'hello world'}
...
localhost> lua box.update(0, 0, "=p", 1, 4)
---
 - 0: {4}
...
localhost> lua box.update(0, 0, "+p", 1, 4)
---
 - 0: {8}
...
localhost> lua box.update(0, 0, "^p", 1, 4)
---
 - 0: {12}
...

box.unpack(format, binary)

Counterpart to box.pack().

Example

localhost> lua tuple=box.replace(2, 0)
---
...
localhost> lua string.len(tuple[0])
---
 - 4
...
localhost> lua box.unpack('i', tuple[0])
---
 - 0
...
localhost> lua box.unpack('bsil', box.pack('bsil', 255, 65535, 4294967295, tonumber64('18446744073709551615')))
---
 - 255
 - 65535
 - 4294967295
 - 18446744073709551615
...
localhost> lua num, str, num64 = box.unpack('ppp', box.pack('ppp', 666, 'string', tonumber64('666666666666666')))
---
...
localhost> lua print(box.unpack('i', num));
---
666
...
localhost> lua print(str);
---
string
...
localhost> lua print(box.unpack('l', num64))
---
666666666666666
...
localhost> lua box.unpack('=p', box.pack('=p', 1, '666'))
---
 - 1
 - 666

box.print(...)

Redefines Lua print() built-in to print either to the log file (when Lua is used from the binary port) or back to the user (for the administrative console).

When printing to the log file, INFO log level is used. When printing to the administrative console, all output is sent directly to the socket.

Note: the administrative console output must be YAML-compatible.

box.dostring(s, ...)

Evaluates an arbitrary chunk of Lua code passed in s. If there is a compilation error, it's raised as a Lua error. In case of compilation success, all arguments which follow s are passed to the compiled chunk and the chunk is invoked.

This function is mainly useful to define and run an arbitrary piece of Lua code, without having to introduce changes to the global Lua environment.

Example

lua box.dostring('abc')
---
error: '[string "abc"]:1: ''='' expected near ''<eof>'''
...
lua box.dostring('return 1')
---
 - 1
...
lua box.dostring('return ...', 'hello', 'world')
---
 - hello
 - world
...
lua box.dostring('local f = function(key) t=box.select(0, 0, key); if t ~= nil then return t[0] else return nil end end return f(...)', 0)
---
 - nil
...

box.time()

Returns current system time (in seconds) as a Lua number. The time is taken from the event loop clock, which makes this call very cheap, but still useful for constructing artificial tuple keys.

box.time64()

Returns current system time (in seconds) as a 64-bit integer. The time is taken from the event loop clock.

box.uuid()

Generate 128-bit (16 bytes) unique id. The id is returned in binary form.

Requires libuuid library to be installed. The library is loaded at runtime, and if the library is not available, this function returns an error.

box.uuid_hex()

Generate hex-string of 128-bit (16 bytes) unique id. Return 32-byte string.

Example
                lua box.uuid_hex()
                ---
                 - a4f29fa0eb6d11e19f7737696d7fa8ff
                ...
            
box.raise(errcode, errtext)

Raises a client error. The difference between this function and the built-in error() function in Lua is that when the error reaches the client, its error code is preserved, whereas every Lua error is presented to the client as ER_PROC_LUA. This function makes it possible to emulate any kind of native exception, such as unique constraint violation, no such space/index, etc. A complete list of errors is present in errcode.h file in the source tree. Lua constants which correspond to Tarantool errors are defined in box.error module. The error message can be arbitrary. Throws client error. Lua procedure can emulate any request errors (for example: unique key exception).

Example
                lua box.raise(box.error.ER_WAL_IO, 'Wal I/O error')
                ---
                error: 'Wal I/O error'
                ...
            
box.auto_increment(space_no, ...)

Insert values into space designated by space_no, using an auto-increment primary key. The space must have a NUM or NUM64 primary key index of type TREE.

Example
localhost> lua box.auto_increment(0, "I am a duplicate")
---
 - 1: {'I am a duplicate'}
...
localhost> lua box.auto_increment(0, "I am a duplicate")
---
 - 2: {'I am a duplicate'}
...
            
box.counter.inc(space_no, key)

Increments a counter identified by the key. The key can be multipart, but there must be an index covering all fields of the key. If there is no tuple identified by the given key, creates a new one with initial counter value set to 1. Returns the new counter value.

Example
localhost> lua box.counter.inc(0, 'top.mail.ru')
---
 - 1
...
localhost> lua box.counter.inc(0, 'top.mail.ru')
---
 - 2
...
box.counter.dec(space_no, key)

Decrements a counter identified by the given key. If the key is not found, is a no-op. When counter value drops to 0, the tuple is deleted.

Example
localhost> lua box.counter.dec(0, 'top.mail.ru')
---
 - 1
...
localhost> lua box.counter.dec(0, 'top.mail.ru')
---
 - 0
...

Package box.tuple

This package provides read-only access for the box.tuple userdata type. It allows, for a single tuple: selective retrieval of the field contents, retrieval of information about size, iteration over all the fields, and conversion to a Lua table.

box.tuple.new(...)

Construct a new tuple from a Lua table or a scalar. Alternatively, one can get new tuples from tarantool's SQL-like statements: SELECT, INSERT, UPDATE, REPLACE, which can be regarded as statements that do new() implicitly. In the following example, x and t will be new tuple objects. Saying lua t returns the entire tuple t.

Example
localhost>localhost> lua x=box.insert(0,'a',tonumber('1'),tonumber64('2')):totable()
---
...
localhost> lua t=box.tuple.new({'abc','def','ghi','abc'})
---
...
localhost> lua t
---
 - 'abc': {'def', 'ghi', 'abc'}
...
#

The # operand in Lua means "return count of components". So, if t is a tuple instance, #t will return the number of fields. In the following example, a tuple named t is created and then the number of fields in t is returned.

Example
localhost> lua t=box.tuple.new({'Field#1','Field#2','Field#3','Field#4'})
---
...
localhost> lua #t
---
 - 4
...
bsize()

If t is a tuple instance, t:bsize() will return the number of bytes in the tuple. It is useful to check this number when making changes to data, because there is a fixed maximum: one megabyte. Every field has one or more "length" bytes preceding the actual contents, so bsize() returns a value which is slightly greater than the sum of the lengths of the contents. In the following example, a tuple named t is created which has three fields, and for each field it takes one byte to store the length and three bytes to store the contents, so bsize() returns 3*(1+3).

Example
localhost> lua t=box.tuple.new({'aaa','bbb','ccc'})
---
...
localhost> lua t:bsize()
---
 - 12
...
[ ]

If t is a tuple instance, t[n] will return the nth field in the tuple. The first field is t[0]. In the following example, a tuple named t is created and then the second field in t is returned.

Example
localhost> lua t=box.tuple.new({'Field#1','Field#2','Field#3','Field#4'})
---
...
localhost> lua t[1]
---
 - Field#2
...
find() or findall()

If t is a tuple instance, t:find(search-value) will return the number of the first field in t that matches search-value, and t:findall(search-value) will return numbers of all fields in t that match search-value. Optionally one can put a numeric argument n before the search-value to indicate start searching at field number n. In the following example, a tuple named t is created and then: the number of the first field in t which matches 'a' is returned, then the numbers of all the fields in t which match 'a' are returned, then the numbers of all the fields in t which match 'a' and are at or after the second field are returned.

Example
localhost> lua t=box.tuple.new({'a','b','c','a'})
---
...
localhost> lua t:find('a')
---
 - 0
...
localhost> lua t:findall('a')
---
 - 0
 - 3
...
localhost> lua t:findall(1,'a')
---
 - 3
...
transform()

If t is a tuple instance, t:transform(n1,n2) will return a tuple where, starting from field n1, a number of fields (n2) are removed. Optionally one can add more arguments after n2 to indicate new values that will replace what was removed. In the following example, a tuple named t is created and then, starting from the second field, two fields are removed but one new one is added, then the result is returned.

Example
localhost> lua t=box.tuple.new({'Field#1','Field#2','Field#3','Field#4','Field#5'})
---
...
localhost> lua t:transform(1,2,'x')
---
 - 'Field#1': {'x', 'Field#4', 'Field#5'}
...
slice()

If t is a tuple instance, t:slice(n) will return all fields starting with field number n, and t:slice(n1,n2) will return a tuple containing fields starting with field number n1, but stopping before field number n2. In the following example, a tuple named t is created and then, starting from the second field, fields before the fourth field are selected, then the result is returned.

Example
localhost> lua t=box.tuple.new({'Field#1','Field#2','Field#3','Field#4','Field#5'})
---
...
localhost> lua t:slice(1,3)
---
 - Field#2
 - Field#3
...
unpack()

If t is a tuple instance, t:unpack(n) will return all fields. In effect, unpack() is the same as slice(0,-1). In the following example, a tuple named t is created and then all its fields are selected, then the result is returned.

Example
localhost> lua t=box.tuple.new({'Field#1','Field#2','Field#3','Field#4','Field#5'})
---
...
localhost> lua t:unpack()
---
 - Field#1
 - Field#2
 - Field#3
 - Field#4
 - Field#5
...
pairs()

In Lua, pairs() is a method which returns: function, value, nil. It is useful for Lua iterators, because Lua iterators traverse a value's components until an end marker is reached. In the following example, a tuple named t is created and then all its fields are selected using a Lua for-end loop.

Example
localhost> lua t=box.tuple.new({'Field#1','Field#2','Field#3','Field#4','Field#5'})
---
...
localhost> lua for k,v in t:pairs() do print(v) end
---
Field#1
Field#2
Field#3
Field#4
Field#5
...

Package box.cjson

This package provides JSON manipulation routines. It's based on the Lua-CJSON package by Mark Pulford. For a complete manual on Lua-CJSON please read the official documentation.

box.cjson.encode(object)

Convert a Lua object to a JSON string.

box.cjson.decode(string)

Convert a JSON string to a Lua object.

Example
lua box.cjson.encode(123)
---
 - 123
...
lua box.cjson.encode({123})
---
 - [123]
...
lua box.cjson.encode({123, 234, 345})
---
 - [123,234,345]
...
lua box.cjson.encode({abc = 234, cde = 345})
---
 - {"cde":345,"abc":234}
...
lua box.cjson.encode({hello = { 'world' } })
---
 - {"hello":["world"]}
...
lua box.cjson.decode('123')
---
 - 123
...
lua box.cjson.decode('[123, "hello"]')[2]
---
 - hello
...
lua box.cjson.decode('{"hello": "world"}').hello
---
 - world
...

Package box.space

This package is a container for all configured spaces. A space object provides access to space attributes, such as id, whether or not a space is enabled, space cardinality, and estimated number of rows. It also contains object-oriented versions of box functions. For example, instead of box.insert(0, ...) one can write box.space[0]:insert(...). Package source code is available in file src/box/lua/box.lua

A list of all space members follows.

space.n
Ordinal space number, box.space[i].n == i
space.enabled
Whether or not this space is enabled in the configuration file.
space.cardinality
A limit on tuple field count for tuples in this space. This limit can be set in the configuration file. Value 0 stands for unlimited.
space.index[]
A container for all defined indexes. An index is a Lua object of type box.index with methods to search tuples and iterate over them in predefined order.
space:select(index_no, ...)
space:select_range(index_no, limit, key)

Select a range of tuples, starting from the offset specified by key. The key can be multipart. Limit selection with at most limit tuples. If no key is specified, start from the first key in the index.

For TREE indexes, this returns tuples in sorted order. For other indexes, the order of tuples is unspecified, and can change significantly if data is inserted or deleted between two calls to select_range(). If key is nil or unspecified, the selection starts from the start of the index.

space:select_reverse_range(limit, key)
Select a reverse range of tuples, limited by limit, starting from key. The key can be multipart. For TREE indexes, this returns tuples in descending order. For other index types this call is not supported.
space:insert(...)
space:replace(...)
space:delete(key)
space:update(key, format, ...)
space:insert(...)
Object-oriented forms of respective box methods.
space:len()
Returns number of tuples in the space.
space:truncate()
Deletes all tuples.
space:pairs()

A helper function to iterate over all space tuples, Lua style.

Example
localhost> lua for k,v in box.space[0]:pairs() do print(v) end
---
1: {'hello'}
2: {'my     '}
3: {'Lua    '}
4: {'world'}
...

Package box.index

This package implements methods of type box.index. Indexes are contained in box.space[i].index[] array within each space object. They provide an API for ordered iteration over tuples. This API is a direct binding to corresponding methods of index objects in the storage engine.

index.unique
Boolean, true if the index is unique.
index.type
A string for index type, either 'TREE', 'HASH', or 'BITSET'.
index.key_field[]
An array describing index key fields.
index.idx
The underlying userdata which does all the magic.
index:iterator(type, ...)

This method provides iteration support within an index. Parameter type is used to identify the semantics of iteration. Different index types support different iterators. The remaining arguments of the function are varying and depend on the iteration type. For example, a TREE index maintains a strict order of keys and can return all tuples in ascending or descending order, starting from the specified key. Other index types, however, do not support ordering.

To understand consistency of tuples returned by an iterator, it's essential to know the principles of the Tarantool transaction processing subsystem. An iterator in Tarantool does not own a consistent read view. Instead, each procedure is granted exclusive access to all tuples and spaces until it encounters a "context switch": by causing a write to disk, network, or by an explicit call to box.fiber.yield(). When the execution flow returns to the yielded procedure, the data set could have changed significantly. Iteration, resumed after a yield point, does not preserve the read view, but continues with the new content of the database.

Parameters

type — iteration strategy as defined in tables below.

Returns

This method returns an iterator closure, i.e. a function which can be used to get the next value on each invocation.

Errors

Selected iteration type is not supported in the subject index type or supplied parameters do not match iteration type.

Table 4.1. Common iterator types

TypeArgumentsHASHTREEBITSETDescription
box.index.ALLnoneyesyesyes Iterate over all tuples in an index. When iterating over a TREE index, tuples are returned in ascending order of the key. When iterating over a HASH or BITSET index, tuples are returned in physical order or, in other words, unordered.
box.index.EQkeyyesyesyes

Equality iterator: iterate over all tuples matching the key. Parts of a multipart key need to be separated by comma.

Semantics of the match depends on the index. A HASH index only supports exact match: all parts of a key participating in the index must be provided. In case of TREE index, only few parts of a key or a key prefix are accepted for search. In this case, all tuples with the same prefix or matching key parts are considered matching the search criteria.

When a TREE index is not unique, or only part of a key is given as a search criteria, matching tuples are returned in ascending order. BITSET and HASH indexes are always unique.

box.index.GTkeyyes (*)yes no Iterate over tuples strictly greater than the search key. For TREE indexes, a key prefix or key part can be sufficient. If the key is nil, iteration starts from the smallest key in the index. The tuples are returned in ascending order of the key. HASH index also supports this iterator type, but returns tuples in unspecified order. However, if the server does not receive updates, this iterator can be used to retrieve all tuples via a HASH index piece by piece, by supplying the last key from the previous range as the start key for an iterator over the next range. BITSET index does not support this iteration type yet.


Table 4.2. TREE iterator types

TypeArgumentsDescription
box.index.REQkey or key part Reverse equality iterator. Is equivalent to box.index.EQ with only distinction that the order of returned tuples is descending, not ascending.
box.index.GEkey or key part Iterate over all tuples for which the corresponding fields are greater or equal to the search key. The tuples are returned in ascending order. Similarly to box.index.EQ, key prefix or key part can be used to seed the iterator. If the key is nil, iteration starts from the smallest key in the index.
box.index.LTkey or key part Similar to box.index.GT, but returns all tuples which are strictly less than the search key. The tuples are returned in the descending order of the key. nil key can be used to start from the end of the index range.
box.index.LEkey or key part Similar to box.index.GE, but returns all tuples which are less or equal to the search key or key prefix, and returns tuples in descending order, from biggest to smallest. If the key is nil, iteration starts from the end of the index range.


Table 4.3. BITSET iterator types

TypeArgumentsDescription
box.index.BITS_ALL_SETbit mask Matches tuples in which all specified bits are set.
box.index.BITS_ANY_SETbit mask Matches tuples in which any of the specified bits is set.
box.index.BITS_ALL_NOT_SETbit mask Matches tuples in which none of the specified bits is set.


Examples

localhost> show configuration
---
...
  space[0].enabled: "true"
  space[0].index[0].type: "HASH"
  space[0].index[0].unique: "true"
  space[0].index[0].key_field[0].fieldno: "0"
  space[0].index[0].key_field[0].type: "NUM"
  space[0].index[1].type: "TREE"
  space[0].index[1].unique: "false"
  space[0].index[1].key_field[0].fieldno: "1"
  space[0].index[1].key_field[0].type: "NUM"
  space[0].index[1].key_field[1].fieldno: "2"
  space[0].index[1].key_field[1].type: "NUM"
...
localhost> INSERT INTO t0 VALUES (1, 1, 0)
Insert OK, 1 rows affected
localhost> INSERT INTO t0 VALUES (2, 1, 1)
Insert OK, 1 rows affected
localhost> INSERT INTO t0 VALUES (3, 1, 2)
Insert OK, 1 rows affected
localhost> INSERT INTO t0 VALUES (4, 2, 0)
Insert OK, 1 rows affected
localhost> INSERT INTO t0 VALUES (5, 2, 1)
Insert OK, 1 rows affected
localhost> INSERT INTO t0 VALUES (6, 2, 2)
Insert OK, 1 rows affected
localhost> lua it = box.space[0].index[1]:iterator(box.index.EQ, 1); print(it(), " ", it(), " ", it());
---
1: {1, 0} 2: {1, 1} 3: {1, 2}
...
localhost> lua it = box.space[0].index[1]:iterator(box.index.EQ, 1, 2); print(it(), " ", it(), " ", it());
---
3: {1, 2} nil nil
...
localhost> lua i = box.space[0].index[1]:iterator(box.index.GE, 2, 1);  print(it(), " ", it(), " ", it());
---
5: {2, 1} 6: {2, 2} nil
...
localhost> lua for v in box.space[0].index[1]:iterator(box.index.ALL) do print(v) end
---
1: {1, 0}
2: {1, 1}
3: {1, 2}
4: {2, 0}
5: {2, 1}
6: {2, 2}
...
localhost> lua i = box.space[0].index[0]:iterator(box.index.LT, 1);
---
error: 'Iterator type is not supported'

index:min()
The smallest value in the index. Available only for indexes of type 'TREE'.
index:max()
The biggest value in the index. Available only for indexes of type 'TREE'.
index:random(randint)
Return a random value from an index. A random non-negative integer must be supplied as input, and a value is selected accordingly in index-specific fashion. This method is useful when it's important to get insight into data distribution in an index without having to iterate over the entire data set.
index:count()
Iterate over an index, counting the number of tuples which equal the provided search criteria. The argument can either point to a tuple, a key, or one or more key parts. Returns the number of matched tuples.

Package box.fiber

Functions in this package allow you to create, run and manage existing fibers.

A fiber is an independent execution thread implemented using a mechanism of cooperative multitasking. A fiber has three possible states: running, suspended or dead. When a fiber is created with box.fiber.create(), it is suspended. When a fiber is started with box.fiber.resume(), it is running. When a fiber's control is yielded back to the caller with box.fiber.yield(), it is suspended. When a fiber ends (due to return or by reaching the end of the fiber function), it is dead.

A fiber can also be attached or detached. An attached fiber is a child of the creator, and is running only if the creator has called box.fiber.resume(). A detached fiber is a child of Tarantool internal sched fiber, and gets scheduled only if there is a libev event associated with it.

To detach, a running fiber must invoke box.fiber.detach(). A detached fiber loses connection with its parent forever.

All fibers are part of the fiber registry, box.fiber. This registry can be searched (box.fiber.find()) either by fiber id (fid), which is numeric, or by fiber name, which is a string. If there is more than one fiber with the given name, the first fiber that matches is returned.

Once fiber function is done or calls return, the fiber is considered dead. Its carcass is put into a fiber pool, and can be reused when another fiber is created.

A runaway fiber can be stopped with box.fiber.cancel(). However, box.fiber.cancel() is advisory — it works only if the runaway fiber calls box.fiber.testcancel() once in a while. Most box.* hooks, such as box.delete() or box.update(), do call box.fiber.testcancel(). box.select() doesn't.

In practice, a runaway fiber can only become unresponsive if it does a lot of computations and doesn't check whether it's been canceled.

The other potential problem comes from detached fibers which never get scheduled, because they are not subscribed to any events, or because no relevant events occur. Such morphing fibers can be killed with box.fiber.cancel() at any time, since box.fiber.cancel() sends an asynchronous wakeup event to the fiber, and box.fiber.testcancel() is checked whenever such an event occurs.

Like all Lua objects, dead fibers are garbage collected: the garbage collector frees pool allocator memory owned by the fiber, resets all fiber data, and returns the fiber to the fiber pool.

box.fiber.id(fiber)
Return a numeric id of the fiber.
box.fiber.self()
Return box.fiber userdata object for the currently scheduled fiber.
box.fiber.find(id)
Locate a fiber userdata object by id.
box.fiber.create(function)

Create a fiber for a function.

Errors

Can hit a recursion limit.

box.fiber.resume(fiber, ...)
Resume a created or suspended fiber.
box.fiber.yield(...)

Yield control to the calling fiber, if the fiber is attached, or to sched otherwise.

If the fiber is attached, whatever arguments are passed to this call, are passed on to the calling fiber. If the fiber is detached, box.fiber.yield() returns back everything passed into it after temporarily yielding control back to the scheduler.

box.fiber.detach()
Detach the current fiber. This is a cancellation point. This is a yield point.
box.fiber.wrap(function, ...)
This is a quick way to create and start a detached fiber. The fiber function is passed in the first argument, the function arguments follow. The fiber is created, detached, and resumed immediately.
box.fiber.sleep(time)
Yield to the sched fiber and sleep time seconds. Only the current fiber can be made to sleep.
box.fiber.status(fiber)
Returns the status of the fiber. If no argument is provided, the current fiber's status is returned. The status can be either dead, suspended, attached or running.
box.fiber.cancel(fiber)
Cancel a fiber. Running and suspended fibers can be canceled. Returns an error if the subject fiber does not permit cancel.
box.fiber.testcancel()
Check if the current fiber has been canceled and throw an exception if this is the case.

Package box.session

Learn session state, set on-connect and on-disconnect triggers.

A session is an object associated with each client connection. Through this module, it's possible to query session state, as well as set a Lua chunk executed on a connect or disconnect event.

box.session.id()
Return a unique monotonic identifier of the current session. The identifier can be used to check whether or not a session is alive. 0 means there is no session (e.g. a procedure is running in a detached fiber).
box.session.fd(id)
Return an integer file descriptor associated with the connected client.
box.session.exists(id)
Return true if a session is alive, false otherwise.
box.session.peer(id)
Return the host address and port for the session peer, for example "127.0.0.1:55457", or "0.0.0.0:0" if the session is not connected.

This module also makes it possible to define triggers on connect and disconnect events. Please see the triggers chapter for details.

Package box.ipc — inter procedure communication

box.ipc.channel(capacity)

Create a new communication channel with predefined capacity. The channel can be used to synchronously exchange messages between stored procedures. The channel is garbage collected when no one is using it, just like any other Lua object. Channels can be worked with using functional or object-oriented syntax. For example, the following two lines are equivalent:

    channel:put(message)
    box.ipc.channel.put(channel, message)
box.ipc.channel.put(channel, message, timeout)
Send a message using a channel. If the channel is full, box.ipc.channel.put() blocks until there is a free slot in the channel. If timeout is provided, and the channel doesn't become empty for the duration of the timeout, box.ipc.channel.put() returns false. Otherwise it returns true.
box.ipc.channel.get(channel, timeout)
Fetch a message from a channel. If the channel is empty, box.ipc.channel.get() blocks until there is a message. If timeout is provided, and there are no new messages for the duration of the timeout, box.ipc.channel.get() returns error.
box.ipc.channel.broadcast(channel, message, timeout)
If the channel is empty, is equivalent to box.ipc.channel.put(). Otherwise sends the message to all readers of the channel.
box.ipc.channel.is_empty(channel)
Check if the channel is empty (has no messages).
box.ipc.channel.is_full(channel)
Check if the channel is full (has no room for a new message).
box.ipc.channel.has_readers(channel)
Check if the channel is empty and has readers waiting for a message.
box.ipc.channel.has_writers(channel)

Check if the channel is full and has writers waiting for empty room.

Example
local channel = box.ipc.channel(10)
function consumer_fiber()
    while true do
        local task = channel:get()
        ...
    end
end

function consumer2_fiber()
    while true do
        local task = channel:get(10)        -- 10 seconds
        if task ~= nil then
            ...
        else
            print("timeout!")
        end
    end
end

function producer_fiber()
    while true do
        task = box.select(...)
        ...
        if channel:is_empty() then
            # channel is empty
        end

        if channel:is_full() then
            # channel is full
        end

        ...
        if channel:has_readers() then
            # there are some fibers that wait data
        end
        ...

        if channel:has_writers() then
            # there are some fibers that wait readers
        end
        channel:put(task)
    end
end

function producer2_fiber()
    while true do
        task = box.select(...)

        if channel:put(task, 10) then       -- 10 seconds
            ...
        else
            print("timeout!")
        end
    end
end

Package box.socket — TCP and UDP sockets

BSD sockets is a mechanism to exchange data with a local or remote host in connection-oriented (TCP) or datagram-oriented (UDP) mode. Semantics of the calls in the box.socket API closely follow semantics of the corresponding POSIX calls. Function names and signatures are mostly compatible with luasocket.

Similarly to luasocket, box.socket doesn't throw exceptions on errors. On success, most calls return a socket object. On error, a multiple return of nil, status, errno, errstr is produced. Status can be one of "error", "timeout", "eof" or "limit". On success, status is always nil. A call which returns data (recv(), recvfrom(), readline()) on success returns a Lua string of the requested size and nil status. On error or timeout, an empty string is followed by the corresponding status, error number and message. A call which sends data (send(), sendto()) on success returns the number of bytes sent, and the status is, again, nil. On error or timeout 0 is returned, followed by status, error number and message.

The last error can be retrieved from the socket using socket:error(). Any call except error() clears the last error first (but may set a new one).

Calls which require a socket address and in POSIX expect struct sockaddr_in, in box.socket simply accept host name and port as additional arguments. Name resolution is done automatically. If it fails, status is set to "error", errno is set to -1 and error string is set to "Host name resolution failed".

All calls that can take time block the calling fiber and can get it preempted. The implementation, however, uses non-blocking cooperative I/O, so Tarantool continues processing queries while a call is blocked. A timeout can be provided for any socket call which can take a long time.

As with all other box libraries, the API can be used in procedural style (e.g. box.socket.close(socket)) as well as in object-oriented style (socket:close()).

A closed socket should not be used any more. Alternatively, the socket will be closed when its userdata is garbage collected by Lua.

box.socket.tcp()

Create a new TCP socket.

Returns

A new socket or nil.

box.socket.udp()

Create a new UDP socket.

Returns

A new socket or nil.

socket:connect(host, port, [timeout])

Connect a socket to a remote host. Can be used with IPv6 and IPv4 addresses, as well as domain names. If multiple addresses correspond to a domain, tries them all until successfully connected.

Returns

Returns a connected socket on success, nil, status, errno, errstr on error or timeout.

socket:send(data, [timeout])

Send data over a connected socket.

Returns

The number of bytes sent. On success, this is exactly the length of data. In case of error or timeout, returns the number of bytes sent before error, followed by status, errno, errstr.

socket:recv(size, [timeout])

Read size bytes from a connected socket. An internal read-ahead buffer is used to reduce the cost of this call.

Returns

A string of the requested length on success. On error or timeout, returns an empty string, followed by status, errno, errstr. If there was some data read before a timeout occurred, it will be available on the next call. In case the writing side has closed its end, returns the remainder read from the socket (possibly an empty string), followed by "eof" status.

socket:readline([limit], [separator list], [timeout])

Read a line from a connected socket.

socket:readline() with no arguments reads data from a socket until '\n' or eof. If a limit is set, the call reads data until a separator is found, or the limit is reached. By default, there is no limit. Instead of the default separator, a Lua table can be used with one or multiple separators. Then the data is read until the first matching separator is found.

Returns

A Lua string with data in case of success or an empty string in case of error. When multiple separators were provided in a separator table, the matched separator is returned as the third argument.

Table 4.4. readline() returns

data, nil, separatorsuccess
"", "timeout", ETIMEDOUT, errstrtimeout
"", "error", errno, errstrerror
data, "limit"limit
data, "eof"eof


socket:bind(host, port[, timeout])

Bind a socket to the given host/port. A UDP socket after binding can be used to receive data (see recvfrom()). A TCP socket can be used to accept new connections, after it's been put in listen mode. The timeout is used for name resolution only. If host name is an IP address, the call never yields and the timeout is unused.

Returns

Socket object on success, nil, status, errno, errstr on error.

socket:listen()

Start listening for incoming connections. The listen backlog, on Linux, is taken from /proc/sys/net/core/somaxconn, whereas on BSD it is set to SOMAXCONN.

Returns

Socket on success, nil, "error", errno, errstr on error.

socket:accept([timeout])

Wait for a new client connection and create a connected socket.

Returns

peer_socket, nil, peer_host, peer_port on success. nil, status, errno, errstr on error.

socket:sendto(data, host, port, [timeout])

Send a message on a UDP socket to a specified host.

Returns

The number of bytes sent on success, 0, status, errno, errstr on error or timeout.

socket:recvfrom(limit[, timeout])

Receive a message on a UDP socket.

Returns

Message, nil, client address, client port on success, "", status, errno, errstr on error or timeout.

socket:shutdown(how)

Shutdown a reading, writing or both ends of a socket. Accepts box.socket.SHUT_RD, box.socket.SHUT_WR and box.socket.SHUT_RDWR.

Returns

Socket on success, nil, "error", errno, errstr on error.

socket:close()

Close (destroy) a socket. A closed socket should not be used any more.

socket:error()

Retrieve the last error that occurred on a socket.

Returns

errno, errstr. 0, "Success" if there is no error.

Package box.net.box — working with networked Tarantool peers

Library box.net contains connectors to remote database systems, such as MariaDB or PostgreSQL. The first supported system is Tarantool itself.

The basic object provided by box.net.box library is a connection. A connection is created by calling box.net.box.new(). To execute remote requests, simply invoke methods of the connection object, a physical connection is established upon request and is re-established if necessary. When done, issue conn:close(). Connection objects are garbage collected just like any other objects in Lua, so an explicit destruction is not mandatory. However, since close() is a system call, it is a good programming practice to close a connection explicitly when it is no longer needed, to avoid lengthy stalls of the garbage collector.

The library also provides a pre-created connection object to the local server, box.net.self. This connection is always established. The purpose of this object is to make polymorphic use of the box.net.box API easier. There is an important difference, however, between the embedded connection and a remote one. With the embedded connection, requests which do not modify data do not yield. When using a remote connection, it is important to keep in mind that any request can yield, and local database state may have changed by the time it returns.

All box.net.box methods are fiber-safe. It's safe to share and use the same connection object across multiple concurrent fibers. In fact, it's perhaps the best programming practice with Tarantool: when multiple fibers use the same connection, all requests are pipelined through the same network socket, but each fiber gets a correct response back. Reducing the number of active sockets lowers the overhead of system calls and increases the overall server performance. There are, however cases, when a single connection is not enough: when it's necessary to prioritize requests, use different authentication ids, etc.

All remote calls support execution timeouts. A specialized wrapper, box.net.box.timeout() allows setting a timeout. Using a wrapper object makes the remote connection API compatible with the local one, removing the need for a separate timeout argument, ignored by the local version. Once sent, a request can not be revoked from the remote server even if a timeout expires: the expired timeout only aborts the wait for the remote server response. box.net.box.timeout(),

Object-oriented and functional APIs are equivalent: conn:close() is identical to box.net.box.close(conn).

conn = box.net.box.new(host, port [, reconnect_interval])

Create a new connection. The connection is established on demand, at the time of the first request. It is re-established automatically after a disconnect. The argument reconnect_interval (in seconds) is responsible for the amount of time the server sleeps between failing attempts to reconnect. The returned object supports methods for making remote calls, such as select, update or delete.

conn:ping()

Execute a PING command. Returns true on success, false on error.

conn:close()

Close a connection. The connection is also closed when it's garbage collected. It's still recommended to close connections explicitly, to spare the garbage collector from heavy work such as closing the socket.

conn:select(space_no, index_no, ...)

See box.select(...). Please note, that unlike a local box.select() any call to a remote server yields, thus local data may change while remote select() is running.

conn:select_limit(space_no, index_no, offset, limit, ...)

See box.select_limit(...)

conn:select_range(space_no, index_no, limit, key, ...)

See box.select_range(...).

conn:insert(space_no, ...)

See box.insert(...).

conn:replace(space_no, ...)

See box.replace(...).

conn:update(...)

See box.update(...).

conn:delete(space_no, key)

See box.delete(...).

conn:call(proc_name, ...)

Call a remote stored procedure, such as box.select_reverse_range(). Please keep in mind that the call is using the binary protocol to pack procedure arguments, and since the latter is type-agnostic it's recommended to pass all arguments of remote stored procedure as strings, for example:

    conn:call("box.select_reverse_range", "1", "4", "10", "Smith")

conn:timeout(timeout)

Returns a closure which is identical to the invoked function, except for the added timeout functionality.

-- wait for 'update' until it is finished
local tuple = conn:update('1', 'key', ...)

-- send update but don't bother to wait for results
local other = conn:timeout(0):update('1', 'arg1', ...)

Example
-- connect to the local server
local self = box.net.box.new('127.0.0.1', box.info.primary_port)
self:insert("1", "Hello", "World")

Packages box.cfg, box.info, box.slab and box.stat: server introspection

Package box.cfg

This package provides read-only access to all server configuration parameters.

box.cfg
Example
localhost> lua for k, v in pairs(box.cfg) do print(k, " = ", v) end
---
io_collect_interval = 0
pid_file = box.pid
panic_on_wal_error = false
slab_alloc_factor = 2
slab_alloc_minimal = 64
admin_port = 33015
logger = cat - >> tarantool.log
...

Package box.info

This package provides access to information about server variables: pid, uptime, version and such. Its contents are identical to the output from SHOW INFO.

box.info()

Since box.info contents are dynamic, it's not possible to iterate over keys with the Lua pairs() function. For this purpose, box.info() builds and returns a Lua table with all keys and values provided in the package.

Example
localhost> lua for k,v in pairs(box.info()) do print(k, ": ", v) end
---
version: 1.4.7-92-g4ba95ca
status: primary
pid: 1747
lsn: 1712
recovery_last_update: 1306964594.980
recovery_lag: 0.000
uptime: 39
build: table: 0x419cb880
logger_pid: 1748
config: /home/unera/work/tarantool/test/box/tarantool_good.cfg
...
box.info.status, box.info.pid, box.info.lsn, ...
Example
localhost> lua box.info.pid
---
 - 1747
...
localhost> lua box.info.logger_pid
---
 - 1748
...
localhost> lua box.info.version
---
 - 1.4.7-92-g4ba95ca
...
localhost> lua box.info.config
---
 - /home/unera/work/tarantool/test/box/tarantool_good.cfg
...
localhost> lua box.info.uptime
---
 - 3672
...
localhost> lua box.info.lsn
---
 - 1712
...
localhost> lua box.info.status
---
 - primary
...
localhost> lua box.info.recovery_lag
---
 - 0.000
...
localhost> lua box.info.recovery_last_update
---
 - 1306964594.980
...
localhost> lua box.info.snapshot_pid
---
 - 0
...
localhost> lua for k, v in pairs(box.info.build) do print(k .. ': ', v) end
---
flags:  -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DCORO_ASM -fno-omit-frame-pointer -fno-stack-protector -fexceptions -funwind-tables -fgnu89-inline -pthread  -Wno-sign-compare -Wno-strict-aliasing -std=gnu99 -Wall -Wextra -Werror
target: Linux-x86_64-Debug
compiler: /usr/bin/gcc
options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_STATIC=OFF -DENABLE_GCOV=OFF -DENABLE_TRACE=ON -DENABLE_BACKTRACE=ON -DENABLE_CLIENT=OFF
...

Package box.slab

This package provides access to slab allocator statistics.

box.slab
Example
localhost> lua box.slab.arena_used
---
 - 4194304
...
localhost> lua box.slab.arena_size
---
 - 104857600
...
localhost> lua for k, v in pairs(box.slab.slabs) do print(k) end
---
64
128
...
localhost> lua for k, v in pairs(box.slab.slabs[64]) do print(k, ':', v) end
---
items:1
bytes_used:160
item_size:64
slabs:1
bytes_free:4194144
...

Package box.stat

This package provides access to request statistics.

box.stat
Example
localhost> lua box.stat -- a virtual table
---
 - table: 0x41a07a08
...
localhost> lua box.stat() -- a full table (the same)
---
 - table: 0x41a0ebb0
...
localhost> lua for k, v in pairs(box.stat()) do print(k) end
---
DELETE
SELECT
REPLACE
CALL
UPDATE
DELETE_1_3
...
localhost> lua for k, v in pairs(box.stat().DELETE) do print(k, ': ', v) end
---
total: 23210
rps: 22
...
localhost> lua for k, v in pairs(box.stat.DELETE) do print(k, ': ', v) end -- the same
---
total: 23210
rps: 22
...
localhost> lua for k, v in pairs(box.stat.SELECT) do print(k, ': ', v) end
---
total: 34553330
rps: 23
...
localhost>

Additional examples can be found in the open source Lua stored procedures repository and in the server test suite.

Limitations of stored procedures

There are two limitations in stored procedures support one should be aware of: execution atomicity and lack of typing.

Cooperative multitasking environment

Tarantool core is built around a cooperative multi-tasking paradigm: unless a running fiber deliberately yields control to some other fiber, it is not preempted. Yield points are built into all calls from Tarantool core to the operating system. Any system call which can block is performed in a asynchronous manner and the fiber waiting on the system call is preempted with a fiber ready to run. This model makes all programmatic locks unnecessary: cooperative multitasking ensures that there is no concurrency around a resource, no race conditions and no memory consistency issues.

When requests are small, e.g. simple UPDATE, INSERT, DELETE, SELECT, fiber scheduling is fair: it takes only a little time to process the request, schedule a disk write, and yield to a fiber serving the next client.

A stored procedure, however, can perform complex computations, or be written in such a way that control is not given away for a long time. This can lead to unfair scheduling, when a single client throttles the rest of the system, or to apparent stalls in request processing. Avoiding this situation is the responsibility of the stored procedure author. Most of the box calls, such as box.insert(), box.update(), box.delete() are yield points; box.select() and box.select_range(), however, are not.

It should also be noted that, in absence of transactions, any yield in a stored procedure is a potential change in the database state. Effectively, it's only possible to have CAS (compare-and-swap) -like atomic stored procedures: i.e. procedures which select and then modify a record. Multiple data change requests always run through a built-in yield point.

Lack of field types

When invoking a stored procedure from the binary protocol, it's not possible to convey types of arguments. Tuples are type-agnostic. The conventional workaround is to use strings to pass all (textual and numeric) data.

Defining triggers in Lua

Triggers are Lua scripts invoked by the system upon a certain event. Tarantool currently only supports system-wide triggers, run when a new connection is established or dropped. Since trigger body is a Lua script, it is external to the server, and a trigger must be set up on each server start. This is most commonly done in the initialization script. Once a trigger for an event exists, it is automatically invoked whenever an event occurs. The performance overhead of triggers, as long as they are not defined, is minimal: merely a pointer dereference and check. If a trigger is defined, its overhead is equivalent to the overhead of calling a stored procedure.

Triggers on connect and disconnect

box.session.on_connect(chunk)

Set a callback (trigger) invoked on each connected session. The callback doesn't get any arguments, but is the first thing invoked in the scope of the newly created session. If the trigger fails by raising an error, the error is sent to the client and the connection is shut down. Returns the old value of the trigger.

Warning

If a trigger always results in an error, it may become impossible to connect to the server to reset it.

box.session.on_disconnect(chunk)
Set a trigger invoked after a client has disconnected. Returns the old value of the trigger. If the trigger raises an error, the error is logged but otherwise is ignored. The trigger is invoked while the session associated with the client still exists and can access session properties, such as id.

Chapter 5. Replication

To set up replication, it's necessary to prepare the master, configure a replica, and establish procedures for recovery from a degraded state.

Replication architecture

A replica gets all updates from the master by continuously fetching and applying its write ahead log (WAL). Each record in the WAL represents a single Tarantool command such as INSERT or UPDATE or DELETE, and is assigned a monotonically growing log sequence number (LSN). In essence, Tarantool replication is row-based: all data change commands are fully deterministic and operate on a single record.

A stored program invocation does not enter the Write Ahead Log. Instead, log events for actual UPDATEs and DELETEs, performed by the Lua code, are written to the log. This ensures that possible non-determinism of Lua does not cause replication to go out of sync.

For replication to work correctly, the latest LSN on the replica must match or fall behind the latest LSN on the master. If the replica had its own updates, this would lead to it getting out of sync, since updates from the master having identical LSNs would not be applied. In fact, if replication is ON, Tarantool does not accept updates, even on its primary_port.

Setting up the master

To prepare the master for connections from the replica, it's only necessary to enable replication_port in the configuration file. An example configuration file can be found in test/replication/cfg/master.cfg. A master with enabled replication_port can accept connections from as many replicas as necessary on that port. Each replica has its own replication state.

Setting up a replica

A server, whether master or replica, always requires a valid snapshot file to boot from. For a master, a snapshot file is usually prepared with with the --init-storage option. For a replica, it's usually copied from the master.

To start replication, configure replication_source. Other parameters can also be changed, but existing spaces and their primary keys on the replica must be identical to the ones on the master.

Once connected to the master, the replica requests all changes that happened after the latest local LSN. It is therefore necessary to keep WAL files on the master host as long as there are replicas that haven't applied them yet. An example configuration can be found in test/replication/cfg/replica.cfg.

If required WAL files are absent, a replica can be "re-seeded" at any time with a newer snapshot file, manually copied from the master.

Note

Replication parameters are "dynamic", which allows the replica to become a master and vice versa with the help of the RELOAD CONFIGURATION statement.

Recovering from a degraded state

"Degraded state" is a situation when the master becomes unavailable -- due to hardware or network failure, or due to a programming bug. There is no reliable way for a replica to detect that the master is gone for good, since sources of failure and replication environments vary significantly.

A separate monitoring script (or scripts, if a decision-making quorum is desirable) is necessary to detect a master failure. Such a script would typically try to update a tuple in an auxiliary space on the master, and raise an alarm if a network or disk error persists for longer than is acceptable.

When a master failure is detected, the following needs to be done:

  • First and foremost, make sure that the master does not accept updates. This is necessary to prevent the situation when, should the master failure end up being transient, some updates still go to the master, while others already end up on the replica.

    If the master is available, the easiest way to turn on read-only mode is to turn Tarantool into a replica of itself. This can be done by setting the master's replication_source to point to self.

    If the master is not available, best bet is to log into the machine and kill the server, or change the machine's network configuration (DNS, IP address).

    If the machine is not available, it's perhaps prudent to power it off.

  • Record the replica's LSN, by issuing SHOW INFO. This LSN may prove useful if there are updates on the master that never reached the replica.

  • Propagate the replica to become a master. This is done by setting replication_source on replica to an empty string.

  • Change the application configuration to point to the new master. This can be done either by changing the application's internal routing table, or by setting up the old master's IP address on the new master's machine, or using some other approach.

  • Recover the old master. If there are updates that didn't make it to the new master, they have to be applied manually. You can use the Tarantool command line client to read the server log files.

Chapter 6. Server administration

Typical server administration tasks include starting and stopping the server, reloading configuration, taking snapshots, log rotation.

Server signal handling

The server is configured to gracefully shutdown on SIGTERM and SIGINT (keyboard interrupt) or SIGHUP. SIGUSR1 can be used to save a snapshot. All other signals are blocked or ignored. The signals are processed in the main event loop. Thus, if the control flow never reaches the event loop (thanks to a runaway stored procedure), the server stops responding to any signal, and can be only killed with SIGKILL (this signal can not be ignored).

Utility tarantar

The tarantar utility program will create new snapshots by reading existing snapshots and write-ahead-log (xlog) files. Thus it differs from SAVE SNAPSHOT, which creates new snapshots from the database. Since tarantar uses less memory than SAVE SNAPSHOT, it is especially appropriate for taking periodic snapshots as a background task.

To prepare: ensure that the configuration file contains wal_dir and snap_dir clauses. Tarantar does not assume that wal_dir and snap_dir have default values.

To run:

 tarantar [options] configuration-file

where possible options are:

-i seconds-count or --interval seconds-count — repeat every seconds-count seconds. example: -i 3600
-n lsn-number or --lsn lsn-number — start from lsn = lsn-number. if not specified, lsn = latest. example: -n 5
-l bytes-count or --limit bytes-count — do not use more than bytes-count bytes of memory. example: -l 5000000
--help — display a help message and exit. example: --help
-v or --version — display version and exit. example: -v

Example:

$ ~/tarantool/client/tarantar/tarantar -c -i 30 ./tarantool.cfg
snap_dir: /home/user/tarantool_test/work_dir
wal_dir:  /home/user/tarantool_test/work_dir
spaces:   1
interval: 30
memory_limit: 0M

START SNAPSHOTTING Fri Oct 25 09:35:25 2013

last snapshot lsn: 7
(snapshot) 00000000000000000007.snap 0.000M processed

( >> ) 00000000000000000006.snap 0.000M processed

START SNAPSHOTTING Fri Oct 25 09:35:55 2013

last snapshot lsn: 7
(snapshot) 00000000000000000007.snap 0.000M processed

( >> ) 00000000000000000006.snap 0.000M processed

snapshot exists, skip.

...

For an explanation of tarantar's design see the Tarantool wiki.

Utility tarancheck

The tarancheck utility program will generate and verify signature files. A signature file contains, along with basic information that identifies the database, checksums calculated for each index in each space of the database, based on the latest snapshot and all subsequent entries in the write-ahead log. Signature files are useful for ensuring that databases have been saved without error, and for quick comparisons to see whether a database's components have been modified.

The main reason that tarancheck was created was so that users would be able to compare the consistency of two running servers, the master and the replica. By creating a signature file on the master using the master directory, and then copying the signature file to the replica, one will be able to confirm that the replica is not corrupt.

There is one necessary warning. Since either the master or the replica is likely to be active when tarancheck runs, the check can only be applicable for the database as of the last transaction that was run on both the master and the replica. That is why tarancheck displays last_xlog_lsn, which is the log sequence number of the write-ahead log, when it finishes.

To prepare: ensure that the configuration file contains wal_dir and snap_dir clauses. Tarancheck does not assume that wal_dir and snap_dir have default values.

To run:

 tarancheck [options] configuration-file

where possible options are:

-G signature file or --generate signature-file — generate signature file. example: -G x.crc
-V signature file or --verify signature-file — verify signature file. example: --verify x.crc
--help — display a help message and exit. example: --help
-v or --version — display version and exit. example: -v

Example:

$ ~/tarantool/client/tarantar/tarancheck --generate=x.crc tarantool.cfg
>>> Signature file generation
configured spaces: 1
snap_dir: ./work_dir
wal_dir: ./work_dir
last snapshot lsn: 1
last xlog lsn: 0
(snapshot) 00000000000000000001.snap
(signature) saving x.crc

Utility tarantool_deploy

With tarantool_deploy one can set up so that, during system boot, one or more instances of the tarantool_box server will start. This utility is for use on Red Hat or CentOS where Tarantool was installed using rpm --install.

Technically, tarantool_deploy will place instructions in /etc/init.d which will initiate tarantool_box with appropriate options and with settings that maximize resource usage. The root password is necessary. These options are available, as shown by tarantool_deploy --help:

Tarantool deployment script: add more Tarantool instances.
usage: tarantool_deploy.sh [options] <instance>

  --prefix <path>       installation path (/usr)
  --prefix_etc <path>   installation etc path (/etc)
  --prefix_var <path>   installation var path (/var)

  --status              display deployment status
  --dry                 don't create anything, show commands

  --debug               show commands
  --yes                 don't prompt
  --help                display this usage

The default prefixes (/usr and /etc and /var) are appropriate if a Tarantool installation was done with default settings, for example tarantool_box should be in /usr/bin. The only necessary argument is the "instance", which is an arbitrary numeric identification formatted as digit.digit. The following is a sample run:

$ tarantool_deploy.sh 0.1
tarantool_deploy.sh: About to deploy Tarantool instance 0.1.
tarantool_deploy.sh: Continue? [n/y]
y
tarantool_deploy.sh: >>> deploy instance 0.1
tarantool_deploy.sh: >>> updating deployment config
tarantool_deploy.sh: done

System-specific administration notes

This chapter provides a cheatsheet for most common server management routines on every supported operating system.

Debian GNU/Linux and Ubuntu

Setting up an instance: ln -s /etc/tarantool/instances.available/instance-name.cfg /etc/tarantool/instances.enabled/

Starting all instances: service tarantool start

Stopping all instances: service tarantool stop

Starting/stopping one instance: service tarantool-instance-name start/stop

Fedora, RHEL, CentOS

tba

FreeBSD

tba

Mac OS X

tba

Chapter 7. Configuration reference

This chapter provides a reference of options which can be set in the command line or tarantool.cfg configuration file.

Tarantool splits its configuration parameters between command line options and a configuration file. Command line options are provided only for the most basic properties; the rest must be set in the configuration file. At runtime, this allows to disambiguate the source of a configuration setting: it unequivocally comes either from the command line, or from the configuration file, but never from both.

Command line options

Tarantool follows the GNU standard for its command line interface: long options start with a double dash (--option), their short counterparts use a single one (-o). For phrases, both dashes and underscores can be used as word separators (--cfg-get and --cfg_get both work). If an option requires an argument, you can either separate it with a space or equals sign (--cfg-get=pid_file and --cfg-get pid_file both work).

  • --help, -h

    Print an annotated list of all available options and exit.

  • --version, -V

    Print product name and version, for example:

    $  ./tarantool_box --version
    Tarantool 1.4.0-69-g45551dd
            

    In this example:

    Tarantool is the name of the reusable asynchronous networking programming framework.
    Box is the name of the storage back-end.
    The 3-number version follows the standard <major>-<minor>-<patch> scheme, in which <major> number is changed only rarely, <minor> is incremented for each new milestone and indicates possible incompatible changes, and <patch> stands for the number of bug fix releases made after the start of the milestone. The optional commit number and commit SHA1 are output for non-released versions only, and indicate how much this particular build has diverged from the last release.

    Note

    Tarantool uses git describe to produce its version id, and this id can be used at any time to check out the corresponding source from our git repository.

  • --config=/path/to/config.file, -c

    Tarantool does not start without a configuration file. By default, the server looks for file named tarantool.cfg in the current working directory. An alternative location can be provided using this option.

  • --check-config

    Check the configuration file for errors. This option is normally used on the command line before reload configuration is issued on the administrative port, to ensure that the new configuration is valid. When configuration is indeed correct, the program produces no output and returns 0. Otherwise, information about discovered error is printed out and the program terminates with a non-zero value.

  • --cfg-get=option_name

    Given option name, print option value. If the option does not exist, or the configuration file is incorrect, an error is returned. If the option is not explicitly specified, its default value is used instead. Example:

    $ ./tarantool_box --cfg-get=admin_port
    33015   

  • --init-storage

    Initialize the directory, specified in vardir configuration option by creating an empty snapshot file in it. If vardir doesn't contain at least one snapshot, the server does not start. There is no magic with automatic initialization of vardir on boot to make potential system errors more noticeable. For example, if the operating system reboots and fails to mount the partition on which vardir is expected to reside, the rc.d or service script responsible for server restart will also fail, thanks to this option.

The only two options which could affect a running server are:

  • --verbose, -v

    Increase verbosity level in log messages. This option currently has no effect.

  • --background, -b

    Detach from the controlling terminal and run in the background.

    Caution

    Tarantool uses stdout and stderr for debug and error log output. When starting the server with option --background, make sure to either redirect its standard out and standard error streams, or provide logger option in the configuration file, since otherwise all logging information will be lost.

The configuration file

All advanced configuration parameters must be specified in a configuration file, which is required for server start. If no path to the configuration file is specified on the command line (see --config), the server looks for a file named tarantool.cfg in the current working directory.

To facilitate centralized and automated configuration management, runtime configuration modifications are supported solely through the RELOAD CONFIGURATION administrative statement. Thus, the procedure to change Tarantool configuration at runtime is to edit the configuration file. This ensures that, should the server get killed or restart, no unexpected changes to configuration can occur.

Not all configuration file settings are changeable at runtime: such settings will be highlighted in this reference. If the same setting is given more than once, the latest occurrence takes effect. You can always invoke SHOW CONFIGURATION from the administrative console to show the current configuration.

Tarantool maintains a set of all allowed configuration parameters in two template files, which are easy to maintain and extend: cfg/core_cfg.cfg_tmpl, src/box/box_cfg.cfg_tmpl. These files can always be used as a reference for any parameter in this manual.

In addition, two working examples can be found in the source tree: test/box/tarantool.cfg, test/big/tarantool.cfg.

Table 7.1. Basic parameters

NameTypeDefaultRequired?Dynamic?Description
usernamestring""nonoUNIX user name to switch to after start.
work_dirstring""nonoA directory where database working files will be stored. The server switches to work_dir with chdir(2) after start. Can be relative to the current directory. If not specified, defaults to the current directory.
script_dirstring""nonoIf this path is set, it is added to the Lua package search path, so that instance-specific Lua scripts can be loaded and executed. If the directory specified in the path contains init.lua file, it is loaded and executed at server start.
wal_dirstring""nonoA directory where write ahead log (.xlog) files are stored. Can be relative to work_dir. Most commonly used so that snapshot files and write ahead log files can be stored on separate disks. If not specified, defaults to work_dir.
snap_dirstring""nonoA directory where snapshot (.snap) files will be stored. Can be relative to work_dir. If not specified, defaults to work_dir. See also wal_dir.
bind_ipaddrstring"INADDR_ANY"nonoThe network interface to bind to. By default, the server binds to all available addresses. Applies to all ports opened by the server.
primary_portintegernoneyesnoThe read/write data port. Has no default value, so must be specified in the configuration file. Normally set to 33013. Note: a replica also binds to this port, and accepts connections, but these connections can only serve reads until the replica becomes a master.
secondary_portintegernonenonoAdditional, read-only port. Normally set to 33014. Not used unless is set.
admin_portintegernonenonoThe TCP port to listen on for administrative connections. Has no default value. Not used unless assigned a value. Normally set to 33015.
pid_filestringtarantool.pidnonoStore the process id in this file. Can be relative to work_dir.
custom_proc_title string""nono

Inject the given string into server process title (what's shown in the COMMAND column for ps and top commands). For example, ordinarily ps shows the Tarantool server process thus:

kostja@shmita:~$ ps -a -o command | grep box
tarantool_box: primary pri: 33013 sec: 33014 adm: 33015

But if the configuration file contains custom_proc_title=sessions then the output looks like:

kostja@shmita:~$ ps -a -o command | grep box
tarantool_box: primary@sessions pri: 33013 sec: 33014 adm: 33015

Table 7.2. Configuring the storage

NameTypeDefaultRequired?Dynamic?Description
slab_alloc_arenafloat1.0nono How much memory Tarantool allocates to actually store tuples, in gigabytes. When the limit is reached, INSERT or UPDATE requests begin failing with error ER_MEMORY_ISSUE. While the server does not go beyond the defined limit to allocate tuples, there is additional memory used to store indexes and connection information. Depending on actual configuration and workload, Tarantool can consume up to 20% more than the limit set here.
slab_alloc_minimalinteger64nonoSize of the smallest allocation unit. It can be tuned down if most of the tuples are very small.
slab_alloc_factorfloat2.0nonoUse slab_alloc_factor as the multiplier for computing the sizes of memory chunks that tuples are stored in. A lower value may result in less wasted memory depending on the total amount of memory available and the distribution of item sizes.
spacearray of objectsnoneyesnoThis is the main Tarantool parameter, describing the data structure that users get access to via the client/server protocol. It holds an array of entries, and each entry describes a tuple set and its indexes. Every entry is a composite object, best seen as a C programming language "struct" [a].

[a]

Space settings explained

Space is a composite parameter, that is, it has properties.

/*
 * Each tuple consists of fields. Three field types are
 * supported.
 */

enum { STR, NUM, NUM64 } field_type;

/*
 * Tarantool is interested in field types only inasmuch as
 * it needs to build indexes on fields. An index
 * can cover one or more fields.
 */

struct index_field_t {
  unsigned int fieldno;
  enum field_type type;
};

/*
 * HASH and TREE and BITSET index types are supported.
 */

enum { HASH, TREE, BITSET } index_type;

struct index_t {
  index_field_t key_field[];
  enum index_type type;
  /* Secondary index may be non-unique */
  bool unique;
};

struct space_t
{
  /* A space can be quickly disabled and re-enabled at run time. */
  bool enabled;
  /*
   * If cardinality is given, each tuple in the space must have exactly
   * this many fields.
   */
  unsigned int cardinality;
  /* estimated_rows is only used for HASH indexes, to preallocate memory. */
  unsigned int estimated_rows;
  struct index_t index[];
};

The way a space is defined in a configuration file is similar to how one would initialize a C structure in a program. For example, a minimal storage configuration looks like the following:

space[0].enabled = 1
space[0].index[0].type = HASH
space[0].index[0].unique = 1
space[0].index[0].key_field[0].fieldno = 0
space[0].index[0].key_field[0].type = NUM64

The parameters listed above are mandatory. Other space properties are set in the same way. An alternative syntax, mainly useful when defining large spaces, exists:

space[0] = {
    enabled = 1,
    index = [
        {
            type = HASH,
            key_field = [
                {
                    fieldno = 0,
                    type = NUM64
                }
            ]
        }
    ]
}

When defining a space, please be aware of these restrictions:

  • at least one space must be configured,
  • each configured space needs at least one unique index,
  • "unique" property doesn't have a default, and must be set explicitly,
  • space configuration can not be changed dynamically, currently you need to restart the server even to disable or enable a space,
  • HASH indexes can not be non-unique.


Table 7.3. Binary logging and snapshots

NameTypeDefaultRequired?Dynamic?Description
panic_on_snap_errorbooleantruenonoIf there is an error reading the snapshot file (at server start), abort.
panic_on_wal_errorbooleanfalsenonoIf there is an error reading a write ahead log file (at server start), abort.
rows_per_walinteger500000nonoHow many log records to store in a single write ahead log file. When this limit is reached, Tarantool creates another WAL file named <first-lsn-in-wal>.xlog This can be useful for simple rsync-based backups.
snap_io_rate_limitfloat0.0noyesReduce the throttling effect of SAVE SNAPSHOT on INSERT/UPDATE/DELETE performance by setting a limit on how many megabytes per second it can write to disk. The same can be achieved by splitting wal_dir and snap_dir locations and moving snapshots to a separate disk.
wal_fsync_delayfloat0noyesDo not flush the write ahead log to disk more often than once in wal_fsync_delay seconds. By default the delay is zero, which means there is no flushing after writes (the meaning of wait_fsync_delay=0 meay change in later versions). Setting the delay may be necessary to increase write throughput, but may lead to several last updates being lost in case of a power failure. Such failure, however, does not lead to data corruption: all WAL records have a checksum, and only complete records are processed during recovery.
wal_modestring"fsync_delay"noyesSpecify fiber-WAL-disk synchronization mode as: none: write ahead log is not maintained; write: fibers wait for their data to be written to the write ahead log (no fsync(2)); fsync: fibers wait for their data, fsync(2) follows each write(2); fsync_delay: fibers wait for their data, fsync(2) is called every N=wal_fsync_delay seconds (N=0.0 means no fsync(2) - equivalent to wal_mode = "write");

Table 7.4. Replication

NameTypeDefaultRequired?Dynamic?Description
replication_portinteger0nonoIf replication_port is greater than zero, the server is considered to be a Tarantool master. The master server listens on the specified port for incoming connections from replicas. See also replication_source, which complements this setting on the replica side.
replication_sourcestringNULLnoyesIf replication_source is not an empty string, the server is considered to be a Tarantool replica. The replica server will try to connect to the master which replication_source specifies with format ip:port. For example, if replication_source = "1.2.3.4:55555" then the replica server tries to connect to 1.2.3.4 port 55555. A replica server does not accept updates on primary_port. This parameter is dynamic, that is, to enter master mode, simply set replication_source to an empty string and issue RELOAD CONFIGURATION.

Table 7.5. Networking

NameTypeDefaultRequired?Dynamic?Description
io_collect_intervalfloat0.0noyesThe server will sleep for io_collect_interval seconds between iterations of the event loop. Can be used to reduce CPU load in deployments in which the number of client connections is large, but requests are not so frequent (for example, each connection issues just a handful of requests per second).
readaheadinteger16384nonoThe size of the read-ahead buffer associated with a client connection. The larger the buffer, the more memory an active connection consumes and the more requests can be read from the operating system buffer in a single system call. The rule of thumb is to make sure the buffer can contain at least a few dozen requests. Therefore, if a typical tuple in a request is large, e.g. a few kilobytes or even megabytes, the read-ahead buffer size should be increased. If batched request processing is not used, it's prudent to leave this setting at its default.
backloginteger1024nonoThe size of the listen backlog.

Table 7.6. Logging

NameTypeDefaultRequired?Dynamic?Description
log_levelinteger4noyesHow verbose the logging is. There are 5 log verbosity classes: 1 -- ERROR, 2 -- CRITICAL, 3 -- WARNING, 4 -- INFO, 5 -- DEBUG. By setting log_level, one can enable logging of all classes below or equal to the given level. Tarantool prints its logs to the standard error stream by default, but this can be changed with the "logger" configuration parameter.
loggerstring""nonoBy default, the log is sent to the standard error stream (stderr). If logger is given a value, Tarantool creates a child process, executes the command indicated by the value, and pipes its standard output to the standard input of the created process. Example setting: tee --append tarantool.log (this will duplicate log output to stdout and a log file).
logger_nonblockinteger0nonoIf logger_nonblock equals 1, Tarantool does not block on the log file descriptor when it's not ready for write, and drops the message instead. If log_level is high, and a lot of messages go to the log file, setting logger_nonblock to 1 may improve logging performance at the cost of some log messages getting lost.
too_long_thresholdfloat0.5noyesIf processing a request takes longer than the given value (in seconds), warn about it in the log. Has effect only if log_level is less than or equal to 3 (WARNING).

Table 7.7. Memcached protocol support

NameTypeDefaultRequired?Dynamic?Description
memcached_portintegernonenono Turn on Memcached protocol support on the given port. All requests on this port are directed to a dedicated space, set in memcached_space. Memcached-style flags are supported and stored along with the value. The expiration time can also be set and is persistent, but is ignored, unless memcached_expire is turned on. Unlike Memcached, all data still goes to the binary log and to the replica, if the latter is set up, which means that power outage does not lead to loss of all data. Thanks to data persistence, cache warm up time is also very short.
memcached_spaceinteger23nono Space id to store memcached data in. The format of tuple is [key, metadata, value], with a HASH index based on the key. Since the space format is defined by the Memcached data model, it must not be previously configured.
memcached_expirebooleanfalsenono Turn on tuple time-to-live support in memcached_space. This effectively turns Tarantool into a persistent, replicated and scriptable implementation of Memcached.
memcached_expire_per_loopinteger1024noyesHow many records to consider per iteration of the expiration loop. Tuple expiration is performed in a separate green thread within our cooperative multitasking framework and this setting effectively limits how long the expiration loop stays uninterrupted on the CPU.
memcached_expire_full_sweepfloat3600noyesTry to make sure that every tuple is considered for expiration within this time frame (in seconds). Together with memcached_expire_per_loop this defines how often the expiration green thread is scheduled on the CPU.

Chapter 8. Connectors

This chapter documents APIs for various programming languages.

Apart from the native Tarantool client driver, you can always use a Memcached driver of your choice, after enabling Memcached protocol in the configuration file.

Packet example

The Tarantool API exists so that a client program can send a request packet to the server, and receive a response. Here is an example of a what the client would send for INSERT INTO t0 VALUES ('A','BB'). The BNF description of the components is in file doc/box-protocol.txt.

ComponentByte#0Byte#1Byte#2Byte#3
type13000
body_length17000
request_id1000
space_no0000
flags2000
cardinality2000
field[0] size1 
field[0] data65 
field[1] size2 
field[1] data6666 

Now, one could send that packet to the tarantool_box server, and interpret the response (box-protocol.txt has a description of the packet format for responses as well as requests). But it would be easier, and less error-prone, if one could invoke a routine that formats the packet according to typed parameters. Something like response=tarantool_routine("insert",0,"A","B");. And that is why APIs exist for drivers for C, Perl, Python, PHP, Ruby, and so on.

C

Here is a complete C program that inserts ['A','BB'] into space[0] via the C API for the binary protocol. To compile, paste the code into a file named example.c and say gcc -o example example.c -I/tarantool-directory/connector/c/include where tarantool-directory = the directory that contains the necessary file tp.h, and the default library path contains the directory where Tarantool library files were placed at installation time. Before trying to run, check that the server (tarantool_box) is running on localhost (127.0.0.1) and its primary port is the default (33013) and space[0]'s primary key type is string (space[0].index[0].key_field[0].type = "STR" in configuration file). To run, say ./example. The program will format a buffer for sending an INSERT request, then open a socket connection with the tarantool_box server at localhost:33013, then send the request, then check if the server returned an error, then — if all is well — print "Insert succeeded". If the row already exists, the program will print Duplicate key exists in unique index 0.

#include <arpa/inet.h>
#include <stdio.h>
#include <tp.h>                                                /* the usual Tarantool include */

int main()
{
  struct tp request;                                           /* area for sending to server */
  struct tp reply;                                             /* area for getting server reply */
  int fd;                                                      /* file descriptor for socket */
  struct sockaddr_in tt;                                       /* the usual socket address info */
  tp_init(&request, NULL, 0, tp_realloc, NULL);                /* initialize request buffer */
  tp_insert(&request, 0, 2);                                   /* append INSERT header */
  tp_tuple(&request);                                          /* begin appending body */
  tp_sz(&request,"");                                          /* append field[0] */
  tp_sz(&request,"BB");                                        /* append field[1] */
  if ((fd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) <= 0)   /* open the socket, abort if failure */
    exit(1);
  memset(&tt, 0, sizeof(tt));                                  /* connect to localhost:33013 */
  tt.sin_family = AF_INET;
  tt.sin_addr.s_addr = inet_addr("127.0.0.1");
  tt.sin_port = htons(33013);
  if (connect(fd, (struct sockaddr *) &tt, sizeof(tt)) <= 0)   /* connect, abort if failure */
    exit(1);
  int rc = write(fd, tp_buf(&request), tp_used(&request));     /* send the INSERT request */
  if (rc != tp_used(&request))                                 /* abort if send failed */
    exit(1);
  tp_init(&reply, NULL, 0, tp_realloc, NULL);                  /* initialize reply buffer */
  while (1) {
    ssize_t to_read = tp_req(&reply);
    if (to_read <= 0)
      break;
    ssize_t new_size = tp_ensure(&reply, to_read);
    if (new_size == -1)                                        /* abort if error e.g. no memory */
      exit(1);
    ssize_t res = read(fd, reply.p, to_read);                  /* get reply */
    if (res <= 0)                                              /* abort if error e.g. no reply */
      exit(1);
    tp_use(&reply, res);
  }
  ssize_t server_code = tp_reply(&reply);                      /* display+abort if error e.g. duplicate key */
  if (server_code != 0) {
    printf("error: %-.*s\n", tp_replyerrorlen(&reply),
           tp_replyerror(&reply));
    tp_free(&reply);
    exit(1);
  }
  tp_free(&request);                                           /* clean up */
  tp_free(&reply);
  close(fd);
  printf("Insert succeeded\n");                                /* congratulate self */
}

The example program only shows one command and does not show all that's necessary for good practice. For that, please see connector/c in the source tree.

Perl

The Perl driver requires a schema definition. Please refer to CPAN module DR::Tarantool.

PHP

Please see tarantool-php project at GitHub.

Python

Here is a complete Python program that inserts ['A','BB'] into space[0] via the high-level Python API. To prepare, paste the code into a file named example.py and say: export PYTHONPATH=tarantool-directory/test/lib where tarantool-directory/test/lib = the directory that contains the necessary file box_connection.py. This will be the directory where Tarantool Python library files were placed at installation time for a source download. Before trying to run, check that the server (tarantool_box) is running on localhost (127.0.0.1) and its primary port is the default (33013) and space[0]'s primary key type is string (space[0].index[0].key_field[0].type = "STR" in configuration file). To run, say python example.py. The program will connect to the server, send the request, and display Insert OK, 1 row affected if all went well. If the row already exists, the program will print Duplicate key exists in unique index 0.

#!/usr/bin/python
from box_connection import BoxConnection

c = BoxConnection("127.0.0.1", 33013)
result = c.execute("INSERT INTO t0 VALUES ('A','BB')")
print result

The example program only shows one command and does not show all that's necessary for good practice. For that, please see http://github.com/mailru/tarantool-python and https://github.com/zlobspb/txtarantool.

Ruby

You need Ruby 1.9 or later to use this connector. Connector sources are located in http://github.com/mailru/tarantool-ruby.

Appendix A. Server process titles

Linux and FreeBSD operating systems allow a running process to modify its title, which otherwise contains the program name. Tarantool uses this feature to help meet the needs of system administration, such as figuring out what services are running on a host, what TCP/IP ports are in use, and so on.

Tarantool process title follows the following naming scheme: program_name: role[@custom_proc_title] [ports in use]

program_name is typically tarantool_box. The role can be one of the following:

  • primary -- the master node,

  • replica/IP:port -- a replication node,

  • replication_server -- runs only if replication_port is set, accepts connections on this port and creates a

  • replication_relay -- a process that servers a single replication connection.

Possible port names are: pri for primary_port, sec for secondary_port, adm for admin_port and memcached for memcached_port.

For example:

  • tarantool_box: primary pri: 50000 sec: 50001 adm: 50002

  • tarantool_box: primary@infobox pri: 15013 sec: 15523 adm: 10012

Appendix B. List of error codes

In the current version of the binary protocol, error message, which is normally more descriptive than error code, is not present in server response. The actual message may contain a file name, a detailed reason or an operating system error code. All such messages, however, are logged in the error log. When using Memcached protocol, the error message is sent to the client along with the code. Below follow only general descriptions of some popular codes. A complete list of errors can be found in file errcode.h in the source tree.

List of error codes

ER_NONMASTER

Attempt to execute an update on a running replica.

ER_ILLEGAL_PARAMS

Illegal parameters. Malformed protocol message.

ER_MEMORY_ISSUE

Out of memory: slab_alloc_arena limit is reached.

ER_WAL_IO

Failed to record the change in the write ahead log. Some sort of disk error.

ER_KEY_PART_COUNT

Key part count is greater than index part count

ER_NO_SUCH_SPACE

Attempt to access a space that is not configured (doesn't exist).

ER_NO_SUCH_INDEX

No index with the given id exists.

ER_PROC_LUA

An error inside Lua procedure.

ER_FIBER_STACK

Recursion limit reached when creating a new fiber. This is usually an indicator of a bug in a stored procedure, recursively invoking itself ad infinitum.

ER_UPDATE_FIELD

A error occured during update of a field.

Appendix C. Limitations

Number of fields in an index

For BITSET indexes, the maximum is 1. For TREE indexes, the theoretical maximum is about 4 billion (BOX_FIELD_MAX) but the practical maximum is the number of fields in a tuple.

Number of indexes in a space

10 (BOX_INDEX_MAX).

Number of fields in a tuple

There is no theoretical maximum. The practical maximum is whatever is specified by space.cardinality in the configuration file, or the maximum tuple length.

Number of spaces

255.

Number of connections

The practical limit is the number of file descriptors that one can set with the operating system.

Space size

The total maximum size for all spaces is in effect set by the slab_alloc_arena_size parameter in the configuration file, which in turn is limited by the total available memory.

Update operations count

The maximum number of operations that can be in a single update is 4000 (BOX_UPDATE_OP_CNT_MAX).

Appendix D. Client reference

This appendix shows all legal syntax for the tarantool command-line client, with short notes and examples. Other client programs may have similar options and statement syntaxes.

Conventions used in this appendix

Tokens are character sequences which are treated as syntactic units within statements. Square brackets [ and ] enclose optional syntax. Three dots in a row ... mean the preceding tokens may be repeated. A vertical bar | means the preceding and following tokens are mutually exclusive alternatives.

Options when starting client from the command line

General form: tarantool [option...] [statement]. Statement will be described in a later section. Option is one of the following (in alphabetical order by the long form of the option):

--admin-port

Syntax: short form: -a port-number long form: --a[dmin-port] [=] port-number. Effect: Client will look for the server on the port designated by port-number. Notes: The administrative port is normally set to 33015 in the server configuration file.

--bin

Syntax: short form: -B long form: --b[in]. Effect: When displaying with the Lua printer, treat values with type NUM as if they are type STR, unless they are arguments in updates used for arithmetic. Example: --bin

--cat

Syntax: short form: -C file-name long form: --c[at] file-name. Effect: Client will print the contents of the write-ahead log or snapshot designated by file-name. Example: --cat /tarantool_user/work_dir/00000000000000000018.xlog Notes: The client stops after displaying the contents.

--delim

Syntax: short form: -D delimiter long form: --d[elim] delimiter. Effect: If --cat is used, then put delimiter at end of each line of a Lua file. If --cat is not used, then require that all statements end with delimiter. Example: --delim = '!' Notes: See also the SETOPT DELIMITER statement.

--format

Syntax: short form: -M tarantool|raw long form: --fo[rmat] tarantool|raw. Effect: set format for output from --cat Example: --format tarantool Notes: The default format is tarantool.

--from

Syntax: short form: -F log-sequence-number long form: --fr[om] log-sequence-number. Effect: Play only what has a a log sequence number greater than or equal to log-sequence-number. Example: --from 55 Notes: see also --play and --to.

--header

Syntax: short form: -H long form: --hea[der]. Effect: Add a header if --format=raw. Example: --header Notes: the default is 'no header'.

--help

Syntax: short form: (none) long form: --hel[p]. Effect: Client displays a help message including a list of options. Example: --help Notes: The client stops after displaying the help.

--host

Syntax: short form: -h host-name long form: --ho[st] [=] host-name. Effect: Client will look for the server on the computer designated by host-name. Example: --host = 127.0.0.1 Notes: default value is localhost.

--play

Syntax: short form: -P file-name long form: --pl[ay] ffile-name. Effect: Client will tell server to replay the write-ahead log designated by file-name. Example: --play /tarantool_user/work_dir/00000000000000000018.xlog

--port

Syntax: short form: -p port-number long form: --po[rt] [=] port-number. Effect: Client will look for the server on the port designated by port-number. Example: --port = 33013 Notes: default value is 33013.

--rpl

Syntax: short form: -R server-name long form: --rpl server-name. Effect: Act as a replica for the server specified by server-name. Example: --rpl = wombat

--space

Syntax: short form: -S space-number Long form: --s[pace] space-number. Effect: Play only what is applicable to the space designated by space-number. Example: --space 0

--to

Syntax: short form: -T log-sequence-number long form: --t[o] log-sequence-number. Effect: Play only what has a log sequence number less than or equal to log-sequence-number. Example: --to 66 Notes: see also --play and --from.

--version

Syntax: short form: -v long form: --v[ersion]. Effect: Client displays version information. Example: --version Notes: The client stops after displaying the version.

Tokens for use within statements

Keywords are: Character sequences containing only letters of the English alphabet. Examples: SELECT, INTO, FIBER. Notes: Keywords are case insensitive so SELECT and Select are the same thing.

Tuple set identifiers are: Lower case letter 't' followed by one or more digits. Examples: t0, t55.

Field identifiers are: Lower case letter 'k' followed by one or more digits. Examples: k0, k55.

Procedure identifiers are: Any sequence of letters, digits, or underscores which is legal according to the rules for Lua identifiers.

String literals are: Any sequence of zero or more characters enclosed in single quotes. Examples: 'Hello, world', 'A'.

Numeric literals are: Character sequences containing only digits, optionally preceded by + or -. Examples: 55, -. Notes: Tarantool NUM data type is unsigned, so -1 is understood as a large unsigned number.

Single-byte tokens are: * or , or ( or ). Examples: * , ( ).

Tokens must be separated from each other by one or more spaces, except that spaces are not necessary around single-byte tokens or string literals.

Statements in alphabetical order

Although an initial statement may be entered on the tarantool command line, generally they are entered following the prompt in interactive mode while tarantool is running. (A prompt will be the name of the host and a greater-than sign, for example localhost>). The end-of-statement marker is a newline (line feed).

CALL

Syntax: CALL procedure-identifier (). Effect: The client tells the server to execute the procedure identified by procedure-identifier. Example: CALL proc50().

DELETE

Syntax: DELETE FROM tuple-set-name WHERE field-name = literal. Effect: Client tells server to delete the tuple identified by the WHERE clause. Example: DELETE FROM t0 WHERE k0='a'. Notes: field-name must identify the primary key.

EXIT

Syntax: E[XIT]. Effect: The tarantool program stops. Example: EXIT. Notes: same as QUIT.

HELP

Syntax: H[ELP]. Effect: Client displays a message including a list of possible statements. Example: HELP.

INSERT

Syntax: INSERT [INTO] tuple-set-identifier VALUES (literal [,literal...]). Effect: Client tells server to add the tuple consisting of the literal values. Example: INSERT INTO t0 VALUES ('a',0).

LOADFILE

Syntax: LOADFILE string-literal. Effect: The client loads instructions from the file identified by string-literal. Example: LOADFILE '/home/tarantool_user/file5.txt'.

LUA

Syntax: LUA token [token...]. Effect: Client tells server to execute the tokens as Lua statements. Example: LUA "hello".." world".

NOTEE

Syntax: NOTEE. Effect: Client ceases to write to a file, thus canceling the effect of the TEE statement. Example: NOTEE.

PING

Syntax: PING. Effect: Client sends a ping to the server. Example: PING.

QUIT

Syntax: Q[UIT]. Effect: The client stops. Example: QUIT. Notes: same as EXIT.

RELOAD

Syntax: RELOAD CONFIGURATION. Effect: Client tells server to re-read the configuration file. Example: RELOAD CONFIGURATION. Notes: The client sends to the server's administrative port.

REPLACE

Syntax; REPLACE [INTO] tuple-set-identifier VALUES (literal [,literal...]). Effect: Client tells server to add the tuple consisting of the literal values. Example: REPLACE INTO t0 VALUES ('a',0). Notes: REPLACE and INSERT are the same, except that INSERT will return an error if a tuple already exists with the same primary key.

SAVE

Syntax: SAVE COREDUMP | SNAPSHOT. Effect: Client tells server to save the designated object. Example: SAVE SNAPSHOT. Notes: The client sends to the server's administrative port.

SELECT

Syntax: SELECT * FROM tuple-set-identifier WHERE field-identifier = literal [AND|OR field-identifier = literal...] [LIMIT numeric-literal [,numeric-literal]]. Effect: Client tells server to find the tuple or tuples identified in the WHERE clause. Example: SELECT * FROM t0 WHERE k0 = 5 AND k1 = 7 LIMIT 1.

SET

Syntax: SET INJECTION name-token state-token. Effect: In normal mode: error. Notes: This statement is only available in debug mode.

SETOPT

Syntax: SETOPT DELIMITER = string-literal. The string must be a value in single quotes. Effect: string becomes end-of-statement delimiter, so newline alone is not treated as end of statement. Example: SETOPT DELIMITER = '!'.

SHOW

Syntax: SHOW CONFIGURATION | FIBER | INFO | INJECTIONS | PALLOC | PLUGINS | SLAB | STAT. Effect: The client asks the server for information about environment or statistics. Example: SHOW INFO. Notes: The client sends to the administrative port. SHOW INJECTIONS is only available in debug mode.

TEE

Syntax: TEE string-literal. Effect: The client begins logging in the file identified by string-literal. Example: TEE '/home/tarantool_user/log.txt'. Notes: TEE may also be set up via an option on the command line.

UPDATE

Syntax: UPDATE tuple-set-identifier SET field-identifier = literal [,field-identifier = literal...] WHERE field-identifier = literal. Effect: Client tells server to change the tuple identified in the WHERE clause. Example: UPDATE t1 SET k1= 'K', k2 = 7 WHERE k0 = 0.

For a condensed Backus-Naur Form [BNF] description of some of the statements, see doc/box-protocol.txt and doc/sql.txt.