Jun 8, 2012

Evaluating LIO Linux iSCSI target (Source Origin)

I'd like to share some notes on LIO, a new iSCSI target in Linux kernel, as there are not so much of information about it.
  1. General information
  2. Concepts
  3. Using targetcli
  4. Using rtslib
  5. Resizing LUN
  6. Protect against split-brain
  7. Summary

General information

Before LIO, the iSCSI target implementation in Ubuntu was iET.  Compared to iET, LIO is implemented as a pure kernel driver.  Operations for LIO is done via configFS special filesystem.

These packages are related to LIO:
  • targetcli
    provides targetcli command-line utility.  This is the standard way to manipulate LIO.
  • python-rtslib
    provides rtslib, a full-fledged python API library over configFS.
Web sites:

Concepts

In addition to the standard iSCSI concepts, you should know some LIO specific ones. For the standard concepts, read RFC3720.
  • Fabric
    LIO supports several other fabrics than iSCSI. Look for /var/target/fabric/ directory for available fabric specifications.
  • Backstores
    LIO supports several storage types as backing storage for LUNs. Specifically, PSCSI passes SCSI commands through to a (real) SCSI device. IBLOCK emulates SCSI devices on top of block devices such as LVM logical volume.

Using targetcli

How to setup:
$ sudo apt-get install --no-install-recommends targetcli python-urwid
$ sudo reboot
To avoid a bug in targetcli package, python-urwid package need to be installed explicitly. Basic usage is described in this page:
There are few other documentations or manuals for it. Use context help inside targetcli command as follows:
$ sudo targetcli
/> help
(snip)
AVAILABLE COMMANDS
  The following commands are available in the current path:
    - bookmarks action [bookmark]
    - cd [path]
    - exit
    - get [group] [parameter...]
    - help [topic]
    - ls [path] [depth]
    - pwd
    - refresh
    - saveconfig
    - set [group] [parameter=value...]
    - status
    - version
/> cd backstores/pscsi
/backstores/pscsi> help
(snip)
AVAILABLE COMMANDS
(snip)
    - create name dev
    - delete name
> help create
SYNTAX
  create name dev
DESCRIPTION
  Creates a PSCSI storage object, with supplied name and SCSI device. The SCSI
  device dev can either be a path name to the device, in which case it is
  recommended to use the /dev/disk/by-id hierarchy to have consistent naming
  should your physical SCSI system be modified, or an SCSI device ID in the
  H:C:T:L format, which is not recommended as SCSI IDs may vary in time.

Using rtslib

First, there is a bug in the current rtslib (2.1-2).  Fix it, or you will fail to set some parameters.  Since targetcli is built on top of rtslib, this bug affects targetcli too.  Apply the following patch:

--- /usr/lib/python2.7/dist-packages/rtslib/node.py     2012-06-06 17:59:41.515308657 +0000
+++ node.py     2012-06-06 17:59:50.185146504 +0000
@@ -189,7 +189,7 @@
                               % str(parameter))
         else:
             try:
-                fwrite(path, "%s \n" % str(value))
+                fwrite(path, "%s\n" % str(value))
             except IOError, msg:
                 msg = msg[1]
                 raise RTSLibError("Cannot set parameter %s: %s"

HTML API documentation is available at /usr/share/doc/python-rtslib/doc/html/.
Run python web server as follows; point your browser to port 8000 to read documents.

$ cd /usr/share/doc/python-rtslib/doc/html
$ python -m SimpleHTTPServer &
To initialize iSCSI fabric:

from rtslib.target import *
fabric = FabricModule('iscsi')
if not fabric.exists:
    for mod in fabric.load(): pass
if not fabric.exists:
    error('failed to load iSCSI fabric modules')

To create a storage object:


from rtslib.tcm import *
bs = IBlockBackstore(0)
storage = None
for s in bs.storage_objects:
    if s.name == 'NAME':
        storage = s
        break
if storage is None:
    storage = bs.storage_object('NAME', '/dev/ubuntu/VOLUME')

To create a target, target portal group, LUN, and network portal:


from rtslib.target import *
  
target = Target(fabric, 'iqn.2012-06.com.cybozu:NAME')
tpg = TPG(target, 1)
if len(tpg.luns) == 0:
    tpg.lun(0, storage, 'NAME at HOST')
tpg.network_portal('IP', 3260)
tpg.set_attribute('authentication', '0')
tpg.set_attribute('generate_node_acls', '1')
tpg.set_attribute('demo_mode_write_protect', '0')
tpg.set_parameter('InitialR2T', 'No')
  
tpg.enable = True

Resizing LUN

Unlike iET, LUN can be easiy extended by using lvresize. Initiators need to rescan the session, though.
INITIATOR$ sudo blockdev --getsize64 /dev/sdb
1073741824
TARGET$ sudo lvresize -L 2g /dev/ubuntu/test
INITIATOR$ sudo iscsiadm -m session -R
Rescanning session [sid: 1, target: iqn.2012-06.com.cybozu:test, portal: 10.xx.xx.xx,3260]
INITIATOR$ sudo blockdev --getsize64 /dev/sdb
2147483648

Protect against split-brain

In our environment, iSCSI initiators are virtual machines which, when malfunctioned, will be restored on another server. That means two or more same VM instances can run at once. It is disastrous that these VMs write to one iSCSI target LUN at the same time.

To eliminate such possible split-brain problems, we restrict a target portal group (TPG) can have just one session by applying this patch to LIO.

--- linux-3.2.0/drivers/target/iscsi/iscsi_target_login.c       2012-06-07 05:37:34.000000000 +0000
+++ iscsi_target_login.c        2012-06-07 05:34:22.866487169 +0000
@@ -1107,6 +1107,20 @@
                goto new_sess_out;
        }
  
+       /* Cybozu */
+       if (zero_tsih) {
+               int error = 0;
+               spin_lock_bh(&np->np_thread_lock);
+               if( tpg->nsessions > 0 )
+                       error = 1;
+               spin_unlock_bh(&np->np_thread_lock);
+
+               if( error ) {
+                       pr_err("Detected possible split brain\n");
+                       goto new_sess_out;
+               }
+       }
+
        if (zero_tsih) {
                if (iscsi_login_zero_tsih_s2(conn) < 0) {
                        iscsi_target_nego_release(login, conn);

This is quite a dirty and imprecise hack, but just enough for us.

Note that MaxConnections session parameter does not prevent this problem. Even if two iSCSI initiators share the same IQN, sessions may be identified differently because iSCSI sessions are identified by a random numeric session ID.


Summary

This article describes LIO and related API and a utility.
Bugs lurking in the current Ubuntu are also addressed.

A kernel patch to restrict session numbers for a TPG is also presented.

Hope this helps.

Yamamoto, Hirotaka @ymmt2005

A tale of two SCSI targets (Source Origin)

January 22, 2011

This article was contributed by Goldwyn Rodrigues

At the end of 2010, the LIO project was chosen to replace STGT as the in-kernel SCSI target implementation. There were two main contenders (LIO and SCST) which tried to get their code into the Linux kernel tree. This article will compare the two projects and try to describe what these implementations have to offer.

What are SCSI targets?

The SCSI subsystem uses a sort of client-server model. Typically a computer is the client or "initiator," requesting that blocks be written to or read from a "target," which is usually a data storage device. The SCSI target subsystem enables a computer node to behave as a SCSI storage device, responding to storage requests by other SCSI initiator nodes. This opens up the possibility of creating custom SCSI devices and putting intelligence behind the storage.

An example of an intelligent SCSI target is Data Domain's online backup appliance, which supports de-duplication (thus saving space). The appliance, functioning as a SCSI target, is a computer node which intelligently writes only those blocks which are not already stored, and increases the reference counts of the blocks which are already present, thus writing only the blocks which have changed since the last backup. On the other side of the SCSI link, the initiator sees the appliance as a normal, shared SCSI storage device and uses its regular backup application to write to the target.

The most common implementation of the SCSI target subsystem is an iSCSI server, which uses a standard TCP/IP encapsulation of SCSI to export a SCSI device over the network. Most SCSI target projects started with the idea supporting iSCSI targets before supporting other protocols. Since only a network interface is needed to act as both an iSCSI initiator and an iSCSI target, supporting iSCSI doesn't require any special hardware beyond a network port, which almost every computer has these days. However, most SCSI targets can be supported with existing initiator cards, so if you have a Fibre, SAS, or Parallel SCSI card, it should be possible to use one of the SCSI target projects to make your computer into a SCSI target for the particular SCSI bus supported by the card.

Current Status

The Linux kernel SCSI subsystem currently uses STGT to implement the SCSI target functionality; STGT was introduced into the Linux kernel at the end of 2006 by Fujita Tomonori. It has a library in the kernel which assists the in-kernel target drivers. All target processing happens in user space, which may lead to performance bottlenecks.

Two out-of-tree kernel SCSI target solutions were contenders to replace STGT: LIO and SCST. SCST has been pushing to be included in the Linux kernel since at least 2008. It was decided then that the STGT project could serve the kernel for a little longer. As time passed, the design limitations of STGT were encountered and a replacement sought. The main criteria for a replacement SCSI target subsystem defined by James Bottomley, the SCSI maintainer, were:

  1. That it would be a drop in replacement for STGT (our current in-kernel target mode driver), since there is room for only one SCSI target infrastructure.
  2. That it used a modern sysfs-based control and configuration plane.
  3. That the code was reviewed as clean enough for inclusion.
The first condition proved to be too restrictive; it was not possible to avoid breaking the ABI entirely. So the current goal, instead, is to find a way to gracefully transition STGT users to the new interface.

Hints of LIO replacing the STGT project came in the 2010 Linux Storage and Filesystem Summit. Christoph Hellwig volunteered to review and clean up the code; he managed to reduce the code-base by around 10,000 lines to make it ready to merge into the kernel.

Comparison

Both projects have drawn comparison charts of their feature lists which are available on their respective web sites: LIO and SCST. However, before exploring the differences, lets compare the similarities. Both projects implement an in-kernel SCSI target core. They provide local SCSI targets similar to loop devices, which comes in handy for using targets in virtualized environments. Both projects support iSCSI, which was one of the initial and main motivations for both projects.

Back-storage handlers are available on both projects in kernel space as well as for user space. Back-storage handlers allow target administrators to control how devices are exported to the initiators. For example, a pass-through handler allows exporting the SCSI hardware as it is, instead of masking the details of that hardware, while a virtual-disk handler allows exporting of files as virtual disk to the initiator.

Both projects support Persistent Reservations (PR); a feature for I/O fencing and failover/retakeover of storage devices in high-availability clusters. Using the PR commands, an initiator can establish, preempt, query, or reset a reservation policy with a specified target. During a failover takeover, the new virtual resource can reset the reservation policy of the old virtual resource, making device takeover easier and faster.

SCST

The main users of the SCSI target subsystem are storage companies providing storage solutions to the industry. Most of these storage solutions are plug-and-play appliances which can be attached to the storage network and used with little or no configuration. SCST boasts of a wider user base, which probably comes from the fact that they have wider range of transport support.

SCST supports both Qlogic and Emulex fibre channel cards whereas LIO supports only Qlogic target drives for now, and it is still in its beta stages of development. SCST supports the SCSI RDMA Protocol (SRP), and claims to be ahead in terms of development with respect to Fibre Channel over Ethernet (FCoE), LSI's Parallel/Wide SCSI Fibre Channel, and Serial Attached SCSI (SAS). It already has support for IBM's pSeries Virtual SCSI. Companies such as Scalable Informatics, Storewize, and Open-e have developed PnP appliance products which rely on these target transports based on SCST.

SCST supports notifications of session changes using asynchronous event notification (AEN). AEN is a protocol feature that may be used by SCSI targets to notify a SCSI initiator of events that occur in the target, when the target is not serving a request. This enables initiators to be notified of changes at the target end, such as devices added, removed, resized, or media changes. This way the initiators can see any target changes in a plug-and-play manner.

The SCST developers claim that their design conforms to more SCSI standards in terms of robustness and safety. The SCSI protocol requires that if an initiator clears a reservation held by another initiator, the reservation holder must be notified about the reservation clearance or else several initiators could change reservation data, ultimately corrupting it. SCST is capable of implementing safe RESERVE/RELEASE operations on devices to avoid such corruption.

According to the SCSI protocol, the initiator and target can communicate with each other to decide on the transfer size. An incorrect transfer size communicated by the initiator can lead to target device lockups or a crash. SCST safeguards against miscommunication of transfer sizes or transfer directions to avoid such a situation. The code claims to have a good memory management policy to avoid out-of-memory (OOM) situations. It can also limit the number of initiators that can connect to the target to avoid resource usage by too many connections. It also offers per-portal visibility control, which means that it can be configured in such a way that a target is visible to a particular subset of initiators only.

LIO

The LIO project began with the iSCSI design as its core objective, and created a generic SCSI target subsystem to support iSCSI. Simplicity has been a major design goal and hence LIO is easier to understand. Beyond that, the LIO developers have shown more willingness to work with the kernel developers as James pointed out to SCST maintainer Vladislav Bolkhovitin:

Look, let me try to make it simple: It's not about the community you bring to the table, it's about the community you have to join when you become part of the linux kernel. The interactions in the wider community are critical to the success of an open source project. You've had the opportunity to interact with a couple of them: sysfs we've covered elsewhere, but in the STGT case you basically said, here's our interface, use it. LIO actually asked what they wanted and constructed something to fit. Why are you amazed then when the STGT people seem to prefer LIO?

The LIO project also boasts of features which are either not present in SCST or are in early development phases. For example, LIO supports asymmetric logical unit assignment (ALUA). ALUA allows a target administrator to manage the access states and path attributes of the targets. This allows the multipath routing method to select the best possible path to optimize usage of available bandwidth, depending on the current access states of the targets. In other words, the path taken by the initiator in a multipath environment can be manipulated by target administrator by changing the access states.

LIO supports Management Information Base (MIB) which makes management of SCSI devices simpler. The SCSI target devices export management information values described in SCSI MIB RFC-4455 which is picked up by an SNMP agent. This feature extends to iSCSI devices and is beneficial in managing a storage network with multiple SCSI devices.

An error in the iSCSI connection can happen at three different levels: the session, digest, or connection level. Error recovery can be initiated at each of these levels, which makes sure that the recovery is made at the current level, and the error does not pass through to the next one. Error recovery starts with detecting a broken connection. In reponse, the iSCSI initiator driver establishes another TCP connection to the target, then it informs the target that the SCSI command path is being changed to the new TCP connection. The target can then continue processing SCSI commands on the new TCP connection. The upper level SCSI driver remains unaware that a new TCP connection has been established and that control has been transferred to the new connection. The iSCSI Session remains active during the period and does not have to be reinstated. LIO supports a maximum Error Recovery Level (ERL) of 2, which means that it can recover errors at the session, digest, or connection levels. SCST supports an ERL of 0, which means it can recover from session-level errors only and that all connection oriented errors are communicated to the SCSI driver.

LIO also supports "multiple connections per session" (MC/S). MC/S allows the initiator to open multiple connections between the initiator and target, either on the same or a different physical link. Hence, in case of a failure of one path, the established session can use another path without terminating the session. MC/S can also be used for load balancing across all established connections. Architectural session command ordering is preserved across those communication paths.

The LIO project also claims that its code is used in a number of appliance products and deployments though the user base does not seem to be as varied as that of SCST.

No comparison can be complete without a performance comparison. SCST developers have released their performance numbers from time to time. However, all their numbers were compared against STGT. The SCST comparison page speaks of SCST performing better than LIO, but the results were drawn on source-code study and not using real-world tests. SCST blames LIO for not releasing performance numbers, and there exist no performance data (to my knowledge) which would compare apples to apples.

The decision has finally been made, though, with quite a bit of opposition. Now comes the task of getting all the niche features which LIO lacks to be ported from SCST to LIO. While the decision was contentious, it is yet another example of the difficulty of getting something merged without being able to cooperate with the kernel development community.


A tale of two SCSI targets

Posted Jan 27, 2011 19:02 UTC (Thu) by jeremiah

I know it hurts to have someone else's code chosen for mainline inclusion. But I really hope the SCST folks and the LIO folks can work together. It seems that the combined feature set of these two packs a heck of a punch. I think this is an area that extremely important and underrated. Most people can't afford expensive SAN tech. And having the ability to use a simple linux box as a full FC/iSCSI/AoE disk array is just incredible, esp. when it can be used as all 3 at the same time. As the standards change it's nice to be able to utilize new hardware and designs w/o losing your investment in old infrastructure.

A tale of two SCSI targets

Posted Jan 27, 2011 22:15 UTC (Thu) by dougg

Thanks for the writeup. Even for those of us who are close to the action, it is difficult to get an overview. One small criticism: the title should be "A tale of two iSCSI targets". Explaining the term "SCSI target" is difficult and the definitive document (SAM-5) does not help in that regard. iSCSI is one of many SCSI transports (and recently a new one was proposed: SOP (SCSI over PCIe)). So I question this statement: "The most common implementation of the SCSI target subsystem is an iSCSI server, ...". IMO USB (storage) keys are the most common (and worst) implementation of the SCSI target subsystem.

A tale of two SCSI targets

Posted Jan 27, 2011 23:48 UTC (Thu) by martinfick

> So I question this statement: "The most common implementation of the SCSI target subsystem is an iSCSI server, ...". IMO USB (storage) keys are the most common (and worst) implementation of the SCSI target subsystem.

I suspect that he meant a SCSI target implemented in linux. I could be wrong, but I doubt those USB storage keys are running linux (yet), are they?

SCSI target, iSCSI target or SCSI target framework ?

Posted Jan 28, 2011 11:16 UTC (Fri) by abacus

Maybe A tale of two SCSI target frameworks would be a better title ? While LIO started as an iSCSI target (and its source code still shows this), SCST has been designed from the start as a generic SCSI target framework.

A tale of two SCSI targets

Posted Jan 28, 2011 13:03 UTC (Fri) by zdzichu (guest, #17118) [Link]

But isn't SCSI target framework about much more than iSCSI? Like Solaris's COMSTAR http://hub.opensolaris.org/bin/view/Project+comstar/WebHome ?

SCSI target users

Posted Jan 28, 2011 11:21 UTC (Fri) by abacus

A quote from the article:

The main users of the SCSI target subsystem are storage companies providing storage solutions to the industry.

I'm not sure that's correct. Many organizations are using software iSCSI target implementations as a storage server for virtual machines, e.g. in combination with VMware ESX. These organizations install the iSCSI target software themselves instead of buying a storage product from a storage company.

A tale of two SCSI targets

Posted Aug 11, 2011 10:04 UTC (Thu) by slashdot (guest, #22014) [Link]

How can something which essentially just provides a network block device (which means read, write and barrier) be so complicated?

Does all this apparent insane bloat need to be in the kernel?

A tale of two SCSI targets

Posted Aug 13, 2011 17:56 UTC (Sat) by abacus

Does all this apparent insane bloat need to be in the kernel?

Implementing a storage target in the kernel allows faster operation than a comparable implementation in user space.

Copyright 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds

Four Linux SCSI Targets

Lio Target