Remote BLOB Store Provider Library Implementation Specification

SQL Server Technical Article

Writers: Kevin Farlee, Pradeep Madhavarapu

Technical Reviewer: Pradeep Madhavarapu, Michael Warmington

Published: August 2008

Applies to: SQL Server 2008

Summary: This is a specification to be used by those creating a storage provider plug-in library for the SQL Server 2008 Remote BLOB Store feature.

Introduction

Remote BLOB Store (RBS) is designed to move the storage of large binary data (BLOBs) from database servers to commodity storage solutions.

With RBS, BLOB data is stored in storage solutions such as Content Addressable Stores (CAS), commodity hardware with data integrity and fault-tolerance systems, or mega service storage solutions like MSN Blue. A reference to the BLOB is stored in the database. An application stores and accesses BLOB data by calling into the RBS client library. RBS manages the life cycle of the BLOB, such as doing garbage collection as and when needed.

RBS is an add-on that can be applied to Microsoft SQL Server 2008 and later. It uses auxiliary tables, stored procedures, and an executable to provide its services. A reference to the BLOB (provided by the BLOB Store) is stored in RBS auxiliary tables and an RBS BLOB ID is generated. Applications store this RBS BLOB ID in a column in application tables. These columns in application tables are called RBS Columns in this specification. The RBS Column is not a new data type; it is just a simple binary(20).

RBS exposes three views for interacting with it: application view (through the RBS client library), administrator view (through stored procedures), and provider view (through a provider interface). This document discusses the provider view.

RBS Provider Requirements

The requirements of RBS are covered in Functional Description, later in this paper. The requirements of RBS providers are listed here.

Goals of an RBS Provider

The main goal of an RBS provider is to enable the use of a particular type of BLOB store (called a target BLOB store) to store RBS BLOB data.

Typically, target BLOB stores offer large storage space at a low cost, including hardware costs, maintenance, and expandability. The technical requirements and recommendations for RBS providers are listed here.

Required

An RBS provider must:

  • Provide an implementation of the BlobStore abstract class that uses the target BLOB store to store BLOB data. Honor the semantics specified by RBS.
  • Allow multiple instances of the provider (pointing to the same or different instances of the target store, and using the same or different credentials) to be used simultaneously from one or more client machines.

An RBS provider should optimally:

  • Allow the use of the features of the target BLOB store through RBS interfaces and configuration options wherever possible. It should also minimize the need for custom configuration options to exploit features of the target BLOB store.
  • Implement optional optimizations and capabilities if possible. These help improve performance and provide extra functionality.
  • Allow attaching, detaching, enabling, disabling, configuring, and deploying target stores and providers without affecting the availability of SQL Server and client computers.
  • Avoid introducing too much overhead; the performance of an RBS provider should be close to the performance of native access to the target BLOB store.

Guarantees Provided by RBS Providers

An RBS provide must guarantee certain implementation features. Others are recommended.

Required

An RBS provider must guarantee:

  • Link-level consistency. This means that there are no dangling references─if the provider gives out a StoreBlobId to represent a newly stored BLOB, the BLOB can be accessed later using the same StoreBlobId as long as it is not deleted.
  • That the BLOB persists when a Store() call returns. BLOB data and any metadata that the provider associates with a BLOB must be persisted by the BLOB store before the call to store the BLOB returns successfully. This means that if the BLOB store goes down because of a power outage or other reason, after the successful completion of a Store() operation, the BLOB is available after the BLOB store comes online.

An RBS provider should optimally guarantee:

  • BLOB data immutability. This means that BLOB data cannot be changed after a BLOB is stored initially. This guarantees that the data returned on reading a BLOB is the same as the data that was given to the provider when the BLOB was stored─no changes are allowed after that.

Deliverables

Each provider must deliver the following pieces, together known as a Provider Pack:

  • Provider library (set of managed DLLs and dependencies, such as native libraries)
  • Documentation
  • Sample configuration files
  • Installer
  • Optional: Provider source code if this is a sample provider

Functional Description

Overview and Component Descriptions

An RBS provider consists of a managed library and, optionally, a set of native libraries that communicate with the BLOB store. The basic components and their interactions are as follows:

  • Application – RBS Maintainer or an application that uses RBS, such as Microsoft SharePoint.
  • RBS Client Library – In the case of applications other than RBS Maintainer, the provider library is called by RBS client library and not the application directly.
  • BLOB Store – An entity which is used to store BLOB data. This can be a CAS storage solution (such as EMC Centera or Microsoft SRS), SMB file server, a mega storage service (such as MSN XStore) or even a SQL Server database.
  • Provider Library – Managed library for implementing the BlobStore abstract class. This also referred to as the provider. It knows how to use the BLOB store for storing BLOBs.
  • Native Library for BLOB Store – Any libraries used by the provider library to communicate with the BLOB store. This is optional.

Figure 1: Provider Architecture

Figure 2: Provider Architecture with Native Library

Figure 3: Provider Architecture with RBS Client Library

Sample Control Flow

Following is a sample control flow for a simple operation.

  1. The application calls the provider library to perform an operation.
    • The provider library calls into the native library to perform the operation.
  2. The provider (or native) library sends the request to the store.
  3. The BLOB store returns a response.
    • The native library returns a response to the provider library.
  4. The provider library returns a response to the application.

Figure 4: Sample Control Flow for Provider Library

Figure 5: Sample Control Flow for Provider Library with Optional Native Library

Provider Abstract Class

RBS defines an abstract class named BlobStore, that must be inherited and implemented by provider writers. The reasons to use an abstract class instead of an interface are as follows:

  • It is easy to extend an abstract class in future versions without breaking backward compatibility, which is not possible with interfaces. For example, in an abstract class, new methods (with default implementations) can be added without breaking compatibility with previous versions.
  • The core function of the provider library is that it is an RBS provider, so it makes sense to have it inherit an abstract class.
  • Some common code that may be useful to many providers can be included in the abstract class. The derived providers can chose to either use it or write their own code.

Overview

Following is an overview of the steps performed by the application (RBS maintainer or RBS client library) on a provider library.

  1. RBS loads the provider library managed DLL and uses configuration information to find the required class within that DLL that is derived from BlobStore.
  2. RBS gets information about the provider through configuration information that is added to the machine-wide CLR configuration file when a provider library is installed.
  3. Using this provider information, it associates zero or more BLOB stores with this provider class.
  4. When a BLOB store associated with this provider class needs to be used, one object of the class is instantiated.
  5. The object is initialized with information about the BLOB store.
  6. Operations (such as storing and fetching BLOBs, creating pools, and so on) are performed using this object. The object may be cached for use later and operations may be performed again after long pauses.
  7. Dispose() is called on the provider object and it is not used after that.
  8. Multiple instances of the same class can be used simultaneously to access the same or different BLOB stores.

The next section lists what must be implemented by the provider class. They are discussed in groups.

Exceptions

BLOB store providers are only expected to throw exceptions of type BlobStoreException. A valid exception code must be specified while throwing an exception. Each operation has a set of expected exception codes. Throwing any other exceptions or codes indicates a bug in the provider or that exceptions have occurred outside the provider’s control. The valid exception codes are:

  • AccessDenied. The caller or application does not have permissions to perform the requested action.
  • NoMoreSpace. No more storage space is available on the BLOB store or pool.
  • PoolNotFound. Specified pool does not exist on the BLOB store.
  • BlobNotFound. Specified BLOB does not exist on the BLOB store or pool.
  • BlobIdAlreadyExists. A BLOB with the specified StoreBlobId already exists in the same pool, so a new one cannot be created.
  • BlobInUse. A BLOB is currently being used, so it cannot be deleted or expunged.
  • PoolNotEmpty. Specified pool is not empty, so it cannot be deleted.
  • ConfigurationMissing. Required BLOB store configuration items are missing.
  • ConfigurationDoesNotAllowOperation. Current configuration of the BLOB store does not allow the requested operation.
  • OperationFailedAuthoritative
    • The requested operation failed for a reason not included in other codes.
    • The failure is authoritative - no part of the operation was performed.
  • OperationFailedMaybe
    • The requested operation may have failed for a reason not included in other codes.
    • The failure is not authoritative─all, some or no part of the operation may have been performed.
  • NotImplemented. The requested operation is  not implemented by this BLOB store provider.

Providers are encouraged to include descriptive messages while throwing any exception.

Initialization

Constructor()

After a provider class is picked for a store registered with RBS, an object of the provider class is instantiated to use that store. RBS instantiates an object of the provider class by using the empty constructor. Within this constructor, the provider must call the base constructor (base()).

void Initialize(ConfigItemList commonConfiguration, ConfigItemList coreConfiguration, ConfigItemList extendedConfiguration,  BlobStoreCredentials[] credentials)

RBS calls this method once on an object of the provider class before using it for any operations.

Configuration information is passed in the form of ConfigItemList objects that contain multiple ConfigItem objects. ConfigItems are explained in the RBS Functional Description. They are essentially (key, value) pairs. There is a pre-defined list of ConfigItems that RBS client library defines. In addition, providers can define their own ConfigItems that are used for provider-specific configuration.

CommonConfiguration contains configuration information that is understood by the RBS client library. ConfigItems present in this are: StoreMajorVersion, StoreMinorVersion, and StoreLocation. CoreConfiguration and ExtendedConfiguration contain provider-specific configuration items associated with this BLOB store in the RBS database. The core configuration consists of configuration information that is required to access existing BLOBs in the back-end BLOB store. The extended configuration consists of configuration information that is not needed to access existing BLOBs, but is needed for other operations, such as create pool, store BLOB, and so on. Extended configuration information is optional and may not be present. This is because extended configuration information is not included in BLOB Locators, which can be used to access a BLOB. BlobStoreCredentials is optional (it may be null). If specified, the specified credentials should be used to connect to the store.

Providers are encouraged to check validity of the passed configuration items and credentials and build internal structures as part of initialization. They may optionally connect to the store as well.

Allowed exception codes are: AccessDenied, ConfigurationMissing.

void Dispose()

This is the opposite of Initialize(), previously described and is called by RBS to indicate that the internal structures, connections etc. can be cleaned up. An object will not be used after Dispose() is called on it.

Pool Operations

Poll operations are operations performed on pools. None of these operations are performed by RBS in parallel (on multiple threads) on the same provider object. For each operation, a list of expected exceptions is specified. If some exception other than those specified is thrown, it indicates either a bug or extraordinary circumstances.

byte[] storePoolId CreatePool(ConfigItemList configuration)

This creates a new pool on the BLOB store. A byte array representing the StorePoolId for that pool is returned.

If the OptimizationSpecifiedIds capability is TRUE, StorePoolId must be less than or equal to 16 bytes.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, NoMoreSpace, OperationFailedAuthoritative.

void DeletePool(byte[] StorePoolId)

This deletes an existing pool.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, PoolNotEmpty, OperationFailedAuthoritative, OperationFailedMaybe.

object ResumeObject BeginEnumerateBlobs(byte[] StorePoolId)

This method is called to start enumerating the list of BLOBs in a particular pool. Since the number of BLOBs expected in a pool is very high, we need support for paging─retrieving a few entries at a time. This method is called to set up any context and internal structures do represent such an enumeration.

The provider is free to create any type of object to store its enumeration state. The object should then return the enumeration state from this method. RBS keeps uses this object in subsequent method calls to enumerate BLOBs.

This method must be implemented even if OptimizationSortedEnumeration is TRUE.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, OperationFailedAuthoritative.

object resumeObject BeginEnumerateBlobs(byte[] storePoolId, byte[] startingStoreBlobId, DateTime createTimeFilterStart, DateTime createTimeFilterEnd)

This method is called by RBS to get a sorted enumeration of BLOBs in a pool. The provider is expected to return a ResumeObject that can be used to enumerate BLOBs in that pool in sorted order of StoreBlobId. The enumeration should return BLOBs belonging to that pool with (StoreBlobId >= StartingStoreBlobId). Comparison of BLOB IDs is a binary comparison of all the bytes of the ID. In addition, all BLOBs returned should have a CreateTime such that (CreateTime >= CreateTimeFilterStart) and (CreateTime <= CreateTimeFilterEnd). If CreateTimeFilterStart or CreateTimeFilterEnd is set to DateTime.MinValue or DateTime.MaxValue respectively, the clause for that parameter should be skipped (that clause is assumed to be satisfied). Both times are specified in UTC.

This method is equivalent to retrieving BLOB entries from a completely sorted list of  BLOBs belonging to the specified pool, starting at the lowest entry that satisfies (StoreBlobId >= StartingStoreBlobId). For any two consecutive entries in the returned array B1 and B2, the following conditions hold:

  • B1 < B2
  • CreateTimeFilterStart <= B1 CreateTime <= CreateTimeFilterEnd
  • CreateTimeFilterStart <= B2 CreateTime <= CreateTimeFilterEnd
  • There is no BLOB Bk belonging to the specified pool such that (B1 < Bk < B2) and (CreateTimeFilterStart <= Bk CreateTime <= CreateTimeFilterEnd)

This method must be implemented if OptimizationSortedEnumeration is TRUE.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, OperationFailedAuthoritative, NotImplemented.

BlobInformation[] EnumerateBlobs(object resumeHandle, int maxBlobs)

This method is called by RBS, specifying a ResumeObject that was previously returned by the provider. The provider is expected to return an array of BLOB entries belonging to that pool. MaxNum is the maximum number of entries to be returned from this call.

Next time this method is called, the provider should continue enumerating BLOBs in the pool at the point where the current call stops. No BLOBs should be returned twice and no BLOBs should be missed. Returning less than MaxNum number of entries indicates that there are no more BLOBs left to enumerate.

BlobInformation includes the StoreBlobId, the CreateTime of the BLOB (this should be the same value that was returned when the BLOB was stored) and Length of the BLOB.

Allowed exception codes are: OperationFailedAuthoritative.

void EndEnumerateBlobs(object resumeHandle)

This method is called to end enumerating BLOBs in a pool. The provider can clean up any internal state related to this enumeration.

Allowed exception codes are: None.

BLOB Operations

These are operations that are performed on BLOBs within pools. These operations may be performed by RBS in parallel (on multiple threads) on the same provider object. So, they must be thread-safe. For each operation, a list of expected exceptions is specified. If some exception other than those specified is thrown, it indicates either a bug or extraordinary circumstances.

BlobStoreWriterStream CreateNewBlob(byte[] storePoolId)

This is the “Push” version of storing a BLOB─the provider is expected to return a writable stream, into which RBS or the application writes data that must be stored in the BLOB store. The BLOB should be stored in the specified pool.

BlobStoreWriterStream is inherited from System.IO.Stream and has one additional method: Commit(). When RBS calls Commit() on this object, the provider should commit the BLOB on the back-end BLOB store and return the BlobInformation for the stored BLOB. The stream cannot be used after that.

This method must be implemented even if the OptimizationSpecifiedIds capability is TRUE.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, NoMoreSpace, OperationFailedAuthoritative.

BlobInformation CreateNewBlobFromStream(byte[] storePoolId, Stream inStream)

This is the “Pull” version of storing a BLOB─a stream containing the data to be stored is given. The BLOB should be stored in the specified pool.

The specified stream supports reading (CanRead is TRUE) and supports querying the Length property. No other assumptions (including assumptions related to CanSeek) should be made about this stream object.

This method must be implemented even if the OptimizationSpecifiedIds capability is TRUE.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, NoMoreSpace, OperationFailedAuthoritative.

BlobStoreWriterStream CreateNewBlob(byte[] storePoolId, byte[] storeBlobId)

This is similar to CreateNewBlob, described above, with the addition that here RBS specifies the StoreBlobId instead of the provider generating it.

This method must be implemented if the OptimizationSpecifiedIds capability is TRUE.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, NotImplemented, PoolNotFound, NoMoreSpace, BlobIdAlreadyExists, OperationFailedAuthoritative.

BlobInformation CreateNewBlobFromStream(byte[] storePoolId, Stream inStream, byte[] storeBlobId)

This is similar to CreateNewBlobFromStream, described above, with the addition that here RBS specifies the StoreBlobId instead of the provider generating it.

This method must be implemented if the OptimizationSpecifiedIds capability is TRUE.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, NotImplemented, PoolNotFound, NoMoreSpace, BlobIdAlreadyExists, OperationFailedAuthoritative.

Stream ReadBlob(byte[] storePoolId, byte[] storeBlobId)

This is the “Pull” version of fetching a BLOB─the provider returns a readable stream that contains the BLOB data.

The returned stream object must allow reading and seeking (CanRead and CanSeek are TRUE) and must support querying the Length property (correct length should be returned). It must disallow writing (CanWrite is FALSE).

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, BlobNotFound, OperationFailedAuthoritative.

void ReadBlobIntoStream(byte[] storePoolId, byte[] storeBlobId, Stream outStream)

This is the “Push” version of fetching a BLOB─the provider copies BLOB data into the passed writable stream.

The passed stream supports writing (CanWrite is TRUE). No other assumptions (including assumptions related to CanSeek) should be made about this stream object.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, BlobNotFound, OperationFailedAuthoritative.

void DeleteBlob(byte[] storePoolId, byte[] storeBlobId)

This should delete the specified BLOB.

Allowed exception codes are: AccessDenied, ConfigurationDoesNotAllowOperation, PoolNotFound, BlobNotFound, BlobInUse, OperationFailedAuthoritative, OperationFailedMaybe.

BlobStoreWriterStream Class

BlobStoreWriterStream is inherited from System.IO.Stream and has one additional method: Commit(). The members are briefly outlined in the below table. Important methods are described after the table.

Return Value Method

Write(buffer, offset, count)

Optional

Get CanRead

Optional

Get CanSeek

TRUE

Get CanWrite

Get Length

Get Position

Optional

Set Position, Seek(), SetLength()

Flush()

Close()

Optional

Read(buffer, offset, count)

BlobInformation

Commit()

Table 1

void Close()

If this method is called, the BLOB data should be discarded and the BLOB should not be stored in the BLOB store.

BlobInformation Commit()

When this method is called, the provider should ensure the BLOB is stored in the BLOB store, and return the details of the BLOB (StoreBlobId, CreateTime, and Length). This method should do an implicit Close().

Supporting Objects

These objects are all defined by the RBS client library infrastructure and are used by the provider library. Details on each of these objects are in the RBS class library documentation.

BlobInformation

Return Value Method

BlobInformation

Constructor()

BlobInformation

Constructor(StoreBlobId, CreateTime, Length)

StoreBlobId

Get StoreBlobId

StoreBlobCreateTime

Get StoreBlobCreateTime

BlobLength

Get BlobLength

Set StoreBlobId

Set StoreBlobCreateTime

Set BlobLength

Table 2

BlobStoreCredentials

Return Value Method

BlobStoreCredentials

Constructor(Credentials)

BlobStoreCredentials

Constructor(Username, Password)

Credentials

Get Credentials

Username

Get Username

Password

Get Password

Table 3

Data Types Used

Friendly Name C# Type

StorePoolID

byte[]

StoreBlobID

byte[]

StoreBlobCreateTime

DateTime

BlobLength

long

InStream

Stream

OutStream

Stream

ConfigItem

ConfigItem

ConfigItemList

ConfigItemList

Config Value

string

Config Key

string

BlobInformation

BlobInformation

BlobStoreWriterStream

BlobStoreWriterStream

ResumeObject

Object

Table 4

Setup

As part of setup for the provider library, Setup must register the DLL and class names to be used by RBS client library. In addition, configuration information about the provider needs to be registered. This is done through the machine-wide CLR xml configuration file.

The different pieces of information needed are described below. Helper classes present in the RBS client library can be used to set this configuration during setup. Look at the sample provider in the RBS SDK for examples on how to use these helper classes to specify the xml elements.

BlobStoreType

Type: string.

This is a Unicode string of up to 128 characters. This uniquely identifies the type of this provider. This is the same string that is used by applications and DBAs in the BlobStoreType field when configuring RBS BLOB stores for a database. Examples are “EMC Centera”, “Microsoft SRS”. Provider writers are encouraged to start the type with the name of the company so as to avoid collisions with other provider writers.

DllFile

Type:string.

This specifies the path to locate the assembly in which the provider class is present.

ClassName

Type: string.

This specifies the name of the class implementing the BlobStore abstract class within the specified assembly.

ProviderVersion

Type: string.

These fields indicate the version number for this provider class. The provider writer is free to pick any non-negative values for these fields. It is expected that these numbers increase over a period of time as new versions are released.

MinSupportedBackendStoreVersion

Type: string.

These fields indicate the minimum version number of the backend BLOB store that is supported by this provider library.

ImplementedCommonBlobStoreSpecificationVersion

Type: string.

These fields indicate the version number of the RBS specification (RBS client library and BlobStore abstract class) that is implemented by this provider library. This means that the provider understands and complies with all the requirements of the specified version of RBS specification.

This property is not used currently, but may be used in the future. Providers are required to set this correctly.

ProviderSpecificConfigKey

This describes ConfigItems that are specific to this provider. Provider-specific configuration items can be used to store configuration information about the back-end BLOB store. This configuration is passed to the provider in the Initialize method.

Multiple instances of this element are allowed. One such element needs to be specified for each ConfigItem key that the provider class understands (only provider-specific keys, not common keys defined by RBS). It has the following fields:

name

Type: string

Key name of the provider-specific configuration item.

format

Type: string

The format of this configuration item, must be among: (Name, Boolean, Number, Binary, Duration).

Provider/Store Version Picking Algorithm

RBS uses standard four-part version numbers, that is, w.x.y.z where each of the terms is progressively decreasing in significance.

The RBS client library uses the above set of version numbers to determine which provider libraries to use with which back-end BLOB stores. The algorithm used is described below.

  1. Build a list of provider libraries available for each BlobStoreType. Current_RbsVersion is the version of this RBS client library. Load all the provider libraries available and for each provider class:
    • Add this provider class with version {ProviderVersion} to the list of providers available for type {BlobStoreType}. Maintain the list in sorted order─descending order of {ProviderVersion}.
  2. For a BLOB store that is registered as an RBS BLOB store in the database, find a suitable provider class. BackendStoreVersion is the version of the backend BLOB store as specified in the database.
    • Find the list of providers available for this {BlobStoreType}. Process each entry in the list in order (highest version first):
      • If (MinSupportedBackendStoreVersion > BackendStoreVersion) skip to the next entry. The store is too old for this provider class.
      • Else pick this provider class for this store.

Conclusion

This specification should be used to guide the development of provider plug-in libraries for the Remote BLOB Store feature of SQL Server 2008.

For more information:

https://www.microsoft.com/sqlserver/: SQL Server Web site

https://technet.microsoft.com/en-us/sqlserver/: SQL Server TechCenter

https://msdn.microsoft.com/en-us/sqlserver/: SQL Server DevCenter 

Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example:

  • Are you rating it high due to having good examples, excellent screen shots, clear writing, or another reason?
  • Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?

This feedback will help us improve the quality of white papers we release.

Send feedback.