How Should We Store Images And Blob Files In SQL Server

Vishal Pawar
8y
11.5k
0
4

Article

There is always a requirement from developers - How should I store images and BLOB files on SQL Server?

In this post, let's explore the data types and methods used for storing Binary Large Objects also know as BLOBs, such as images and sounds, into SQL Server. The facility is already there in SQL 2000 and SQL 2005 but here, we are digging into SQL 2008 too.

Let's first understand what BLOBs are.

To start with, we’ll compare two types of files, ASCII and Binary. Most of the values stored in SQL Server consist of ASCII (American Standard Code for Information Interchange) characters. An overly simplified explanation of ASCII characters would be letters, numbers, and symbols found on the keyboard. A file containing only ASCII characters can be modified by a text editor such as Notepad without consequence. Binary files, however, contain both ASCII characters and special control characters and byte combinations that are not found on the keyboard. An MP3 music file would be binary. Opening an MP3 inside Notepad and removing characters in an attempt to make the song shorter would result in the file being corrupted and not playable because Notepad is limited to ASCII characters and cannot correctly interpret or create binary bits. Other examples of binary data include images and EXE compiled programs. BLOBs, then, are binary files that are large, or Binary Large Objects (BLOB).

Now, let's see why one needs to store BOLBs in SQL Server.

There are justified reasons both for and against storing binary objects inside SQL Server. We’ll look at both sides. As a real world example, we’ll consider a typical sales organization. There are usually product lines or families of products being sold. A level below the product line would be the individual or discreet parts; we’ll call them widgets. Each widget has the standard inventory columns, such as price, cost, quantity on hand, vendor, etc. In addition, many may have sales literature or brochures describing the widget. Often, these brochures are electronic such as PDF, Power Point, or some type of image. One way of dealing with these electronic documents would be just to throw them up on a File Server and create a directory for each widget. This will work, until customers or employees want an application they enter search parameters into and receive back the sales brochures that match. For example, “show me all documents for blue widgets that sell for less than $100”. At this point, a database tied to an application will usually be involved. Therefore, for this series of articles, we’ll create a Visual Studio application that connects to SQL Server to retrieve widget sales brochures.

File Storage Locations

One of the early questions is where to store the electronic brochures. Either the application could store the file system path information leading to the document, such as “d:\sales doc\widgeta-picture.jpg”, inside a varchar column, leaving the actual document on the file system, or we could place the actual JPG file inside a binary or image column. A few key questions will help determine the best option.

Performance
Are these binary objects performance hungry, such as a streaming video? If so, the file system may perform better than trying to stream the binary out of SQL Server.
Size
Is the binary object to be retrieved large? Large is defined as over 1 MB in size. If the object is large, the file system will typically be more efficient at presenting, or reading the object than SQL Server. If the binaries are small, say little images of each widget, then storing them inside SQL server will be more than adequate.
Security
Is access to the binaries a high-security concern? If the objects are stored in SQL Server, then security can be managed through the usual database access methods. If the files are stored on the file system, then alternative security methods will need to be in place.
Client Access
How will the client access the database, ODBC, Native SQL Driver? For large streaming video, a client such as ODBC may time out or fail.
Fragmentation
If the binaries will be frequently modified and are large, the file system may handle fragmentation better than SQL Server.
Transactions
Do you need transactional control? If so, then SQL has a built in solution.

For an in-depth discussion on database vs. file system storage for Blobs, as well as where the previous 1MB size reference came from, see the Microsoft article - To BLOB or Not to BLOB, located at http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-2006-45

Data Types

For this first example, we’ll create an application that will store images of each product. Because these files are small, we’ll opt to store them in SQL Server. In SQL 2000, there were two different families of data type options for these type of files, binary, and image. The Binary family includes three different data types. The Binary data type itself, which requires a fixed size. For this first example, because our images vary in size, we’ll use the varbinary data type, the “var” standing for a variable.

The varbinary data type has a maximum length of 8,000 bytes. Starting in SQL 2005, “varbinary(max)” was included in the binary data type family. The keyword MAX indicates the size is unlimited. If the SQL version is before 2005 and the file size is greater than 8,000, then the Image data type can be used. It’s a variable size type with a maximum file size of 2GB. Although the Image data type is included in SQL 2005 and SQL 2008, it shouldn’t be used. Microsoft says it’s there for backward compatibility and will be dropped at some point in the future. Therefore, this example will use the Binary type, the three versions of which are recapped below,

Binary - Fixed size up to 8,000 bytes.
VarBinary(n) - Variable size up to 8,000 bytes (n specifies the max size).
VarBianry(max) - Variable size, no maximum limit.

A few reasons to store images or binary data inside SQL Server include security, client access, and transactional control. This article will focus on the varBinary(MAX) data type. It is available in SQL 2005 and SQL 2008. The (MAX) extension to the Binary type means there is no upper size limit. The “var” means the size is variable rather than fixed as in the case of the standard Binary data type. SQL BOL (Books On Line) gives a good example of the three binary types:

Use Binary when the sizes of the column data entries are consistent (and less than 8,000 bytes).

Use varBinary when the sizes of the column data entries vary considerably (and are less than 8,000 bytes).

Use varBinary(max) when the column data entries exceed 8,000 bytes.

For SQL 2000, use the Image data type. However, be aware that Microsoft has stated the Image data type is for backward compatibility only and may be discontinued on a future version.

Example Table

To begin with, we’ll create a test database and table to hold our images. Use the following TSQL statement.

USE master;
GO
CREATE DATABASE Test;
GO
USE Test;
GO
CREATE TABLE BLOBTest(TestID int IDENTITY(1, 1), BLOBName varChar(50), BLOBData varBinary(MAX));

In this example, the column name is BLOBData, but the column name can be any standard SQL name. Binary data types do not have any special naming restrictions.

If you’re running SQL 2005 or SQL 2008, test the database inserting an image using a TSQL statement. This statement will not work in SQL 2000 because only a (MAX) data type can be the target. Find a small image then execute the following statement:

INSERT INTO BLOBTest(BLOBName, BLOBData)
SELECT‘ First test file’,
BulkColumn FROM OPENROWSET(Bulk‘ C: \temp\ nextup.jpg’, SINGLE_BLOB) AS BLOB

Substitute the ‘C:\temp\nextup.jpg’ for the file system path to your file. The OPENROWSET statement allows SQL to access data from an external provider. Bulk is a special provider for OPENROWSET created for inserting documents and images. For additional details, see BOL “Importing Large Objects by using the OPENROWSET Bulk Rowset Provider”. A Select of the table should produce one record as shown below.

SELECT * FROM BLOBTest

SQL Server

The SELECT statement will verify data has been inserted, but there isn’t a way in SQL to view the image. For that, we’ll create a small Visual Studio application.

Example Table

To begin with, we’ll create a test database and table to hold our images. Use the following TSQL statement.

USE master;
GO
CREATE DATABASE Test;
GO
USE Test;
GO
CREATE TABLE BLOBTest(TestID int IDENTITY(1, 1), BLOBName varChar(50), BLOBData varBinary(MAX));

In this example, the column name is BLOBData, but the column name can be any standard SQL name. Binary data types do not have any special naming restrictions.

INSERT INTO BLOBTest(BLOBName, BLOBData)
SELECT‘ First test file’,
BulkColumn FROM OPENROWSET(Bulk‘ C: \temp\ nextup.jpg’, SINGLE_BLOB) AS BLOB

SELECT * FROM BLOBTest

SQL Server

The SELECT statement will verify data has been inserted, but there isn’t a way in SQL to view the image. For that, we’ll create a small Visual Studio application.

Several different types of VARBINARIES were introduced, including the VARBINARY(MAX) data type which will be the focus of this article. The first code example loaded an image into the database then retrieved it using an ASPX page with a Binary Write. In this article, a web-based application will be created for inserting images into SQL Server. In addition, the examples from the first two articles will be expanded and improved.

SQL Test Database

To begin, create a test database and table for storing images using the following T-SQL script.

USE master;
GO
CREATE DATABASE BLOBTest3;
GO
USE BLOBTest3;
GO
CREATE TABLE BLOBFromWeb(BLOBData varbinary(max));
GO

The script created a single database, containing a single table, containing a single column. The test column “BLOBData” uses the MAX keyword allowing it to store binaries of any size.

Next, create a stored procedure that will be used by a web page for uploading images.

CREATE PROCEDURE WebUp(@FileData varbinary(max))
AS
INSERT INTO BLOBFromWeb(BLOBData)
VALUES(@FileData);

The stored procedure will pass one variable, the image “@FileData”, into SQL Server.

The rest of the examples in this series will use stored procedures rather than SQL statements. There are many benefits to using stored procedures, such as increased security, preventing SQL Injection attacks, portability, and performance.

Inserting into SQL from the Web

This example uses Microsoft Visual Studio 2008 to create a web form, which will ask the end user to browse to a file, and then upload the file into SQL Server. The application will also work in Visual Studio 2005. To begin, create a new web site and an .aspx page using code behind. Drag a FileUpload and a Button control onto the form as shown below.

Switch to the source view of the page and change the default <form> tag to the following

<form id="form1" runat="server" enctype="multipart/form-data">

The “enctype” specifies the how the form data is encoded.

Change back to the design view and double click the button to create an On Click event. This will bring up the code behind page as shown below,

SQL Server

At the top of the page, add using statements for the SQL,

using System.Data.Sql;
using System.Data.SqlClient;

And also a statement for the file system

using System.IO;

Add the following code into Button1_Click event

string sConn = @ "server=.; database=BLOBTest3; Integrated Security=True";
SqlConnection objConn = new SqlConnection(sConn);
objConn.Open();
SqlCommand objCmd = new SqlCommand("WebUp", objConn);
objCmd.CommandType = CommandType.StoredProcedure;
SqlParameter paramFileData = objCmd.Parameters.Add("@FileDatalDbType.VarBinary);
paramFileData.Direction = ParameterDirection.Input;
byte[] bImage = new byte[FileUpload1.PostedFile.ContentLength];
Stream objStream = FileUpload1.PostedFile.InputStream; objStream.Read(bImage, 0, FileUpload1.PostedFile.ContentLength); paramFileData.Value = bImage; objCmd.ExecuteNonQuery(); objConn.Close();

The first line in the statement sets the connection to SQL Server. The phrase “Integrated Security” means Windows security will be used rather than a SQL Login ID. Next the connection to the database is opened. After the database is opened, the next two lines create a SQL Command object.

SqlCommand objCmd = new SqlCommand("WebUp", objConn);
objCmd.CommandType = CommandType.StoredProcedure;

WebUp is the name of the stored procedure created earlier. It’s tied to the open database connection, then the SQL Command is told WebUp is a stored procedure rather than a TSQL text statement.

The stored procedure requires a single parameter, the image, to be passed in. In the next line, a parameter is created. The name “paramFileData” can be anything, there is nothing special about it, but the Parameters.Add, “@FileData” must match the parameter name in the stored procedure.

SqlParameter paramFileData = objCmd.Parameters.Add("@FileDatalDbType.VarBinary);

Now, with the paramater created, the direction is specified

paramFileData.Direction = ParameterDirection.Input;

Any data being passed into SQL is a command type of Input, data coming out would be ParameterDirection.Output.

Ideally, at this point, the FileUpload control could pass the image directly into the stored procedure parameter, but this doesn’t work. Instead, a byte array is created with the size of the image,

byte[] bImage = new byte[FileUpload1.PostedFile.ContentLength];

Next a Stream is created pointing to the image content:

Stream objStream = FileUpload1.PostedFile.InputStream;

Lastly the Stream transferes the image into the byte array:

objStream.Read(bImage, 0, FileUpload1.PostedFile.ContentLength);

Now the image data can be passed into the SQL parameter and executed.

paramFileData.Value = bImage;
objCmd.ExecuteNonQuery();

Here is a link for the complete web application, both the design page and the code behind.

Running the Application

View the web page inside a browser and click the browse button. A Windows file chooser will appear as shown below,

Once a file is selected and the final Button is clicked, the SQL Stored procedure will be executed. Viewing the results inside SQL Server will show there is data, but not what it looks like.

Because there isn’t a BLOB viewer as part of SQL Server, we’ll create a Viewer similar to the one in the previous article, but now, it is stored procedure driven.

Viewer

Create a simple stored procure that will select the image data using the TSQL code below

CREATE PROCEDURE BLOBViewer AS SELECT BLOBData FROM BLOBFromWeb

Next create a aspx page with code behind. On the code behind page add using statements for SQL Server

using System.Data.Sql;
using System.Data.SqlClient;

In the Page_load section, use the following code to execute the stored procedure just created.

string sConn = @ "server=.; database=BLOBTest3; Integrated Security=True";
SqlConnection objConn = new SqlConnection(sConn);
objConn.Open();
SqlCommand objCmd = new SqlCommand("BLOBViewer", objConn);
objCmd.CommandType = CommandType.StoredProcedure;
SqlDataReader dr = objCmd.ExecuteReader();
dr.Read();
Response.BinaryWrite((byte[]) dr["BLOBData"]);
objConn.Close();

The web viewer code used is very similar to the previous article. The CommandType has been changed to Stored Procedure, and the SQLCommand now uses the name of the procedure rather than a sql string to execute.

BLOBs files are binary data, in other words, not text. Files like an MP3, exe, or pictures are all examples of BLOBs (Binary Large Objects). Often, these types of files may be part of a larger database project. The question usually arises as to how and where to store them, on the File System as files, or inside SQL Server. There are legitimate reasons to justify both. As a very general guide, if the files are small, store them inside SQL server. Another good reason to store binary data inside SQL Server is to obtain the benefit of transactional control. For example, if you need to be sure that an image has been updated or deleted before some other processing step begins, then SQL Server has that control built in. On the other hand, large files, or streaming video, will perform better being served from the Windows file system rather than inside SQL Server. Also, the file system will handle fragmentation better than SQL Server.

Now we will focus on storing BLOBs on the Windows File System and using Microsoft SQL Server to organize them from a web page.

File System Example

To begin, we’ll build on a control used in the previous examples, the File Upload control. This control will live on a web page and will gather file information, such as file name and directory path to a test image file, and eventually, this information will be passed to SQL Server. To being, create a new page called FileSystemIn.aspx with code behind and drag four controls on to it, a File Upload, a button, and two labels as shown below.

We’ll use the default control names for ease of explanations.

<asp:FileUpload ID="FileUpload1" runat="server" /> <br /> <br />
<asp:Button ID="Button1" runat="server" onclick="Button1_Click" Text="Button" /> <br /> <br />
<asp:Label ID="Label1" runat="server" Text="Label"></asp:Label> <br />
<asp:Label ID="Label2" runat="server" Text="Label"></asp:Label>

On the code behind page, we’ll gather the file name and directory path and display them in the labels. Create a Click event and insert the following code

protected void Button1_Click(object sender, EventArgs e)
{
Label1.Text = System.IO.Path.GetFileName(FileUpload1.PostedFile.FileName.ToString());
Label2.Text = ystem.IO.Path.GetDirectoryName(FileUpload1.PostedFile.FileName.ToString());
}

When the button is clicked, the “PostedFile”, (the file browsed to by the File Upload control) will be displayed.

We can also pass this data to SQL Server though a stored procedure so the database can become our image organizer. First, create a test database to hold the file information from the TSQL below

USE master;
GO
CREATE DATABASE BLOBTest4;
GO
USE BLOBTest4;
GO
CREATE TABLE FileInfo(TheName varchar(50), DirPath varchar(50));

Now, we’ll create a stored procedure the web page will use to INSERT

CREATE PROCEDURE FileSystemIn(@TheName varchar(50), @DirPath varchar(50))
AS
INSERT INTO FileInfo(TheName, DirPath)
VALUES(@TheName, @DirPath);

Next, we’ll replace the web page code behind On Click code with a stored procedure. So now, the file information will be saved inside SQL Server rather than displayed on a label.

string sTheName, sDirPath;
sTheName = System.IO.Path.GetFileName(FileUpload1.PostedFile.FileName.ToString());
sDirPath = System.IO.Path.GetDirectoryName(FileUpload1.PostedFile.FileName.ToString());
string sConn = @ "server=.; database=BLOBTest4; Integrated Security=True";
SqlConnection objConn = new SqlConnection(sConn);
objConn.Open();
SqlCommand objCmd = new SqlCommand("FileSystemIn", objConn);
objCmd.CommandType = CommandType.StoredProcedure;
SqlParameter pTheName = objCmd.Parameters.Add("@TheNamelDbType.VarChar, 50);
pTheName.Direction = ParameterDirection.Input; pTheName.Value = sTheName; SqlParameter pDirPath = objCmd.Parameters.Add("DirPathlDbType.VarChar, 50);
pDirPath.Direction = ParameterDirection.Input; pDirPath.Value = sDirPath; objCmd.ExecuteNonQuery(); objConn.Close();

Test the web page, SELECT on the FileInfo table; one row will be returned as shown below.

The same SQL Connection and Command objects were used there as well. From here, a web page can be created to view the images by getting the path information from SQL Server.

SQL Server 2008 FILESTREAM

There are a couple of problems with the type of implementation demonstrated. First, if files are deleted or added by some method other than the web application, SQL Server will be unaware, and thus out of sync with the file system. The second problem is the nightly backup. Now you’ll need to backup the directory paths where the files are stored as well as backing up the SQL Server database. In addition, security to these files is now outside of SQL Server controls.

A way to overcome these problems in SQL Server 2008 is to use the new FILESTREAM option for a VARBINARY(MAX) column. FILESTREAMS are physically stored on WINDOWS NTFS, just like a standard MDF or LDF SQL Server files, but they are specifically created for storing binary data. The CREATE DATABASE statement is used to create a special FILEGROUP and mark it as a stream.

Once the database is created, a column inside a table can be designated as a type “VARBINARY(MAX) FILESTREAM”. BLOBs stored inside the FILESTREAM are not accessible from the file system. You can’t open Windows File explorer and have access to them, meaning security is handled by SQL Server. In addition, the images can be manipulated with standard INSERT, UPDATE, and DELETE statements. So for large BLOBs or BLOBs with high disk activity like streaming video, SQL Server now has a viable option for handling this type of data.

Summary

Binary data can be stored in SQL and retrieved by a web application. In the next article, a web application will be created for inserting images into SQL. In addition, the examples in this article will be expanded to include passing the file name of the picture back to the application, passing in a specific image name to retrieve, and converting these applications to use stored procedures and code behind pages.
SQL Server can be a handy container for image and BLOB data. Working with binary data is somewhat different from working with ASCII. The data saved inside SQL Server isn’t visible like character data. In addition, special handling of the data, such as with a byte array, is required before sending the data to SQL Server. Overall, however, the working with BLOBS and SQL Server is very worthwhile.
SQL Server has several options for managing BLOBs or binary data. VARBINARY (MAX) in SQL 2005, IMAGE data types in older versions, and the new FILESTREAM option in SQL Server 2008. Working with them requires a little additional effort compared to standard data types, but the methods and objects used are straightforward to use. Remember that error checking wasn’t included in these examples, so make sure to check the File Upload for a data before sending it to SQL Server (in case the end user clicks the button before selecting a file).