Splitting PDF File In C# Using iTextSharp

Vivek Kumar
9y
25.1k
0
6

Article

We are going to use iTextSharp library in this article. It is an open source library and very useful to CREATE, ADAPT, INSPECT and MAINTAIN documents in the Portable Document Format (PDF).

Please refer to the link given below for PDF, using iTextSharp library.

Generating PDF File Using C#

Sometimes we need to split the pages from one PDF file into multiple PDF files.

Here, in this article, we are going to take a sample example for splitting a PDF file. Sample example is in console applications but in real time, we can use ASP.NET, Web API etc., as per our requirement.

We have to follow some simple steps to split the pages from one PDF file and save into multiple PDF files.

Step 1

We have to install iTextSharp through manage NuGet packages, as shown below.

We can install, using Package Manager Console with the the command given below.

Install-Package iTextSharp

Step 2

Now, add three namespaces in top of .cs page, which are given below.

using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;

Step 3

Write the code in the Program class to extract the pages from one PDF and save into multiple PDF files.

class Program
{
static void Main(string[] args)
{
string pdfFilePath = @"C:\PdfFiles\sample.pdf";
string outputPath = @"C:\SplitedPdfFiles";
int interval = 10;
int pageNameSuffix = 0;
// Intialize a new PdfReader instance with the contents of the source Pdf file:
PdfReader reader = new PdfReader(pdfFilePath);
FileInfo file = new FileInfo(pdfFilePath);
string pdfFileName = file.Name.Substring(0, file.Name.LastIndexOf(".")) + "-";
Program obj = new Program();
for (int pageNumber = 1; pageNumber <= reader.NumberOfPages; pageNumber += interval)
{
pageNameSuffix++;
string newPdfFileName = string.Format(pdfFileName + "{0}", pageNameSuffix);
obj.SplitAndSaveInterval(pdfFilePath, outputPath, pageNumber, interval, newPdfFileName);
}
}
private void SplitAndSaveInterval(string pdfFilePath, string outputPath, int startPage, int interval, string pdfFileName)
{
using (PdfReader reader = new PdfReader(pdfFilePath))
{
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileStream(outputPath + "\\" + pdfFileName + ".pdf", FileMode.Create));
document.Open();
for (int pagenumber = startPage; pagenumber < (startPage + interval); pagenumber++)
{
if (reader.NumberOfPages >= pagenumber)
{
copy.AddPage(copy.GetImportedPage(reader, pagenumber));
}
else
{
break;
}
}
document.Close();
}
}
}

In the code given above, we are using the PdfReader, FileInfo, Document and PdfCopy classes.

Now, I am going to explain the code written above.

Here, pdfFilePath variable is the old PDF location and outputPath variable is the location of new PDF files.

string pdfFilePath = @"C:\PdfFiles\sample.pdf";
string outputPath = @"C:\SplitedPdfFiles";

The interval is the page(s) number of the PDF file from where we want to split the original PDF and divide into each new PDF files. It will have the same number of pages.

In my example, sample.pdf has 102 pages and the interval variable is 10, so each PDF file will contain 10 pages and the last PDF file will contain 2 pages.

We are using pageNameSuffix variable for giving the sequence number of each file with the PDF original name as sample-1.pdf, sample-2.pdf and so on.

int interval = 10;
int pageNameSuffix = 0;

Now, The PdfReader instance contains the content of the source PDF file and we can get the number of pages of the PDF file, using the instance (reader) of PdfReader. We can increment pageNumber, as per interval value, using for loop, as given below.

// Intialize a new PdfReader instance with the contents of the source Pdf file:
PdfReader reader = new PdfReader(pdfFilePath);
FileInfo file = new FileInfo(pdfFilePath);
string pdfFileName = file.Name.Substring(0, file.Name.LastIndexOf(".")) + "-";
Program obj = new Program();
for (int pageNumber = 1; pageNumber <= reader.NumberOfPages; pageNumber += interval)
{
pageNameSuffix++;
string newPdfFileName = string.Format(pdfFileName + "{0}", pageNameSuffix);
obj.SplitAndSaveInterval(pdfFilePath, outputPath, pageNumber, interval, newPdfFileName);
}

Here, we are using splitAndSaveInterval Method for all the operations of PDF, as per our requirement. Its always better to keep separate methods aside for the separation of concern.

Copy PDF page(s) from the original PDF file into new PDF, using parameterized constructor of PdfCopy class and add the page into the new PDF file, using AddPage Method.

private void SplitAndSaveInterval(string pdfFilePath, string outputPath, int startPage, int interval, string pdfFileName)
{
using (PdfReader reader = new PdfReader(pdfFilePath))
{
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileStream(outputPath + "\\" + pdfFileName + ".pdf", FileMode.Create));
document.Open();
for (int pagenumber = startPage; pagenumber < (startPage + interval); pagenumber++)
{
if (reader.NumberOfPages >= pagenumber)
{
copy.AddPage(copy.GetImportedPage(reader, pagenumber));
}
else
{
break;
}
}
document.Close();
}
}

The screenshot is given below for newly created PDF files from the sample.pdf file.

In this example, I explained how to split a PDF file and save it into multiple PDF files, as per the requirement in C#, using iTextSharp .

Download the attachment for the source code of the sample Application.