4
Answers

Extract Text from PDF in C#

Cien S

Cien S

15y
8.4k
1
Hi,

I want to extract text from PDF in C# asp.net. I am using this code as following link ::

Link:: http://www.codeproject.com/KB/cs/PDFToText.aspx

But this code is not working properly. The main problem is that when i get output file they have no that content which are in Inputfile.

Is there any way to resolve this problem ?

Please help me.

Thanks in advance.

Pankaj

Answers (4)
0
Jony Green

Jony Green

NA 100 0 9y
you can try to use PDFLib, it's an open source library, can extract text from pdf pages. before your coding, you can go to this free online pdf text extractor(http://www.online-code.net/pdf-to-word.html) to have a test, it's using PDFLib.
0
Anna Harris

Anna Harris

NA 34 0 11y
Using iTextsharp, you can easily read the text from Pdf file in Asp.Net.
0
Dorothea HANKS

Dorothea HANKS

NA 4 0 11y
I happened to being working on pdf processing, i just finished my pdf programming a week ago, for you problem, i can share may source code of extracting text from pdf in c# with you, hope you can get some useful help.
0
Mike Gold

Mike Gold

NA 32k 21.3m 15y

There is a C# library called SharpPDF  and I think one called iTextSharp.  iTextSharp can be used to read a pdf file
 
PdfReader reader = new PdfReader("In.PDF");
 
Then extract the images, not sure about how to do the text, but it should point you in the right direction:http://www.vbforums.com/showthread.php?t=530736
 
-Mike