3
Answers

read data from pdf files using c#

Pankaj Nagarsekar

Pankaj Nagarsekar

12y
11.1k
1
I have this question. I have 5 PDFs having around 38,000 objective questions. So i want to make an application which imports this questions and save it into database and then give interface to the user for choosing question with four objective. I used itextsharp to read from PDFs as a chunk and also line by line. The content after reading is scattered and i cannot figure out a sequence by which i can split or differentiate between the question and the four objectives. Is there any better way by which I can import data from PDFs?? The content in PDFs is in tabular format.
Please check Pdf file (s8.postimage.org/owm0hsej9/Qbank.jpg)
resulting string in Window(s13.postimage.org/4toy70lqf/Resulting_String.jpg)
Answers (3)
0
kianu rieves

kianu rieves

NA 192 0 12y
Usually you do not read PDF line by line. You have to read document (in Java PdfReader), then using high-level objects such as Chunk, Phrase, Paragraph, List, and so on you can access elements of page. These objects are often referred to as iText's basic building blocks.
http://www.dapfor.com/en/net-suite/net-grid/tutorial/editors
0
Pankaj Nagarsekar

Pankaj Nagarsekar

NA 3 11.1k 12y
@Sukesh Marla Thanx for the quick reply..
The code in the link u posted only reads the file from pdf and the text is all scattered.. Is there a way by which i can read it in the same order as it is in the pdf files?. Please check the image links i  have posted in the query. 

0
Sukesh Marla

Sukesh Marla

NA 11.8k 1.2m 12y