Extract text from .doc file in Asp.net
Hi,
I am using following code to extrcat Text from .doc file.
Code::
FileStream fileStream = new FileStream("F:\\Resume_Rajib_Ghosal.doc", FileMode.Open, FileAccess.Read, FileShare.None);
StreamReader srd = new StreamReader(fileStream);
while (srd.Read() > 0)
{
string text = srd.ReadToEnd();
}
srd.Close();
But aftering extracting when i search kewords as xml,hidden,control,form,html as so on.., its not working properly. I mean to say in original file if i search xml kewords then they have no text. But in text if i search xml kewords then they have multiple values. And also it increment to content Length of .doc file and it contain invalid characters as ? ? 8??i ?i ?BN?? .
Give me some better resolution.
Thanks in advance.
Pankaj