Home » Server Options » Text & interMedia » Extracting snippets from PDF stored in BFILE
icon9.gif  Extracting snippets from PDF stored in BFILE [message #176287] Wed, 07 June 2006 12:55 Go to next message
Mark Kane
Messages: 21
Registered: January 2000
Junior Member
I have a table with BFILEs that I need to search through, but I only want a snippet of the data, not the entire BFILE itself.

Some of the BFILE's are text and some are PDF. I can seperate out the text from the PDF, putting them into LONG and BLOBs respectively. The next step is to isolate an 80 character snippet of the files, centered on the word the users were interested in.

I'm doing this because my shop relies heavily on Discoverer and I do not want to make my users open every document to learn the context in which the word or term was used in.

How can I extract a snippet of words from a PDF file stored in a BFILE and put it into a VARCHAR?

I have scratched my head for quite a while and have made little progress, so any hints or help would be greatly appreciated.

-m
Re: Extracting snippets from PDF stored in BFILE [message #176319 is a reply to message #176287] Wed, 07 June 2006 19:19 Go to previous messageGo to next message
andrew again
Messages: 2577
Registered: March 2000
Senior Member
You may need to use "oracle text" becuase it can get the text out of PDFs.
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:440419921146
Re: Extracting snippets from PDF stored in BFILE [message #176360 is a reply to message #176287] Thu, 08 June 2006 02:04 Go to previous message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
Oracle Text ctx_doc.snippet is designed to do exactly that:

http://download-west.oracle.com/docs/cd/B19306_01/text.102/b14218/cdocpkg.htm#sthref1771
Previous Topic: Oracle text error
Next Topic: Oracle Text & Ultra search
Goto Forum:
  


Current Time: Thu Mar 28 06:43:22 CDT 2024