I need to be able to search a PDF for about 200 different reference numbers that I know to return a value I do not know. Examples of the reference numbers:
- ABC-12-012
- ABC-012-86
- ABC-0512-10
Where the reference number will always:
- Be at the beginning of a line
- Follow the word "References:"
- Start with ABC-
- Between each hyphen could be varied counts of numeric characters.
The data that I need is actually several lines above the "Reference". It is a series of dotted numbers followed by a description. It resembles "9.8.1 Appendix A" but could just as easily be "9.1 Appendix D" or"9.2.8.63.4 Appendix C".
Also, in case it matters, the known reference may not show up in every .PDF.
Thanks for any help on this!
Sample Text:
________________________________
9.8.1 Appendix A
Description:
This is where a description would be. there could be another header as well.
Additional Information:
One or more additional sections may exist between the 9.8.1 Appendix A (which is the text I need) and the ABC-0012-083 which is what I know to search for.
References:
ABC-0012-083
9.8.2 Addendum 9
Description:
This is where a description would be. there could be another header as well.
Additional Information:
One or more additional sections may exist between the 9.8.1 Appendix A (which is the text I need) and the ABC-0012-083 which is what I know to search for.
References:
ABC-021-19
________________________________