Hi there,
I need some Delphi 2010 compatible code for integration into another project which does the following:
Process a directory full of DOC/DOCX/PDF files (which will be CVs of people) and extract the following data from each document:
Fullname (as used at the top of the file)
Firstname
Lastname
Middle name (if any)
Email (if any)
Phone (if any)
Country
City
Address
Full text of the document (in simple text format (no formatting) with line breaks (tstringlist?) )
Sample documents can be provided. They generally follow a standard format, ie have the name and contact details in the header. However there are also documents which only have the name at the top of the document.
The component should also be able to process just one document, ie it should not be limited to processing directories full of files only.
The data will then be written into a database program we are developing.
The component should also include a function that searches the doc/docx/pdf document for certain keywords (which will be company names, these should be defined in an .INI file - I will provide list in .txt format. The list will consist of different companies names grouped into groups of companies) and if it finds one or more of these keywords it should return an integer value corresponding to the group this keyword belongs to.
More details by PM if needed.
This is a simple project so cheap offers please. If you need want to use any additional components please tell me with your bid and let me know if a commercial licence is required. I generally don't want to license any third party components but if it is necessary I will consider. I will give preference to bidders who can provide some sort of basic demo (.exe). Payment in full upon delivery of the working source code.
If this project goes well there will be plenty of follow up work.
Thanks