I require a program with a GUI that will perform probabilistic record linkage upon ANY two databases the user loads into the program. The resulting records must then be placed into a new database. The program should do the following:
Have a simple GUI
Allow user to load ANY two databases in
Program must remove duplicates in each database respectively, before comparing the records in each one to the records in the other to find further duplicates, and then removing the duplicate.
Program must also consider if each database has different naming standards, example 'Salary' vs 'Wage'.
Program must consider that attributes will not necessarily be in same order.
Program must consider that different formatting may be used for data, example, one data base could store address details under one attribute 'Address' while another may use 'Street', 'City', 'State' etc. thus the program must consider that one attribute in database A could be a duplicate of multiple attributes in Database B.
If Database A has one unique attribute that Database B does not, that attribute must be added to the resulting database, even though some records in database B will contain null.
Probabilistic techniques must be used to achieve this, and further requirements will likely be identified during development but I will be in constant communication so do not worry. Thank you.
People with knowledge of record linkage preferred.
13 los freelancers están ofertando un promedio de $395 para este trabajo.
I have experience of creating similar db sync app where field can have different name or completely missing some time. I have total 15 years of experience in software development