ID verification: The first objective of the project consists in proposing an automatic solution for ID document analysis and integrity verification. An ID document goes through three processes: classification, text extraction, and ID verification. The classification module aims at defining the ID family (type, country, and version). Both image­based and text­based approaches will be used to achieve a precise classification. The document goes then through several technical modules in order to extract its content (binarization, noise removal, zoning, OCR, etc.). The forensic authentication process is then executed over both visual (under white and ultraviolet light) and textual ID data. This ID verification process will rely on a set of rules that are externalized in a formal manner in order to allow easy management and evolving capabilities.

ID knowledge management: The objective of this module is to organize ID models and ID analysis rules in a knowledge base (KB) so that global coherence between models and rules becomes easy to maintain when models and rules are inserted/modified/deleted. The knowledge base will integrate information from public registers (e.g., Prado), and from the output of ID fraud analysis (see below). The objective for this KB will be to cover the largest number of ID models with the greatest level of details deemed useful for ID analysis and verification. ID analysis rules will be based on the knowledge about ID models and will be used to decide at each step of the verification process which classifier/extractor/verifier to run in order to refine the classification/extraction/verification of the ID document under analysis.

ID fraud analysis: As aforementioned, many ID fraud activities are based on common professional sources that control this increasing black market. A simple individual analysis of ID documents does not help to uncover these sources. Therefore, the efforts in this part will be placed on analyzing the cumulative quantity of detected false documents in order to find out potential forensic links between them. Extracting such similarity groups will be of a great importance to detect the common sources, and to study new falsification techniques and trends. Cluster analysis methods will be used to discover relations between false IDs in their multidimensional feature space. This pattern extraction module will be coupled with a suitable visualization mechanism in order to facilitate the comprehension and the analysis of extracted groups of inter­linked fraud cases. It is worth mentioning that the knowledge management module will allow an easy and direct exploitation of any new suggested control rules within the automatic ID analysis and the verification module.