Open Internship & Ph.D. Positions
Laboratoire d’Informatique Fondamentale d’Orléans (LIFO, EA 4022), |
Last update: November 8, 2024
THE POSITION IS NO LONGER AVAILABLE.
Reasoning with probabilistic and contradictory information (PDF)
Supervisors / Contacts:
Sabine Frittella — sabine.frittella@insa-cvl.fr
Vincent Hugot — vincent.hugot@insa-cvl.fr
Information can take various forms: crisp, fuzzy, probabilistic... whether we are asking a yes/no question, asking how tall someone is, or how likely it is to rain tomorrow. It can also be incomplete – when one does not have or cannot have access to some pieces of information – or contradictory – when one or many sources provide contradictory pieces of evidence. In addition, the information can also be imprecise: a measurement is given with a margin of error, one does not know the exact probability of it raining tomorrow, but one can provide an interval of probability with a certain degree of confidence...
Different mathematical frameworks enable us to reason with different kinds of pieces of information. For instance, classical logics are well suited for crisp non-contradictory information, while paraconsistent logics were developed to reason with contradictory information. Similarly, imprecise probabilities have been developed to formalise situations where a classical probability measure cannot be provided, for instance when one does not know the exact probability of an event, but can provide a lower bound on that probability.
This internship takes place within a project that aims to develop probability theory and probabilistic logics in a paraconsistent setting. We have already introduced paraconsistent probabilistic logics [3, 4] where we study probabilities and belief functions [2] to reason with contradictory and incomplete probabilistic information. Now, we wish to extend this research to other notions of imprecise probabilities such as possibility functions and lower probabilities. In addition, we intend to look at applications of this work by first developing software to study this theory on real data (for instance on this real, imprecise and uncertain birds dataset [5] generated through crowdsourcing specifically to run experiments on belief functions) and by looking at potential applications (one could start by looking at connections with [6] that uses belief functions together with the open-world assumptions to treat contradictory information from satellite imagery).
The goal of the internship is to develop software to study probabilities in a paraconsistent setting on real data. The intern will need to be comfortable with reasoning with functions over the powerset, manipulating basic notions about probabilities and classical propositional logic. In addition, we shall need to implement a large number of symbolic transformations on formulae; functional languages with structural pattern-matching, such as OCaml or Haskell, are ideal for such tasks, and candidates with this background will be greatly appreciated. Python (3.10+) is also suitable for rapid prototyping. We shall begin by implementing belief functions and standard combination rules such as Dempster Shafer’s rule in a classical setting, then we shall generalise to the paraconsistent setting.
In this section we introduce the main mathematical notions that will be needed throughout the internship. Then we comment briefly on the generalisation of probability theory to the paraconsistent setting.
Consider a random variable X that takes its value in the finite set Ω={x1, …, xn}. For instance, one can represent the throw of dice via a random variable X over Ω={1,2,3,4,5,6}.
A probability distribution over Ω is a map p : Ω → [0,1] such that
| p(xi)=1. |
p(xi) encodes the probability that the variable X takes the value xi. If we consider our dice, we have p(n)=1/6 for 1≤ n ≤ 6.
A probability measure over P(Ω) is a map µ : P(Ω) → [0,1] such that
One can show that any probability measure is monotone (that is, if A⊆ B, then p(A)≤ p(B)).
Belief functions generalise the notion of probabilities. They enable us to treat some situations in which one cannot provide the specific probability of an event, but one can provide an interval.
A mass function over the set P(Ω) is a function m : P(Ω)→ [0,1] such that
| m(A)=1 . |
A belief function over P(Ω) is a map bel : P(Ω) → [0,1] such that there is a mass function m over P(Ω) such that : for every A⊆ Ω
bel(A)= |
| m(B). |
Mass functions are used to encode information. When the mass function m : P(Ω)→ [0,1] is non-zero only on singletons, its associated belief function bel is in fact a probability measure. Otherwise, bel(A) provides a lower bound on the probability that the random variable takes its value in A.
Since mass functions encode information and more specifically pieces of evidence, many rules have been proposed to combine mass functions, that is, to aggregate the information contained in many mass functions. One example is Demspter-Shafer combination rule.
|
To code belief functions, Dempster-Shafer combination rule, and other similar combination rules from the literature.
The software will take as input a set Ω and a collection of mass functions over P(Ω), each expressed as a collection of couples (A,x), where A ⊆ Ω and x∈ ]0,1]. From there, the Dempster-Shafer combination rule will be straightforward to implement.
This software will then be extended to classical logic and Belnap-Dunn logic, other combination rules, etc, depending on time and affinities. This will require implementing a small parser for propositional formulæ, a transformation into (non-redundant) Disjunctive Normal Form, etc.
Let Prop be a finite set of propositional variables. Formulas of classical propositional logic are defined by induction as follows:
φ := p ∣ φ ∧ φ ∣ φ ∨ φ ∣ ¬ φ |
with p∈ Prop. When interpreted over a powerset algebra, the conjunction ∧ can be interpreted as the intersection ∩, the disjunction ∨ as the union ∪, and the negation ¬ as the complement.
Classical propositional logic is used to reason about facts of the world. For instance, the statement John lives in London is either true or false. Belnap-Dunn logic 10 was introduced to reason about the information available about a statement. For instance, one could have no information about where John lives, the information that he lives in London, the information that he does not live in London, or contradictory information about where he lives. To formalise the 4 possible situations regarding the available information about the statement John lives in London, one uses 4 truth values T (true), F (false), N (neither), B (both). These truth values are interpreted as follows:
Understand belief functions over Belnap-Dunn logic (these notions are introduced here [3] and will be explained by the supervisors), and take advantage of the fact that Belnap-Dunn logic can be encoded in classical logic to reuse the previous code to compute aggregation of mass functions in the setting of Belnap-Dunn logic.
Depending on the interests and the profile of the intern, many other tasks can be done, such as:
Before applying, please check that you have the latest version of this document and that the position is still available: https://tcs.vhugot.com/Offers
Please submit your application by email to
Dr. Sabine Frittella (email:
sabine.frittella@insa-cvl.fr) and
Dr. Vincent Hugot (email:
vincent.hugot@insa-cvl.fr), including:
Your are:
The internship has been filled, and is no longer available.
The Ph.D. position has been filled.
Towards Formal Semantics and Proven Compliance of Business Workflows Processing Personal Data (PDF)
You can check that you have the latest version of this document and that the position is still available on https://tcs.vhugot.com/Offers.
Laboratory & Team:
Laboratoire d’Informatique Fondamentale d’Orléans (LIFO, EA 4022),
Systems and Data Security (SDS) team,
INSA Centre
Val de Loire,
88 boulevard Lahitolle
18022 Bourges
Supervisors and collaborators:
The internship will be supervised by Vincent Hugot.
The following Ph.D. thesis will be co-supervised by Vincent Hugot and directed by either Pascal Berthomé or Benjamin Nguyen, depending on scope.
The workflow software company Relyens/QualNet2 shall also offer support in the form of real-world workflows and use-cases, as well as access to their IntraQualDyn software.
Duration and start:
The internship will last 5 to 6 months, and start at the candidate’s earliest convenience, sometime around March, April, or May 2024.
It serves as prelude to, and groundwork for, a Ph.D. thesis, which shall be funded (3 years) by the iPoP project3. Do not apply if you have no intention of pursuing a Ph.D. after.
Candidates with a Master’s degree and an excellent background may also apply directly to the Ph.D. position. The starting date is more flexible in that case.
Contact:
To apply or request additional information,
send an email to
vincent.hugot@insa-cvl.fr,
preferably with a [Workflows] prefix to the mail title.
To apply, please enclose:
Before applying, please check that you have the latest version of this document and that the position is still available: https://tcs.vhugot.com/Offers
Requirements:
Useful Skills and Traits:
This project builds on a previous collaboration between SDS and Qualnet, during which we defined a formal semantics for (a subset of) the Intraqual graphical business process workflow language.
Note that proper formal semantics are very uncommon for business workflow languages, which tend to be defined graphically and given informal semantics, usually in natural language.
The short-term aim was to produce a principled, reliable execution engine and ensure correctness of execution. We are now in a position to apply formal verification techniques to workflows.
Of particular interest for our project are workflows dealing with personal data generally, and sensitive personal data especially; for instance, workflows in the medical domain.
During the Ph.D., we hope to develop methodologies to prove that workflows are in compliance with regulations on handling of private data. Thus we shall:
The aim is not an immediate industrial application; we shall stay agnostic with respect to specific languages and technologies. All research shall be publicly available, and all code open-sourced. However, QualNet will provide us with real-world workflows and use-cases, which should guide our modelling work.
The main goal of the internship is to lay the groundwork for the Ph.D. topic, as above. In particular, some time should be spent on exploring the prior art on workflow semantics and proof, and some time on reading real-world workflows and confronting them to the GDPR and similar documents to identify interesting properties to verify. This will inform the starting directions of the Ph.D.
A secondary project, the implementation of a CTL* model-checker (in Python), will be pursued in parallel. It is a more self-contained subject, and serves mainly as a pretext to test and practice the skills that will be necessary for completion of the Ph.D.
Correctness of the workflow on paper does not by itself imply correctness of the software; that requires additional work ensuring that the “compiling” process from workflow to software preserves the proven properties
This document was translated from LATEX by HEVEA.