Open Internship & Ph.D. Positions

Laboratoire d’Informatique Fondamentale d’Orléans (LIFO, EA 4022),
Systems and Data Security (SDS) team

Last update: November 8, 2024

Contents

1  Internship M1 or M2 on Reasoning with probabilistic and contradictory information

THE POSITION IS NO LONGER AVAILABLE.

Reasoning with probabilistic and contradictory information (PDF)

Supervisors / Contacts:
Sabine Frittellasabine.frittella@insa-cvl.fr
Vincent Hugotvincent.hugot@insa-cvl.fr

1.1 Description of the internship

Context.

Information can take various forms: crisp, fuzzy, probabilistic... whether we are asking a yes/no question, asking how tall someone is, or how likely it is to rain tomorrow. It can also be incomplete – when one does not have or cannot have access to some pieces of information – or contradictory – when one or many sources provide contradictory pieces of evidence. In addition, the information can also be imprecise: a measurement is given with a margin of error, one does not know the exact probability of it raining tomorrow, but one can provide an interval of probability with a certain degree of confidence...

Different mathematical frameworks enable us to reason with different kinds of pieces of information. For instance, classical logics are well suited for crisp non-contradictory information, while paraconsistent logics were developed to reason with contradictory information. Similarly, imprecise probabilities have been developed to formalise situations where a classical probability measure cannot be provided, for instance when one does not know the exact probability of an event, but can provide a lower bound on that probability.

This internship takes place within a project that aims to develop probability theory and probabilistic logics in a paraconsistent setting. We have already introduced paraconsistent probabilistic logics [3, 4] where we study probabilities and belief functions [2] to reason with contradictory and incomplete probabilistic information. Now, we wish to extend this research to other notions of imprecise probabilities such as possibility functions and lower probabilities. In addition, we intend to look at applications of this work by first developing software to study this theory on real data (for instance on this real, imprecise and uncertain birds dataset [5] generated through crowdsourcing specifically to run experiments on belief functions) and by looking at potential applications (one could start by looking at connections with [6] that uses belief functions together with the open-world assumptions to treat contradictory information from satellite imagery).

Goal of the internship.

The goal of the internship is to develop software to study probabilities in a paraconsistent setting on real data. The intern will need to be comfortable with reasoning with functions over the powerset, manipulating basic notions about probabilities and classical propositional logic. In addition, we shall need to implement a large number of symbolic transformations on formulae; functional languages with structural pattern-matching, such as OCaml or Haskell, are ideal for such tasks, and candidates with this background will be greatly appreciated. Python (3.10+) is also suitable for rapid prototyping. We shall begin by implementing belief functions and standard combination rules such as Dempster Shafer’s rule in a classical setting, then we shall generalise to the paraconsistent setting.

1.2 Some mathematical background

In this section we introduce the main mathematical notions that will be needed throughout the internship. Then we comment briefly on the generalisation of probability theory to the paraconsistent setting.

1.2.1 Probability measures

Consider a random variable X that takes its value in the finite set Ω={x1, …, xn}. For instance, one can represent the throw of dice via a random variable X over Ω={1,2,3,4,5,6}.

A probability distribution over Ω is a map p : Ω → [0,1] such that

 
xi∈Ω
p(xi)=1.

p(xi) encodes the probability that the variable X takes the value xi. If we consider our dice, we have p(n)=1/6 for 1≤ n ≤ 6.

A probability measure over P(Ω) is a map µ : P(Ω) → [0,1] such that

  1. p(Ω)=1 and
  2. p(AB)=p(A)+p(B) for all A,B⊆ Ω such that AB=∅.1

One can show that any probability measure is monotone (that is, if AB, then p(A)≤ p(B)).

1.2.2 Belief functions and the theory of evidence

Belief functions generalise the notion of probabilities. They enable us to treat some situations in which one cannot provide the specific probability of an event, but one can provide an interval.

A mass function over the set P(Ω) is a function m : P(Ω)→ [0,1] such that

 
AP(Ω)
m(A)=1 .

A belief function over P(Ω) is a map bel : P(Ω) → [0,1] such that there is a mass function m over P(Ω) such that : for every A⊆ Ω

bel(A)=
 
B ⊆ A
m(B).

Mass functions are used to encode information. When the mass function m : P(Ω)→ [0,1] is non-zero only on singletons, its associated belief function bel is in fact a probability measure. Otherwise, bel(A) provides a lower bound on the probability that the random variable takes its value in A.

Since mass functions encode information and more specifically pieces of evidence, many rules have been proposed to combine mass functions, that is, to aggregate the information contained in many mass functions. One example is Demspter-Shafer combination rule.

Definition 1 (Dempster’s combination rule over a powerset algebra) Let m1 and m2 be two mass functions on a powerset algebra P(Ω). Dempster’s combination rule computes their aggregation m1m2 as follows.
     
  m1 ⊕ m2 : P(Ω) → [0,1]         (1)
X
 ↦



   0 if  X=∅ (2)
{ m1(X1) · m2(X2) ∣ X1 ⋂ X2 = X } 
{ m1(X1) · m2(X2) ∣ X1 ⋂ X2 ≠ ∅ } 
 otherwise.
        

First step of the internship.

To code belief functions, Dempster-Shafer combination rule, and other similar combination rules from the literature.

The software will take as input a set Ω and a collection of mass functions over P(Ω), each expressed as a collection of couples (A,x), where A ⊆ Ω and x∈ ]0,1]. From there, the Dempster-Shafer combination rule will be straightforward to implement.

This software will then be extended to classical logic and Belnap-Dunn logic, other combination rules, etc, depending on time and affinities. This will require implementing a small parser for propositional formulæ, a transformation into (non-redundant) Disjunctive Normal Form, etc.

A very non-technical presentation of classical propositional logic and Belnap-Dunn logic

Let Prop be a finite set of propositional variables. Formulas of classical propositional logic are defined by induction as follows:

φ := p ∣ φ ∧ φ ∣ φ ∨ φ ∣ ¬ φ 

with pProp. When interpreted over a powerset algebra, the conjunction ∧ can be interpreted as the intersection ∩, the disjunction ∨ as the union ∪, and the negation ¬ as the complement.

Classical propositional logic is used to reason about facts of the world. For instance, the statement John lives in London is either true or false. Belnap-Dunn logic 10 was introduced to reason about the information available about a statement. For instance, one could have no information about where John lives, the information that he lives in London, the information that he does not live in London, or contradictory information about where he lives. To formalise the 4 possible situations regarding the available information about the statement John lives in London, one uses 4 truth values T (true), F (false), N (neither), B (both). These truth values are interpreted as follows:

Second step of the internship.

Understand belief functions over Belnap-Dunn logic (these notions are introduced here [3] and will be explained by the supervisors), and take advantage of the fact that Belnap-Dunn logic can be encoded in classical logic to reuse the previous code to compute aggregation of mass functions in the setting of Belnap-Dunn logic.

Further work.

Depending on the interests and the profile of the intern, many other tasks can be done, such as:

1.3 Practical information

1.3.1 Contacts & Application procedure:

Before applying, please check that you have the latest version of this document and that the position is still available: https://tcs.vhugot.com/Offers

Please submit your application by email to
Dr. Sabine Frittella (email: sabine.frittella@insa-cvl.fr) and
Dr. Vincent Hugot (email: vincent.hugot@insa-cvl.fr), including:

1.3.2 Requirements

Your are:

1.3.3 Additional Information

1.4 References

  1. R. Fagin, J.Y. Halpern, Y. Moses, M.Y. Vardi. Reasoning About Knowledge. MIT press. 1995.
  2. G. Shafer, A Mathematical Theory of Evidence. Princeton University Press. 1976.
  3. M. Bílková, S. Frittella, D. Kozhemiachenko, O. Majer, S. Nazari. Reasoning with belief functions over Belnap–Dunn logic. Annals of Pure and Applied Logic. 2023. PDF
  4. M. Bílková, S. Frittella, D. Kozhemiachenko, O. Majer, K. Manoorkar: Describing and quantifying contradiction between pieces of evidence via Belnap-Dunn logic and Dempster-Shafer theory. ISIPTA 2023: 37-47. PDF
  5. C. Thierry, A. Hoarau, A. Martin, J-C. Dubois, Y. Le Gall: Real Bird Dataset with Imprecise and Uncertain Values. BELIEF 2022: 275-285. PDF
  6. E. Skau, C. Armstrong, D. P. Truong, D. Gerts, K. Sentz: Open world Dempster-Shafer using complementary sets. ISIPTA 2023: 438-449. PDF
  7. J. Marques-Silva. "Practical applications of boolean satisfiability." In 2008 9th International Workshop on Discrete Event Systems, pp. 74-80. IEEE, 2008. PDF
  8. Bryant, "Graph-Based Algorithms for Boolean Function Manipulation," in IEEE Transactions on Computers, vol. C-35, no. 8, pp. 677-691, Aug. 1986. PDF
  9. A. Chandra, D. Kozen, and L. Stockmeyer. 1981. Alternation. J. ACM 28, 1 (Jan. 1981), 114–133. PDF
  10. N. Belnap, How a Computer Should Think, Springer, Cham, Synthese Library (Studies in Epistemology, Logic, Methodology, and Philosophy of Science), New Essays on Belnap-Dunn Logic, 418, 2019.

2 Internship / Ph.D: Towards Formal Semantics and Proven Compliance of Business Workflows Processing Personal Data

The internship has been filled, and is no longer available.

The Ph.D. position has been filled.

Towards Formal Semantics and Proven Compliance of Business Workflows Processing Personal Data (PDF)

You can check that you have the latest version of this document and that the position is still available on https://tcs.vhugot.com/Offers.

2.1 Practical Information

Laboratory & Team:

Laboratoire d’Informatique Fondamentale d’Orléans (LIFO, EA 4022),
Systems and Data Security (SDS) team,
INSA Centre Val de Loire,
88 boulevard Lahitolle
18022 Bourges

Supervisors and collaborators:

The internship will be supervised by Vincent Hugot.

The following Ph.D. thesis will be co-supervised by Vincent Hugot and directed by either Pascal Berthomé or Benjamin Nguyen, depending on scope.

The workflow software company Relyens/QualNet2 shall also offer support in the form of real-world workflows and use-cases, as well as access to their IntraQualDyn software.

Duration and start:

The internship will last 5 to 6 months, and start at the candidate’s earliest convenience, sometime around March, April, or May 2024.

It serves as prelude to, and groundwork for, a Ph.D. thesis, which shall be funded (3 years) by the iPoP project3. Do not apply if you have no intention of pursuing a Ph.D. after.

Candidates with a Master’s degree and an excellent background may also apply directly to the Ph.D. position. The starting date is more flexible in that case.

Contact:

To apply or request additional information, send an email to
vincent.hugot@insa-cvl.fr, preferably with a [Workflows] prefix to the mail title.

To apply, please enclose:

Before applying, please check that you have the latest version of this document and that the position is still available: https://tcs.vhugot.com/Offers

Requirements:

Useful Skills and Traits:

2.2 Context and Long-Term Goals

This project builds on a previous collaboration between SDS and Qualnet, during which we defined a formal semantics for (a subset of) the Intraqual graphical business process workflow language.

Note that proper formal semantics are very uncommon for business workflow languages, which tend to be defined graphically and given informal semantics, usually in natural language.

The short-term aim was to produce a principled, reliable execution engine and ensure correctness of execution. We are now in a position to apply formal verification techniques to workflows.

Of particular interest for our project are workflows dealing with personal data generally, and sensitive personal data especially; for instance, workflows in the medical domain.

During the Ph.D., we hope to develop methodologies to prove that workflows are in compliance with regulations on handling of private data. Thus we shall:

The aim is not an immediate industrial application; we shall stay agnostic with respect to specific languages and technologies. All research shall be publicly available, and all code open-sourced. However, QualNet will provide us with real-world workflows and use-cases, which should guide our modelling work.

2.3 Short-Term Goals of the Internship

The main goal of the internship is to lay the groundwork for the Ph.D. topic, as above. In particular, some time should be spent on exploring the prior art on workflow semantics and proof, and some time on reading real-world workflows and confronting them to the GDPR and similar documents to identify interesting properties to verify. This will inform the starting directions of the Ph.D.

A secondary project, the implementation of a CTL* model-checker (in Python), will be pursued in parallel. It is a more self-contained subject, and serves mainly as a pretext to test and practice the skills that will be necessary for completion of the Ph.D.


1
In this work, we will only consider probability measures over finite sets, therefore we can restrict ourselves to the axiom of finite additivity.
2
https://www.qualnet.fr/
3
https://files.inria.fr/ipop/
4
The IntraQualDyn software, for instance, automatically generates a website that implements the workflow, providing each user with the data and data entry forms they need at the current step in the process.

Correctness of the workflow on paper does not by itself imply correctness of the software; that requires additional work ensuring that the “compiling” process from workflow to software preserves the proven properties


This document was translated from LATEX by HEVEA.